Skip to content

feat(spanner): Batch DML#5139

Merged
olavloite merged 4 commits intogoogleapis:mainfrom
olavloite:spanner-batch-dml
Mar 27, 2026
Merged

feat(spanner): Batch DML#5139
olavloite merged 4 commits intogoogleapis:mainfrom
olavloite:spanner-batch-dml

Conversation

@olavloite
Copy link
Copy Markdown
Contributor

Adds support for Batch DML statements. Normally, a Batch DML statement returns an array of update counts containing the exact update count of each statement.

A Batch DML statement can be partly executed by Spanner and fail after N statements. In that case, the client returns a specific error that contains the update counts of the first N statements, and an error for the N+1th statement.

It is also possible that a Batch DML statement is partly executed, and that the read/write transaction is aborted by Spanner after N statements. This is propagated as a 'normal' error to trigger a transaction retry.

Adds support for Batch DML statements. Normally, a Batch DML statement returns an
array of update counts containing the exact update count of each statement.

A Batch DML statement can be partly executed by Spanner and fail after N statements.
In that case, the client returns a specific error that contains the update counts
of the first N statements, and an error for the N+1th statement.

It is also possible that a Batch DML statement is partly executed, and that the
read/write transaction is aborted by Spanner after N statements. This is propagated
as a 'normal' error to trigger a transaction retry.
@olavloite olavloite requested a review from a team as a code owner March 25, 2026 16:25
@product-auto-label product-auto-label Bot added the api: spanner Issues related to the Spanner API. label Mar 25, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented Mar 25, 2026

Codecov Report

❌ Patch coverage is 99.53704% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 97.92%. Comparing base (a2d4241) to head (1758ce3).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
src/spanner/src/read_write_transaction.rs 98.49% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #5139      +/-   ##
==========================================
+ Coverage   97.90%   97.92%   +0.02%     
==========================================
  Files         214      215       +1     
  Lines       42809    43228     +419     
==========================================
+ Hits        41914    42333     +419     
  Misses        895      895              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@dbolduc dbolduc self-requested a review March 25, 2026 17:14
Copy link
Copy Markdown
Member

@dbolduc dbolduc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly nits, but I want you to think more about the return type of execute_batch_update because I am not convinced by it (yet).

Comment thread src/spanner/src/batch_dml.rs Outdated
Comment thread src/spanner/src/batch_dml.rs Outdated
Comment thread src/spanner/src/batch_dml.rs Outdated
Comment thread src/spanner/src/batch_dml.rs Outdated
let result = process_response(response);
let err = result.expect_err("should return error");
let batch_err = BatchUpdateError::extract(&err);
assert!(batch_err.is_none());
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
assert!(batch_err.is_none());
assert!(batch_err.is_none(), "{batch_err:?}");

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment thread src/spanner/src/batch_dml.rs Outdated
assert!(batch_err.is_none());
assert_eq!(
err.status().expect("status").code,
google_cloud_gax::error::rpc::Code::Aborted
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
google_cloud_gax::error::rpc::Code::Aborted
Code::Aborted

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment thread src/spanner/src/batch_dml.rs
/// # Ok(())
/// # }
/// ```
pub async fn execute_batch_update(&self, batch: BatchDml) -> crate::Result<Vec<i64>> {
Copy link
Copy Markdown
Member

@dbolduc dbolduc Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bit weird to me that there is a Vec<i64> normally, and in the error case.

If update_counts should always be returned, did you consider something like:

pub struct BatchUpdateResponse {
  pub update_counts: Vec<i64>,
  pub result: crate::Result<()>,
}

It's a little clunky, but worth considering. Maybe you are afraid it does not force the application to look at the result field?


Q: So the application can learn something from the update_counts in the success case, and that is why we return it. Is that right?

If not, we could return a std::result::Result<(), BatchUpdateError>. Where the source of the BatchUpdateError might be a crate::Error, with an empty update_counts if there is some gRPC error.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q: So the application can learn something from the update_counts in the success case, and that is why we return it. Is that right?

The application can learn something from it in both the success and failure case:

  1. Success: Verify that the number of rows affected is what you expect. This is often used by query generators and other frameworks (e.g. ORMs) to verify that the update they generated actually updated the number of rows that they expected.
  2. Failure: In theory, you might want to verify the same as in the success case. But the primary use case is to understand which statement caused the failure.

The way that applications then can determine which statement failed is a bit clunky, but this is actually a relatively standard way of handling this type of errors across many programming languages. This is the standard in Java, .NET, and Python for all database drivers. And to a certain degree also Go, but as Go has multiple return arguments, it is there returned as both a result and an error, instead of putting the update counts in the error.

Maybe you are afraid it does not force the application to look at the result field?

Yes, this is the primary reason why I would prefer not to return a BatchUpdateResponse with a result or status field. I am absolutely sure that there will be applications that get that wrong and think that their batch succeeded when it didn't.

If not, we could return a std::result::Result<(), BatchUpdateError>. Where the source of the BatchUpdateError might be a crate::Error, with an empty update_counts if there is some gRPC error.

I actually considered that. The primary reason I did not choose that was to keep the signature of this method as equal as possible to the other execute_... methods that we have (in this case it would be std::result::Result<Vec<i64>, BatchUpdateError>, as we definitely want to return the update counts in case of success).

One (tiny) additional drawback of that approach is that it is harder for an application to distinguish between a gRPC error and an actual BatchUpdateError that failed for the first statement. Although we could consider making the list of update_counts optional and return None in that case.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is the primary reason why I would prefer not to return a BatchUpdateResponse with a result or status field. I am absolutely sure that there will be applications that get that wrong and think that their batch succeeded when it didn't.

Alright, makes sense to me. I suspected something like this.

let tx = ReadWriteTransactionBuilder::new(db_client)
.begin_transaction()
.await
.expect("Failed to build transaction");
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style nit: we prefer to use ? and have our tests return anyhow::Result<()>

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

.add_statement("UPDATE Users SET Name = 'Alice' WHERE Id = 1")
.add_statement("UPDATE Users SET Name = 'Bob' WHERE Id = 2");

let counts = tx.execute_batch_update(batch.build()).await.unwrap();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment thread src/spanner/src/transaction_runner.rs Outdated
Comment on lines +508 to +509
attempt += 1;
if attempt == 1 {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please use mockall::Sequence with separate mock.expect_foo() calls.

Here is an example:

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done (missed this one yesterday)

Comment on lines +196 to +210
/// match transaction.execute_batch_update(batch).await {
/// Ok(update_counts) => {
/// println!("All statements succeeded. Update counts: {:?}", update_counts);
/// }
/// Err(e) => {
/// if let Some(batch_error) = BatchUpdateError::extract(&e) {
/// println!(
/// "Batch failed halfway. Successful update counts before failure: {:?}",
/// batch_error.update_counts
/// );
/// } else {
/// println!("The entire batch failed, or an internal error occurred: {}", e);
/// }
/// }
/// }
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this would look like:

let resp = transaction.execute_batch_update(batch).await;
match resp.result {
  Ok(()) => println!("All statements succeeded. Update counts: {:?}", resp.update_counts),
  Err(e) => println!("Batch failed. Successful update counts before failure: {:?}", resp.update_counts),
}

(The case where update_counts.is_empty() does not seem like a special case to me 🤷)

/// # Ok(())
/// # }
/// ```
pub async fn execute_batch_update(&self, batch: BatchDml) -> crate::Result<Vec<i64>> {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is the primary reason why I would prefer not to return a BatchUpdateResponse with a result or status field. I am absolutely sure that there will be applications that get that wrong and think that their batch succeeded when it didn't.

Alright, makes sense to me. I suspected something like this.

@olavloite olavloite merged commit cf86949 into googleapis:main Mar 27, 2026
36 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api: spanner Issues related to the Spanner API.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants