Spark: backport PR #15512 to v3.4, v3.5, v4.0 for WAP branch delete fix#16245
Merged
amogh-jahagirdar merged 2 commits intoapache:mainfrom May 7, 2026
Merged
Conversation
…lete fix When WAP is enabled via spark.wap.branch, canDeleteWhere() previously scanned the main branch while deleteWhere() committed to the WAP branch. This could cause canDeleteWhere() to incorrectly approve a metadata-only delete based on data that was never on the WAP branch, surfacing as "Cannot delete file where some, but not all, rows match filter" at commit time. Resolve the scan branch the same way deleteWhere resolves the write branch (with a fall-back to main when the WAP branch has not been created yet), and pass it through canDeleteUsingMetadata. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Iceberg style requires an empty line between a control flow block and the following statement. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
kevinjqliu
approved these changes
May 7, 2026
huaxingao
approved these changes
May 7, 2026
singhpk234
approved these changes
May 7, 2026
amogh-jahagirdar
approved these changes
May 7, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Backport of #15512 to Spark v3.4, v3.5, and v4.0.
Bug
When WAP (Write-Audit-Publish) is enabled via
spark.wap.branch,canDeleteWhere()anddeleteWhere()could target different branches:canDeleteWhere()scanned the table identifier branch (null → main),because the WAP branch is only a session config and not part of the
identifier.
deleteWhere()resolved the WAP branch before committing.This could cause
canDeleteWhere()to incorrectly approve ametadata-only delete based on data that was never on the WAP branch,
surfacing at commit time as:
Not a clean backport
The v4.1 fix relies on APIs that don't exist in older versions, so it had to be adapted. Two real differences:
SparkTableUtil.determineReadBranchdoesn't exist in v3.4/v3.5/v4.0. It was added in v4.1 only by Spark 4.1: Align handling of branches in reads and writes #15288 ("Spark 4.1: Align handling of branches in reads and writes"), which restructured read/write branch resolution to share logic, support option-based branches, and addvalidateReadBranch/validateWriteBranch. That refactor touched 19 files and is v4.1-only, so backporting it as a prerequisite isn't appropriate. The equivalent logic this fix needs is inlined here as a small privatescanBranchForDelete()inSparkTable.java. The behavior matches v4.1 for the scenarios these versions support (the inlined version skips the option-based branch resolution because older versions don't have option-based branches onSparkTablein the first place).SnapshotUtil.schemaFor(table(), branch)call. In v4.1'scanDeleteUsingMetadata, theStrictMetricsEvaluatoruses a class-levelschemafield — it doesn't takebranchat all. In v3.4/v3.5/v4.0, that line isnew StrictMetricsEvaluator(SnapshotUtil.schemaFor(table(), branch), deleteExpr). I changedbranch→scanBranchthere too, so the schema lookup is consistent with the scan ref. v4.1's diff doesn't touch that line because it doesn't exist in v4.1.Cosmetic: the new tests use
spark.conf().set/unsetdirectly (matching v4.1's PR) instead of thewithSQLConf(...)lambda style that older WAP tests in this file use —withSQLConftakes aRunnable, which can't throw the checkedNoSuchTableExceptionthatappend(...)declares.The fix in
deleteWhereitself didn't need backporting — older versions already calldetermineWriteBranchthere. The bug was only on thecanDeleteWhereread path.Test plan
compileJavaandcompileTestJavapass for v3.4, v3.5, v4.0spark-extensionsmodule:TestDelete#testDeleteToWapBranchCanDeleteWhereScansWapBranchTestDelete#testMetadataDeleteToWapBranchCommitsToWapBranch🤖 Generated with Claude Code