perf: linear reverse-op construction in diff_dicts/diff_sets#38
Merged
Conversation
Replace the per-iteration rops.insert(0, ...) calls and the key_rops.extend(rops) accumulator in diff_dicts with bucket-based assembly: collect input-only, output-only, and common reverse-op chunks separately, then splice them together once at the end. Same treatment for diff_sets. Op order is preserved bit-for-bit (existing tests pass unchanged); time drops from O(n^2) to O(n) for dicts with large common-key sets. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Korijn
approved these changes
May 26, 2026
Collaborator
|
😎 |
Merged
berendkleinhaneveld
added a commit
that referenced
this pull request
May 26, 2026
This version includes the following: * ci: Update versions of steps and let setup-uv manage the python version (#40) * perf: trim common prefix/suffix in diff_lists (#36) * perf: use str.replace in Pointer escape/unescape (#37) * fix: produce() snapshots mutable values when recording patches (#34) * perf: linear reverse-op construction in diff_dicts/diff_sets (#38) * fix: iapply raises on invalid patch paths (#35) * fix: to_json mutating caller's ops (#33)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
diff_dictsanddiff_setsin patchdiff/diff.py built their reverse-op list withrops.insert(0, …)inside loops, anddiff_dictsfurther prepended each common-key chunk viakey_rops.extend(rops); rops = key_rops. Both patterns are O(n²) in the number of keys/values.This PR replaces them with bucket-based assembly: collect input-only, output-only, and common reverse-op chunks separately, then splice once at the end. Op order is preserved bit-for-bit — all existing tests pass without modification.
Op-order statement
The op order is unchanged. The new code reproduces the exact layering the old
insert(0)+key_rops.extend(rops)pattern produced:key_rops.extend(rops); rops = key_rops)insert(0))insert(0))The 13 existing equality-on-rops tests in
tests/test_diff.pyand the 9 apply tests intests/test_apply.pypass without changes.Round-trip property tests
Added in tests/test_diff.py:
test_dict_diff_roundtrip_property— 25 randomized dict pairs (mixed nested values: ints, strings, tuples, dicts, lists, sets) assertingapply(a, ops) == bandapply(b, rops) == a.test_set_diff_roundtrip_property— 25 randomized set pairs asserting the same.Both pass.
Performance
Reproducer from the plan (3-run mean, M-series Mac):
Growth ratio (n → 2n) on the branch:
That is the linear shape the plan asked for (was 2.6× / 3.0× / 3.3× on master).
New benchmark
Added
test_dict_diff_large_common[500|1000|2000]to benchmarks/benchmark.py so this win is locked in. Branch numbers:Test plan
uv run pytest tests/test_diff.py tests/test_apply.py -v— 24 passed (22 existing + 2 new round-trip)uv run pytest -x -q— 270 passeduv run pytest benchmarks/benchmark.py --benchmark-only -k "dict or set"— passesuv run ruff check— cleanuv run ruff format --check— clean🤖 Generated with Claude Code