Skip to content

perf: linear reverse-op construction in diff_dicts/diff_sets#38

Merged
berendkleinhaneveld merged 1 commit into
masterfrom
perf/diff-dicts-linear-rops
May 26, 2026
Merged

perf: linear reverse-op construction in diff_dicts/diff_sets#38
berendkleinhaneveld merged 1 commit into
masterfrom
perf/diff-dicts-linear-rops

Conversation

@berendkleinhaneveld
Copy link
Copy Markdown
Collaborator

Summary

diff_dicts and diff_sets in patchdiff/diff.py built their reverse-op list with rops.insert(0, …) inside loops, and diff_dicts further prepended each common-key chunk via key_rops.extend(rops); rops = key_rops. Both patterns are O(n²) in the number of keys/values.

This PR replaces them with bucket-based assembly: collect input-only, output-only, and common reverse-op chunks separately, then splice once at the end. Op order is preserved bit-for-bit — all existing tests pass without modification.

Op-order statement

The op order is unchanged. The new code reproduces the exact layering the old insert(0) + key_rops.extend(rops) pattern produced:

  • common chunks in reverse iteration order (later common keys are prepended ahead of earlier ones, matching key_rops.extend(rops); rops = key_rops)
  • followed by output-only rops in reverse iteration order (matching insert(0))
  • followed by input-only rops in reverse iteration order (matching insert(0))

The 13 existing equality-on-rops tests in tests/test_diff.py and the 9 apply tests in tests/test_apply.py pass without changes.

Round-trip property tests

Added in tests/test_diff.py:

  • test_dict_diff_roundtrip_property — 25 randomized dict pairs (mixed nested values: ints, strings, tuples, dicts, lists, sets) asserting apply(a, ops) == b and apply(b, rops) == a.
  • test_set_diff_roundtrip_property — 25 randomized set pairs asserting the same.

Both pass.

Performance

Reproducer from the plan (3-run mean, M-series Mac):

n before (master) after (this PR) speedup
500 0.87 ms 0.86 ms 1.0×
1000 2.28 ms 1.71 ms 1.3×
2000 6.80 ms 3.36 ms 2.0×
4000 22.46 ms 7.29 ms 3.1×

Growth ratio (n → 2n) on the branch:

  • 500 → 1000: 2.0×
  • 1000 → 2000: 2.0×
  • 2000 → 4000: 2.2×

That is the linear shape the plan asked for (was 2.6× / 3.0× / 3.3× on master).

New benchmark

Added test_dict_diff_large_common[500|1000|2000] to benchmarks/benchmark.py so this win is locked in. Branch numbers:

n mean
500 893 µs
1000 1799 µs
2000 3677 µs

Test plan

  • uv run pytest tests/test_diff.py tests/test_apply.py -v — 24 passed (22 existing + 2 new round-trip)
  • uv run pytest -x -q — 270 passed
  • uv run pytest benchmarks/benchmark.py --benchmark-only -k "dict or set" — passes
  • uv run ruff check — clean
  • uv run ruff format --check — clean

🤖 Generated with Claude Code

Replace the per-iteration rops.insert(0, ...) calls and the
key_rops.extend(rops) accumulator in diff_dicts with bucket-based
assembly: collect input-only, output-only, and common reverse-op
chunks separately, then splice them together once at the end. Same
treatment for diff_sets. Op order is preserved bit-for-bit (existing
tests pass unchanged); time drops from O(n^2) to O(n) for dicts with
large common-key sets.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@Korijn
Copy link
Copy Markdown
Collaborator

Korijn commented May 26, 2026

😎

@berendkleinhaneveld berendkleinhaneveld marked this pull request as ready for review May 26, 2026 11:25
@berendkleinhaneveld berendkleinhaneveld merged commit 2944284 into master May 26, 2026
9 of 10 checks passed
@berendkleinhaneveld berendkleinhaneveld deleted the perf/diff-dicts-linear-rops branch May 26, 2026 11:25
berendkleinhaneveld added a commit that referenced this pull request May 26, 2026
This version includes the following:

* ci: Update versions of steps and let setup-uv manage the python version (#40)
* perf: trim common prefix/suffix in diff_lists (#36)
* perf: use str.replace in Pointer escape/unescape (#37)
* fix: produce() snapshots mutable values when recording patches (#34)
* perf: linear reverse-op construction in diff_dicts/diff_sets (#38)
* fix: iapply raises on invalid patch paths (#35)
* fix: to_json mutating caller's ops (#33)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants