Skip to content

perf: use str.replace in Pointer escape/unescape#37

Merged
berendkleinhaneveld merged 2 commits into
masterfrom
perf/pointer-escape-str-replace
May 26, 2026
Merged

perf: use str.replace in Pointer escape/unescape#37
berendkleinhaneveld merged 2 commits into
masterfrom
perf/pointer-escape-str-replace

Conversation

@berendkleinhaneveld
Copy link
Copy Markdown
Collaborator

Summary

Replaces the regex-based escape/unescape helpers in patchdiff/pointer.py with str.replace. For single-character substitutions, str.replace is ~3× faster than a compiled regex, and Pointer.__str__ calls escape(str(t)) for every token in the pointer — so every to_json / to_str_paths call benefits.

The four module-level re.compile objects (slash_re, tilde_re, tilde0_re, tilde1_re) and import re are removed. grep confirms no other file in the repo imports those names.

Micro-bench (reproducer from plan)

import timeit
from patchdiff.pointer import escape, unescape
print('escape  100k:', timeit.timeit(lambda: escape('somekey_42'),  number=100_000) * 1000, 'ms')
print('unescape 100k:', timeit.timeit(lambda: unescape('somekey_42'), number=100_000) * 1000, 'ms')
escape 100k unescape 100k
master 23.1 ms 23.5 ms
this branch 6.8 ms 7.5 ms
speedup 3.4× 3.1×

(Target was ≥2× on escape.)

Macro-bench

uv run pytest benchmarks/benchmark.py --benchmark-only -k "pointer"pointer-append and pointer-evaluate groups unchanged, as expected (escape only runs on serialization, not append/evaluate):

test_pointer_append                  180.88 ns
test_pointer_evaluate_deep_list      1.28 us
test_pointer_evaluate_deep_dict      2.21 us

Correctness

  • Full existing test suite passes unchanged (269 tests).
  • Added test_escape_unescape_roundtrip in tests/test_pointer.py covering: '', 'plain', 'has/slash', 'has~tilde', '~/mix', '///', '~~~', and the tricky '~01' (which must escape to '~001', not be confused with the RFC 6901 ~0 + '1' sequence). For each token the test asserts both escape(t) matches RFC 6901 and unescape(escape(t)) == t.

RFC 6901 escape order (~~0 before /~1 on escape; reverse on unescape) is preserved.

Test plan

  • uv run pytest tests/test_pointer.py -v
  • uv run pytest -x -q
  • uv run pytest benchmarks/benchmark.py --benchmark-only -k "pointer"
  • uv run ruff check
  • uv run ruff format --check

🤖 Generated with Claude Code

Replaces the regex-based escape helpers in patchdiff/pointer.py with
str.replace. For single-character substitutions this is ~3x faster than a
compiled regex, and Pointer.__str__ calls escape() per token on every
serialization.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@berendkleinhaneveld berendkleinhaneveld marked this pull request as ready for review May 26, 2026 10:44
@berendkleinhaneveld berendkleinhaneveld merged commit 6331f2d into master May 26, 2026
@berendkleinhaneveld berendkleinhaneveld deleted the perf/pointer-escape-str-replace branch May 26, 2026 11:53
berendkleinhaneveld added a commit that referenced this pull request May 26, 2026
This version includes the following:

* ci: Update versions of steps and let setup-uv manage the python version (#40)
* perf: trim common prefix/suffix in diff_lists (#36)
* perf: use str.replace in Pointer escape/unescape (#37)
* fix: produce() snapshots mutable values when recording patches (#34)
* perf: linear reverse-op construction in diff_dicts/diff_sets (#38)
* fix: iapply raises on invalid patch paths (#35)
* fix: to_json mutating caller's ops (#33)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants