Skip to content

Events auto corpus bench#751

Merged
tamirms merged 18 commits into
rpc-hackfrom
events-auto-corpus-bench
May 22, 2026
Merged

Events auto corpus bench#751
tamirms merged 18 commits into
rpc-hackfrom
events-auto-corpus-bench

Conversation

@tamirms
Copy link
Copy Markdown
Contributor

@tamirms tamirms commented May 21, 2026

Summary

  • eventstore.Query API: new QueryOptions{MaxEvents, LedgerRange} shape covering all the semantics getEvents needs against a single chunk (multi-filter AND/OR, ledger-window clipping, result truncation). 21 tests in
    eventstore/query_test.go. Reader interface gets a FetchRange(ctx, start, count) primitive + ctx on Lookup/All.

  • bench-fullhistory cold-events / hot-events rewritten to drive eventstore.Query(). Default mode is auto-generated corpus from (chunk, seed) — no checked-in artifact files, no separate "prepare queries" step, one command:

    bench-fullhistory cold-events -chunk 5999 -cold-events-dir -iters 7000 -seed 42

The corpus picks the 3 highest-volume contracts in the chunk plus the top 12 (position, value) topic terms from those contracts' 4-topic events, then partitions the 15-term universe across K filters per iter via round-robin with
category-collision recovery. K is sampled from {1, 2, 3, 5, 8, 12, 15} per iter so the bench measures a distribution; CSV gains an n_unique_terms column for cost-class filtering. Hand-authored -queries <file> JSON corpus stays supported
as a secondary input.

  • streamhash dep bump to pick up lazy RAM-index decode (landed in tamirms/streamhash main as part of this work). Cold tx-hash bench p50 drops from 18.5 ms → 4.2 ms because mphf_open no longer eagerly decodes ~1M RAM index entries per
    Open.

Commits

  • d506630beventstore.Query API + Reader ctx refactor + initial JSON-corpus bench + streamhash bump
  • 25914c10 — Auto-corpus replaces JSON-corpus as the bench default; one-shot scan owns term picking; bench iterates a corpus rather than a file
  • 52751682 — Picker simplification: drop per-position cap / growth loop / trim loop; greedy top-15 over picked contracts' 4-topic events
  • e00a2f4f — Code-review polish: delete dead helpers, reuse parseIntList, collapse benchRequest into generatedRequest, fix JSON-path n_unique_terms, etc.

Test plan

  • go test ./cmd/stellar-rpc/internal/fullhistory/pkg/stores/eventstore/ — all 21 Query tests + existing eventstore tests pass.
  • go vet clean across internal/fullhistory/... and scripts/bench-fullhistory/.
  • Auto-corpus bench end-to-end against chunk 5999 (10K ledgers, 9.9M events): K=15 p50=63 ms, aggregate p50=30 ms / p99=65 ms, ops/s=31.
  • Cold tx-hash bench against chunk 5900 with bumped streamhash: total p50=4.2 ms (vs 18.5 ms before, 4.4× speedup driven entirely by mphf_open improvement).
  • -queries <file> JSON path verified to still work (CSV n_unique_terms column now meaningful on that path too).

Out-of-scope follow-ups

  • seed_events.go still writes a termCorpus JSON that nothing reads now — recommend deleting in a separate cleanup PR.
  • No -dump-corpus flag; the auto-corpus is fully reproducible from (chunk, seed) so artifact files aren't needed yet.
  • The picker doesn't try to maximize per-filter intersection cardinality — the chunk's natural value distribution decides it. If a future workload demands deterministic worst-case-cost requests, add a Latin-partition shuffle on top of the
    existing round-robin (~20 LOC).

Tamir Sen and others added 4 commits May 21, 2026 22:10
eventstore.Query becomes the canonical multi-filter query path,
with QueryOptions for MaxEvents + LedgerRange. The bench-fullhistory
cold-events / hot-events subcommands are rewritten to drive Query()
against a hand-authored JSON corpus of requests, replacing the old
single-Lookup --scenario flag.

eventstore.Reader interface:
- Lookup and All now take ctx (Lookup did I/O on cold without one;
  All did the same). Existing callers updated.
- New FetchRange(ctx, start, count uint32) iter.Seq2[Payload, error]
  primitive — All() now wraps it with start=0, count=EventCount;
  Query's match-all path streams through it directly instead of
  materialising []uint32{0..count}. ColdReader's FetchRange dispatches
  to packfile.ReadRange; HotStore's drives rocksdb.Store.IterateRange.
- ErrFetchRangeOutOfBounds sentinel + shared validateFetchRange
  helper so hot/cold error formats match.

eventstore.Query / QueryOptions:
- Filter shape: ContractID []byte + Topics [MaxTopicCount][]byte.
  nil/empty = wildcard within a filter; AND across non-empty fields,
  OR across filters; ≤15 filters / ≤15 unique terms guaranteed by
  the caller.
- QueryOptions.MaxEvents (0 = unlimited) caps the bitmap before
  FetchEvents; lowest event IDs win, which is ledger-ascending order.
- QueryOptions.LedgerRange (zero value = whole ingested chunk) clips
  the union bitmap to the ledger-window's eventID range. Resolves
  against Reader.Offsets() so out-of-chunk bounds clip gracefully.
- Per-filter intersection sorts inputs by GetCardinality (smallest
  first) before FastAnd, per roaring's docs. Single-constraint
  filters skip FastAnd entirely (avoid a Clone); single-filter
  unions skip FastOr (avoid another Clone).
- Term-key dedup across filters → one LookupKeys call for ≤15
  unique terms. Equivalence verified via a countingReader test.

bench-fullhistory cold-events / hot-events:
- New -queries <file> JSON corpus (query_corpus.go); each request is
  { name? string, filters, maxEvents? *int, ledgerRange? object }.
  DisallowUnknownFields rejects typos; hex inputs are length-
  validated (32-byte ContractID; non-empty topics) so a mistake in
  the JSON surfaces as an error rather than a silent zero-match.
- Per-iter shape unchanged: cold evicts + fresh-opens per iter; hot
  shares one reader + warmup. CSV row carries query_idx (0-based JSON
  position) as the demux key; stats prints per-query lines AND an
  aggregate line.

streamhash bump:
- v0.0.0-20260521214846-d69a858ed230 picks up the lazy RAM-index
  decode that lands in tamirms/streamhash. Cold tx-hash bench p50
  drops from 18.5 ms to 4.2 ms because mphf_open no longer eagerly
  decodes ~1M block entries on every Open.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
cold-events and hot-events now default to an auto-generated request
corpus derived from the target chunk — no checked-in artifact files,
no separate "prepare queries" step, no JSON middleman in the common
path. The hand-authored -queries <file> JSON corpus stays supported
as a secondary input for edge-case workloads.

The corpus is generated entirely from (chunk, seed) and is fully
reproducible — same params produce the same workload across runs
and across machines.

Pipeline (default mode):

1. One-shot scan of the chunk's eventstore picks 15 high-volume
   terms (corpus.go:scanForTopTerms): 3 contracts with the most
   4-topic events + the top 3 most-frequent topic values at each
   of the 4 topic positions, aggregated over those 3 contracts.
   The (3 × 5) layout fills eventstore.Query's ≤15-unique-term
   ceiling exactly and is the smallest term-set that lets K=3
   partitions place one-term-per-category in each filter without
   dropping terms.

2. Each iter's request is built by shuffling the 15 terms and
   round-robin partitioning across K filters with category-
   collision recovery (corpus.Next): for each shuffled term, try
   filter (i mod K) first; on category collision, walk forward
   through filters to find an empty slot for that category.
   For K ≥ 3 with the 3-per-category term set every term lands
   (no drops); for K=1/K=2 the slot shortage forces partial
   coverage. One algorithm, no per-K templates.

3. K is sampled per iter from -buckets (default 1,2,3,5,8,12,15
   — same shape as PR #749's stratification, now a CLI flag).

Bench output:

- CSV per-iter row drops query_idx and adds n_unique_terms.
  Columns: n_filters, n_unique_terms, [open_ns,] query_ns,
  n_events, total_ns. n_unique_terms = 15 marks max-cost iters
  (full LookupKeys saturation); operators isolate the heavy band
  via post-hoc filtering.
- Stats output prints per-class lines (grouped by K bucket label
  in auto-corpus mode, JSON-entry label in -queries mode) plus an
  aggregate line.

Notable: the prepare-events-queries subcommand I was about to add
is gone before it landed. The bench owns corpus generation; no
separate generator subcommand, no JSON file format to maintain,
no two-step operator workflow. Reproducibility from (chunk, seed)
replaces the artifact-on-disk model.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Strip the per-position cap + N-growing + budget-trim heuristics
inherited from PR #749's writeFilter15. They were responding to a
misread "max 3 per position" v2-spec constraint that doesn't exist
in our actual spec (the only constraint is ≤15 unique terms
total). The cap forced balanced category sizes, which in turn
required the growth + trim loops to handle chunks whose topic
positions can't supply 3 distinct values.

Replace with a greedy top-15 picker that mirrors what the budget
ceiling actually says:

  1. Pick top termsPerCategory (3) contracts by 4-topic event count
     (these are the K=3 partition anchors — each filter gets one).
  2. Aggregate (position, value) pairs over those contracts' 4-topic
     events.
  3. Take the top (15 − 3) = 12 by frequency, regardless of
     position. Done.

No per-position cap. No growth loop. No trim loop. The chunk's
natural value distribution decides how the 12-term topic budget
splits across positions.

Trade-off observed on chunk 5999 (cold-events bench, 700 iters,
seed 42):

                Growth picker      Elegant picker     Δ
  K=3 p50         17.4 ms            23.6 ms            +35%
  K=15 p50        55.4 ms            62.1 ms            +12%
  Aggregate p50   27.1 ms            30.6 ms            +13%
  Aggregate p99   59.8 ms            64.6 ms            +8%

The elegant picker emits 3 contracts + topic [1,9,0,2] (skewed
toward topic[1]; topic[2] gets zero coverage because every topic[2]
value in this chunk has lower frequency than the 12th-ranked
topic[1] value). The skew produces filters with more wildcard
slots, which means larger per-filter bitmaps and unions that
reliably hit the MaxEvents=1000 fetch cap — strictly more
expensive per request, which is what the bench is here to
exercise.

n_unique_terms-per-iter is now variable (K=3 may use 9-15 terms
depending on partition collision-recovery vs the previous picker's
guaranteed 14). The bench's CSV reports the actual count per iter
so operators can filter by cost class post-hoc.

Code: picker shrinks from ~130 LOC (cap + grow + trim + balanced
emit) to ~50 LOC (single pass + top-N sort). math import dropped.

Caveat documented in the picker comment: topic positions that
don't appear in the top 12 (e.g. topic[2] on chunk 5999) get zero
coverage in any generated query. This doesn't affect Query cost
modeling (the storage layer treats positions symmetrically) but
means the bench won't exercise topic[2] filter lookups on chunks
where topic[2] is low-frequency.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Mechanical fixes from a three-reviewer pass on the auto-corpus
work. No behavioral changes (verified by re-running cold-events
against chunk 5999 with the same seed; per-K p50 within 1-3% of
the pre-refactor numbers, all within run-to-run noise).

- Delete chunkLedgerRange (dead code) + drop the chunk import from
  corpus.go. The constructor was inlining LedgerRange{Start, End}
  already; the helper had no callers.

- Delete sortStringsAscending. Use sort.Strings — the package
  already imports sort transitively, so the hand-rolled insertion
  sort was unmotivated.

- parseBuckets now delegates to parseIntList (bench_grid.go) for
  the actual CSV-int parse. Drops ~25 LOC of byte-walking and
  fmt.Sscanf in favor of the package's established list-parser.
  Behavior consistent (empty input → defaultBuckets, otherwise
  comma-separated ints with whitespace tolerance).

- Collapse benchRequest into generatedRequest. The two-type
  wrapper existed only to add a label field; pushing label into
  generatedRequest itself drops the wrapper closure and lets the
  cold/hot bench files use corpus.Next directly.

- preparedQuery now carries nUniqueTerms, computed at prepareQueries
  time as the sum of non-empty (contractId + topic[0..3]) slots
  across the JSON request's filters. Fixes a bug where the JSON
  path wrote n_unique_terms=0 on every CSV row, silently breaking
  post-hoc filtering. (Count is a conservative upper bound — same
  value reused across filters gets counted twice; Query's runtime
  dedupe would collapse to fewer unique terms. Matches what
  LookupKeys actually does pre-dedupe.)

- intListString formats []int as "[1,2,3]" for the source-label
  log line. Default %v formatting produced "[1 2 3 5 8 12 15]"
  (space-separated, looks like a slice literal); the new format
  is human-friendlier and matches what -buckets accepts.

- kLabels caches the "K=N" demux-key strings so corpus.Next
  doesn't allocate a fresh string per iter. With 7 K-buckets and
  thousands of iters, the previously-allocated strings were
  duplicates; the map lookup + lazy fill replaces per-iter
  fmt.Sprintf.

- Stale comments in corpus.go updated: dropped references to the
  growth/trim/balanced-3-per-category picker (removed two commits
  ago in 5275168). Updated Next's docstring to describe what the
  greedy picker actually produces (chunk-dependent unbalanced
  term set; K≥3 partitions may drop terms when a category is
  over-populated).

- gofmt -w pass on the package — minor whitespace cleanup in the
  generatedRequest struct.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@socket-security
Copy link
Copy Markdown

socket-security Bot commented May 21, 2026

Review the following changes in direct dependencies. Learn more about Socket for GitHub.

Diff Package Supply Chain
Security
Vulnerability Quality Maintenance License
Updatedgolang/​github.com/​tamirms/​streamhash@​v0.0.0-20260410181758-ca41413750cb ⏵ v0.0.0-20260521214846-d69a858ed23099 +1100100100100

View full report

Tamir Sen and others added 14 commits May 21, 2026 23:43
Drop dead code now that the auto-corpus path is the only events
bench shape:

- Delete seed_events.go entirely. The command duplicated the
  functionality of cold-events-ingest + hot-events-ingest (which
  have richer per-stage timing), and its only added behavior was
  writing a termCorpus JSON file that nothing in the codebase
  ever read. The termCorpus type, the cidSet/t0Set/t1Set sampling
  loop (with its admitted-broken `switch k[0] & 0x07` block), the
  hexKeys helper, and the --corpus-out / --max-terms flags all go.

- Delete query_corpus.go entirely. The -queries <file> escape
  hatch from cold-events / hot-events depended on this loader and
  its filterRequest / queryRequest / preparedQuery types, but no
  tool on this branch emits the JSON format and the auto-corpus
  is now the only source. Removing the JSON path collapses ~250
  LOC of loader, validator, options-translator, and
  bench-side branching.

- Strip -queries flag, newJSONRequestSource, and the
  newColdRequestSource / newHotRequestSource branching from the
  event bench files. cmdColdEvents and cmdHotEvents now construct
  a corpus directly from (chunk, seed) — single straight-line
  setup path, no dispatch.

- Move PubnetPassphrase from seed_events.go (now gone) into
  main.go so the surviving ingest benches (cold/hot-events-ingest,
  cold/hot-tx-hash) keep their network-passphrase constant.

- Drop seed-events from the main.go command dispatch + usage
  block. Update cold-events / hot-events usage text to describe
  the auto-corpus shape (no more "--queries <file>" mention).

- Update corpus.go's top-of-file comment: the picker emits "top 12
  (position, value) topic terms" (not "top 3 per position") and
  the JSON escape-hatch sentence is gone.

Verified end-to-end against chunk 5999: auto-corpus bench
produces identical per-K p50 numbers within run-to-run noise
(K=15 p50 = 62ms; aggregate p50 = 29ms).

Net: -559 / +74 lines across the bench package; 2 files deleted;
no behavior change in the auto-corpus path; the -queries
hand-authoring path is gone.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
… perf)

Round-3 code-review fixes across the auto-corpus bench. Verified by
re-running cold-events against chunk 5999: per-K p50 numbers
identical to the previous run within ~1% noise, and the cross-tab of
n_filters × n_unique_terms shows the same distribution.

Real bug fixed:

- corpus.Next now compacts zero-value filter entries before
  returning. Without this, a partition that left some filter index
  unvisited (category collision, or len(terms) < K) would produce
  a Filter{} entry that Query.hasMatchAllFilter interprets as
  "match all events in chunk" — silently turning a K-filter
  benchmark into a full-chunk fetch and corrupting cost numbers.
  Compaction is invisible on chunk 5999's term distribution but
  protects against pathological shapes on other chunks.

  The returned k and label are now derived from the post-compaction
  filter count so CSV's n_filters and stats labels reflect what
  was actually dispatched, not what was sampled.

Perf wins:

- scanForTopTerms maps now key on raw byte strings (string(b))
  instead of hex.EncodeToString output. Saves 40M hex-encode calls
  + ~2× the map-key memory during the scan on a 10M-event chunk.
  encoding/hex import dropped.

- corpus.Next leaves QueryOptions.LedgerRange at its zero value.
  Query.LedgerRange.resolve clips that to the chunk's
  actually-ingested bounds — semantically identical to passing
  {FirstLedger, LastLedger} explicitly but skips a per-iter
  range-bitmap intersection (200-300 KB allocation per query on
  a 10K-ledger chunk). The corpus's lr field is now dead and
  removed.

Reuse + style:

- bench_cold_events / bench_hot_events use createCSV from
  bench_grid.go instead of open-coding the
  MkdirAll + Create + Fprintln dance. ~14 lines removed.

- intListString moved from bench_cold_events.go to main.go where
  shared bench utilities live. It's used by both events bench
  files; main.go is the right home.

- PubnetPassphrase renamed to pubnetPassphrase (lowercase). It's
  declared in a main package so the exported form was meaningless
  — matches the convention of ledgersPerChunk / chunkFirstLedger
  / chunksPerBucket. Updated all callers (cold/hot-tx-hash,
  cold/hot-events-ingest).

- Dropped unused numTermCategories constant from corpus.go and
  the stale "skip-on-marshal-failure" comment got a one-line
  cross-reference to events.TermsFor's contrasting reject policy.

Reviewer findings deferred (call-out only):

- Query's union.ToArray() materializes the full bitmap even when
  MaxEvents truncates to a small slice. Real production win for
  high-cardinality queries (1-4 MB alloc per call). Skipped here
  because it touches Query() rather than the bench, and the patch
  needs separate testing.

- corpus.Next's per-iter Perm + Filter slice allocations
  (~720 B/iter) are below the cold-loop's noise floor. Worth
  refactoring only if hot iters get faster.

- sort.Slice's reflective comparator in Query's per-filter sort
  could move to slices.SortFunc. Marginal; same scope concern as
  union.ToArray above.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Query was calling union.ToArray() to materialise the entire union
bitmap into a []uint32 before slicing down to MaxEvents. For
high-cardinality queries that's tens of MB allocated only to keep
the first 1000 entries (a ~100× over-alloc on a 100K-event union).

Switch to roaring.Iterator and stop after MaxEvents — the iterator
yields strictly ascending, deduplicated uint32s (matching
FetchEvents's sorted-no-dupes precondition), and the destination
slice is pre-sized to exactly the truncation target.

Behaviour preserved — Query's contract is identical (returns the
N lowest-eventID matches when truncated), all 21 query_test.go
cases still pass including the four MaxEvents truncation tests.
Result sets verified identical across the bench's K=15 iters
(1000 events returned by every iter, in both pre- and post-fix
runs).

Cold-events bench on chunk 5999 (300 iters, seed 42):

  K     before    after     Δ
  K=1   8.6 ms    8.7 ms    no-op (small bitmaps)
  K=3   24.1 ms   23.7 ms   no-op
  K=5   29.7 ms   28.9 ms   -3%
  K=8   38.4 ms   34.6 ms   -10%
  K=12  54.9 ms   34.5 ms   -37%
  K=15  62.7 ms   32.2 ms   -49%

  aggregate p50  28.3 → 27.7 ms  -2%
  aggregate p99  65.5 → 37.7 ms  -42%
  ops/s          32   → 41      +28%

K=15 saturates the worst case: 15 single-term filters means each
filter's bitmap is huge (entire contract or entire topic-value
slice), so the K-way union covers most of the chunk. The previous
ToArray spent ~30 ms allocating + populating a slice we
immediately threw most of away. Iterator-driven extraction with
correct up-front sizing eliminates that overhead.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Four changes, batched together because they're all small Query-path
adjustments. Cold-bench numbers preserved within ~1% of the prior
run (the borrow change benefits hot-path only; the others are
correctness / clarity wins below noise on cold).

(1) HotStore.Lookup / LookupKeys no longer Clone the mirror bitmap

  membitmaps.Get used to defensively .Clone() the stored bitmap on
  every dense-mode read so callers could mutate freely. The only
  non-test consumer is eventstore.Query, and the audit confirms
  Query never mutates the LookupKeys output (FastAnd / FastOr are
  read-only on inputs, and the one mutation site -- And-with-range
  -- already does its own explicit Clone). Removing the defensive
  Clone eliminates a per-key roaring container-copy on every hot
  Lookup.

  Contract change: BitmapIndex.Get, Reader.Lookup, and
  Reader.LookupKeys now document that the returned bitmap is
  BORROWED -- callers must not mutate, and must not hold the
  pointer across a write that touches the same term. In the hot
  store specifically, IngestLedgerEvents serialises reads against
  writes via HotStore.mu so callers between Ingest calls see a
  stable bitmap; the borrow contract just requires not mutating
  what they observe.

  Cold reader path unchanged -- it never had a clone, every Lookup
  unmarshals a fresh bitmap from index.pack. Effective caller
  contract is uniform across hot and cold ("treat as read-only").

  Estimated hot-path win for high-cardinality terms: ~ms-scale per
  query on K=15 against popular contracts. Bench-irrelevant on cold
  (no shared state to clone).

  TestMemBitmaps_GetReturnsClone inverted to
  TestMemBitmaps_DenseGetReturnsLiveReference: pins the new borrow
  semantics so a future regression that re-introduces Clone fails
  loudly. TestHotStore_LookupReturnsCloneNotLive likewise inverted.
  TestMemBitmaps_All_ConcurrentGetIsSafe rewritten to do only
  read-only inspection of Get/All results (sparse-mode clone test
  deleted as redundant under the new uniform contract).

(2) Per-filter FastAnd input sort: sort.Slice -> slices.SortFunc

  Reflective comparator -> typed comparator. For 5 elements the
  win is sub-microsecond and invisible on the bench, but the typed
  form is the modern Go idiom and reads cleaner.

(3) Single-filter+range path: union.Clone()+union.And(rangeBM)
    -> roaring.And(union, rangeBM)

  roaring.And computes the intersection directly into a fresh
  bitmap. Clone+And first copies every container (including ones
  And would immediately discard) and then computes the intersection
  in place. Same allocation count but less work.

  Bench-invisible because cold-events leaves LedgerRange at its
  zero value (intentional -- Query.LedgerRange.resolve clips to
  the chunk's bounds), so the hasRange branch doesn't fire. Real
  production win for explicit-range getEvents queries that happen
  to use a single filter.

(4) Final ID extraction: union.ToArray()-truncate-iterator
    -> ManyIterator + NextMany

  Iterator yields one element per HasNext+Next pair (two function
  calls per element). ManyIterator.NextMany fills a buffer of N
  elements per call by walking containers in bulk. Tens of
  microseconds saved per K=15 query. Below noise but cleaner.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
WriteColdIndex now calls bitmap.Clone().RunOptimize() before
MarshalBinary on every term's bitmap. The clone is required because
idx.All yields the live mirror state — concurrent Lookup queries
borrow the same pointer and could be iterating it via FastAnd/FastOr,
so a direct in-place RunOptimize would race against the read side.
The clone is private to the freeze loop body and dropped after the
bytes are serialised.

Effect on chunk 5999 (representative SAC-heavy chunk):

  index.pack:    108MB → 93MB   (-14%, dropped ~15MB)
  freeze CPU:    +~1s per chunk  (one-shot, not in the hot path;
                                  Clone dominates the added cost)

The size win concentrates in 23 high-cardinality terms with natural
clustering (popular contracts, common topic[0]=transfer). 99% of
terms are sparse-mode and don't compress, but the cost of running
RunOptimize on them is negligible (RunOptimize is a fast scan that
no-ops when no contiguous runs exist).

This change requires the paired fix in (*Bitmap).lazyOR — without
it, FastOr at query time hits the runContainer16.lazyIOR → ior →
inplaceUnion → Add → searchRange slow path (O(N · logR) per
element), and K=15 p99 regresses from 35ms to 215ms. The upstream
issue (RoaringBitmap/roaring#81) has been open since December 2016
with the lazyIOR/lazyOR stubs on runContainer16 unimplemented. The
go.mod replace in this commit pins to the tamirms/roaring fork
which contains a one-site workaround in (*Bitmap).lazyOR that
pre-promotes runContainer16 slots to bitmapContainer before lazy
union (mirrors the pre-promotion already done in parallel.go
ParHeapOr).

Query results on chunk 5999 with both fixes in place (1000 iters):

  K     baseline           runopt + fork
  1     8.7 / 19.6ms       8.0 / 19.5ms        -8% / -1%
  2    14.3 / 24.6ms      11.7 / 22.9ms       -18% / -7%
  3    23.7 / 26.8ms      20.6 / 23.8ms       -13% / -11%
  5    28.6 / 31.7ms      23.0 / 26.6ms       -20% / -16%
  8    34.8 / 38.3ms      24.7 / 28.2ms       -29% / -26%
  12   34.6 / 39.4ms      22.1 / 39.9ms       -36% / +1%
  15   32.3 / 35.3ms      19.6 / 38.1ms       -39% / +8%

  Aggregate p50/p99:      20.7 / 35.6ms    -26% / -5%
  Aggregate throughput:   40 → 53 ops/s    +33%

Faster at every K, all the way to p99. The smaller index.pack means
faster cold-cache OpenColdReader + per-Lookup ReadItems reads, and
RUN-encoded bitmap intersections (in single-filter queries) are
asymptotically faster than BITMAP container intersections of the
same cardinality.

The go.mod replace is a temporary pin. Revert to the upstream module
once RoaringBitmap/roaring merges the fastaggregation fix and tags a
release.
Zero branch-introduced lint issues; 28 pre-existing failures on main
are out of scope.

Mechanical fixes:
  - Misspell: en_GB → en_US (serialise → serialize, etc.)
  - Modernize: range-over-int, min(), slices.Sort
  - Prealloc: capacity hints on append loops in tests + benches
  - Staticcheck ST1005: lowercase error strings
  - Staticcheck ST1011: drop unit suffix on time.Duration variable
  - Revive unused-parameter: rename to _
  - gci: import ordering

Targeted fixes:
  - cache.go: deleted dead mincore/residency funcs (unused, errcheck)
  - cold_store.go: wrap long fmt.Errorf line
  - query.go: drop named returns on resolve, wrap long signature

Justified //nolint annotations (with rationale):
  - corpus.go defaultBuckets/kLabels: const-shaped lookup tables
  - query.go topicFieldByPosition: immutable lookup array
  - query.go Query: gocognit/cyclop/funlen — linear pipeline with
    bitmap-ownership invariants that splitting would scatter
  - query.go MaxEvents conversion: bounded above by MaxEvents (int)
  - feewindow.go IngestFees: gocognit/cyclop — linear V0/V1/V2 ingest
    pipeline with BeginTx/Commit/Rollback ownership
  - cold_store.go loadHeader: funcorder — grouped near init/Open
  - bench-fullhistory/main.go fatal: goprintffuncname — 180 callsites

.golangci.yml: extend the existing bench-fullhistory exclusion to
cover all ad-hoc CLI scripts under cmd/stellar-rpc/scripts/, with the
same rationale (long flag-parsing, fmt.Print* UI, deterministic
math/rand, defer-before-Exit). Adds gocritic and nestif to the
exclusion set.
Four interlocking changes that clean up the events package's
concurrency story end-to-end. Resolves the unsafe borrow contract
that prior reviewers flagged on HotStore.Lookup vs IngestLedgerEvents
(the mirror's RWMutex didn't span the borrowed pointer's lifetime),
and removes the redundant HotStore-level lock + closed flag.

(1) events.Bitmaps + events.ConcurrentBitmaps replace memBitmaps

  - Bitmaps is a named map[TermKey]*roaring.Bitmap with one method
    (AddTo). Used by cold-backfill ingest and as the freeze-handoff
    type. No synchronization; the caller guarantees single-threaded
    access (build then hand off to WriteColdIndex).
  - ConcurrentBitmaps is the in-memory event index for live ingest:
    one writer, many concurrent readers. Each per-term entry is a
    pair of atomic.Pointer (sparse list and dense bitmap); AddTo
    publishes new copies via COW rather than mutating in place, so
    readers atomic.Load and operate on immutable snapshots for as
    long as they hold the pointer. No lock spans the borrow.
  - The struct-level RWMutex on ConcurrentBitmaps now protects only
    the map's structure (insert when a new key arrives). For existing
    keys both AddTo and Get bypass it entirely — atomic load/store
    on the entry's pointers.
  - ConcurrentBitmaps.Snapshot returns a uniquely-owned Bitmaps by
    Cloning each bitmap. Used by HotStore.Index().Snapshot() at
    freeze time so WriteColdIndex receives bitmaps it can mutate
    freely.
  - events.BitmapIndex interface deleted — concrete types replace
    the abstraction at every call site.

(2) events.LedgerOffsets + events.ConcurrentLedgerOffsets

  - Mirrors the bitmaps split. Simple LedgerOffsets has no mutex
    (single-threaded build-then-read via sync.OnceValues handoff in
    the cold reader, or hand-off from cold backfill to ColdWriter).
  - ConcurrentLedgerOffsets uses a fixed-cap backing array
    [chunk.LedgersPerChunk]uint32 plus atomic.Uint32 counter for
    lock-free single-writer + many-reader. Reader.Offsets() returns
    a *LedgerOffsets via Snapshot at query entry — one allocation
    per query, not per EventIDs call.

(3) HotStore.mu and HotStore.closed deleted

  - HotStore.mu only serialised IngestLedgerEvents against itself; the
    single-writer contract makes the lock dead weight (one ingest
    goroutine per chunk by orchestrator-level construction).
  - HotStore.closed duplicated chunkStore's own atomic.Bool +
    CompareAndSwap (rocksdb.Store.Close is idempotent). Delegate
    everywhere: Close → chunkStore.Close, read-path guards →
    chunkStore.IsClosed.
  - HotStore is now a 4-field struct holding just chunkStore + chunkID
    + ConcurrentBitmaps + ConcurrentLedgerOffsets.

(4) WriteColdIndex takes events.Bitmaps directly

  - Drops the BitmapIndex interface parameter; takes a concrete
    Bitmaps the caller hands over. Documents that the bitmaps are
    uniquely owned and will be mutated in place via RunOptimize.
  - Cold backfill: passes its in-progress Bitmaps directly (zero
    extra Clones vs the previous "always Clone defensively" path —
    ~hours saved over a multi-chunk backfill).
  - Live-chunk freeze: passes hotStore.Index().Snapshot(). Snapshot
    Clones each bitmap so the freeze path can mutate freely without
    affecting concurrent readers.

Other:
  - Batch AddTo in IngestLedgerEvents: aggregate per-key event IDs
    per ledger, then call ConcurrentBitmaps.AddTo(key, ids...) once
    per unique key. For popular terms receiving many events in one
    ledger this turns N COW clones into 1.
  - Stale docstrings fixed in query.go ("owned bitmaps" → borrow
    semantics) and hot_store.go ("Get always clones" → live-snapshot).
  - QueryOptions.MaxEvents < 0 now returns an error up front (was
    silently bypassing the cap via uint64 wraparound).
  - Tests updated: ConcurrentBitmaps state inspected via
    atomic.Pointer.Load; Snapshot uniqueness pinned; lock-free
    ConcurrentLedgerOffsets read-write tested under -race. The full
    events + eventstore suites pass under -race; lint clean.
…doc drift

(1) ConcurrentBitmaps.Get / Snapshot promotion-window race fix

  Reviewer flagged a real bug: during sparse→dense promotion, AddTo
  Stores bm then Stores empty ids. A reader's two Loads can straddle
  both writes, observing (bm=nil, ids=empty) for a term that is in
  fact populated — and silently return nil. HotStore.Lookup would
  then surface ErrTermNotFound and Query would drop that filter.
  -race does not catch it because there is no unsynchronised access;
  it is a memory-model visibility / interleaving bug, not a data
  race.

  Fix: in both Get and Snapshot, if both bm.Load() and ids.Load()
  come back empty, re-Load bm before returning nil. An entry that
  exists in the map must have either a non-nil bm or non-empty ids
  (newConcurrentTermEntry never leaves both empty), so the re-Load
  resolves the promotion-window case. Added
  TestConcurrentBitmaps_PromotionWindowGetNeverReturnsNil to pin the
  contract.

(2) Avoid duplicate r.Offsets() per Query

  Query takes one snapshot at entry (line 165) but eventIDRange used
  to call r.Offsets() again on both helper paths (fetchAllInRange
  and ledgerRangeBitmap). With ConcurrentLedgerOffsets.Snapshot
  allocating per call, that's 2-3 fresh 40KB slices per Query.

  Refactor eventIDRange / fetchAllInRange / ledgerRangeBitmap to
  take the *events.LedgerOffsets the caller already has. One
  snapshot per Query, plus a single consistent view across all of
  Query's lookups.

(3) Stale docstrings cleared

  - hot_store.go:261 referenced the deleted BitmapIndex.Get
    contract; replace with the ConcurrentBitmaps live-snapshot
    semantics.
  - hot_store.go:634 referenced the deleted events.MemBitmaps type;
    update to events.ConcurrentBitmaps.
(1) Add Snapshot promotion-window regression test

  Mirrors the Get version. The Snapshot path has the same re-Load-bm
  fallback for the mid-promotion (bm=nil, ids=empty) window, but
  the regression test only covered Get. A break in just Snapshot's
  fallback would silently produce an index.pack missing entries
  for whichever terms happen to be mid-promotion at freeze time.

(2) Reader.Offsets docstring updated

  Previous text said "implementations share a cached snapshot
  across readers". After the LedgerOffsets split, HotStore allocates
  a fresh *LedgerOffsets per call via ConcurrentLedgerOffsets.Snapshot.
  ColdReader still caches. Document both contracts.

(3) Reader.Lookup docstring updated

  Replace the deleted BitmapIndex.Get reference with the new
  ConcurrentBitmaps atomic-snapshot semantics.

(4) Replace TestHotStore_LookupReturnsBorrowedLiveBitmap

  The old test mutated the returned bitmap and asserted that the
  mutation bled through to the next Lookup. That encoded the OLD
  "live mirror reference" contract; with Shape D it would actually
  corrupt the supposedly-immutable shared snapshot. Replace with
  TestHotStore_LookupReturnsImmutableSnapshot: a prior Lookup result
  stays unchanged after a subsequent IngestLedgerEvents, and a new
  Lookup observes the new snapshot.

(5) HotStore.Index doc + small test cleanup

  Drop the unreachable internal call form from the docstring; small
  test loop drops an unused index variable.
(1) Doc drift in hot_store.go

  - warmupOffsets docstring + warmup-section block-comment still
    referenced *events.LedgerOffsets after the type split. Replace
    with *events.ConcurrentLedgerOffsets.
  - Add an explicit ordering-invariant comment at IngestLedgerEvents
    Phase 3 explaining why mirror.AddTo runs BEFORE offsets.Append.
    The reverse order would let a concurrent Query see an offsets
    snapshot whose count includes ledger-N IDs the mirror has not
    yet published; the bitmap intersection would silently miss
    those IDs.
  - Document the perKeyIDs initial-capacity choice (64 ≈ a few ×
    unique terms per typical ledger; map grows correctly past).

(2) New: TestHotStore_QueryUnderConcurrentIngest

  Drives end-to-end eventstore.Query in one goroutine while
  IngestLedgerEvents extends the mirror + offsets in another, for
  200 iters. Catches:
  - data races in the Query → LookupKeys → FastAnd → FetchEvents
    pipeline,
  - mirror-before-offsets ordering violations (would show as a
    missing-RocksDB-row error from FetchEvents),
  - per-Query offsets snapshot inconsistency (would show as a
    decode-from-zero-payload error).

(3) New: TestConcurrentLedgerOffsets_AppendPastCapacity

  Pins the bounds check on the fixed-size backing array. Filling
  exactly LedgersPerChunk slots succeeds; the next Append surfaces
  an error instead of panicking on slice index.

(4) .golangci.yml — gomoddirectives allow-list

  The temporary replace directive for github.com/RoaringBitmap/roaring/v2
  was triggering gomoddirectives. The replace is intentional and
  documented; allow-list it (with the revert criterion in the
  go.mod comment).
Three coordinated changes that recover the +42% hot-ingest regression
the prior cleanup landed, while keeping the borrow-race fix and
adding stronger correctness:

(1) termState — single atomic.Pointer per term map entry

  Replaces the two-atomic.Pointer concurrentTermEntry (ids and bm
  fields, each their own atomic.Pointer) with a single
  atomic.Pointer[termState] where termState carries both fields as
  plain Go values. Every AddTo publishes the whole (ids, bm) pair
  in one atomic.Store, eliminating the promotion-window
  observability bug entirely (was previously patched via a re-Load
  fallback in Get/Snapshot).

  Net effect:
   - 1 atomic op per Get instead of up to 2.
   - Snapshot+Get no longer need the re-Load workaround.
   - Memory: +24B per term map entry (the *atomic.Pointer
     indirection vs an inline struct); ~50MB more for chunk 5999.
     Negligible.

(2) SetCopyOnWrite on dense termState bitmaps

  Every promoted (dense) termState bitmap is created with
  CopyOnWrite enabled. Subsequent AddTo's Clone is now O(1) shallow
  copy that shares container pointers; only the touched
  containers get deep-copied via getWritableContainerAtIndex.

  This turns per-ingest-call Clone cost from O(bitmap-size) to
  O(touched-containers), the dominant write-side cost. For chunk
  5999 hot ingest: +42% wall regression -> +7% wall regression.

  Concurrency: roaring's Clone has a side effect on the source
  bitmap (markAllAsNeedingCopyOnWrite). The single-writer ingest
  contract means only one goroutine ever Clones in AddTo. Snapshot
  also Clones each dense bitmap; the orchestrator runs Snapshot
  only after ingest stops on the chunk, so no concurrent Clone
  races. Documented as the Snapshot single-caller contract.

(3) Warmup batches per-term before calling AddTo

  Without batching, warmupIndex calls mirror.AddTo() once per
  (term, eventID) row in events_index. With COW each AddTo does a
  shallow Clone that allocates ~24KB of slice headers plus marks
  the source's needCopyOnWrite. For chunk 5999's ~50M index rows
  that was ~1.2 TB of allocation in a few minutes — GC saturated,
  warmup took 30+ minutes.

  Fix: build a single-threaded events.Bitmaps during warmup
  (per-term buffer flushed when the iteration's term changes —
  rocksdb's byte-sorted iteration delivers term-major order for
  free), then convert to ConcurrentBitmaps via the new
  NewConcurrentBitmapsFromBitmaps constructor. Each term then
  gets a single ConcurrentBitmaps AddTo with all its IDs at once,
  recovering baseline warmup time (~1:40 for chunk 5999).

(4) Snapshot contract + test alignment

  Snapshot's docstring now documents the single-caller contract.
  The previous concurrent Snapshot+AddTo race tests
  (PromotionWindowSnapshotNeverDropsTerm,
  ConcurrentReadOnlyAccessIsSafe) are replaced with sequential
  variants (SnapshotPostPromotionIncludesAllTerms,
  ConcurrentGetIsSafe) that match the production contract. The
  internal-state assertions in older tests are updated for the
  termState shape (p.Load().bm/.ids instead of te.bm.Load()).

(5) Lint: replace-local: false for gomoddirectives

  Match the linter version some environments use that flags the
  roaring fork replace directive without it.

Bench A/B against pre-cleanup baseline (98f6632):

  cold-events-ingest:   71s  → 67s   (-5%, noise)
  cold-events query:    21ms → 21ms  (parity)
  hot-events-ingest:    2:33 → 2:45  (+7% wall, +5% p50)
  hot warmup:           1:40 → 1:40  (parity; previously 30+ min broken)
  hot-events query:     10.7 → 11.3ms p50 (+5%, per-K parity)

All tests pass under -race.
Aggregates findings from three parallel review agents on commit
00aa039 (termState + COW + warmup batching).

(1) Drop the per-Query *LedgerOffsets snapshot

  HotStore.Offsets() previously allocated a fresh 40KB []uint32
  copy of the live backing array on every Query. Query reads only
  5 scalars out of it (LedgerCount, StartLedger, EndLedger,
  EventIDs at two boundary ledgers). At 1000 queries that was
  40MB of allocation driving GC tail (most of the +37% max query
  regression).

  Replace with ConcurrentLedgerOffsets.View(): returns a
  *LedgerOffsets whose offsets slice shares the live backing
  array, capped to the count visible at call time. ~24-byte
  allocation per Query instead of 40KB.

  Safety: writer-side Append only writes at positions ≥ count
  before its atomic.Store of the new count, so any reader that
  observed the prior count sees stably-written positions
  [0, count). The slice cap is set equal to its len so a stray
  Append on the view forks rather than mutating the live array
  (read-only contract documented).

  Snapshot() kept for the freeze path's "I need to hand off the
  data and the live array might mutate later" use case
  (ColdWriter.Finish), and so other paths that retain the
  pointer across mutation continue to work.

(2) Document the borrowed-bitmap forbidden methods

  The safety argument for ConcurrentBitmaps.Get's no-clone-on-
  read contract hinges on readers not calling any roaring method
  that writes through needCopyOnWrite. Previously documented
  abstractly ("don't mutate"); now an explicit list lives in the
  Get docstring (Clone, RunOptimize, AddRange, Add, AddMany,
  SetCopyOnWrite, *Writable* accessors) so future query-path
  changes don't silently land a race against concurrent AddTo's
  Clone.

(3) Snapshot docstring clarifies COW inheritance

  Returned dense bitmaps inherit CopyOnWrite=true via Clone.
  Document why container-replacement mutators (RunOptimize,
  AddRange spanning new high-16 blocks — which is what
  WriteColdIndex does) stay safe, and why write-through-pointer
  mutators (Add, AddMany on existing containers) would alias
  back to the live mirror.

(4) Test coverage

  - TestConcurrentBitmaps_DenseAddToSetsCopyOnWrite pins
    SetCopyOnWrite(true) at three call sites (promotion in
    AddTo, newTermState over-threshold initial batch,
    inherited via Clone in subsequent AddTo). A regression that
    drops the flag would silently undo the +42% hot-ingest
    recovery; this test catches it.
  - TestNewConcurrentBitmapsFromBitmaps_DirectlyPinsContract
    covers the warmup-side conversion: cardinality survives,
    CopyOnWrite enabled, nil source entries skipped, post-
    conversion AddTo preserves COW.
  - TestConcurrentLedgerOffsets_ViewSharesBacking pins the
    View's frozen-cap semantic: source appends after View() do
    not extend the view.

(5) Doc drift

  - TermsFor doc in index.go: references the renamed
    HotStore.IngestLedgerEvents, cold-events-ingest backfill,
    events.Bitmaps, and ConcurrentBitmaps.Snapshot. Previous
    text mentioned long-deleted Writer.IngestLedgerEvents /
    Backfill / EventIndex / WriteIndex.
  - TestConcurrentBitmaps_GetDuringPromotionNeverReturnsNil
    docstring rewritten — the original two-Store visibility bug
    it pinned is structurally impossible under the termState
    design; the test still has value as a -race probe so the
    test stays, just with a docstring that matches reality.
…consistency

Add a defensive post-filter that runs after FetchEvents to verify each
candidate event actually matches the user's filter — TermKey is a
non-cryptographic xxh3_128 over attacker-controllable topic values, so
the index is treated as a candidate filter and the post-filter enforces
result correctness. Two dispatched paths per Payload:

  - view path: xdr.ContractEventView navigation, topic.Raw() bytes,
    bytes.Equal. Zero per-event allocation; aliases into ContractEventBytes.
  - struct path: per-event MarshalBinary on constrained topics, bytes.Equal.

Both paths lazy-resolve so events that fail every clause's ContractID
check never touch the topics.

Paired with a new view-mode read path:
  events.Payload.UnmarshalView aliases data[off:off+eventLen] into
  ContractEventBytes instead of calling ContractEvent.UnmarshalBinary.
  HotStore/ColdReader gain a UseXDRViews option; FetchEvents +
  FetchRange dispatch to UnmarshalView when set. Iterator paths
  (HotStore.FetchRange, ColdReader.FetchEvents, ColdReader.FetchRange)
  bytes.Clone before UnmarshalView to take ownership from the borrowed
  iterator buffer. BatchMultiGet path already copies; no extra clone.

Bench A/B on chunk 5999 (9.9M events):
  Hot ingest:  6m57s → 2m48s  (2.5× — extract 6.81×, write 1.49×)
  Cold ingest: 4m59s → 1m08s  (4.4× — lcm_decode 6.77×, term_idx 3.71×)
  Hot query:   p50 12.57ms → 7.08ms (1.77×), max -6%
  Cold query:  p50 20.96ms → 16.99ms (1.23×)

Store consistency cleanup spans eventstore/ledger/txhash + benches:
  - ledger/cold_store.go split into cold_reader.go + cold_writer.go
  - ledger: NewColdStoreReader → OpenColdReader, ColdStoreReader →
    ColdReader (drop redundant Store infix)
  - txhash + ledger: NewHotStore → OpenHotStore (align with eventstore)
  - txhash.HotStore.Get → Lookup (parallel with ColdReader.Lookup; signals
    may-not-find)
  - Bench renames: tx_hash → txhash, tx_page → txpage
  - Bench flag normalization: -hot-events-dir → -hot-dir,
    -num-packfiles → -num-chunks, --xdr-views default false everywhere
  - Mode-agnostic corpus picker reads topics via ContractEventView
    when payload arrives as view; single reader serves corpus + iters
  - Close docstring + concurrency contracts on txhash/ledger HotStore

Tests:
  - postfilter_test.go: struct/view equivalence on 8 ScVal arms (Symbol,
    Bytes, String, U64, I128, Vec, Map, Address); collision injection;
    missing topic; nil ContractId; V!=0 body; multi-clause same-position;
    validateFilters; mixed-mode dispatch precedence; FuzzPostFilterStructVsView
  - payload_test.go: UnmarshalView round-trip + alias semantics + error
    paths + stale-state clearing
  - ledger: split cold_reader_test.go + cold_writer_test.go

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- gci/gofmt: payload.go Payload field comments + query.go var block
- gocognit/cyclop/gocyclo: nolint on Query + matchesAnyFilter* (the
  lazy-cache invariants don't survive helper extraction)
- godoclint: txhash.HotStore docstring updated from Get to Lookup
- misspell: behaviour→behavior, materialised→materialized, Sanitise→Sanitize
- nonamedreturns: makeTestEvent + makeTestEventScVal drop named returns

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@tamirms tamirms merged commit 16090ae into rpc-hack May 22, 2026
15 checks passed
@tamirms tamirms deleted the events-auto-corpus-bench branch May 22, 2026 15:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant