[codex] Stabilize Compute ingest memory by sorenbs · Pull Request #8 · prisma/streams

sorenbs · 2026-04-25T05:09:31Z

Summary

constrain low-memory Compute ingest/background work with smaller segment targets, single upload lane, disabled segment workers, deferred/capped companion/index wakeups, low-memory append connection close, and post-append GC
move R2 object store access off Bun S3/S3File APIs onto signed fetch, streaming file uploads, and streaming response reads
harden the Compute demo for external Streams targets, local R2 latency/endpoint testing, and colocated direct append pause controls
document risky Bun memory APIs and link the relevant open Bun issues
fix packaged local touch worker resolution so @prisma/streams-local can load the worker from the published package layout

Memory/stress evidence

native Bun S3 path reproduced high pressure around 850.9 MB RSS / 834.4 MB anon RSS
signed fetch path reduced this to around 533.8 MB RSS / 506.1 MB anon RSS
streamed path completed the 900k event run at 474.8 MB RSS / 461.2 MB anon RSS, cgroup peak 637.1 MB, settled 204.7 MB RSS / 188.9 MB anon RSS, OOMKilled=false, uploaded 18/18, companions 18/18, search families complete

Verification

bun run verify
bun run test:conformance
bun run test:conformance:local
bun run test:node-local-package
bun run test:bun-local-package
bun run test:bun-server-package
bun run build:npm-packages
bun pm pack --dry-run in dist/npm/streams-local
bun pm pack --dry-run in dist/npm/streams-server

Note: the first local conformance attempt collided with the server-mode conformance run on 127.0.0.1:8787; rerunning local conformance serially passed.

Change omitted-sort non-scoring search requests to use offset:desc instead of primary timestamp sorting. Text-scoring queries keep _score, timestamp, offset ordering, and docs now describe the default split. Perf repro measurements (SEARCH_PERF_REPRO=1 bun test test/search_perf_repro.test.ts): before default-sort/broad=2233.69ms parseCalls=32817; after=85.20ms parseCalls=119. Offset-desc repro stayed ~2338.68ms before / 2291.94ms after. Small no-L0 repro stayed ~2290.26ms before / 2319.66ms after. Verification before commit: bun run typecheck; bun run check:result-policy; bun test. Also fixed the aggregate uncompanioned-prefix test helper race so the full suite passes consistently.

Use the DSB3 segment footer for offset-desc search scans so the reader walks blocks newest-to-oldest and stops once the requested page is full. The old no-footer fallback remains for legacy/corrupt segment shapes. Perf repro measurements (SEARCH_PERF_REPRO=1 bun test test/search_perf_repro.test.ts): before default/offset-small=85.20ms/2291.94ms/2319.66ms; after=25.13ms/64.41ms/64.16ms. Offset-desc newest-segment indexed time dropped from ~2290ms to ~63ms. Verification before commit: bun run typecheck; bun run check:result-policy; bun test.

Add coverage fields and response headers for candidate doc IDs, decoded records, JSON parse time, segment payload bytes fetched, sort time, and peak hits held. The perf repro now prints these counters so broad scans can be distinguished from index-only or block-limited paths. Perf repro measurements (SEARCH_PERF_REPRO=1 bun test test/search_perf_repro.test.ts): before metrics latency=25.13ms/64.41ms/64.16ms; after=22.98ms/51.53ms/68.11ms. Benefit is diagnostic: after metrics reported default candidateDocIds=2048 decodedRecords=113 segmentBytesFetched=5801497 peakHitsHeld=100; offset-desc candidateDocIds=32768 decodedRecords=1 segmentBytesFetched=195859446 peakHitsHeld=1; small-no-L0 candidateDocIds=32768 decodedRecords=106 segmentBytesFetched=195891695 peakHitsHeld=100. Verification before commit: bun run typecheck; bun run check:result-policy; bun test.

Measurements: - Existing repros before (after metrics): default=25.13ms, offset-desc=64.41ms, small-no-L0=64.16ms. - Existing repros after: default=21.25ms, offset-desc=61.15ms, small-no-L0=65.70ms. - Added WAL-tail rare exact repro: cold cache=866.18ms, warm cache=0.65ms, warm candidate_doc_ids=1, warm scanned_tail_docs=1. Verification: - bun run typecheck - bun run check:result-policy - bun test

Measurements: - Existing repros before: default=21.25ms, offset-desc=61.15ms, small-no-L0=65.70ms, WAL warm=0.65ms. - Existing repros after: default=23.97ms, offset-desc=43.33ms, small-no-L0=62.55ms, WAL warm=0.49ms. - Added exact-only sealed repro: elapsed=1183.96ms, candidate_doc_ids=1, parseCalls=5, families=[exact]. This demonstrates doc-level exact postings avoid parsing every source candidate; block decode remains for later index-only work. Verification: - bun run typecheck - bun run check:result-policy - bun test

Measurements: - Existing repros before: default=23.97ms, offset-desc=43.33ms, small-no-L0=62.55ms, exact-only sealed=1183.96ms. - After: default=22.39ms, offset-desc=49.63ms, small-no-L0=71.71ms, exact-only sealed=27.48ms. - Exact-only decoded_records dropped from 32768 to 15 while candidate_doc_ids stayed 1; this is the expected top-k/late-block-decode improvement for indexed offset-desc queries. Verification: - bun run typecheck - bun run check:result-policy - bun test

Measurements: - Before: default=22.39ms / 5.8MB fetched, offset-desc=49.63ms / 195.9MB fetched, small-no-L0=71.71ms / 195.9MB fetched, exact-only=27.48ms / 99.7MB fetched. - After: default=27.48ms / 325489B fetched, offset-desc=17.61ms / 193625B fetched, small-no-L0=35.25ms / 820836B fetched, exact-only=5.60ms / 133119B fetched. Verification: - bun run typecheck - bun run check:result-policy - bun test

Measurements: - Before: default=27.48ms, offset-desc=17.61ms, small-no-L0=35.25ms, WAL warm=0.86ms, exact-only=5.60ms. - After: default=27.06ms, offset-desc=17.65ms, small-no-L0=33.15ms, WAL warm=0.52ms, exact-only=5.37ms. - Existing local repros are flat because decoded sections are already tiny; this adds the bounded decoded-section cache advertised by config for repeated interactive searches. Verification: - bun run typecheck - bun run check:result-policy - bun test

Measurements: - New explicit timestamp-desc repro before: 2327.85ms, parseCalls=32881, indexedSegments=16, decodedRecords=32768, segmentBytesFetched=92935801, sortTimeMs=12, peakHitsHeld=32768. - New explicit timestamp-desc repro after: 172.87ms, parseCalls=2071, indexedSegments=1, decodedRecords=2048, segmentBytesFetched=5801497, sortTimeMs=0, peakHitsHeld=100. - Existing repros after: default=27.56ms, offset-desc=17.45ms, small-no-L0=34.67ms, WAL warm=0.56ms, exact-only=5.77ms. Verification: - SEARCH_PERF_REPRO=1 bun test test/search_perf_repro.test.ts - bun run typecheck - bun run check:result-policy - bun test

The deployed demo server had next_offset=100000 but segment_count=0, sealed_through=-1, uploaded_through=-1, pending_bytes=102255073. _search therefore reported possible_missing_wal_rows=100000 and scanned_tail_docs=0 for environment:"staging". The segmenter worker path was resolved from the bundled compute/entry.js module. ./segment/segmenter_worker.js pointed under compute/segment, while the bundle emits the worker at ../segment/segmenter_worker.js. Verification: - bun test test/compute/worker_module_url.test.ts test/compute/bundle_build.test.ts - bun run typecheck - bun run check:result-policy - bun test

Enforce the documented 500-key page limit for routing key listing and add a regression test. Also align docs with current runtime defaults, profile/schema behavior, coverage fields, package requirements, and portable links.

+  payload: unknown,
+  headers?: HeadersInit,
+): Response {
+  return new Response(JSON.stringify(payload), {


sorenbs added 20 commits April 24, 2026 16:19

Tune 1024 preset and clamp segment sizing

510a170

Wake search index managers on enqueue

a3b8bb9

Add object store latency metrics

a26b1d9

Organize Compute demo and bundle tooling

001b0c9

Fix docs drift and routing key limit

f2ddaf2

Enforce the documented 500-key page limit for routing key listing and add a regression test. Also align docs with current runtime defaults, profile/schema behavior, coverage fields, package requirements, and portable links.

Constrain low-memory ingest background work

8d48bfe

Stream R2 object store bodies

e7df810

Harden Compute demo memory setup

d36b323

Document risky Bun memory APIs

ca8117a

Fix packaged touch worker path

c0a278b

github-advanced-security AI found potential problems Apr 25, 2026

View reviewed changes

Comment thread src/compute/demo_site.ts

payload: unknown,

headers?: HeadersInit,

): Response {

return new Response(JSON.stringify(payload), {

Stabilize companion yield test

cebe2a6

sorenbs merged commit 283b470 into main Apr 25, 2026
6 checks passed

sorenbs deleted the codex/compute-cleanup branch April 25, 2026 05:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[codex] Stabilize Compute ingest memory#8

[codex] Stabilize Compute ingest memory#8
sorenbs merged 21 commits intomainfrom
codex/compute-cleanup

sorenbs commented Apr 25, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

sorenbs commented Apr 25, 2026

Summary

Memory/stress evidence

Verification

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants