diff --git a/plans/PLAN-CLI-PROGRESS-OUTPUT.md b/plans/PLAN-CLI-PROGRESS-OUTPUT.md new file mode 100644 index 0000000..c71e6dc --- /dev/null +++ b/plans/PLAN-CLI-PROGRESS-OUTPUT.md @@ -0,0 +1,212 @@ +# Plan: CLI progress output (Phase 1 — stream + heartbeats) + +Status: **active**. This plan implements +[`propose/CLI-PROGRESS-OUTPUT-PROPOSE.md`](../propose/CLI-PROGRESS-OUTPUT-PROPOSE.md). + +Depends on: **none** (orthogonal to graph schema / ontology). No `ontology_version` bump and no re-index requirement. + +## Goal + +- **Stop buffering** lifecycle subprocess output until exit: relay each child’s **stdout and stderr** to the operator’s **stderr** as bytes arrive (verbatim), while still accumulating the same tail windows for structured results (`RefreshIndexOutput`, CLI failure payloads). +- **Bracket opaque phases** with honest stderr lines: cocoindex wrap (`[lance] …`), pipeline header/footer (`java_codebase_rag/cli.py`), pass-start lines and **5 s** heartbeats in `build_ast_graph.py` (verbose path only). +- **Preserve contracts**: machine-readable **`java-codebase-rag` stdout** for `init` / `increment` / `reprocess` / `erase` stays **byte-for-byte identical** to today; under `--quiet`, **stderr matches a per-subcommand baseline from today** (no **new** bytes from streaming relay, `[lance]` wrap, header/footer, pass starts, or heartbeats). Pre-existing stderr stays unchanged — notably `increment --quiet` always prints the multi-line Kuzu staleness warning today and continues to; `meta` / `tables` / `diagnose-ignore` / `analyze-pr` unchanged. + +## Principles (do not relitigate in review) + +- **Stream first; no pretty UI in this round** — no `rich` / `tqdm` / `click`, no ANSI, no TTY-only rendering (deferred to a future `CLI-PRETTY-OUTPUT` propose). +- **Stderr = human channel; CLI stdout = agent/CI contract** — relayed subprocess bytes go to **stderr**, not stdout. +- **No parsing or reformatting** of cocoindex or graph-builder lines; wrap lines are additive CLI-owned prefixes only. +- **Summary line grep parity** — existing `[passN] …` summary strings in `build_ast_graph.py` stay **verbatim**; only **new** start/heartbeat lines are added. +- **Heartbeat cadence fixed at 5 s** (integer seconds in messages); not configurable in this rollout. +- **Quiet is sacred** — `quiet=True` / `--quiet` must keep capture-only subprocess behaviour (no live relay) **and** must not add any **new** stderr markers from this work. **Parity = stderr byte-for-byte equal to today's baseline per subcommand**, not `stderr == ""` for every command (`increment --quiet` already emits the staleness warning block). +- **Five improvements, two implementation PRs after propose** — align with propose §6: structural streaming PR, then cosmetic PR (+ docs). Propose’s “PR-PROG-1 = propose merge” is documentation land; once the propose is on the target branch, implementation starts at propose’s PR-PROG-2. + +## PR breakdown — overview + +| PR | Scope | Ontology bump | Files touched (approx) | Test buckets | Independent of | +| --- | --- | --- | --- | --- | --- | +| PR-1 | Propose merge / scope lock (if not already on main) | no | `propose/CLI-PROGRESS-OUTPUT-PROPOSE.md` | none | none | +| PR-2 | Live stream stdout+stderr; cocoindex `[lance]` wrap; full buffers for tails; quiet parity | no | `server.py`, `java_codebase_rag/pipeline.py`, `java_codebase_rag/cli.py`, new test module | unit + integration (stdout invariant, quiet) | PR-1 if propose already merged | +| PR-3 | Pass-start lines, 5 s heartbeats (`build_ast_graph.py` + `write`), pipeline header/footer, README + `docs/JAVA-CODEBASE-RAG-CLI.md` + `docs/AGENT-GUIDE.md` + `--help` one-liner | no | `build_ast_graph.py`, `java_codebase_rag/cli.py`, `README.md`, CLI docs, agent guide, tests | heartbeat ordering, header/footer, quiet extension, stdout invariant regression | PR-2 | + +Landing order: **PR-1 (optional) → PR-2 → PR-3**. + +## Ground truth vs propose (implementation must not miss this) + +| Topic | Decision | +| --- | --- | +| Where `reprocess` buffers today | `server.py::run_refresh_pipeline` uses `asyncio.create_subprocess_exec` with `PIPE` + `communicate()`. | +| Where `init` / `increment` buffer today | `java_codebase_rag/pipeline.py` uses `subprocess.run(..., capture_output=True)` for cocoindex and for `build_ast_graph.py`. **PR-2 must stream both paths**, not only `run_refresh_pipeline`, or `init` remains silent until exit. | +| `erase` | Uses `pipeline.run_cocoindex_drop` and in-process deletes; propose’s header/footer still apply in PR-3. Optional: stream `cocoindex drop` stderr in non-quiet later; not required for propose UC6 if drop stays fast. | +| MCP stdio rule | Tool handlers must not write to stdout; **`run_refresh_pipeline` is CLI-only today** — keep relay on **stderr** only. | + +--- + +# PR-1 — Propose merge (optional) + +## File-by-file changes + +### 1. `propose/CLI-PROGRESS-OUTPUT-PROPOSE.md` + +- Land or refresh status so the propose is the reviewed anchor for Phase 1 scope and Phase 2 deferrals. + +## Tests for PR-1 + +- None (documentation only). + +## Definition of done (PR-1) + +- Propose is merged to the integration branch with §7 decisions intact. + +## Implementation step list + +| # | Step | File(s) | Done when | +| --- | --- | --- | --- | +| 1 | Confirm propose merged | `propose/CLI-PROGRESS-OUTPUT-PROPOSE.md` | PR-2 branch can rebase on it | + +--- + +# PR-2 — Stream subprocess I/O + cocoindex wrap + +## File-by-file changes + +### 1. `server.py` (`run_refresh_pipeline`) + +- Replace `communicate()`-style buffering for cocoindex and `build_ast_graph.py` with **concurrent async readers** on both `stdout` and `stderr`. +- While `quiet=False`: relay each chunk **verbatim** to **`sys.stderr`** (preserve bytes/decoding strategy; document `errors="replace"` if kept). +- Always append to in-memory strings for **`RefreshIndexOutput`** (`stdout`/`stderr`/`graph_stdout`/`graph_stderr`) so tail clipping (`clip` / last N chars) matches current field semantics. +- Emit propose Appendix A lines around cocoindex only when `quiet=False`: + - `[lance] running cocoindex update (project_root=)` + - `[lance] cocoindex update finished in s (exit=)` +- When `quiet=True`: keep **capture-only** behaviour (no relay), identical tail attachment semantics. + +### 2. `java_codebase_rag/pipeline.py` + +- Refactor `run_cocoindex_update` and `run_build_ast_graph` (and optionally `run_cocoindex_drop`) so non-quiet paths **stream** child stdout+stderr to the parent stderr while still returning `CompletedProcess`-compatible strings (full or tailed — match whatever the CLI expects today for failure messages). +- Quiet paths: retain `capture_output=True` (or equivalent) with no relay. + +### 3. `java_codebase_rag/cli.py` + +- `_cmd_init` / `_cmd_increment`: call streaming-aware pipeline helpers; emit the same `[lance]` bracket lines around cocoindex as `run_refresh_pipeline` (shared small helper in `pipeline.py` or `cli.py` to avoid drift). +- Do **not** change stdout JSON / pprint payloads or exit-code mapping. + +## Tests for PR-2 + +Prefer a dedicated module `tests/test_cli_progress_stdout_invariant.py` (name from propose §6) grouping stdout baseline checks. + +1. `test_stream_relay_arrives_before_wait` — asyncio (or threaded) fake child: bytes written to child stdout/stderr appear on a **sink** before process exit, proving no end-of-process batching in non-quiet mode. +2. `test_refresh_pipeline_quiet_stderr_baseline` — `run_refresh_pipeline(quiet=True)`: stderr has **no new** progress markers from this work; compare to baseline or assert relay/wrap lines absent (subprocess output captured only, as today). +3. `test_cli_lifecycle_stdout_invariant_init` — `java-codebase-rag init --quiet` (tiny fixture, temp index dir): captured **stdout** matches a **checked-in baseline** string (propose §3.4). +4. `test_cli_lifecycle_stdout_invariant_reprocess` — same for `reprocess --quiet` when CI can run the pipeline; if full cocoindex is unavoidable, gate behind `JAVA_CODEBASE_RAG_RUN_HEAVY` **only as a last resort** — prefer stubbed subprocesses or payload builders so default `pytest tests` stays ungated. + +## Definition of done (PR-2) + +- Interactive `init`, `increment`, and `reprocess` show live subprocess output on stderr (when not `--quiet`). +- `RefreshIndexOutput` fields remain populated for success and failure cases with comparable tail limits. +- Ruff + pytest green per `AGENTS.md`. + +## Implementation step list + +| # | Step | File(s) | Done when | +| --- | --- | --- | --- | +| 1 | Extract or implement async byte relay helper | `server.py` (+ small `_cli_progress.py` only if it reduces duplication) | Both cocoindex and graph subprocesses covered | +| 2 | Mirror streaming for sync CLI path | `pipeline.py`, `cli.py` | `init`/`increment` no longer buffer until exit | +| 3 | Add `[lance]` wrap in both server and CLI cocoindex call sites | `server.py`, `cli.py` | Wording matches propose Appendix A | +| 4 | Add tests | `tests/test_cli_progress_stdout_invariant.py` (+ streaming unit module as needed) | All PR-2 tests pass | + +--- + +# PR-3 — Pass starts, heartbeats, pipeline header/footer, docs + +## File-by-file changes + +### 1. `build_ast_graph.py` + +- For each pass **1–6** and the **write** block (verbose mode): print **start** line from propose Appendix B **before** work; keep existing **summary** lines unchanged. +- Add **heartbeat** context manager (or task) emitting `[passN] running … s elapsed` every **5 s** on stderr, `flush=True`, guarded by a **small lock** so heartbeat lines do not interleave mid-line with other prints; cancel on pass exit **including exceptions** (propose §8). +- Suppress start lines + heartbeats when not verbose (`--quiet` path from CLI already drops `--verbose`). + +### 2. `java_codebase_rag/cli.py` + +- Wrap `init` / `increment` / `reprocess` / `erase` with **header** and **footer** (propose Appendix A) on stderr when not `--quiet`. +- Timer uses monotonic clock; durations `X.XX` two decimal places; middle-dot `·` separators; `exit=` on footer. +- `refresh` alias path: deprecation line remains; header/footer bracket **`reprocess`** semantics (same subcommand label as executed handler). + +### 3. `README.md` (CLI section) + +- One sentence: lifecycle commands stream subprocess progress to **stderr** (including relayed child stdout); `--quiet` suppresses it; stdout remains the machine contract. + +### 4. `docs/JAVA-CODEBASE-RAG-CLI.md` + +- Same operator-facing note under output / lifecycle area. + +### 5. `docs/AGENT-GUIDE.md` + +- Same note for agent operators driving the CLI. + +### 6. `java_codebase_rag/cli.py` (`build_parser` description) + +- One sentence in top-level `--help` description string (propose UC16). + +## Tests for PR-3 + +Use `tests/test_cli_quiet_parity.py` for quiet regression (propose §8 / §6). + +1. `test_pass_heartbeat_fires_when_pass_slowed` — inject a delay stub or env-controlled slow fixture so a pass exceeds 5 s; assert at least one heartbeat line **before** summary. +2. `test_pass_start_before_pass_body` — start line appears before first pass-specific verbose output. +3. `test_pipeline_header_footer_present` — non-quiet lifecycle command includes header regex and footer regex on stderr. +4. `test_cli_quiet_stderr_baseline_per_subcommand` — for `init` / `increment` / `reprocess` / `erase --yes` with `--quiet`, captured **stderr** equals a **checked-in per-subcommand baseline** from current behaviour (or assert absence of new markers: `[lance]`, `java-codebase-rag … ·`, `[passN] starting`, `[passN] running …`). **Expect non-empty baseline for `increment --quiet`** (staleness warning). +5. Re-run / extend `test_cli_lifecycle_stdout_invariant_*` from PR-2 — stdout baselines still match. + +## Definition of done (PR-3) + +- Output spec in propose Appendix A satisfied for normal `init` / `reprocess` runs (modulo real cocoindex volume). +- Documentation and `--help` mention stderr streaming + `--quiet`. +- Full `tests` suite + ruff per repo workflow. + +## Implementation step list + +| # | Step | File(s) | Done when | +| --- | --- | --- | --- | +| 1 | Implement `heartbeat()` context manager | `build_ast_graph.py` | Exception-safe cancellation + lock | +| 2 | Insert Appendix B start strings | `build_ast_graph.py` | Grep parity on old summaries | +| 3 | Header/footer helpers | `cli.py` | All four lifecycle verbs wrapped | +| 4 | Docs + help string | `README.md`, `docs/*.md`, `cli.py` | Single consistent sentence | +| 5 | Tests | `tests/…` | PR-3 tests + PR-2 invariants green | + +--- + +# Cross-PR risks and mitigations + +| # | Risk | Severity | Mitigation | +| --- | --- | --- | --- | +| 1 | Line interleaving from concurrent stderr writes | Medium | Module-level `threading.Lock` around **single** `print(..., flush=True)` calls (propose §8). | +| 2 | Heartbeat thread leaks on exception | Medium | Context manager `__exit__` always cancels background worker; unit test exception path. | +| 3 | Quiet tests assert `stderr == ""` for `increment --quiet` and fail | Medium | Lock rule in propose §3.3 + plan: **baseline parity**; record `increment` quiet stderr fixture including staleness block. | +| 4 | Accidental timestamp or progress on stdout | High | Baseline byte comparison tests; code review: only `_emit` / `print` to stdout for payloads. | +| 5 | `init` path forgotten | High | Explicit `pipeline.py` + `cli.py` scope in PR-2; grep for `capture_output=True` after PR-2. | +| 6 | UTF-8 decode errors on relay | Low | Keep `errors="replace"` consistent with today’s decode of captured bytes. | + +# Out of scope + +- Pretty rendering, colors, progress bars, `rich` / `tqdm` / `click`. +- Changing summary line text in `build_ast_graph.py`. +- `meta`, `tables`, `diagnose-ignore`, `analyze-pr` output or timing. +- Configurable heartbeat interval or `--format=json` for human progress. +- Parsing or summarizing cocoindex output beyond the two `[lance]` lines. +- i18n / translated stderr. + +# Whole-plan done definition + +1. Long `init` / `reprocess` runs emit visible stderr at most ~5 s apart during graph passes (verbose) and stream cocoindex + builder child output live when not `--quiet`. +2. `--quiet` lifecycle runs: stdout payloads match checked baselines; **stderr matches per-subcommand baselines from today** (no new markers from this work; `increment --quiet` baseline includes the existing staleness warning). +3. Documentation and `--help` describe stderr streaming and `--quiet` suppression. +4. `propose/CLI-PROGRESS-OUTPUT-PROPOSE.md` status updated to **completed** when the feature set is merged; this plan moved to `plans/completed/` after the final PR lands. + +# Tracking + +- `PR-1` (propose): _pending_ +- `PR-2` (stream + wrap): _pending_ +- `PR-3` (heartbeats + docs): _pending_ + +Optional follow-up: add `plans/CURSOR-PROMPTS-CLI-PROGRESS-OUTPUT.md` using `plans/completed/CURSOR-PROMPTS-TIER1B.md` as the structural template for per-PR Cursor handoffs. diff --git a/propose/CLI-PROGRESS-OUTPUT-PROPOSE.md b/propose/CLI-PROGRESS-OUTPUT-PROPOSE.md index dbbebdc..a5085da 100644 --- a/propose/CLI-PROGRESS-OUTPUT-PROPOSE.md +++ b/propose/CLI-PROGRESS-OUTPUT-PROPOSE.md @@ -3,7 +3,7 @@ **Status**: draft **Author**: Dmitriy Teriaev + Perplexity Computer **Date**: 2026-05-11 -**Last amended**: 2026-05-13 (pass5/pass6 + cocoindex stdout tee + risk-table wording) +**Last amended**: 2026-05-13 (pass5/pass6 + cocoindex stdout tee + risk wording + quiet stderr baseline parity §3.3) ## TL;DR @@ -12,7 +12,7 @@ - **This propose ships only the minimal mode (Mode 1).** Phase 2 — TTY-pretty rendering with `rich` / progress bars / colors — is deferred to a separate later propose (`CLI-PRETTY-OUTPUT-PROPOSE.md`) and explicitly out of scope here. The split is intentional: minimal mode fixes 80% of the perceived "is it stuck?" problem at 20% of the work and zero new dependencies. - **What ships in Mode 1**: (a) **stream** each subprocess's **stdout and stderr** live to the operator (relay **verbatim** to the parent process's stderr — the human channel) instead of buffering until `communicate()` returns, (b) **wrap** cocoindex with one-line announcements (`[lance] running cocoindex update…` / `[lance] done in X.XXs`), (c) **heartbeat** lines every ~5 s during long passes in `build_ast_graph.py`, (d) per-pass start lines (today's pipeline only prints per-pass *end* lines — opening "now starting pass 2" lines close the silent-gap perception), (e) a one-line overall **pipeline header** and **footer** in the CLI driver. - **Scope hard-cap: lifecycle commands only** (`init`, `increment`, `reprocess`, `erase`). `meta`, `tables`, `diagnose-ignore`, `analyze-pr` stay byte-for-byte identical in this round. -- **Backwards-compatibility invariant**: machine-readable **CLI stdout** for all existing commands stays byte-for-byte identical. New human-facing text (including **relayed subprocess stdout**, not only stderr) is written to **stderr** or suppressed under `--quiet` / `quiet=True`. +- **Backwards-compatibility invariant**: machine-readable **CLI stdout** for all existing commands stays byte-for-byte identical. New human-facing text (including **relayed subprocess stdout**, not only stderr) is written to **stderr** or suppressed under `--quiet` / `quiet=True` (see §3.3: **baseline parity** for overall stderr, not global empty string). - **No new runtime dependencies.** `rich` / `tqdm` / `click` deliberately deferred to Phase 2. - **Migration shape**: **3 PRs** — propose merge → stream + wrap (the structural fix) → heartbeats + start lines + pipeline header/footer (the cosmetic-but-real fix). Tests focus on the streaming invariant + `--quiet` parity; no progress-bar UI tests in this round. @@ -44,7 +44,7 @@ This frame rules out: 2. **CLI stdout is the agent contract; stderr is the human channel.** Every new line **we synthesize** lands on stderr. Relayed subprocess bytes also go to stderr so they are visible before exit. **CLI** stdout payloads for `meta` / `tables` / `analyze-pr` remain byte-for-byte identical. 3. **No new runtime dependencies.** No `rich`, no `tqdm`, no `click`. Pure stdlib `time` / `sys` / `threading`. 4. **Honest about partial knowledge.** When a pass cannot announce a percentage (e.g. cocoindex internals are opaque to us), we say "running…" with elapsed time, not a fake bar. Mirrors the "partial fidelity is loud" principle from CLI-SCENARIOS §2. -5. **`--quiet` is sacred.** The existing `--quiet` flag (which sets `quiet=True` and drops `--verbose` from the graph builder) must continue to suppress *every* new line this propose adds, including the pipeline header / footer and heartbeats. CI consumers depend on it. +5. **`--quiet` is sacred.** The `--quiet` flag (which sets `quiet=True` and drops `--verbose` from the graph builder) must continue to suppress *every* new line **this propose adds**, including the pipeline header / footer and heartbeats. **Overall** `--quiet` stderr behaviour follows §3.3: **baseline equality** with today per subcommand, not “stderr is always empty” (`increment --quiet` already prints the staleness warning). CI consumers depend on no *additional* noise from this work. 6. **Cardinal-number discipline.** This propose locks **5 user-visible improvements** (stream, cocoindex wrap, heartbeats, pass-start lines, pipeline header/footer) across **3 PRs**. Adding a 6th improvement in this round requires a propose amendment, not a drive-by. Mirrors [`propose/completed/CLI-SCENARIOS-PROPOSE.md`](completed/CLI-SCENARIOS-PROPOSE.md) §6. 7. **Heartbeat cadence is fixed at ~5 s.** Not adjustable in this round (one knob, one default; matches CLI-SCENARIOS "one source of truth per config knob" principle). A future propose may make it configurable if a real consumer needs it. 8. **No structural change to the pipeline.** We surface existing phases; we do not split, merge, reorder, or rename passes in `build_ast_graph.py`. @@ -173,7 +173,11 @@ The user now sees something happening at most ~5 s apart for the entire duration ### 3.3 What `--quiet` looks like -`java-codebase-rag init --quiet` produces **no stderr output** except errors (today's behaviour, preserved). stdout (the machine-readable summary the CLI driver prints at exit) is byte-for-byte identical to today's output. CI logs and agent-sandbox runs see no change in line count or content. +**Stdout:** the machine-readable summary the CLI prints at exit is byte-for-byte identical to today's output (same as §3.4). + +**Stderr — quiet parity rule (locked):** `--quiet` must not add any **new** stderr bytes from this propose: no relayed subprocess output, no `[lance]` wrap lines, no pipeline header/footer, no pass-start or heartbeat lines. **Pre-existing CLI stderr is unchanged.** Today, `increment --quiet` always prints a fixed multi-line **Kuzu graph may be stale** warning to stderr before cocoindex (`java_codebase_rag/cli.py`); that block is outside this propose and must remain bit-identical in `--quiet` runs. Tests therefore assert **per-subcommand stderr baselines** recorded from current behaviour (or assert absence of new markers such as `[lance]`, `java-codebase-rag ·`, `[passN] starting`, `[passN] running …`), not `stderr == ""` for every lifecycle command. + +On a **success** path with no extra warnings, `init --quiet` / `reprocess --quiet` / `erase --yes` typically still have empty stderr, as today. ### 3.4 Stdout invariant (locked) @@ -181,15 +185,16 @@ For each of `init` / `increment` / `reprocess` / `erase`, the **`java-codebase-r ## §4 — Use-case re-walk -Walking 16 realistic invocations through the proposed surface. Each row records the **mode** (interactive vs CI / agent), the **observable change** post-PR-PROG-3, and whether the **stdout invariant** holds. +Walking 17 realistic invocations through the proposed surface. Each row records the **mode** (interactive vs CI / agent), the **observable change** post-PR-PROG-3, and whether the **stdout invariant** holds. | # | Invocation | Mode | Observable change | Stdout invariant | |---|---|---|---|---| | UC1 | `java-codebase-rag init` on 4500-file estate | Interactive | 5 s max silence; pipeline header + cocoindex wrap + per-pass start/heartbeat/summary + footer | Identical | | UC2 | `java-codebase-rag init` on a 50-file toy repo | Interactive | Same lines, but most heartbeats never fire (passes finish in <5 s). Header / footer / start / summary still print. | Identical | -| UC3 | `java-codebase-rag init --quiet` | Interactive | No stderr output (today's behaviour) | Identical | +| UC3 | `java-codebase-rag init --quiet` | Interactive | No **new** stderr from this propose; stderr matches today's baseline (usually empty on success) | Identical | | UC4 | `java-codebase-rag reprocess` in CI (non-TTY, output redirected) | CI / agent | Stderr lines now appear in the CI log in real time instead of one final burst — fine for line-oriented CI consumers; no ANSI escapes | Identical | | UC5 | `java-codebase-rag increment` (small Lance delta, full graph rebuild) | Interactive | Cocoindex wrap shows quick exit (e.g. 2 s); graph rebuild still gets heartbeats. User can tell which side is slow. | Identical | +| UC5b | `java-codebase-rag increment --quiet` | CI / agent | Same multi-line Kuzu staleness stderr warning as today; still no `[lance]` / header / relayed subprocess lines | Identical | | UC6 | `java-codebase-rag erase --yes` | Interactive | Pipeline header + footer; no cocoindex / pass lines (no subprocess work) | Identical | | UC7 | Cursor agent runs `init` in a sandbox shell | CI / agent | Sees lines streamed instead of a single burst; agent can detect progress / hang on its own | Identical | | UC8 | User pipes output to a file: `java-codebase-rag init 2> log.txt` | CI / agent | Log file fills as the command runs (was: log file appears empty until exit, then fills) | Identical | @@ -204,10 +209,10 @@ Walking 16 realistic invocations through the proposed surface. Each row records **Result of the re-walk:** -- 16 of 16 invocations: stdout invariant holds. -- 16 of 16: observable stderr change is improvement, never regression. -- 0 of 16: requires a 6th improvement, an ANSI / pretty rendering, or a percentage-bar. -- UC4 / UC8 / UC12 explicitly exercise the **non-TTY / agent / log-file** consumer to validate that no ANSI / redraw / TTY-only construct sneaks in. +- 17 of 17 invocations: stdout invariant holds. +- 17 of 17: observable stderr change is improvement, never regression (quiet rows use the §3.3 baseline rule). +- 0 of 17: requires a 6th improvement, an ANSI / pretty rendering, or a percentage-bar. +- UC4 / UC8 / UC12 / UC5b explicitly exercise the **non-TTY / agent / log-file** consumer to validate that no ANSI / redraw / TTY-only construct sneaks in. No surface revisions triggered. @@ -241,7 +246,7 @@ No surface revisions triggered. **Purpose**: structural fix. `server.py:run_refresh_pipeline` no longer buffers subprocess output until completion; both child streams are relayed live to the parent's stderr while buffers retain tails for `RefreshIndexOutput`. Cocoindex gets the wrap-around `[lance] running…` / `[lance] finished in …` lines. **Tests**: - Unit test for the streaming relay (asyncio tasks read from fake stdout+stderr pipes and write to a captured sink in real time, not after `.wait()`). -- `--quiet` parity test: stderr is empty when `quiet=True`, identical to today. +- `--quiet` parity test: captured **stderr** matches a **per-subcommand baseline** from current behaviour (or asserts absence of new markers only); not `stderr == ""` for every command — `increment --quiet` already emits the staleness warning block today. - Stdout invariant test: `init` against the fixture repo produces a stdout byte-string identical to a recorded baseline. ### PR-PROG-3 — heartbeats + pass-start lines + pipeline header/footer @@ -252,7 +257,7 @@ No surface revisions triggered. - Heartbeat fires at least once when a fixture pass is artificially slowed to >5 s; does not fire on a fast pass. - Pass-start line is emitted before any pass-internal output. - Pipeline header / footer wrap the whole command. -- `--quiet` parity: every new line type is suppressed. +- `--quiet` parity: every **new** line type from this propose is suppressed; stderr matches **recorded baselines** per subcommand where non-empty stderr already exists today (see §3.3). - Stdout invariant test (regression on the PR-PROG-2 test). - Docs: README + AGENT-GUIDE.md gain a one-sentence note that lifecycle commands stream subprocess progress to **stderr** (including relayed child stdout) and `--quiet` suppresses it. @@ -266,7 +271,7 @@ Total: 3 PRs. 4. **CLI stdout is the agent contract; human progress uses stderr.** Payloads printed by `java-codebase-rag` to **its own stdout** stay byte-for-byte identical. Child-process stdout is no longer “invisible until exit,” but it is **streamed to stderr** for humans and **accumulated** for the same structured return values as today — not printed on the CLI's stdout. 5. **No new runtime dependencies.** Pure stdlib. `rich` / `tqdm` / `click` deferred to Phase 2. 6. **Heartbeat cadence locked at 5 s.** Not configurable in this round. -7. **`--quiet` suppresses every new line.** No new line type bypasses the quiet path. CI / agent consumers see no behavioural change in `--quiet` mode. +7. **`--quiet` suppresses every new stderr byte from this propose** (relay, `[lance]` wrap, header/footer, pass starts, heartbeats). Pre-existing CLI stderr (notably `increment`'s staleness warning) is unchanged; parity is **baseline equality**, not global empty stderr. 8. **Five improvements, three PRs, locked.** Adding a 6th improvement in this round requires an amendment to this propose. 9. **Per-pass start lines are net-new; summary lines preserved verbatim.** Grep parity invariant: any consumer that today greps for `[passN] parsed` / `[passN] emitted` / etc. continues to match. 10. **No ANSI escapes, no TTY detection, no redraw-in-place.** Mode 1 is plain-text in all environments. TTY-aware rendering is Phase 2. @@ -282,7 +287,7 @@ Total: 3 PRs. | Heartbeat thread interleaves with the main thread's prints, corrupting line atomicity | All heartbeat writes use `print(..., file=sys.stderr, flush=True)` with a module-level `threading.Lock` shared between heartbeat and pass-end writers. Lock scope is the single `print` call. | | 5 s cadence wrong for some environments (too noisy / too quiet) | Cadence is locked for this round (§7 decision #6). If a real consumer reports a problem, a future propose can introduce a configurable cadence. Two-PR cost to defer is small. | | User confused by new lines breaking their muscle memory for the old summary-only output | Existing summary lines are preserved verbatim (§7 decision #9). Anyone grepping `[passN] parsed` keeps working. README + AGENT-GUIDE.md get a one-sentence note in PR-PROG-3. | -| `--quiet` parity bug: one of the new line types slips through quiet mode | Dedicated unit test in PR-PROG-3 (`test_cli_quiet_parity.py`) runs every lifecycle command with `--quiet` against a fixture and asserts captured stderr is empty (or matches today's empty baseline). | +| `--quiet` parity bug: one of the new line types slips through quiet mode | Dedicated unit test in PR-PROG-3 (`test_cli_quiet_parity.py`) runs each lifecycle command with `--quiet` against a fixture and asserts **stderr matches a recorded per-subcommand baseline** from current behaviour (or asserts absence of new markers). Does **not** require empty stderr for `increment --quiet`, which today prints the Kuzu staleness warning (§3.3). | | Stdout invariant test breaks because timing / wall-clock leaks into stdout | Baseline test redirects only stderr for capture; stdout is asserted against a string baseline that includes no timestamps. If a timestamp accidentally lands on stdout, the test fails — by design. | | Non-TTY environments (CI / agent sandboxes) get noisy because we strip nothing | The new lines are line-oriented, no ANSI, no redraws. Line-oriented CI logs are the *target* shape, not an accident. Verified by UC4 / UC8 / UC12. | | cocoindex prints a huge amount of output and floods the user terminal | Out of scope — we pass cocoindex output through verbatim by §2 principle 4. If this becomes a real problem, a future propose may add a `--lance-quiet` flag. | @@ -308,7 +313,7 @@ java-codebase-rag · finished in s (exit=) Rules: - **CLI-owned** lines (header, footer, `[lance] …` wrap, `[passN] …` heartbeats / starts) go to **stderr**. - **Subprocess stdout and stderr** are **relayed verbatim to stderr** as bytes arrive (may be partial lines); the CLI's **own stdout** is unchanged. -- Every synthesized line is suppressed when `--quiet` (relayed child bytes are not forwarded in quiet mode — same capture behaviour as today). +- Every synthesized line is suppressed when `--quiet` (relayed child bytes are not forwarded in quiet mode — same capture behaviour as today). **Overall** `--quiet` stderr follows §3.3: **byte-for-byte baseline parity** with today per subcommand, not global empty stderr (see `increment --quiet` staleness warning). - `` is integer seconds (no decimals on heartbeats); pipeline header / footer use `s` (two decimals). - The pipeline-header / footer lines use the U+00B7 middle dot (`·`) as a separator. No other special characters. - No ANSI escapes anywhere.