wait_for_text: two-call state/capture race within a single poll tick

**Type:** architecture · **Tier:** deferred · **Tool:** `wait_for_text`

## What's happening

Each poll in `wait_for_text` runs two tmux subprocess calls in sequence:

1. `_read_pane_state` issues `display-message` to read `history_size`, `cursor_y`, `pane_height`, `pane_pid`, `pane_dead`.
2. `pane.capture_pane(start=start_line, end=None, join_wrapped=True)` issues `capture-pane`, where `start_line = baseline_abs - state.history_size + 1`.

Between (1) and (2), tmux can scroll more lines into history. tmux's `capture-pane` computes `top = gd->hsize + n` against the **live** hsize at capture time ([cmd-capture-pane.c#L158](https://github.com/tmux/tmux/blob/134ba6c/cmd-capture-pane.c#L158)), not the hsize we sampled in step 1. So when N new rows scroll between the two calls:

- We pass `n = baseline_abs - hsize_at_step1 + 1`
- tmux computes `top = hsize_at_step2 + n = baseline_abs + 1 + (hsize_at_step2 - hsize_at_step1)`
- The captured window starts N rows past the row we wanted; those N rows are invisible to the wait this tick.

## When it matters

Single-tick latency under bursty output. The next poll usually picks the missed rows back up — unless the missed rows have already scrolled past the visible region and been collected by [`grid_collect_history`](https://github.com/tmux/tmux/blob/134ba6c/grid.c#L386), at which point the rollover guard fires and the wait raises. So the bug surface is:

- One-tick `interval` of latency on transient bursts (default 50 ms; bounded).
- Permanent miss only at the moment of history rollover — but rollover now raises.

In other words: the race exists but its impact is bounded by `interval` and capped at \"raise\" rather than \"silently wrong\" thanks to the rollover guard.

## Options under consideration

### 1. Re-read after capture, retry on drift

```python
state_pre = await asyncio.to_thread(_read_pane_state, pane)
start_line = baseline_abs - state_pre.history_size + 1
lines = await asyncio.to_thread(pane.capture_pane, start=start_line, ..., join_wrapped=True)
state_post = await asyncio.to_thread(_read_pane_state, pane)
delta = state_post.history_size - state_pre.history_size
if delta > 0:
    # capture started \`delta\` rows too late; re-issue with adjusted start
    ...
```

Doubles per-tick subprocess cost in the worst case (3 tmux calls instead of 2 when drift is detected). Complicates the `_PaneState` invariant set: now we track two state reads per tick. Test matrix grows.

### 2. Chain in a single tmux command

Build one `pane.cmd(...)` invocation that issues `display-message ; capture-pane` with tmux's [`\;` chaining](https://github.com/tmux/tmux/blob/134ba6c/cmd-queue.c). One stdout stream needs to be split by the caller. Drops out of libtmux's typed API. Tightly couples to tmux's chaining quirks.

### 3. Document, rely on next-tick recovery (current behavior)

Acceptable because:
- The miss is bounded by `interval` (default 50 ms).
- Permanent misses now raise rather than silently return wrong results, courtesy of the rollover guard.
- The deterministic alternative for command-completion synchronization is [`wait_for_channel`](https://github.com/tmux-python/libtmux-mcp/blob/main/src/libtmux_mcp/tools/wait_for_tools.py) composed with `tmux wait-for -S` — zero polling, zero races.

## Recommendation

Stay on option 3 until real-world telemetry shows flaky single-tick misses. The blast radius is small and the agent-facing escape hatch (`wait_for_channel`) is already documented in the `wait_for_text` \"When NOT to use this\" section. Re-evaluate if a stress-test fixture starts catching missed transitions.

## References

- tmux capture-pane start math: [cmd-capture-pane.c#L158](https://github.com/tmux/tmux/blob/134ba6c/cmd-capture-pane.c#L158) (`top = gd->hsize + n`)
- tmux history collection: [grid.c#L386](https://github.com/tmux/tmux/blob/134ba6c/grid.c#L386) (`grid_collect_history`)
- tmux command queue / chaining: [cmd-queue.c](https://github.com/tmux/tmux/blob/134ba6c/cmd-queue.c)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

wait_for_text: two-call state/capture race within a single poll tick #50

What's happening

When it matters

Options under consideration

1. Re-read after capture, retry on drift

2. Chain in a single tmux command

3. Document, rely on next-tick recovery (current behavior)

Recommendation

References

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

wait_for_text: two-call state/capture race within a single poll tick #50

Description

What's happening

When it matters

Options under consideration

1. Re-read after capture, retry on drift

2. Chain in a single tmux command

3. Document, rely on next-tick recovery (current behavior)

Recommendation

References

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions