## Problem The README makes claims that need verification through the docs-as-specs pipeline. ## Current State (Jan 2026) After running `/validate-docs`, the coverage is better than expected: | Category | Coverage | |----------|----------| | README features with specs | 6/7 (86%) | | Spec assertions with tests | 34/35 (97%) | ## Critical Gap: Agent Workflow The README prominently features this workflow: ``` ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ Spawn │────►│ Work │────►│ PR │────►│ Close │ │ (main) │ │ (k8s ns) │ │ (GitHub) │ │ (summary) │ └─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘ ``` **This has NO spec and NO tests.** It's the headline feature but completely unverified. ## Other Gaps - [ ] **chat.md**: "Messages persist across page reloads" - no test - [ ] **Terminology**: Specs say "Sessions", some UI/tests say "INBOX" - [ ] **Skipped tests**: Several spec behaviors have skipped tests due to flakiness ## Infrastructure Claims (Out of Scope for E2E) These README claims are about infrastructure, not user behavior: - "Workers continue after browser closes" - DBOS guarantee - "Task state persists across restarts" - DBOS guarantee - "Workers in isolated K8s namespaces" - Deployment architecture These should be verified via integration tests, not E2E. ## Action Items 1. [ ] Create `docs/specs/agent-workflow.md` 2. [ ] Add message persistence test to chat spec coverage 3. [ ] Fix skipped tests or remove spec assertions they cover 4. [ ] Standardize terminology (Sessions vs INBOX) ## What IS Working Well The docs-as-specs approach is solid: - `sessions.md` → 21/21 assertions tested (100%) - `layout.md` → 7/7 assertions tested (100%) - `chat.md` → 6/7 assertions tested (86%) The pipeline works, we just need to extend it to the agent workflow.
Problem
The README makes claims that need verification through the docs-as-specs pipeline.
Current State (Jan 2026)
After running
/validate-docs, the coverage is better than expected:Critical Gap: Agent Workflow
The README prominently features this workflow:
This has NO spec and NO tests. It's the headline feature but completely unverified.
Other Gaps
Infrastructure Claims (Out of Scope for E2E)
These README claims are about infrastructure, not user behavior:
These should be verified via integration tests, not E2E.
Action Items
docs/specs/agent-workflow.mdWhat IS Working Well
The docs-as-specs approach is solid:
sessions.md→ 21/21 assertions tested (100%)layout.md→ 7/7 assertions tested (100%)chat.md→ 6/7 assertions tested (86%)The pipeline works, we just need to extend it to the agent workflow.