diff --git a/memory/PLAN.md b/memory/PLAN.md index 258879f8..da0fa52e 100644 --- a/memory/PLAN.md +++ b/memory/PLAN.md @@ -17,7 +17,7 @@ The interaction model is mature: four-phase interview, interviewer-autonomous qu The next product arc is the **Conversational Workspace Runtime** umbrella (`docs/design/CONVERSATIONAL_WORKSPACE_RUNTIME.md`) plus a stronger semantic/generative substrate. The umbrella synthesizes MULTI_CHAT, SIDE_CHAT, PATCH_LEDGER, and CONTINUOUS_WORKSPACE_HYBRID into five sub-tracks: workspace shell (Track 1, shipped as `continuous-workspace` / FE-709), inline secondary-chat runtime over the existing chat/turn substrate (`chat-runtime-secondary-chats`), reconciliation runtime absorption (`reconciliation-runtime`), changeset ledger (`changeset-ledger`), and transcript-first chat context provision (`chat-context-provision`). The shell is now the stable host; schema-level `thread` is deferred until chat/turn proves insufficient. Secondary chats are the near-term runtime primitive for side, reconciliation, qa, and strategy conversations. The chat runtime is the critical unblocker for reconciliation absorption; chat context provision can proceed against chat/turn with explicit transcript snapshots and graph-item handles. The changeset ledger runs in parallel. The umbrella supersedes the independent side-chat V4a persistence horizon — persistent side-chat history becomes inline secondary chats in the workspace. The FE-705 branch contributes an integration substrate — a local agent capability CLI and external LLM-as-user probe harness — that should be reconciled into main before graph-review and scenario-options work depends on generated completed-spec fixtures. After that, the highest-coordination work is intent-graph semantics and the semantic changeset ledger; FE-701 should follow soon after the FE-705 reconciliation because the current schema already carries transitional multi-chat / reconciliation placeholders that only become coherent once `changeset` / `change` owns semantic mutation history. Lower-coordination provider, gitignore, and web-research work can proceed in parallel. -The **orchestrator / Petri-net execution substrate** is committed (2026-05-21) to Petri as the forward execution model, justified by parallelism, simulation, and resume value claims. The dual-engine PoC (FE-730 / PR #143) validated the substrate but left the engine as a serial first-enabled interpreter with hand-compiled nets, collapsed mechanical/semantic completion, and leaked control state outside the net. The next moves evolve the Petri engine through a phased plan: Phase 0 (compiler/interpreter/firing-policy extraction) closes `orchestrator-poc`; Phases 1–2 (`petri-semantic-lanes`, `petri-parallel-execution`) are the near-horizon new frontier items under umbrella H-6476; Phases 3–4 (graph compilation, simulation oracle) are on the horizon pending `intent-graph-semantics` (FE-700) and relation-policy readiness. The north-star design is `docs/next/architecture/plan-graph-petri-orchestration.md`. +The **orchestrator / Petri-net execution substrate** is committed (2026-05-21) to Petri as the forward execution model, justified by parallelism, simulation, and resume value claims. Phases 0–2 are done: the dual-engine PoC (Phase 0, FE-730) validated the substrate and extracted the compiler/interpreter; Phase 1 (FE-738) added two-lane mechanical+semantic subnets, the compiler topology/wiring split, and §7 event vocabulary; Phase 2 (FE-743) added parallel firing policy with greedy token claiming, shared resource pool tokens bounding global concurrency, and worktree-per-slice isolation — the decision gate passed (parallel measurably beats serial on wall clock). Phases 3–4 (graph compilation, simulation oracle) are on the horizon pending `intent-graph-semantics` (FE-700) and relation-policy readiness. The north-star design is `docs/next/architecture/plan-graph-petri-orchestration.md`. The May 2026 intent-spec, multi-chat, changeset-ledger, prompt/context, and agent-mutation design notes are reconciled into one direction. `docs/design/MULTI_CHAT.md` is the substrate document. `docs/design/SIDE_CHAT.md` describes side-chat V1 / V2 / V3.0 / V3.1 / V4 phasing on top of that substrate. `docs/design/PATCH_LEDGER.md` remains historical deeper design pressure for semantic mutation history, but canonical future-facing vocabulary is `changeset` / `change`. The product-layer ontology trajectory is split out as `docs/design/INTENT_GRAPH_SEMANTICS.md` and `docs/design/BEHAVIORAL_KERNELS.md`; broader synthesis lives in `docs/archive/design/INTENT_SPEC_EVOLUTION.md`. FE-705's branch-local strategy/proposal notes add scenario options, graph-review oracle, chat-local strategies, and concern/dependency mapping; those notes should become a canonical design doc when the branch is integrated. Coordination uses a substrate-strangler posture: keep existing frontend REST/SSE contracts stable while route adapters and capability adapters converge on shared server-owned handlers, then cut over UI flows only after parity and changeset-backed authority exist. The dev-layer self-tooling trajectory lives in `docs/design/ln-skills/EVOLUTION.md`. @@ -30,12 +30,12 @@ The May 2026 intent-spec, multi-chat, changeset-ledger, prompt/context, and agen ### Recently Completed +- `petri-parallel-execution` (FE-743) — parallel firing policy, shared resource pool tokens, worktree-per-slice isolation. Decision gate passed: parallel measurably beats serial on wall clock for multi-slice plans. Follows `petri-semantic-lanes` (FE-738). - `petri-semantic-lanes` (FE-738) — two-lane subnet, compiler topology/wiring split, engine factory, semantic rework budget, §7 events. PR #148. Criterion (5) stale-graph deferred → `petri-graph-compilation`. ### Next -1. `petri-parallel-execution` — parallel firing, shared resource pools, worktree-per-slice coordination; the categorical break where petri earns its complexity. Decision gate: if petri doesn't beat proc on wall clock, pause petri investment. Follows `petri-semantic-lanes`. -3. `intent-graph-semantics` — highest-coordination semantic substrate after FE-705 reconciliation. +1. `intent-graph-semantics` — highest-coordination semantic substrate after FE-705 reconciliation. 4. `changeset-ledger` — Track 4 of the runtime umbrella; parallel with Track 2; semantic history spine needed before canonical proposal acceptance, direct-edit atomicity, and productized scenario options. 5. `chat-context-provision` — Track 5 of the runtime umbrella recast as transcript-first context; can proceed against chat/turn once secondary-chat entry/anchor shape is settled. 6. `reconciliation-runtime` — Track 3 of the runtime umbrella; after Track 2 + Track 4 provide the secondary-chat surface and durable attribution. @@ -93,12 +93,29 @@ The May 2026 intent-spec, multi-chat, changeset-ledger, prompt/context, and agen - **Traceability:** Requirements 46–50; spec §2 (layer split), §4 (canonical slice-net), §6 (transition contracts), §7 (event model), §8 (failure-mode nets), §10 (prototypes A–C). - **Design docs:** `docs/next/architecture/plan-graph-petri-orchestration.md`; `docs/design/orchestrator.md`; umbrella H-6476. +### petri-graph-compilation + +- **Name:** Petri graph compilation — compile nets from plan-graph + relation policy +- **Linear:** unassigned in this plan snapshot +- **Kind:** structural +- **Status:** horizon (blocked on `intent-graph-semantics` FE-700) +- **Objective:** Compile Petri nets from workspace plan-graph nodes and relation-policy edges rather than from YAML plan fixtures. Relation kinds (`plan.depends_on`, `plan.verified_by_oracle`, `plan.introduces_design`, etc.) compile into topology-level requirements (prerequisite tokens, guard predicates, semantic-lane join conditions). Extends the FE-700 relation-policy registry. +- **Why now / unlocks:** Without graph compilation, the Petri engine only runs hand-authored YAML plans. Graph compilation makes the engine a planning oracle (simulate before executing) and connects execution to the semantic workspace. +- **Open design constraints (from PR #143 / FE-743 review):** + - **Declarative output arcs:** Current topology declares only input places; output routing lives in fire closures (conditional on report payloads). FE-738's `HandlerDescriptor` declares candidate outputs (`onTrue`/`onFalse`/`onPass`/`onFail`) but selection is runtime. This limits formal analyzability (reachability, deadlock detection, simulation) to input-side structure. Phase 3 should move conditional routing into the topology — explicit guard predicates + declared output arcs per branch — so the compiled net is formally analyzable end-to-end. + - **Token state enrichment:** Open question whether more metadata should move from reports into tokens (richer typed token payloads per spec §3). FE-738 added `reworkCount`, FE-743 added pool tokens with `agentPoolSize`, but the boundary between control state (tokens) and substantive handoff state (reports) is a design choice this frontier needs to resolve as the token taxonomy gets richer. + - **Epic verification sandbox scope:** Per-slice sandbox isolation means `verify-epic` can't see all slices' artifacts. Currently `verify-epic` falls back to the parent sandbox dir. The production fix is to merge per-slice sandboxes into an epic-scoped dir before epic verification runs. +- **Acceptance:** TBD — depends on FE-700 relation-policy shape. +- **Verification:** Compiled-net topology tests against plan-graph fixtures; reachability assertions for relation-policy-derived gates; comparison of compiled vs hand-authored net shapes. +- **Traceability:** Requirements 46–50; spec §5 (relation-policy compilation), §6 (transition contracts). +- **Design docs:** `docs/next/architecture/plan-graph-petri-orchestration.md` §5–§6; umbrella H-6476. + ### petri-parallel-execution - **Name:** Petri parallel execution — concurrent firing, resource pools, worktree-per-slice -- **Linear:** unassigned (create under umbrella H-6476) +- **Linear:** FE-743 - **Kind:** structural -- **Status:** not-started +- **Status:** done - **Objective:** Replace the serial `while(true) { transitions.find() }` interpreter with a parallel firing policy that can advance multiple enabled transitions concurrently. Convert per-slice `test-agent`/`code-agent` tokens (already present in PoC at `engine-petri.ts:134-149`) into shared capped resource pools that bound global concurrency. Add worktree-per-slice isolation (one worktree per active slice, not just per run). This is the categorical break where the Petri engine earns its complexity over proc. - **Why now / unlocks:** Parallelism is the primary value claim for petri over proc (per PR #143's own verdict and the spec doc's working conclusion). Without it, both engines are serial and proc wins on simplicity. If petri doesn't beat proc on wall clock time for multi-slice plans, the investment should pause. - **Acceptance:** (1) Multi-slice plans execute with real parallelism (multiple transitions firing concurrently). (2) Resource pool tokens limit global concurrency to configured agent capacity. (3) Each active slice has its own worktree. (4) No fan-out starvation, dead-place, or unreached-slice bugs (regressions from PoC bug-fix rounds). (5) Wall-clock improvement measurable on a 3+ slice fixture vs serial execution. (6) Contract test suite still passes for both engines (proc remains serial). @@ -485,9 +502,9 @@ workspace-gitignore-assist productized-web-research TRACK F — Petri-net execution substrate (umbrella H-6476) -orchestrator-poc (Phase 0: compiler extraction — closing) - └──→ petri-semantic-lanes (Phase 1: two-lane subnet + §7 events) - └──→ petri-parallel-execution (Phase 2: concurrent firing + resource pools) +orchestrator-poc (Phase 0: compiler extraction — done) + └──→ petri-semantic-lanes (Phase 1: two-lane subnet + §7 events — done) + └──→ petri-parallel-execution (Phase 2: concurrent firing + resource pools — done) └──→ petri-graph-compilation (Phase 3: compile from plan-graph + relation policy) ├──→ depends on intent-graph-semantics (FE-700) for relation-policy gates └──→ petri-simulation-oracle (Phase 4: reachability, deadlock, resume) diff --git a/src/orchestrator/src/cook-cli.test.ts b/src/orchestrator/src/cook-cli.test.ts index e4c961fa..d67f12ee 100644 --- a/src/orchestrator/src/cook-cli.test.ts +++ b/src/orchestrator/src/cook-cli.test.ts @@ -11,6 +11,11 @@ describe('parseCookArgs', () => { expect(opts.verbose).toBe(false); }); + it('parses --policy=parallel', () => { + const opts = parseCookArgs(['./f', '--policy=parallel']); + expect(opts.policy).toBe('parallel'); + }); + it('parses --policy=serial', () => { const opts = parseCookArgs(['./f', '--policy=serial']); expect(opts.policy).toBe('serial'); diff --git a/src/orchestrator/src/cook-cli.ts b/src/orchestrator/src/cook-cli.ts index add26b83..fc5ee7ec 100644 --- a/src/orchestrator/src/cook-cli.ts +++ b/src/orchestrator/src/cook-cli.ts @@ -7,7 +7,7 @@ import type { FiringPolicy } from './petri-net.js'; import { createPiActions } from './pi-actions.js'; import { loadPlan } from './plan-loader.js'; import { BunTestRunner } from './test-runner.js'; -import { createWorktree } from './worktree.js'; +import { createSandbox } from './worktree.js'; export type CookOptions = { dir: string; @@ -26,8 +26,8 @@ export function parseCookArgs(args: string[]): CookOptions { const arg = args[i]!; if (arg.startsWith('--policy=')) { const val = arg.split('=')[1]!; - if (val !== 'serial') { - throw new Error(`Unknown policy: ${val}. Use serial.`); + if (val !== 'serial' && val !== 'parallel') { + throw new Error(`Unknown policy: ${val}. Use serial or parallel.`); } policy = val; } else if (arg.startsWith('--max-retries=')) { @@ -44,7 +44,7 @@ export function parseCookArgs(args: string[]): CookOptions { } if (!dir) { - throw new Error('Usage: brunch cook [--policy=serial] [--max-retries=N] [--verbose]'); + throw new Error('Usage: brunch cook [--policy=serial|parallel] [--max-retries=N] [--verbose]'); } return { dir: resolve(dir), policy, maxRetries, verbose }; @@ -74,7 +74,7 @@ export async function runCook(opts: CookOptions): Promise { const plan = loadPlan(planPath); const launchCwd = process.env.BRUNCH_LAUNCH_CWD || process.cwd(); - const { worktreeDir, runDir } = createWorktree(launchCwd); + const { sandboxDir, runDir } = createSandbox(launchCwd); const reportsPath = join(runDir, 'reports.jsonl'); const epicCount = plan.epics.length; @@ -86,7 +86,7 @@ export async function runCook(opts: CookOptions): Promise { console.error(` policy ${opts.policy}`); console.error(` plan ${epicCount} epics, ${sliceCount} slices`); console.error(` retries ${opts.maxRetries}`); - console.error(` worktree ${worktreeDir}`); + console.error(` sandbox ${sandboxDir}`); console.error(` reports ${reportsPath}`); console.error(''); @@ -100,7 +100,7 @@ export async function runCook(opts: CookOptions): Promise { const result = await engine.run({ plan, - worktreeDir, + sandboxDir, actions, reports, testRunner, diff --git a/src/orchestrator/src/engine-contract.test.ts b/src/orchestrator/src/engine-contract.test.ts index 23b49b32..7ab80ca9 100644 --- a/src/orchestrator/src/engine-contract.test.ts +++ b/src/orchestrator/src/engine-contract.test.ts @@ -10,7 +10,10 @@ import type { ActionContext, ActionHandlers, OrchestratorInput, Plan, RunCtx, Te // Shared engine list for parameterized tests // --------------------------------------------------------------------------- -const engines = [{ name: 'serial', create: () => createOrchestrator('serial') }] as const; +const engines = [ + { name: 'serial', create: () => createOrchestrator('serial') }, + { name: 'parallel', create: () => createOrchestrator('parallel') }, +] as const; // --------------------------------------------------------------------------- // Reusable fake factory — per-test closures instead of module-level state @@ -126,6 +129,48 @@ function createFakes(opts?: { return { callOrder, reports, actions, testRunner }; } +// --------------------------------------------------------------------------- +// Concurrency-tracking wrapper — reusable across parallel/pool tests +// --------------------------------------------------------------------------- + +type ConcurrencyTracker = { maxConcurrent: number }; + +/** + * Wrap action handlers with concurrency tracking. Each wrapped handler + * increments an active counter, yields to allow interleaving under + * Promise.allSettled, calls the original, then decrements. + * + * @param actions Original action handlers to wrap + * @param onlyKeys If provided, only wrap these action keys (others pass through) + */ +function withConcurrencyTracking( + actions: ActionHandlers, + onlyKeys?: Set, +): { tracked: ActionHandlers; tracker: ConcurrencyTracker } { + let active = 0; + const tracker: ConcurrencyTracker = { maxConcurrent: 0 }; + + const tracked: ActionHandlers = {}; + for (const [key, handler] of Object.entries(actions)) { + if (onlyKeys && !onlyKeys.has(key)) { + tracked[key] = handler!; + } else { + tracked[key] = async (ctx: ActionContext) => { + active++; + tracker.maxConcurrent = Math.max(tracker.maxConcurrent, active); + try { + await new Promise((resolve) => setTimeout(resolve, 5)); + return await handler!(ctx); + } finally { + active--; + } + }; + } + } + + return { tracked, tracker }; +} + // --------------------------------------------------------------------------- // Contract test #1 — single epic, single slice, happy path // --------------------------------------------------------------------------- @@ -157,7 +202,7 @@ describe('Engine contract test #1 — single epic, single slice, happy path', () const fakes = createFakes(); const result = await create().run({ plan: simplePlan, - worktreeDir: '/tmp/fake', + sandboxDir: '/tmp/fake', actions: fakes.actions, reports: fakes.reports, testRunner: fakes.testRunner, @@ -170,7 +215,7 @@ describe('Engine contract test #1 — single epic, single slice, happy path', () const fakes = createFakes(); const result = await create().run({ plan: simplePlan, - worktreeDir: '/tmp/fake', + sandboxDir: '/tmp/fake', actions: fakes.actions, reports: fakes.reports, testRunner: fakes.testRunner, @@ -184,7 +229,7 @@ describe('Engine contract test #1 — single epic, single slice, happy path', () const fakes = createFakes(); await create().run({ plan: simplePlan, - worktreeDir: '/tmp/fake', + sandboxDir: '/tmp/fake', actions: fakes.actions, reports: fakes.reports, testRunner: fakes.testRunner, @@ -204,7 +249,7 @@ describe('Engine contract test #1 — single epic, single slice, happy path', () const fakes = createFakes(); await create().run({ plan: simplePlan, - worktreeDir: '/tmp/fake', + sandboxDir: '/tmp/fake', actions: fakes.actions, reports: fakes.reports, testRunner: fakes.testRunner, @@ -252,8 +297,6 @@ const depPlan: Plan = { }; describe('Engine contract test #2 — intra-epic slice dependencies', () => { - const engines = [{ name: 'serial', create: () => createOrchestrator('serial') }] as const; - for (const { name, create } of engines) { describe(name, () => { it('completes both slices in dependency order', async () => { @@ -347,7 +390,7 @@ describe('Engine contract test #2 — intra-epic slice dependencies', () => { const engine = create(); const result = await engine.run({ plan: depPlan, - worktreeDir: '/tmp/fake', + sandboxDir: '/tmp/fake', actions: depActions, reports, testRunner: depTestRunner, @@ -402,7 +445,7 @@ describe('Engine contract test #3 — epic dependencies', () => { const fakes = createFakes(); const result = await create().run({ plan: epicDepPlan, - worktreeDir: '/tmp/f', + sandboxDir: '/tmp/f', actions: fakes.actions, reports: fakes.reports, testRunner: fakes.testRunner, @@ -452,7 +495,7 @@ describe('Engine contract test #4 — epic verification passes', () => { const fakes = createFakes({ verifyEpicResult: true }); const result = await create().run({ plan: verifyPlan, - worktreeDir: '/tmp/f', + sandboxDir: '/tmp/f', actions: fakes.actions, reports: fakes.reports, testRunner: fakes.testRunner, @@ -492,7 +535,7 @@ describe('Engine contract test #5 — epic verification fails', () => { const fakes = createFakes({ verifyEpicResult: false }); const result = await create().run({ plan: verifyFailPlan, - worktreeDir: '/tmp/f', + sandboxDir: '/tmp/f', actions: fakes.actions, reports: fakes.reports, testRunner: fakes.testRunner, @@ -516,7 +559,7 @@ describe('Engine contract test #6 — retry loop', () => { const fakes = createFakes({ testRunResults: [false, true] }); const result = await create().run({ plan: simplePlan, - worktreeDir: '/tmp/f', + sandboxDir: '/tmp/f', actions: fakes.actions, reports: fakes.reports, testRunner: fakes.testRunner, @@ -542,7 +585,7 @@ describe('Engine contract test #7 — retry exhaustion', () => { const fakes = createFakes({ testRunResults: [false] }); const result = await create().run({ plan: simplePlan, - worktreeDir: '/tmp/f', + sandboxDir: '/tmp/f', actions: fakes.actions, reports: fakes.reports, testRunner: fakes.testRunner, @@ -565,7 +608,7 @@ describe('Engine contract test #8 — multi-cycle needs more', () => { const fakes = createFakes({ evalSequence: [false, false, true] }); const result = await create().run({ plan: simplePlan, - worktreeDir: '/tmp/f', + sandboxDir: '/tmp/f', actions: fakes.actions, reports: fakes.reports, testRunner: fakes.testRunner, @@ -595,7 +638,7 @@ describe('Engine contract test #9 — action handler throws', () => { const fakes = createFakes({ throwOnAction: 'write-tests' }); const result = await create().run({ plan: simplePlan, - worktreeDir: '/tmp/f', + sandboxDir: '/tmp/f', actions: fakes.actions, reports: fakes.reports, testRunner: fakes.testRunner, @@ -625,7 +668,7 @@ describe('Engine contract test #10 — semantic gate rejects then accepts', () = }); const result = await create().run({ plan: simplePlan, - worktreeDir: '/tmp/f', + sandboxDir: '/tmp/f', actions: fakes.actions, reports: fakes.reports, testRunner: fakes.testRunner, @@ -656,7 +699,7 @@ describe('Engine contract test #11 — semantic rework exhaustion halts', () => }); const result = await create().run({ plan: simplePlan, - worktreeDir: '/tmp/f', + sandboxDir: '/tmp/f', actions: fakes.actions, reports: fakes.reports, testRunner: fakes.testRunner, @@ -682,10 +725,11 @@ describe('Adapter: compiled net shape (topology-only — no runtime bindings)', const blueprint = compileTopology(simplePlan, { maxRetries: 3 }); // simplePlan: 1 epic, 1 slice (no deps) + // Pool places: pool:test-agent, pool:code-agent = 2 // Epic places: epic:epic-1:done = 1 - // Mechanical places: spec-ready, test-agent, code-agent, failing-tests, - // untested-code, needs-more, done-spec, completed, eligible, - // retry-budget = 10 + // Mechanical places: spec-ready, failing-tests, untested-code, + // needs-more, done-spec, completed, eligible, + // retry-budget = 8 // Semantic places: semantic-budget, semantic-satisfied = 2 // Total places: 13 expect(blueprint.places.length).toBe(13); @@ -723,12 +767,13 @@ describe('Adapter: compiled net shape (topology-only — no runtime bindings)', const blueprint = compileTopology(depPlan, { maxRetries: 3 }); // depPlan: 1 epic, 2 slices (slice-b depends on slice-a) + // Pool places: pool:test-agent, pool:code-agent = 2 // Epic places: epic:epic-1:done = 1 - // Slice-a places: 12 (8 standard + eligible + retry-budget + semantic-budget + semantic-satisfied) - // Slice-b places: 12 (8 standard + eligible + retry-budget + semantic-budget + semantic-satisfied) + // Slice-a places: 10 (6 mechanical + eligible + retry-budget + semantic-budget + semantic-satisfied) + // Slice-b places: 10 (same) // Dep-signal places: slice:slice-a:dep-signal:slice-b = 1 - // Total: 26 - expect(blueprint.places.length).toBe(26); + // Total: 24 + expect(blueprint.places.length).toBe(24); // Transitions: // slice-a: slice-ready, evaluate, write-tests, write-code, run-tests, assess-semantic, return-done = 7 @@ -766,7 +811,7 @@ describe('Adapter: §7 event vocabulary', () => { }; const input: OrchestratorInput = { plan: simplePlan, - worktreeDir: '/tmp/fake', + sandboxDir: '/tmp/fake', actions: fakes.actions, reports: fakes.reports, testRunner: fakes.testRunner, @@ -812,7 +857,7 @@ describe('Adapter: §7 event vocabulary', () => { }; const input: OrchestratorInput = { plan: simplePlan, - worktreeDir: '/tmp/fake', + sandboxDir: '/tmp/fake', actions: fakes.actions, reports: fakes.reports, testRunner: fakes.testRunner, @@ -828,3 +873,287 @@ describe('Adapter: §7 event vocabulary', () => { expect(halted.length).toBe(1); }); }); + +// --------------------------------------------------------------------------- +// Contract test #12 — parallel fires concurrently +// --------------------------------------------------------------------------- + +describe('Engine contract test #12 — parallel fires concurrently', () => { + const threeSlicePlan: Plan = { + epics: [{ id: 'e1', summary: 'Three independent slices', depends_on: [], verification: [] }], + slices: [ + { + id: 'p1', + epic_id: 'e1', + definition: 'S1', + depends_on: [], + verification: [{ kind: 'unit-test', target: 't1' }], + }, + { + id: 'p2', + epic_id: 'e1', + definition: 'S2', + depends_on: [], + verification: [{ kind: 'unit-test', target: 't2' }], + }, + { + id: 'p3', + epic_id: 'e1', + definition: 'S3', + depends_on: [], + verification: [{ kind: 'unit-test', target: 't3' }], + }, + ], + }; + + it('parallel: multiple action handlers execute concurrently for independent slices', async () => { + const fakes = createFakes({ evalSequence: [true], semanticResults: [true] }); + const { tracked, tracker } = withConcurrencyTracking(fakes.actions); + + const engine = createOrchestrator('parallel'); + const result = await engine.run({ + plan: threeSlicePlan, + sandboxDir: '/tmp/f', + actions: tracked, + reports: fakes.reports, + testRunner: fakes.testRunner, + policy: { maxRetries: 3 }, + }); + + expect(result.status).toBe('completed'); + // Under parallel policy, independent slices fire concurrently. + expect(tracker.maxConcurrent).toBeGreaterThan(1); + }); + + it('serial: action handlers execute one at a time', async () => { + const fakes = createFakes({ evalSequence: [true], semanticResults: [true] }); + const { tracked, tracker } = withConcurrencyTracking(fakes.actions); + + const engine = createOrchestrator('serial'); + const result = await engine.run({ + plan: threeSlicePlan, + sandboxDir: '/tmp/f', + actions: tracked, + reports: fakes.reports, + testRunner: fakes.testRunner, + policy: { maxRetries: 3 }, + }); + + expect(result.status).toBe('completed'); + expect(tracker.maxConcurrent).toBe(1); + }); + + it('parallel: wall-clock time is faster than serial for independent slices', async () => { + const DELAY_MS = 20; + + function createDelayedFakes() { + const f = createFakes({ evalSequence: [true], semanticResults: [true] }); + const delayed: ActionHandlers = {}; + for (const [key, handler] of Object.entries(f.actions)) { + delayed[key] = async (ctx: ActionContext) => { + await new Promise((resolve) => setTimeout(resolve, DELAY_MS)); + return handler!(ctx); + }; + } + return { ...f, actions: delayed }; + } + + // Serial run + const serialFakes = createDelayedFakes(); + const t0 = Date.now(); + await createOrchestrator('serial').run({ + plan: threeSlicePlan, + sandboxDir: '/tmp/f', + actions: serialFakes.actions, + reports: serialFakes.reports, + testRunner: serialFakes.testRunner, + policy: { maxRetries: 3 }, + }); + const serialMs = Date.now() - t0; + + // Parallel run + const parallelFakes = createDelayedFakes(); + const t1 = Date.now(); + await createOrchestrator('parallel').run({ + plan: threeSlicePlan, + sandboxDir: '/tmp/f', + actions: parallelFakes.actions, + reports: parallelFakes.reports, + testRunner: parallelFakes.testRunner, + policy: { maxRetries: 3 }, + }); + const parallelMs = Date.now() - t1; + + // Parallel should be measurably faster — at least 20% improvement + expect(parallelMs).toBeLessThan(serialMs * 0.85); + }); +}); + +// --------------------------------------------------------------------------- +// Contract test #13 — resource pool bounds concurrency +// --------------------------------------------------------------------------- + +describe('Engine contract test #13 — resource pool bounds concurrency', () => { + const threeSlicePlan: Plan = { + epics: [{ id: 'e1', summary: 'Three independent slices', depends_on: [], verification: [] }], + slices: [ + { + id: 'r1', + epic_id: 'e1', + definition: 'S1', + depends_on: [], + verification: [{ kind: 'unit-test', target: 't1' }], + }, + { + id: 'r2', + epic_id: 'e1', + definition: 'S2', + depends_on: [], + verification: [{ kind: 'unit-test', target: 't2' }], + }, + { + id: 'r3', + epic_id: 'e1', + definition: 'S3', + depends_on: [], + verification: [{ kind: 'unit-test', target: 't3' }], + }, + ], + }; + + const agentActions = new Set(['evaluate-done', 'write-tests', 'write-code']); + + it('parallel + agentPoolSize=1: only 1 agent-consuming action at a time', async () => { + const fakes = createFakes({ evalSequence: [true], semanticResults: [true] }); + const { tracked, tracker } = withConcurrencyTracking(fakes.actions, agentActions); + + const result = await createOrchestrator('parallel').run({ + plan: threeSlicePlan, + sandboxDir: '/tmp/f', + actions: tracked, + reports: fakes.reports, + testRunner: fakes.testRunner, + policy: { maxRetries: 3, agentPoolSize: 1 }, + }); + + expect(result.status).toBe('completed'); + expect(tracker.maxConcurrent).toBe(1); + }); + + it('parallel + agentPoolSize=2: at most 2 agent-consuming actions at a time', async () => { + const fakes = createFakes({ evalSequence: [true], semanticResults: [true] }); + const { tracked, tracker } = withConcurrencyTracking(fakes.actions, agentActions); + + const result = await createOrchestrator('parallel').run({ + plan: threeSlicePlan, + sandboxDir: '/tmp/f', + actions: tracked, + reports: fakes.reports, + testRunner: fakes.testRunner, + policy: { maxRetries: 3, agentPoolSize: 2 }, + }); + + expect(result.status).toBe('completed'); + expect(tracker.maxConcurrent).toBe(2); + }); + + it('default agentPoolSize (unbounded) preserves full concurrency', async () => { + const fakes = createFakes({ evalSequence: [true], semanticResults: [true] }); + const { tracked, tracker } = withConcurrencyTracking(fakes.actions, agentActions); + + const result = await createOrchestrator('parallel').run({ + plan: threeSlicePlan, + sandboxDir: '/tmp/f', + actions: tracked, + reports: fakes.reports, + testRunner: fakes.testRunner, + policy: { maxRetries: 3 }, + }); + + expect(result.status).toBe('completed'); + expect(tracker.maxConcurrent).toBe(3); + }); +}); + +// --------------------------------------------------------------------------- +// Adapter test — sandbox-per-slice isolation +// --------------------------------------------------------------------------- + +describe('Adapter: sandbox-per-slice isolation', () => { + it('each action handler receives a per-slice sandboxDir', async () => { + const sandboxDirs = new Map(); + + const fakes = createFakes({ evalSequence: [true], semanticResults: [true] }); + const trackingActions: ActionHandlers = {}; + for (const [key, handler] of Object.entries(fakes.actions)) { + trackingActions[key] = async (ctx: ActionContext) => { + sandboxDirs.set(`${ctx.slice.id}:${key}`, ctx.sandboxDir); + return handler!(ctx); + }; + } + + const engine = createOrchestrator('serial'); + const result = await engine.run({ + plan: simplePlan, + sandboxDir: '/tmp/run', + actions: trackingActions, + reports: fakes.reports, + testRunner: fakes.testRunner, + policy: { maxRetries: 3 }, + }); + + expect(result.status).toBe('completed'); + // Every action should receive sandboxDir = /tmp/run/ + for (const [key, dir] of sandboxDirs) { + const sliceId = key.split(':')[0]!; + expect(dir).toBe(`/tmp/run/${sliceId}`); + } + // At least evaluate-done and assess-semantic were called + expect(sandboxDirs.size).toBeGreaterThanOrEqual(2); + }); + + it('verify-epic receives the parent sandboxDir (not per-slice)', async () => { + const verifyPlan: Plan = { + epics: [ + { + id: 'ev', + summary: 'Verified', + depends_on: [], + verification: [{ kind: 'integration-test', target: 't' }], + }, + ], + slices: [ + { + id: 'sv', + epic_id: 'ev', + definition: 'S', + depends_on: [], + verification: [{ kind: 'unit-test', target: 't' }], + }, + ], + }; + + let verifyEpicSandboxDir = ''; + const fakes = createFakes({ evalSequence: [true], semanticResults: [true], verifyEpicResult: true }); + const trackingActions: ActionHandlers = {}; + for (const [key, handler] of Object.entries(fakes.actions)) { + trackingActions[key] = async (ctx: ActionContext) => { + if (key === 'verify-epic') verifyEpicSandboxDir = ctx.sandboxDir; + return handler!(ctx); + }; + } + + const result = await createOrchestrator('serial').run({ + plan: verifyPlan, + sandboxDir: '/tmp/run', + actions: trackingActions, + reports: fakes.reports, + testRunner: fakes.testRunner, + policy: { maxRetries: 3 }, + }); + + expect(result.status).toBe('completed'); + // verify-epic gets the parent sandbox, not /tmp/run/sv + expect(verifyEpicSandboxDir).toBe('/tmp/run'); + }); +}); diff --git a/src/orchestrator/src/net-compiler.ts b/src/orchestrator/src/net-compiler.ts index c3a6df33..be74f9f6 100644 --- a/src/orchestrator/src/net-compiler.ts +++ b/src/orchestrator/src/net-compiler.ts @@ -5,6 +5,9 @@ // 3. compilePlan(input, ctx) → PetriNet (convenience wrapper) // --------------------------------------------------------------------------- +import { mkdirSync } from 'node:fs'; +import { resolve, sep } from 'node:path'; + import type { NetBlueprint, TokenSeed, TransitionSkeleton } from './net-blueprint.js'; import { PetriNet } from './petri-net.js'; import type { Token } from './petri-net.js'; @@ -23,6 +26,19 @@ function ep(epicId: string, place: string): string { return `epic:${epicId}:${place}`; } +/** Resolve a per-slice sandbox under the run root; reject path-escape ids. */ +function sliceSandboxDir(rootSandboxDir: string, sliceId: string): string { + if (!sliceId || sliceId.includes('..') || sliceId.includes('/') || sliceId.includes('\\')) { + throw new Error(`Invalid slice id: ${sliceId}`); + } + const root = resolve(rootSandboxDir); + const dir = resolve(root, sliceId); + if (dir !== root && !dir.startsWith(root + sep)) { + throw new Error(`Invalid slice id: ${sliceId}`); + } + return dir; +} + // --------------------------------------------------------------------------- // Pass 1 — compileTopology: pure function, no closures over runtime state. // Same Plan + Policy → same blueprint. Trivially snapshot-testable. @@ -33,6 +49,19 @@ export function compileTopology(plan: Plan, policy: RunPolicy): NetBlueprint { const transitions: TransitionSkeleton[] = []; const initialTokens: { place: string; token: TokenSeed }[] = []; + // Shared agent resource pool places + const poolTestAgent = 'pool:test-agent'; + const poolCodeAgent = 'pool:code-agent'; + places.push(poolTestAgent, poolCodeAgent); + + const poolSize = policy.agentPoolSize ?? plan.slices.length; + for (let i = 0; i < poolSize; i++) { + initialTokens.push( + { place: poolTestAgent, token: { sliceId: '', epicId: '' } }, + { place: poolCodeAgent, token: { sliceId: '', epicId: '' } }, + ); + } + // Epic-level places for (const epic of plan.epics) { places.push(ep(epic.id, 'done')); @@ -65,11 +94,9 @@ export function compileTopology(plan: Plan, policy: RunPolicy): NetBlueprint { const sid = slice.id; const base: TokenSeed = { sliceId: sid, epicId: epic.id }; - // Places — mechanical lane + // Places — mechanical lane (agent tokens are in shared pools, not per-slice) for (const name of [ 'spec-ready', - 'test-agent', - 'code-agent', 'failing-tests', 'untested-code', 'needs-more', @@ -83,16 +110,12 @@ export function compileTopology(plan: Plan, policy: RunPolicy): NetBlueprint { places.push(p(sid, 'semantic-budget')); places.push(p(sid, 'semantic-satisfied')); - // Retry + semantic budget places + // Retry budget + eligibility gate places.push(p(sid, 'retry-budget')); - - // Eligibility gate places.push(p(sid, 'eligible')); - // Initial tokens + // Initial tokens (agent tokens seeded in pools above) initialTokens.push( - { place: p(sid, 'test-agent'), token: { ...base } }, - { place: p(sid, 'code-agent'), token: { ...base } }, { place: p(sid, 'semantic-budget'), token: { ...base, reworkCount: 0 } }, { place: p(sid, 'retry-budget'), token: { ...base, retryCount: 0 } }, ); @@ -127,7 +150,7 @@ export function compileTopology(plan: Plan, policy: RunPolicy): NetBlueprint { // Evaluate transitions.push({ id: `${sid}:evaluate`, - inputs: [p(sid, 'spec-ready'), p(sid, 'test-agent')], + inputs: [p(sid, 'spec-ready'), poolTestAgent], contract: { kind: 'mechanical', lane: 'mechanical', @@ -142,14 +165,14 @@ export function compileTopology(plan: Plan, policy: RunPolicy): NetBlueprint { routeField: 'done', onTrue: [p(sid, 'done-spec')], onFalse: [p(sid, 'needs-more')], - agentReturnPlace: p(sid, 'test-agent'), + agentReturnPlace: poolTestAgent, }, }); // Write tests transitions.push({ id: `${sid}:write-tests`, - inputs: [p(sid, 'needs-more'), p(sid, 'test-agent')], + inputs: [p(sid, 'needs-more'), poolTestAgent], contract: { kind: 'mechanical', lane: 'mechanical', actor: 'test-agent' }, handler: { kind: 'action', @@ -158,14 +181,14 @@ export function compileTopology(plan: Plan, policy: RunPolicy): NetBlueprint { epicId: epic.id, onTrue: [p(sid, 'failing-tests')], onFalse: [], - agentReturnPlace: p(sid, 'test-agent'), + agentReturnPlace: poolTestAgent, }, }); // Write code transitions.push({ id: `${sid}:write-code`, - inputs: [p(sid, 'failing-tests'), p(sid, 'code-agent')], + inputs: [p(sid, 'failing-tests'), poolCodeAgent], contract: { kind: 'mechanical', lane: 'mechanical', actor: 'coding-agent' }, handler: { kind: 'action', @@ -174,7 +197,7 @@ export function compileTopology(plan: Plan, policy: RunPolicy): NetBlueprint { epicId: epic.id, onTrue: [p(sid, 'untested-code')], onFalse: [], - agentReturnPlace: p(sid, 'code-agent'), + agentReturnPlace: poolCodeAgent, }, }); @@ -313,6 +336,11 @@ export function wireHandlers(blueprint: NetBlueprint, input: OrchestratorInput, net.addPlace(place); } + // Create per-slice sandbox directories + for (const slice of plan.slices) { + mkdirSync(sliceSandboxDir(input.sandboxDir, slice.id), { recursive: true }); + } + // Register transitions with wired fire handlers for (const skel of blueprint.transitions) { const h = skel.handler; @@ -330,7 +358,13 @@ export function wireHandlers(blueprint: NetBlueprint, input: OrchestratorInput, const { actionKey, sliceId, epicId, routeField, onTrue, onFalse, agentReturnPlace } = h; const slice = plan.slices.find((s) => s.id === sliceId)!; const epic = plan.epics.find((e) => e.id === epicId)!; - const actCtx: ActionContext = { slice, epic, plan, worktreeDir: input.worktreeDir, reports }; + const actCtx: ActionContext = { + slice, + epic, + plan, + sandboxDir: sliceSandboxDir(input.sandboxDir, sliceId), + reports, + }; const baseToken: Token = { sliceId, epicId }; fire = async (consumed) => { @@ -364,7 +398,7 @@ export function wireHandlers(blueprint: NetBlueprint, input: OrchestratorInput, const retryToken = consumed[1]!; const retryCount = retryToken.retryCount ?? 0; - const result = await testRunner.run(target, input.worktreeDir); + const result = await testRunner.run(target, sliceSandboxDir(input.sandboxDir, sliceId)); const reportId = createReport(reports, { epicId, sliceId, @@ -399,7 +433,13 @@ export function wireHandlers(blueprint: NetBlueprint, input: OrchestratorInput, const { actionKey, sliceId, epicId, onSatisfied, onRejected, budgetPlace, maxReworks } = h; const slice = plan.slices.find((s) => s.id === sliceId)!; const epic = plan.epics.find((e) => e.id === epicId)!; - const actCtx: ActionContext = { slice, epic, plan, worktreeDir: input.worktreeDir, reports }; + const actCtx: ActionContext = { + slice, + epic, + plan, + sandboxDir: sliceSandboxDir(input.sandboxDir, sliceId), + reports, + }; const baseToken: Token = { sliceId, epicId }; fire = async (consumed) => { @@ -460,7 +500,16 @@ export function wireHandlers(blueprint: NetBlueprint, input: OrchestratorInput, const { actionKey, epicId, representativeSliceId, onPassOutputs } = h; const epic = plan.epics.find((e) => e.id === epicId)!; const slice = plan.slices.find((s) => s.id === representativeSliceId)!; - const actCtx: ActionContext = { slice, epic, plan, worktreeDir: input.worktreeDir, reports }; + // Epic verification runs against the parent sandbox (not a per-slice dir) + // so it can see artifacts from all slices. TODO: merge per-slice sandboxes + // into an epic-scoped dir once parallel slice isolation is production-ready. + const actCtx: ActionContext = { + slice, + epic, + plan, + sandboxDir: input.sandboxDir, + reports, + }; fire = async () => { const reportId = await actions[actionKey]!(actCtx); diff --git a/src/orchestrator/src/petri-net.ts b/src/orchestrator/src/petri-net.ts index 754a2872..a87677ac 100644 --- a/src/orchestrator/src/petri-net.ts +++ b/src/orchestrator/src/petri-net.ts @@ -6,8 +6,7 @@ export type Token = { reportId?: string; sliceId: string; epicId: string; - /** Retry counter — carried on retry-budget tokens. Phase 0 extension - * to move retry state into the net instead of leaking to ctx.retries. */ + /** Retry counter — carried on retry-budget tokens. */ retryCount?: number; /** Semantic rework counter — carried on semantic-budget tokens. * Prevents infinite rework loops when assess-semantic always rejects. */ @@ -40,10 +39,10 @@ export type TransitionDef = { /** * Firing policy determines how the interpreter selects the next enabled - * transition. Phase 0 ships only `serial` (first-enabled); Phase 2 will - * add `parallel` (all-enabled concurrently). + * transition. `serial` fires first-enabled one at a time; `parallel` + * fires all enabled transitions concurrently via greedy token claiming. */ -export type FiringPolicy = 'serial'; +export type FiringPolicy = 'serial' | 'parallel'; // --------------------------------------------------------------------------- // §7 Event vocabulary — structured events emitted by the interpreter. @@ -85,6 +84,8 @@ function placeName(placeId: string): string { return placeId; } +type TransitionClaim = { transition: TransitionDef; consumed: Token[] }; + export class PetriNet { private places = new Map(); private transitions: TransitionDef[] = []; @@ -103,11 +104,6 @@ export class PetriNet { this.transitions.push(def); } - hasTokens(placeId: string): boolean { - const tokens = this.places.get(placeId); - return !!tokens && tokens.length > 0; - } - /** Returns the number of registered places. */ get placeCount(): number { return this.places.size; @@ -123,10 +119,19 @@ export class PetriNet { return this.transitions; } + /** True when every input place of `t` has at least one token. */ + private isEnabled(t: TransitionDef): boolean { + return t.inputs.every((p) => { + const tokens = this.places.get(p); + return tokens && tokens.length > 0; + }); + } + /** True when any non-resource place still holds tokens (actual deadlock, not clean completion). */ private hasWorkBearingTokens(): boolean { for (const [placeId, tokens] of this.places) { if (tokens.length === 0) continue; + if (placeId.startsWith('pool:')) continue; const name = placeName(placeId); if (BENIGN_RESIDUAL_PLACES.has(name)) continue; return true; @@ -134,20 +139,49 @@ export class PetriNet { return false; } - async run(_policy: FiringPolicy, shouldHalt?: () => boolean, eventSink?: NetEventSink): Promise { - // Phase 0: only serial policy — find first enabled, fire, repeat. + private restoreClaim({ transition, consumed }: TransitionClaim): void { + for (let i = 0; i < transition.inputs.length; i++) { + this.addToken(transition.inputs[i]!, consumed[i]!); + } + } + + private depositClaim( + { transition, consumed: _consumed }: TransitionClaim, + outputs: { place: string; token: Token }[], + eventSink?: NetEventSink, + ): void { + const producedPlaces: string[] = []; + for (const { place, token } of outputs) { + this.addToken(place, token); + producedPlaces.push(place); + } + eventSink?.emit({ + kind: 'transition_fired', + ts: new Date().toISOString(), + transitionId: transition.id, + contract: transition.contract, + consumed: transition.inputs, + produced: producedPlaces, + }); + } + + async run(policy: FiringPolicy, shouldHalt?: () => boolean, eventSink?: NetEventSink): Promise { + if (policy === 'serial') { + await this.runSerial(shouldHalt, eventSink); + } else { + await this.runParallel(shouldHalt, eventSink); + } + } + + /** Serial policy — find first enabled transition, fire, repeat. */ + private async runSerial(shouldHalt?: () => boolean, eventSink?: NetEventSink): Promise { while (true) { if (shouldHalt?.()) { eventSink?.emit({ kind: 'net_halted', ts: new Date().toISOString() }); break; } - const enabled = this.transitions.find((t) => - t.inputs.every((p) => { - const tokens = this.places.get(p); - return tokens && tokens.length > 0; - }), - ); + const enabled = this.transitions.find((t) => this.isEnabled(t)); if (!enabled) { if (this.hasWorkBearingTokens()) { eventSink?.emit({ kind: 'net_deadlocked', ts: new Date().toISOString() }); @@ -155,29 +189,91 @@ export class PetriNet { break; } - // Consume one token per input place - const consumed: Token[] = []; + const claim: TransitionClaim = { transition: enabled, consumed: [] }; for (const p of enabled.inputs) { - consumed.push(this.places.get(p)!.shift()!); + claim.consumed.push(this.places.get(p)!.shift()!); } - // Fire — handler decides outputs - const outputs = await enabled.fire(consumed); - const producedPlaces: string[] = []; - for (const { place, token } of outputs) { - this.addToken(place, token); - producedPlaces.push(place); + try { + const outputs = await enabled.fire(claim.consumed); + this.depositClaim(claim, outputs, eventSink); + } catch (err) { + this.restoreClaim(claim); + throw err; } + } + } - // Emit transition_fired event - eventSink?.emit({ - kind: 'transition_fired', - ts: new Date().toISOString(), - transitionId: enabled.id, - contract: enabled.contract, - consumed: enabled.inputs, - produced: producedPlaces, - }); + /** + * Parallel policy — find all enabled transitions, claim tokens greedily, + * fire all claimed transitions concurrently via Promise.allSettled, repeat. + * + * Successful fires in a batch are committed before checking halt, matching + * serial semantics where each completed fire persists. Handler rejections + * roll back the entire batch to avoid partial net state. + */ + private async runParallel(shouldHalt?: () => boolean, eventSink?: NetEventSink): Promise { + while (true) { + if (shouldHalt?.()) { + eventSink?.emit({ kind: 'net_halted', ts: new Date().toISOString() }); + break; + } + + const allEnabled = this.transitions.filter((t) => this.isEnabled(t)); + + if (allEnabled.length === 0) { + if (this.hasWorkBearingTokens()) { + eventSink?.emit({ kind: 'net_deadlocked', ts: new Date().toISOString() }); + } + break; + } + + const claims: TransitionClaim[] = []; + for (const t of allEnabled) { + if (!this.isEnabled(t)) continue; + + const consumed: Token[] = []; + for (const p of t.inputs) { + consumed.push(this.places.get(p)!.shift()!); + } + claims.push({ transition: t, consumed }); + } + + if (claims.length === 0) break; + + const results = await Promise.allSettled( + claims.map(async (claim) => ({ + claim, + outputs: await claim.transition.fire(claim.consumed), + })), + ); + + let firstError: unknown; + const fulfilled: { claim: TransitionClaim; outputs: { place: string; token: Token }[] }[] = []; + + for (const result of results) { + if (result.status === 'fulfilled') { + fulfilled.push(result.value); + } else { + firstError ??= result.reason; + } + } + + if (firstError) { + for (const claim of claims) { + this.restoreClaim(claim); + } + throw firstError; + } + + for (const { claim, outputs } of fulfilled) { + this.depositClaim(claim, outputs, eventSink); + } + + if (shouldHalt?.()) { + eventSink?.emit({ kind: 'net_halted', ts: new Date().toISOString() }); + break; + } } } } diff --git a/src/orchestrator/src/pi-actions.ts b/src/orchestrator/src/pi-actions.ts index 92ee16c2..4064d823 100644 --- a/src/orchestrator/src/pi-actions.ts +++ b/src/orchestrator/src/pi-actions.ts @@ -46,7 +46,7 @@ function runPi(opts: { model: string; promptFile: string; task: string; - worktreeDir: string; + sandboxDir: string; }): string { const start = Date.now(); @@ -71,7 +71,7 @@ function runPi(opts: { opts.task, ], { - cwd: opts.worktreeDir, + cwd: opts.sandboxDir, encoding: 'utf8', timeout: 300_000, maxBuffer: 10 * 1024 * 1024, @@ -129,7 +129,7 @@ export function createPiActions(opts?: { verbose?: boolean; runStart?: number }) model: 'claude-haiku-4-5', promptFile: join(promptsDir, 'evaluator.md'), task, - worktreeDir: ctx.worktreeDir, + sandboxDir: ctx.sandboxDir, }); const parsed = extractJson(raw) as { done?: boolean; reasoning?: string } | undefined; const done = !!parsed?.done; @@ -156,7 +156,7 @@ export function createPiActions(opts?: { verbose?: boolean; runStart?: number }) model: 'claude-sonnet-4-6', promptFile: join(promptsDir, 'test-writer.md'), task, - worktreeDir: ctx.worktreeDir, + sandboxDir: ctx.sandboxDir, }); return report(ctx, 'test-writer', 'tests-written', { @@ -174,7 +174,7 @@ export function createPiActions(opts?: { verbose?: boolean; runStart?: number }) model: 'claude-sonnet-4-6', promptFile: join(promptsDir, 'code-writer.md'), task, - worktreeDir: ctx.worktreeDir, + sandboxDir: ctx.sandboxDir, }); return report(ctx, 'code-writer', 'code-written', { @@ -199,7 +199,7 @@ export function createPiActions(opts?: { verbose?: boolean; runStart?: number }) model: 'claude-sonnet-4-6', promptFile: join(promptsDir, 'test-writer.md'), task: writeTask, - worktreeDir: ctx.worktreeDir, + sandboxDir: ctx.sandboxDir, }); let allPassed = true; @@ -207,7 +207,7 @@ export function createPiActions(opts?: { verbose?: boolean; runStart?: number }) try { const { execSync } = await import('node:child_process'); const output = execSync(`bun test ${v.target}`, { - cwd: ctx.worktreeDir, + cwd: ctx.sandboxDir, encoding: 'utf8', timeout: 60_000, stdio: ['ignore', 'pipe', 'pipe'], diff --git a/src/orchestrator/src/test-runner.ts b/src/orchestrator/src/test-runner.ts index 29effd72..7bff8ddf 100644 --- a/src/orchestrator/src/test-runner.ts +++ b/src/orchestrator/src/test-runner.ts @@ -3,10 +3,10 @@ import { execSync } from 'node:child_process'; import type { TestResult, TestRunner } from './types.js'; export class BunTestRunner implements TestRunner { - async run(target: string, worktreeDir: string): Promise { + async run(target: string, sandboxDir: string): Promise { try { const output = execSync(`bun test ${target}`, { - cwd: worktreeDir, + cwd: sandboxDir, encoding: 'utf8', timeout: 60_000, stdio: ['ignore', 'pipe', 'pipe'], diff --git a/src/orchestrator/src/types.ts b/src/orchestrator/src/types.ts index 8ea18f0a..f964e44d 100644 --- a/src/orchestrator/src/types.ts +++ b/src/orchestrator/src/types.ts @@ -55,7 +55,7 @@ export type ActionContext = { slice: Slice; epic: Epic; plan: Plan; - worktreeDir: string; + sandboxDir: string; reports: ReportSink; }; @@ -74,7 +74,7 @@ export type TestResult = { }; export interface TestRunner { - run(target: string, worktreeDir: string): Promise; + run(target: string, sandboxDir: string): Promise; } // --------------------------------------------------------------------------- @@ -85,11 +85,14 @@ export type RunPolicy = { maxRetries: number; /** Maximum semantic rework cycles per slice before halting. Defaults to maxRetries. */ maxSemanticReworks?: number; + /** Number of tokens per shared agent pool (test-agent, code-agent). + * Defaults to slice count (unbounded — one token per slice). */ + agentPoolSize?: number; }; export type OrchestratorInput = { plan: Plan; - worktreeDir: string; + sandboxDir: string; actions: ActionHandlers; reports: ReportSink; testRunner: TestRunner; diff --git a/src/orchestrator/src/worktree.test.ts b/src/orchestrator/src/worktree.test.ts index ea03c9be..07b5869f 100644 --- a/src/orchestrator/src/worktree.test.ts +++ b/src/orchestrator/src/worktree.test.ts @@ -4,33 +4,33 @@ import { join } from 'node:path'; import { afterEach, describe, expect, it } from 'vitest'; -import { createWorktree } from './worktree.js'; +import { createSandbox } from './worktree.js'; -describe('createWorktree', () => { +describe('createSandbox', () => { const dirs: string[] = []; afterEach(() => { for (const d of dirs) rmSync(d, { recursive: true, force: true }); dirs.length = 0; }); - it('creates worktree under baseDir/.cook/runs//worktree/', () => { + it('creates sandbox under baseDir/.cook/runs//worktree/', () => { const baseDir = mkdtempSync(join(tmpdir(), 'cook-wt-')); dirs.push(baseDir); - const info = createWorktree(baseDir, 'test-run-1'); + const info = createSandbox(baseDir, 'test-run-1'); expect(info.runId).toBe('test-run-1'); expect(info.runDir).toBe(join(baseDir, '.cook', 'runs', 'test-run-1')); - expect(info.worktreeDir).toBe(join(baseDir, '.cook', 'runs', 'test-run-1', 'worktree')); - expect(existsSync(info.worktreeDir)).toBe(true); + expect(info.sandboxDir).toBe(join(baseDir, '.cook', 'runs', 'test-run-1', 'worktree')); + expect(existsSync(info.sandboxDir)).toBe(true); }); it('generates a runId when not provided', () => { const baseDir = mkdtempSync(join(tmpdir(), 'cook-wt-')); dirs.push(baseDir); - const info = createWorktree(baseDir); + const info = createSandbox(baseDir); expect(info.runId).toBeTruthy(); - expect(existsSync(info.worktreeDir)).toBe(true); + expect(existsSync(info.sandboxDir)).toBe(true); }); it('does not write to a separate fixture directory', () => { @@ -38,7 +38,7 @@ describe('createWorktree', () => { const fixtureDir = mkdtempSync(join(tmpdir(), 'cook-fixture-')); dirs.push(baseDir, fixtureDir); - createWorktree(baseDir, 'isolated-run'); + createSandbox(baseDir, 'isolated-run'); // Fixture dir must not have a .cook/ directory expect(existsSync(join(fixtureDir, '.cook'))).toBe(false); diff --git a/src/orchestrator/src/worktree.ts b/src/orchestrator/src/worktree.ts index 86afc4ac..7c22cc39 100644 --- a/src/orchestrator/src/worktree.ts +++ b/src/orchestrator/src/worktree.ts @@ -2,20 +2,20 @@ import { randomUUID } from 'node:crypto'; import { mkdirSync } from 'node:fs'; import { join } from 'node:path'; -export type WorktreeInfo = { +export type SandboxInfo = { runId: string; runDir: string; - worktreeDir: string; + sandboxDir: string; }; /** * Create an isolated run directory under `baseDir/.cook/runs//`. * `baseDir` should be cwd (not the fixture directory) so fixtures stay pristine. */ -export function createWorktree(baseDir: string, runId?: string): WorktreeInfo { +export function createSandbox(baseDir: string, runId?: string): SandboxInfo { const id = runId ?? randomUUID(); const runDir = join(baseDir, '.cook', 'runs', id); - const worktreeDir = join(runDir, 'worktree'); - mkdirSync(worktreeDir, { recursive: true }); - return { runId: id, runDir, worktreeDir }; + const sandboxDir = join(runDir, 'worktree'); + mkdirSync(sandboxDir, { recursive: true }); + return { runId: id, runDir, sandboxDir }; } diff --git a/src/server/cli.ts b/src/server/cli.ts index c703c026..fdc4c030 100644 --- a/src/server/cli.ts +++ b/src/server/cli.ts @@ -22,7 +22,7 @@ if (args.has('--help') || args.has('-h') || args.has('help')) { console.log(' cook [flags] Run the orchestrator on a plan directory.'); console.log(''); console.log('Cook flags:'); - console.log(' --engine=proc|petri Execution engine (default: petri)'); + console.log(' --policy=serial|parallel Firing policy (default: serial)'); console.log(' --max-retries=N Retry budget per slice (default: 3)'); console.log(' --verbose, -v Show raw pi-agent output'); process.exit(0);