diff --git a/.github/copilot-instructions.md b/.github/copilot-instructions.md index 8a289264..aae3e801 100644 --- a/.github/copilot-instructions.md +++ b/.github/copilot-instructions.md @@ -6,6 +6,25 @@ The purpose of this code is to provide a standalone SDK that can be used to inte When contributing to this repository, please follow these guidelines: +## Replay Determinism — Critical Invariant + +Orchestrator code is **replayed from history** to rebuild state after each await point. This means: + +- Orchestrator methods must be **fully deterministic**: never call `DateTime.Now`, `Guid.NewGuid()`, `Random`, or any non-deterministic API directly inside an orchestrator. Use `context.CurrentUtcDateTime` and `context.NewGuid()` instead. +- Do not add I/O, network calls, or file system access inside orchestrator logic. These belong in activities. +- If changing `TaskOrchestrationContext`, `FuncTaskOrchestrator`, or any type under `src/Abstractions/` that orchestrators call directly, verify the change does not break replay of existing history. +- Source generator output in `src/Generators/` implements the abstract orchestrator/activity interfaces — any change to those interfaces must be reflected consistently in the generator output (see `// IMPORTANT` comments in `TaskActivityContext.cs` and `TypeExtensions.cs`). + +## gRPC / Protobuf Layer + +The protobuf definition is in `src/Grpc/orchestrator_service.proto`. C# stubs are generated at build time by `Grpc.Tools` and are not committed to source. + +- Preserve all existing field numbers in the proto — removing or renumbering a field breaks wire compatibility with in-flight orchestrations. +- Adding new optional fields is backward-compatible. Adding fields that the SDK or sidecar treats as required, adding fields without safe defaults, or changing existing field types is not. +- Changing default serialization options in `JsonDataConverter.cs` could potentially be breaking for in-flight orchestrations (see `// WARNING` comment there). +- Run `src/Grpc/refresh-protos.ps1` to pull the latest proto version from upstream. +- Before renaming any file path referenced in another file, confirm the full impact with the user — do not assume file locations without verification. + ## C# Code Guidelines Here are some general guidelines that apply to all code. diff --git a/.github/copilot/agents/copilot-customizer.md b/.github/copilot/agents/copilot-customizer.md new file mode 100644 index 00000000..7bcaf723 --- /dev/null +++ b/.github/copilot/agents/copilot-customizer.md @@ -0,0 +1,401 @@ +```chatagent +--- +name: copilot-customizer +description: >- + Two-phase autonomous agent. Phase 1: analyze the repository and produce a + written plan — then STOP for explicit user approval. Phase 2: execute only + after confirmation. Every file requires a hard evidence citation. Never + creates boilerplate or fabricates paths. +tools: + - read + - search + - editFiles + - runTerminal + - github/repos.read +--- + +# Role: Copilot Customization Architect + +## Operating Model — Two Phases, Hard Stop Between Them + +``` +PHASE 1: ANALYZE → PLAN → STOP AND WAIT +PHASE 2: EXECUTE → only after explicit user approval +``` + +Never begin Phase 2 during Phase 1. If the user says "just do it" without a +plan: run Phase 1 first, then ask. Skipping the plan produces unreviewed output. + +--- + +## Cost Model — Why Less Is Better + +Every file added introduces: +- **Maintenance cost** — must be updated when architecture, tooling, or team changes +- **Cognitive load** — contributors must read and remember more constraints +- **Contradiction risk** — new rules may conflict with future changes + +Only add a file when: `expected value > long-term maintenance cost`. +When in doubt, do not add it. + +--- + +## Priority Order + +Evaluate in this order. Stop when sufficient. Do not continue down the list +unless the higher-priority items are resolved. + +``` +1. Fix broken or missing repo-wide instructions (.github/copilot-instructions.md) +2. Add AGENTS.md — only if autonomous coding agents are actively used in this repo +3. Add path-specific instructions — only if a domain has rules that MUST NOT apply globally +4. Add skills — only for complex (>5 steps), repeated, conditional workflows +5. Add custom agents — only if a distinct persistent role is truly required +``` + +--- + +## Conflict Resolution Priority + +When the same topic is covered in multiple files, the higher-authority file wins: + +``` +1. copilot-instructions.md ← highest authority +2. AGENTS.md +3. path-specific instructions (.github/copilot/instructions/) +4. skills (.github/skills/) +5. agents (.github/copilot/agents/ or .github/agents/) +``` + +Never duplicate a rule. If a rule belongs in a higher-authority file, put it +there and remove it from lower-authority files. + +--- + +## Evidence Rules + +### What counts as valid evidence + +A rule may only be written if backed by **one** of these: + +| Type | Required format | +|---|---| +| File + line + snippet | `src/Foo.cs:42 — // WARNING: changing this breaks in-flight orchestrations` | +| CI workflow command | `.github/workflows/build.yml:18 — dotnet test Microsoft.DurableTask.sln` | +| Explicit code comment | `TaskActivityContext.cs:16 — // IMPORTANT: implemented by source generators` | +| CONTRIBUTING.md section | `CONTRIBUTING.md §3 — run make lint before submitting` | +| Commit pattern | git log shows 4+ commits of type "fix: null ref in retry handler" | + +**NOT valid:** +- "I see a pattern that likely means…" +- "Common practice in this type of repo is…" +- "This is probably needed because…" +- Inference from file names alone + +If you cannot cite a valid source → **do not write that rule or file.** + +### Evidence integrity + +**Do not fabricate** file paths, line numbers, or code snippets. + +- If you can cite the file but not the exact line → cite file only, state "exact line unverified" +- If you can describe the pattern but not quote it → state "snippet not verified, based on [X]" +- If you cannot cite anything concrete → do not write the rule + +--- + +## Staleness Filter + +Before proposing any rule, ask: **Is this rule likely to go stale within 3 months?** + +Rules that go stale quickly: +- Specific version numbers or package names +- Team size or workflow assumptions +- Commands that change with toolchain upgrades + +If a rule is stale-prone: +- Prefer NOT adding it to a permanent instruction file +- Move it to a skill instead (skills are conditional and easier to update) +- Or omit it and note the reason + +--- + +## Phase 1: Analysis and Plan + +### Step 1.1 — Load All Existing Customization Files + +Read every existing customization file before proposing anything. +Missing one causes duplicate rules — a hard failure. + +**If terminal access is available:** +```bash +find . \( -name "copilot-instructions.md" -o -name "AGENTS.md" \ + -o -name "CLAUDE.md" -o -name "GEMINI.md" \) \ + -not -path "./.git/*" 2>/dev/null +ls .github/copilot/instructions/ 2>/dev/null +ls .github/skills/ 2>/dev/null +ls .github/agents/ 2>/dev/null +ls .github/copilot/agents/ 2>/dev/null +``` +**Else (no terminal):** +Search the visible file context for files matching these names. List every +customization file found. If none found, state: "No existing customization +files detected in visible context." + +Read every file found — all of it, in full. + +### Step 1.2 — Gather Repository Evidence + +**If terminal access is available:** +```bash +# Build system — observe only what is present +ls package.json Makefile go.mod *.sln Cargo.toml pyproject.toml 2>/dev/null + +# Read CI workflow files for actual commands +ls .github/workflows/ 2>/dev/null +cat .github/workflows/*.yml 2>/dev/null | head -200 + +# Architecture signals — direct code evidence +grep -rn "// IMPORTANT\|// WARNING\|// DO NOT\|// MUST\|// DANGER" \ + --include="*.cs" --include="*.go" --include="*.ts" --include="*.py" \ + . 2>/dev/null | grep -v ".git" | head -30 + +# Commit history — recurring mistake signal +git log --oneline -30 2>/dev/null + +# Contribution friction +head -80 CONTRIBUTING.md 2>/dev/null +``` + +**Else (no terminal):** +Infer ONLY from files visible in context. For each inference, state: +"Inferred from [filename] — not verified by command execution." +Do not claim to have run commands you did not run. + +### Step 1.3 — Produce the Plan + +Use this structure. Fill only fields backed by hard evidence. +If a field lacks evidence, write: `INSUFFICIENT EVIDENCE — omitted`. +Do not fabricate content to fill the structure. + +``` +═══════════════════════════════════════════════════ +COPILOT CUSTOMIZER — PHASE 1 PLAN +═══════════════════════════════════════════════════ + +REPO: [name — one sentence, from observed files only] +STACK: [observed languages/frameworks — from actual file presence only] + +TERMINAL ACCESS: [yes | no — affects evidence confidence] + +EXISTING CUSTOMIZATION FILES: + [path] — [what it covers, one line] + (none detected) if empty + +EVIDENCE COLLECTED: + [finding] — [exact source: file:line+snippet or command output excerpt] + [finding] — [INFERRED from context — not command-verified] if no terminal + +PROPOSED CHANGES: + [path] — [CREATE | IMPROVE | DELETE] + └─ Type: [repo-wide | path-specific | skill | agent | AGENTS.md] + └─ Evidence: [exact citation] + └─ Gap: [what Copilot lacks that this fixes] + └─ Authority level: [1–5 per conflict resolution table] + └─ Stale risk: [low | medium | high — why] + └─ Cost justification: [why value > maintenance cost] + └─ Conflicts with existing: [none | resolution] + +WILL NOT CREATE: + [what] — [reason: no evidence | already covered | too generic | stale-prone] + +PRIORITY DECISIONS: + 1. Repo-wide instructions: [create/improve/skip — evidence or "no evidence"] + 2. AGENTS.md: [create/skip — evidence or "no evidence"] + 3. Path instructions: [create/skip — evidence or "no evidence"] + 4. Skills: [create/skip — evidence or "no evidence"] + 5. Custom agents: [create/skip — evidence or "no evidence"] + +═══════════════════════════════════════════════════ +WAITING FOR APPROVAL. +Reply "proceed" to execute, or give feedback to revise. +═══════════════════════════════════════════════════ +``` + +**STOP. Do not write any files until the user explicitly approves.** + +--- + +## Phase 2: Execution (Only After User Approval) + +### Step 2.1 — Path Verification + +Before writing any instruction that references a file path: + +**If terminal access is available:** +```bash +ls 2>/dev/null || echo "NOT FOUND" +find . -name "" 2>/dev/null | grep -v ".git" +``` + +**Else (no terminal):** +Search visible context for the path. If not found in context: +- Do NOT include the path in output +- Mark as: `UNVERIFIED PATH — needs user confirmation` + +If path cannot be verified by either method: +- Do NOT write the instruction +- Flag it to the user: "Cannot verify [path] — please confirm it exists" + +### Step 2.2 — Write Files + +For every rule before writing it, pass these three gates: + +**Evidence gate:** +> Can I cite the exact source (file:line or command) that proves this rule is needed? +> No → do not write the rule. + +**Testability gate:** +> Can I describe a concrete code change that would violate this rule in a detectable way? +> No → do not write the instruction. + +**Staleness gate:** +> Is this rule likely to go stale within 3 months? +> Yes → move to a skill or omit. + +**Prohibited language gate:** +If any of these words appear in an instruction — delete the whole instruction: +`ensure`, `make sure`, `be careful`, `strive to`, `try to`, `consider`, +`follow best practices`, `good practice`, `recommended approach` + +#### File Formats + +**`.github/copilot-instructions.md`** — repo-wide, always active +- Prefer under 100 lines. If longer, justify why each section earns its place. +- Architectural invariants + coding standards + review priorities only. +- Do NOT include build/test commands here. + +**`.github/copilot/instructions/.md`** — path-scoped +``` +--- +applyTo: "src/storage/**" +--- +[rules — testable, evidence-backed, path-verified] +``` + +**`.github/skills//SKILL.md`** — conditional playbook + +Skills live in named directories. The file must be `SKILL.md`. +``` +NOT: .github/copilot/skills/name.md ← wrong +YES: .github/skills/name/SKILL.md ← correct +``` + +Create a skill only when ALL three are true: +- task is conditional (not always relevant) +- task has > 5 steps +- task is reused across multiple sessions + +**`.github/copilot/agents/.md` or `.github/agents/.agent.md`** + +Use `chatagent` fence. YAML lives inside the fence, not at file root: +``` +\`\`\`chatagent +--- +name: +description: >- + One sentence. +tools: + - read + - editFiles + - runTerminal +--- +# Role: ... +\`\`\` +``` + +Create an agent only when ALL three are true: +- requires persistent persona across a multi-turn session +- requires tool orchestration (not just passive instructions) +- role is meaningfully distinct from repo-wide instructions + +### Step 2.3 — Conflict Check + +``` +AUTHORITY: Each rule lives in exactly one file at the right authority level +OVERLAP: No rule appears in more than one file +ACCURACY: Every path verified (or explicitly flagged as unverified) + No fabricated file paths, line numbers, or snippets +TESTABILITY: Every instruction can be violated in a detectable way +LANGUAGE: No prohibited language present +``` + +### Step 2.4 — Commit or Report + +**If git is available:** +```bash +git add .github/copilot-instructions.md \ + .github/copilot/instructions/ \ + .github/skills/ \ + .github/copilot/agents/ \ + .github/agents/ \ + AGENTS.md 2>/dev/null; true +git status +git commit -m "feat(copilot): add evidence-based Copilot customizations + +Each file justified by direct repository evidence. +Conflict-checked against existing customization files." +``` + +**If git is NOT available:** +``` +COMMIT PLAN: + Files to stage: [list each file] + Suggested message: feat(copilot): [what changed and why] + Open concerns before committing: [any unverified paths or uncertain rules] +``` + +--- + +## Phase 3: Post-Run Reflection (Required on Every Run) + +``` +POST-RUN REFLECTION +═══════════════════ +TERMINAL ACCESS: [yes | no — affected evidence confidence how?] + +MISTAKES MADE: + [any path cited that turned out wrong] + [any rule removed due to insufficient evidence] + [any fabrication caught and corrected] + (none) if clean + +INFERENCES MADE (no-terminal runs): + [every claim based on context inference rather than command execution] + [confidence level: high | medium | low] + +OVER-ENGINEERING AVOIDED: + [what was considered but cut — too generic, stale-prone, no evidence, cost > value] + +STALE RISK IDENTIFIED: + [which generated rules are most likely to go stale, and why] + +AGENT SPEC IMPROVEMENTS NEEDED: + [what in this agent's instructions caused confusion or poor output] + [what should change in the next version] + (none) if clean +``` + +--- + +## Hard Constraints + +- Never begin Phase 2 without explicit user approval of the Phase 1 plan +- Never cite a path without verifying it — or marking it UNVERIFIED +- Never fabricate file paths, line numbers, or code snippets +- Never present an inference as a verified fact +- Never duplicate a rule that exists at a higher authority level +- Never write an untestable instruction +- Never create files when no material gap exists +- Never skip the post-run reflection +``` diff --git a/.github/copilot/instructions/analyzers-generators.md b/.github/copilot/instructions/analyzers-generators.md new file mode 100644 index 00000000..9e321add --- /dev/null +++ b/.github/copilot/instructions/analyzers-generators.md @@ -0,0 +1,28 @@ +--- +applyTo: "src/Analyzers/**,src/Generators/**,test/Analyzers.Tests/**,test/Generators.Tests/**" +--- +# Roslyn Analyzers and Source Generators + +## Runtime vs. Compile-Time Boundary (production code only) + +Code in `src/Analyzers/` runs at **compile time only** inside the compiler process. Code in `src/Generators/` runs at **compile time only** to emit new C# source files. The constraints below apply to these production projects, not to their test projects. + +- Do not reference any NuGet package that has a runtime dependency (e.g., `Microsoft.Extensions.*`, gRPC libs). Analyzer/generator projects may only reference `Microsoft.CodeAnalysis.*` packages. +- Do not use `System.Reflection` APIs — use Roslyn symbol APIs (`INamedTypeSymbol`, `IMethodSymbol`, etc.) instead. +- Generator output must reproduce the same source given the same input — generators must be **deterministic and idempotent**. + +## Interface Consistency Constraint + +`TaskActivityContext` (in `src/Abstractions/`) is implemented by the source generator output. If you change any `abstract` member on `TaskActivityContext` or `TaskOrchestrationContext`, you must also update the generator templates in `src/Generators/` to match. The `// IMPORTANT` comments in those files mark the exact coupling points. + +## Testing Analyzers + +- Analyzer tests use `Microsoft.CodeAnalysis.CSharp.Analyzer.Testing` — do not use xUnit directly without that infrastructure. +- Each analyzer test must provide the exact source snippet that triggers the diagnostic and the expected `DiagnosticResult` with descriptor ID, location, and message. +- Verify both the diagnostic fires on bad code and does NOT fire on correct code (false positive check). + +## Testing Generators + +- Generator tests use `Microsoft.CodeAnalysis.CSharp.SourceGenerators.Testing` — use `GeneratorDriver` or the test helper wrappers in `test/Generators.Tests/`. +- Compare generated output with explicit expected source strings (inline in the tests or via shared helpers). When changing generator templates, update all affected expected outputs or helper expectations. +- Do not emit `#pragma warning disable` in generated output without explicit justification. diff --git a/.github/copilot/instructions/grpc-worker.md b/.github/copilot/instructions/grpc-worker.md new file mode 100644 index 00000000..13e248c7 --- /dev/null +++ b/.github/copilot/instructions/grpc-worker.md @@ -0,0 +1,42 @@ +--- +applyTo: "src/Grpc/**,src/Worker/**,src/InProcessTestHost/**,test/Grpc.IntegrationTests/**" +--- +# gRPC Worker and Integration Tests + +## Wire Compatibility Rules + +`src/Grpc/orchestrator_service.proto` defines the wire contract between the SDK and the sidecar process. + +- Do not remove, rename, or renumber any existing proto field or RPC. Field numbers are permanent. +- Adding new optional fields is backward-compatible. Adding required fields or changing field types is not. +- When updating the proto from the upstream `microsoft/durabletask-protobuf` repository, run `src/Grpc/refresh-protos.ps1` to pull the latest proto version. C# stubs are generated at build time by `Grpc.Tools` — not committed to source. +- Proto-consumer code in `src/Shared/Grpc/ProtoUtils.cs`, `src/Client/Grpc/ProtoUtils.cs`, and `src/Worker/Grpc/` must be updated to handle any new fields added. + +## Worker Dispatch Loop + +`src/Worker/Core/` contains the orchestration and activity dispatch loop. This code runs under concurrent load. + +- The `IEnumerable` returned by the orchestrator engine may be lazily evaluated — enumerate it exactly once (see `// IMPORTANT` in `src/InProcessTestHost/Sidecar/Dispatcher/TaskOrchestrationDispatcher.cs`). +- The receive loop in `src/InProcessTestHost/Sidecar/Dispatcher/WorkItemDispatcher.cs` assumes a single logical thread — do not introduce concurrent access to its internal state without an explicit concurrency mechanism. +- Changing `DurableTaskWorkerOptions.DataConverter`, `DurableTaskWorkerOptions.Versioning.DefaultVersion`, or `DurableTaskClientOptions.DefaultVersion` is a breaking change for in-flight orchestrations. Add a `// WARNING` comment and update the XML doc if you touch those properties. + +## Writing Integration Tests + +Integration tests in `test/Grpc.IntegrationTests/` require a gRPC sidecar. Follow this pattern: + +1. Inherit from `IntegrationTestBase` — it manages the `GrpcSidecarFixture` lifecycle. +2. Register orchestrators and activities using `StartWorkerAsync(b => b.AddTasks(...))`. +3. Schedule orchestrations via the injected `DurableTaskClient` from the fixture. +4. Await completion with a timeout by using a `CancellationTokenSource`, for example: + + ```csharp + using CancellationTokenSource cts = new CancellationTokenSource(timeout); + OrchestrationMetadata result = await client.WaitForInstanceCompletionAsync( + instanceId, + getInputsAndOutputs: true, + cancellationToken: cts.Token); + ``` + +5. Do not use `Task.Delay` to wait for orchestration completion when `WaitForInstanceCompletionAsync` or external events are available; if you must poll, use bounded `Task.Delay`-based polling with clear timeouts. + +Add integration tests (not just unit tests) when the behavior change touches the gRPC dispatch path, retry logic, or cancellation handling. diff --git a/.github/skills/breaking-change-check/SKILL.md b/.github/skills/breaking-change-check/SKILL.md new file mode 100644 index 00000000..816587c1 --- /dev/null +++ b/.github/skills/breaking-change-check/SKILL.md @@ -0,0 +1,77 @@ +--- +name: breaking-change-check +description: Use when adding, removing, or changing any public API surface in this repo (method signatures, class members, interface members, enum values, constructor parameters, or serialization behavior). Guides a systematic backward-compatibility check before committing. +--- + +# Breaking Change Safety Check + +## When to Use + +Invoke this skill whenever you are about to: +- Remove or rename a `public` or `protected` method, property, class, or interface +- Change a method signature (parameter types, return type, parameter count, optionality) +- Change a public `enum` value or add a new value to a `[Flags]` enum +- Change a default value that affects serialization or wire behavior +- Modify `TaskOrchestrationContext`, `TaskActivityContext`, `DurableTaskClientOptions`, or `DurableTaskWorkerOptions` + +## Step 1 — Identify the Changed API Surface + +Read the file containing the change. List every `public` or `protected` member being modified. + +For each, determine: is this member shipped in a `Microsoft.DurableTask.*` NuGet package? +To decide this, inspect `src//.csproj` and our build/CI configuration: look for NuGet packaging metadata (such as ``, `IsPackable`, `GeneratePackageOnBuild`, or inclusion in a packing target or release artifact). If the project is packed into a `Microsoft.DurableTask.*` NuGet, assume customers may depend on it. + +## Step 2 — Verify All Callers Inside This Repo + +For each changed member, search the full solution before modifying the signature: + +```bash +git grep -n "MemberName" -- '*.cs' +``` + +Also search `test/` and `samples/` — these are first-party consumers that must be updated alongside the change. + +## Step 3 — Check Orchestration Replay Safety + +If the changed API is callable from inside an orchestrator (anything reachable from `TaskOrchestrationContext`): +- Would the new behavior produce a different result when replaying existing orchestration history? +- If yes: the change is NOT backward-compatible. It requires a new orchestration version (via `TaskVersion`) or an explicit feature flag. Do not proceed without documenting this. + +## Step 4 — Check Serialization Compatibility + +If the change affects a type serialized to/from the gRPC wire format or the JSON data converter: +- Adding new optional properties with sensible defaults is generally backward-compatible. +- Removing properties, renaming them, or changing their serialized representation is breaking. +- Inspect `src/Abstractions/Converters/JsonDataConverter.cs` for any serializer configuration that applies. + +## Step 5 — Classify the Breaking Change + +Reference: https://github.com/dotnet/runtime/blob/main/docs/coding-guidelines/breaking-change-rules.md + +Classify as one of: +- **Binary breaking** — existing compiled assemblies will fail to load against the new version +- **Source breaking** — existing source code will fail to compile against the new version +- **Behavioral breaking** — existing code compiles and loads but produces different results + +Binary and source breaking changes require explicit maintainer approval (noted in PR summary or linked issue). +Behavioral breaking changes on serialization or wire paths require a documented migration plan. + +## Step 6 — Document in the PR + +If the change is breaking (any category), add this block to the PR description: + +``` +## Breaking Change +Type: [binary | source | behavioral] +Impact: [what breaks and who is affected] +Migration: [what callers must do to upgrade to this version] +``` + +If not breaking, state explicitly: "No breaking change — [reason]." + +## Common Mistakes in This Repo + +- Changing a `DataConverter` default in `DurableTaskWorkerOptions` without noting it affects in-flight orchestrations +- Modifying the `TaskActivityContext` abstract interface without updating the source generator output in `src/Generators/` +- Adding a required constructor parameter to a type registered via DI (breaks existing DI registrations) +- Changing a proto field number or removing a proto field (breaks wire compatibility — field numbers are permanent)