From 049300d7484023069213e1c1714a8a8e21616526 Mon Sep 17 00:00:00 2001 From: sc516 Date: Sat, 23 May 2026 20:27:05 +0900 Subject: [PATCH] Add production incident triage instructions --- docs/README.instructions.md | 1 + ...production-incident-triage.instructions.md | 95 +++++++++++++++++++ 2 files changed, 96 insertions(+) create mode 100644 instructions/production-incident-triage.instructions.md diff --git a/docs/README.instructions.md b/docs/README.instructions.md index 1211113f4..3ae3e28ae 100644 --- a/docs/README.instructions.md +++ b/docs/README.instructions.md @@ -162,6 +162,7 @@ See [CONTRIBUTING.md](../CONTRIBUTING.md#adding-instructions) for guidelines on | [Power Platform MCP Custom Connector Development](../instructions/power-platform-mcp-development.instructions.md)
[![Install in VS Code](https://img.shields.io/badge/VS_Code-Install-0098FF?style=flat-square&logo=visualstudiocode&logoColor=white)](https://aka.ms/awesome-copilot/install/instructions?url=vscode%3Achat-instructions%2Finstall%3Furl%3Dhttps%3A%2F%2Fraw.githubusercontent.com%2Fgithub%2Fawesome-copilot%2Fmain%2Finstructions%2Fpower-platform-mcp-development.instructions.md)
[![Install in VS Code Insiders](https://img.shields.io/badge/VS_Code_Insiders-Install-24bfa5?style=flat-square&logo=visualstudiocode&logoColor=white)](https://aka.ms/awesome-copilot/install/instructions?url=vscode-insiders%3Achat-instructions%2Finstall%3Furl%3Dhttps%3A%2F%2Fraw.githubusercontent.com%2Fgithub%2Fawesome-copilot%2Fmain%2Finstructions%2Fpower-platform-mcp-development.instructions.md) | Instructions for developing Power Platform custom connectors with Model Context Protocol (MCP) integration for Microsoft Copilot Studio | | [PowerShell Cmdlet Development Guidelines](../instructions/powershell.instructions.md)
[![Install in VS Code](https://img.shields.io/badge/VS_Code-Install-0098FF?style=flat-square&logo=visualstudiocode&logoColor=white)](https://aka.ms/awesome-copilot/install/instructions?url=vscode%3Achat-instructions%2Finstall%3Furl%3Dhttps%3A%2F%2Fraw.githubusercontent.com%2Fgithub%2Fawesome-copilot%2Fmain%2Finstructions%2Fpowershell.instructions.md)
[![Install in VS Code Insiders](https://img.shields.io/badge/VS_Code_Insiders-Install-24bfa5?style=flat-square&logo=visualstudiocode&logoColor=white)](https://aka.ms/awesome-copilot/install/instructions?url=vscode-insiders%3Achat-instructions%2Finstall%3Furl%3Dhttps%3A%2F%2Fraw.githubusercontent.com%2Fgithub%2Fawesome-copilot%2Fmain%2Finstructions%2Fpowershell.instructions.md) | PowerShell cmdlet and scripting best practices based on Microsoft guidelines | | [PowerShell Pester v5 Testing Guidelines](../instructions/powershell-pester-5.instructions.md)
[![Install in VS Code](https://img.shields.io/badge/VS_Code-Install-0098FF?style=flat-square&logo=visualstudiocode&logoColor=white)](https://aka.ms/awesome-copilot/install/instructions?url=vscode%3Achat-instructions%2Finstall%3Furl%3Dhttps%3A%2F%2Fraw.githubusercontent.com%2Fgithub%2Fawesome-copilot%2Fmain%2Finstructions%2Fpowershell-pester-5.instructions.md)
[![Install in VS Code Insiders](https://img.shields.io/badge/VS_Code_Insiders-Install-24bfa5?style=flat-square&logo=visualstudiocode&logoColor=white)](https://aka.ms/awesome-copilot/install/instructions?url=vscode-insiders%3Achat-instructions%2Finstall%3Furl%3Dhttps%3A%2F%2Fraw.githubusercontent.com%2Fgithub%2Fawesome-copilot%2Fmain%2Finstructions%2Fpowershell-pester-5.instructions.md) | PowerShell Pester testing best practices based on Pester v5 conventions | +| [Production Incident Triage Instructions](../instructions/production-incident-triage.instructions.md)
[![Install in VS Code](https://img.shields.io/badge/VS_Code-Install-0098FF?style=flat-square&logo=visualstudiocode&logoColor=white)](https://aka.ms/awesome-copilot/install/instructions?url=vscode%3Achat-instructions%2Finstall%3Furl%3Dhttps%3A%2F%2Fraw.githubusercontent.com%2Fgithub%2Fawesome-copilot%2Fmain%2Finstructions%2Fproduction-incident-triage.instructions.md)
[![Install in VS Code Insiders](https://img.shields.io/badge/VS_Code_Insiders-Install-24bfa5?style=flat-square&logo=visualstudiocode&logoColor=white)](https://aka.ms/awesome-copilot/install/instructions?url=vscode-insiders%3Achat-instructions%2Finstall%3Furl%3Dhttps%3A%2F%2Fraw.githubusercontent.com%2Fgithub%2Fawesome-copilot%2Fmain%2Finstructions%2Fproduction-incident-triage.instructions.md) | Evidence-first production incident triage instructions for GitHub Copilot, focused on separating observed facts, system layers, timelines, read-only checks, and safe next actions. | | [Project Context](../instructions/moodle.instructions.md)
[![Install in VS Code](https://img.shields.io/badge/VS_Code-Install-0098FF?style=flat-square&logo=visualstudiocode&logoColor=white)](https://aka.ms/awesome-copilot/install/instructions?url=vscode%3Achat-instructions%2Finstall%3Furl%3Dhttps%3A%2F%2Fraw.githubusercontent.com%2Fgithub%2Fawesome-copilot%2Fmain%2Finstructions%2Fmoodle.instructions.md)
[![Install in VS Code Insiders](https://img.shields.io/badge/VS_Code_Insiders-Install-24bfa5?style=flat-square&logo=visualstudiocode&logoColor=white)](https://aka.ms/awesome-copilot/install/instructions?url=vscode-insiders%3Achat-instructions%2Finstall%3Furl%3Dhttps%3A%2F%2Fraw.githubusercontent.com%2Fgithub%2Fawesome-copilot%2Fmain%2Finstructions%2Fmoodle.instructions.md) | Instructions for GitHub Copilot to generate code in a Moodle project context. | | [Python MCP Server Development](../instructions/python-mcp-server.instructions.md)
[![Install in VS Code](https://img.shields.io/badge/VS_Code-Install-0098FF?style=flat-square&logo=visualstudiocode&logoColor=white)](https://aka.ms/awesome-copilot/install/instructions?url=vscode%3Achat-instructions%2Finstall%3Furl%3Dhttps%3A%2F%2Fraw.githubusercontent.com%2Fgithub%2Fawesome-copilot%2Fmain%2Finstructions%2Fpython-mcp-server.instructions.md)
[![Install in VS Code Insiders](https://img.shields.io/badge/VS_Code_Insiders-Install-24bfa5?style=flat-square&logo=visualstudiocode&logoColor=white)](https://aka.ms/awesome-copilot/install/instructions?url=vscode-insiders%3Achat-instructions%2Finstall%3Furl%3Dhttps%3A%2F%2Fraw.githubusercontent.com%2Fgithub%2Fawesome-copilot%2Fmain%2Finstructions%2Fpython-mcp-server.instructions.md) | Instructions for building Model Context Protocol (MCP) servers using the Python SDK | | [Quarkus](../instructions/quarkus.instructions.md)
[![Install in VS Code](https://img.shields.io/badge/VS_Code-Install-0098FF?style=flat-square&logo=visualstudiocode&logoColor=white)](https://aka.ms/awesome-copilot/install/instructions?url=vscode%3Achat-instructions%2Finstall%3Furl%3Dhttps%3A%2F%2Fraw.githubusercontent.com%2Fgithub%2Fawesome-copilot%2Fmain%2Finstructions%2Fquarkus.instructions.md)
[![Install in VS Code Insiders](https://img.shields.io/badge/VS_Code_Insiders-Install-24bfa5?style=flat-square&logo=visualstudiocode&logoColor=white)](https://aka.ms/awesome-copilot/install/instructions?url=vscode-insiders%3Achat-instructions%2Finstall%3Furl%3Dhttps%3A%2F%2Fraw.githubusercontent.com%2Fgithub%2Fawesome-copilot%2Fmain%2Finstructions%2Fquarkus.instructions.md) | Quarkus development standards and instructions | diff --git a/instructions/production-incident-triage.instructions.md b/instructions/production-incident-triage.instructions.md new file mode 100644 index 000000000..16728e16f --- /dev/null +++ b/instructions/production-incident-triage.instructions.md @@ -0,0 +1,95 @@ +--- +applyTo: '**' +description: 'Evidence-first production incident triage instructions for GitHub Copilot, focused on separating observed facts, system layers, timelines, read-only checks, and safe next actions.' +--- + +# Production Incident Triage Instructions + +Use these instructions when helping with production incidents, broken deploys, customer-facing errors, degraded jobs, webhook failures, data mismatches, or urgent operational alerts. + +## Core Behavior + +- Work from evidence first. Separate observed facts from inferred causes. +- Mark uncertain statements with `[Assumption]`. +- Use absolute timestamps when available, including time zone. +- Do not treat a recent deploy, admin action, or alert as the cause until the timeline proves it. +- Prefer read-only checks before recommending data changes, restarts, rollbacks, or destructive commands. +- Redact secrets, tokens, credentials, payment details, and private customer data. Describe only the location and remediation path. +- Keep the next action small enough to increase certainty or reduce impact without widening blast radius. + +## Triage Flow + +1. Restate the symptom in plain language. +2. Identify whether the incident is ongoing, intermittent, recovered, or unproven. +3. Build a timeline from reports, logs, deploys, migrations, cron runs, webhook events, and monitoring alerts. +4. Split the system path into layers: + - frontend reachability and rendered state + - API route, controller, resolver, or backend handler + - authentication and authorization contract + - database, cache, queue, or object storage state + - worker, cron, event bus, webhook, or provider callback + - deploy, runtime, network, CDN, or platform infrastructure +5. For each plausible layer, name the fastest read-only proof: + - URL or route probe + - log query + - database select + - queue depth or job status + - health endpoint + - provider dashboard event + - commit, release, or deploy comparison +6. Rank causes by evidence strength, not by convenience. +7. Recommend one safe next action and explain what result would prove or disprove the current hypothesis. + +## Output Format + +Use this structure for incident responses: + +```markdown +## Current Read + +- Definitely happening: +- Not proven yet: +- Current status: + +## Timeline + +| Time | Evidence | Meaning | +|---|---|---| + +## Layer Checks + +| Layer | Read-only check | Evidence needed | +|---|---|---| + +## Likely Causes + +1. Cause: + Evidence: + Why it might be wrong: + +## Safe Next Action + +- Action: +- Expected proof: +- Rollback or stop condition: +``` + +## Money, Auth, and Production Data + +For incidents involving balances, invoices, payments, authentication, authorization, or customer data: + +- Require read-only proof before mutation. +- Identify the single source of truth for the value or permission. +- Compare frontend display, API response, and database state before naming a cause. +- If correction is needed, propose the smallest scoped mutation, the audit log entry, and the post-verification query. +- Do not suggest broad backfills or manual edits without a reversible plan and explicit verification. + +## Review Checklist + +Before finalizing an incident answer, verify: + +- The answer does not expose secrets or private data. +- Each cause is tied to concrete evidence or marked as `[Assumption]`. +- The timeline uses specific times rather than vague ordering. +- At least one read-only check is provided for each major hypothesis. +- The recommended next action is safer than a broad restart, rollback, or data mutation.