diff --git a/README.md b/README.md index 507b59f..d262f00 100644 --- a/README.md +++ b/README.md @@ -28,7 +28,6 @@ Public background documents: - [Why CodeGate Exists](docs/why-codegate.md) - [Public Evidence Map](docs/public-evidence-map.md) -- [Feature Evidence Ledger](docs/feature-evidence-ledger.md) ## What CodeGate Is @@ -176,9 +175,6 @@ Current checks include: - Dependabot cooldown and execution-risk checks - Workflow hygiene checks (concurrency gates, obfuscation, unsafe conditional trust) -Track the current workflow-audit coverage and backlog in the [workflow audit parity checklist](docs/workflow-audit-parity-checklist.md). -Real public validation fixtures and source provenance are documented in [workflow audit real-case corpus](docs/workflow-audit-real-cases.md). - Examples: ```bash diff --git a/docs/feature-evidence-ledger.md b/docs/feature-evidence-ledger.md deleted file mode 100644 index f65f613..0000000 --- a/docs/feature-evidence-ledger.md +++ /dev/null @@ -1,98 +0,0 @@ -# CodeGate Feature Evidence Ledger - -Last updated: 2026-03-22 - -## Purpose - -This document tracks which CodeGate feature families are worth carrying forward, how complete they are in the current product, what public evidence supports them, and whether we have already validated them against temp-only real-world samples. - -Scale used in this ledger: - -- **Evidence strength** - - `Strong`: direct public incidents, advisories, or live marketplace samples - - `Moderate`: strong analog evidence or official guidance, but limited direct public artifacts - - `Weak`: mostly preventive/product-hardening value, limited public incident evidence -- **Status** - - `Mature`: shipped and validated on real or representative cases - - `Implemented`: shipped, but real-world validation is still limited - - `Partial`: some behavior exists, but coverage or confidence is incomplete - -## Feature Families - -| Feature family | What CodeGate does | Status | Evidence strength | Public evidence | Real temp-only validation | Recommendation | -| ----------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | ----------: | ----------------: | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------ | ---------------------------------------- | -| Cross-tool config discovery | Finds project and user-scope config/instruction/plugin surfaces across Claude, Codex, Cursor, Windsurf, Kiro, Cline, Roo, Zed, Gemini CLI, Copilot, and Junie. | Implemented | Moderate | Supported by the attack surfaces described in [Claude Code project-file research](https://research.checkpoint.com/2026/rce-and-api-token-exfiltration-through-claude-code-project-files-cve-2025-59536/), [Codex CLI project-local config RCE](https://research.checkpoint.com/2025/openai-codex-cli-command-injection-vulnerability/), [Cursor MCPoison](https://research.checkpoint.com/2025/cursor-vulnerability-mcpoison/), and [AWS Kiro / Amazon Q bulletin](https://aws.amazon.com/security/security-bulletins/AWS-2025-019/). | Indirectly validated by scanning real samples from public skill repos. | Keep | -| Environment override detection | Flags hostile endpoint/base-URL and env redirection settings in project/user configs. | Mature | Strong | [Check Point on Claude Code `ANTHROPIC_BASE_URL` exfiltration](https://research.checkpoint.com/2026/rce-and-api-token-exfiltration-through-claude-code-project-files-cve-2025-59536/) directly justifies this family. | Supported by existing fixtures; no new public temp sample rerun needed. | Keep | -| Command-surface detection | Flags executable commands in MCP configs, hooks, workflows, object templates, and markdown execute blocks. | Mature | Strong | [Codex CLI command injection](https://research.checkpoint.com/2025/openai-codex-cli-command-injection-vulnerability/), [Claude Code project-file RCE](https://research.checkpoint.com/2026/rce-and-api-token-exfiltration-through-claude-code-project-files-cve-2025-59536/), and [AWS-2025-019](https://aws.amazon.com/security/security-bulletins/AWS-2025-019/) all depend on committed command surfaces. | Validated on live `security-review` and `frankenphp` skill files. | Keep | -| Consent-bypass / auto-approval detection | Flags `alwaysAllow`, `autoApprove`, `yolo`, enterprise bypass flags, and remote-MCP policies that suppress review or HITL. | Mature | Strong | [Cursor MCPoison](https://research.checkpoint.com/2025/cursor-vulnerability-mcpoison/) and [AWS-2025-019](https://aws.amazon.com/security/security-bulletins/AWS-2025-019/) both show trust/confirmation bypass as a primary failure mode. | Validated via fixtures; public config repros should be added to the evidence queue. | Keep | -| Rule / skill maliciousness detection | Flags hidden payloads, override language, hidden Unicode, remote-shell instructions, session-transfer patterns, and similar hostile content in `SKILL.md` / rule markdown. | Mature | Strong | [Snyk ToxicSkills](https://snyk.io/blog/toxicskills-malicious-ai-agent-skills-clawhub/), [Snyk Clawdhub campaign](https://snyk.io/articles/clawdhub-malicious-campaign-ai-agent-skills/), and [Snyk skill threat model](https://snyk.io/jp/articles/skill-md-shell-access/) strongly justify this family. | Validated on live `security-review`, `frankenphp`, `browser-use`, and `remote-browser` samples. | Keep and strengthen | -| Session / cookie / profile transfer detection | Flags cookie export/import, session sharing, profile sync, real-browser reuse, and public tunnel patterns. | Mature | Strong | Live public skills such as [`browser-use`](https://github.com/browser-use/browser-use/blob/main/skills/browser-use/SKILL.md), [`remote-browser`](https://github.com/browser-use/browser-use/blob/main/skills/remote-browser/SKILL.md), and [`kernel-agent-browser`](https://github.com/kernel/skills/blob/main/plugins/kernel-cli/skills/kernel-agent-browser/SKILL.md) provide direct evidence. | Validated on all three public samples above. | Keep and strengthen | -| Bootstrap control-point detection | Flags skills that bootstrap global/latest tools, write `.claude` hooks/settings/agents or `CLAUDE.md`, and require restart to activate the new control points. | Mature | Strong | The public [`create-beads-orchestration`](https://github.com/AvivK5498/The-Claude-Protocol/blob/main/skills/create-beads-orchestration/SKILL.md) skill demonstrates this pattern directly. | Validated against the live public `create-beads-orchestration` sample after the March 7 hardening pass. | Keep and strengthen | -| IDE / workspace security settings detection | Flags risky workspace settings and AI-tool settings that turn committed repo files into execution/config vectors. | Implemented | Strong | [VS Code Workspace Trust](https://code.visualstudio.com/docs/editing/workspaces/workspace-trust) explicitly warns that tasks, debugging, workspace settings, extensions, and AI agents can execute code from unfamiliar workspaces. | Indirect validation via discovery and fixture coverage. | Keep | -| Git hook detection | Flags suspicious repo hooks and supports allowlisting known-safe hooks. | Implemented | Strong | [Git hooks docs](https://git-scm.com/docs/githooks.html) confirm hooks are executable programs triggered by Git events; [CrowdStrike on CVE-2025-48384](https://www.crowdstrike.com/en-us/blog/crowdstrike-falcon-blocks-git-vulnerability-cve-2025-48384/) shows malicious hook placement via Git write primitives is a real attack path; [Claude Code project-file RCE](https://research.checkpoint.com/2026/rce-and-api-token-exfiltration-through-claude-code-project-files-cve-2025-59536/) also highlighted hooks. | Validated by fixtures; public malicious repo examples should be added to the evidence queue. | Keep | -| Symlink escape detection | Flags symlinks from repo-controlled surfaces into sensitive local files or system paths. | Implemented | Moderate | [CrowdStrike on CVE-2025-48384](https://www.crowdstrike.com/en-us/blog/crowdstrike-falcon-blocks-git-vulnerability-cve-2025-48384/) is strong analog evidence for repo-controlled file redirection; this feature is mostly preventative hardening. | Validated by fixtures only. | Keep | -| Plugin / extension manifest provenance and integrity checks | Checks source URLs, local paths, install scripts, permissions, provenance, publisher identity, signatures, attestation, transparency, version pinning, and marketplace/domain consistency. | Implemented | Strong | [Eclipse Open VSX advisory](https://blogs.eclipse.org/post/mika%C3%ABl-barbero/eclipse-open-vsx-registry-security-advisory), [Open VSX October 2025 follow-up](https://blogs.eclipse.org/post/mika%C3%ABl-barbero/open-vsx-security-update-october-2025), [JFrog on compromised Amazon Q VS Code extension](https://research.jfrog.com/post/amazon-q-vs-code-extension-compromised-with-malicious-code/), and [ReversingLabs on malicious VS Code extensions](https://www.reversinglabs.com/blog/malicious-vs-code-fake-image) directly support this family. | Not yet revalidated with fresh live manifests in the current audit pass. | Keep and strengthen | -| Marketplace provenance / signature / attestation policy | Flags missing digest/signature/attestation metadata, issuer trust-anchor problems, transparency proof failures, bypass flags, and unstable release channels. | Implemented | Strong | Same extension-registry incidents above, especially [Open VSX](https://blogs.eclipse.org/post/mika%C3%ABl-barbero/eclipse-open-vsx-registry-security-advisory) and [JFrog’s Amazon Q extension compromise](https://research.jfrog.com/post/amazon-q-vs-code-extension-compromised-with-malicious-code/), justify stronger marketplace integrity controls. | Not yet revalidated with temp-only live manifest pulls. | Keep and strengthen | -| MCP rug-pull detection | Hashes MCP configs and reports `NEW_SERVER` / `CONFIG_CHANGE` across scans. | Mature | Strong | [Cursor MCPoison](https://research.checkpoint.com/2025/cursor-vulnerability-mcpoison/) is direct evidence that a previously trusted MCP entry can change behavior silently; [Invariant tool poisoning](https://invariantlabs.ai/blog/mcp-security-notification-tool-poisoning-attacks) and [Snyk MCP research](https://snyk.io/articles/mcp-security-research-brief-securing-tools-skill-execution/) reinforce the need for change tracking. | Validated by tests and product hardening work; not a live-sample feature in isolation. | Keep | -| MCP tool-poisoning detection | Uses deep scan to analyze remote tool descriptions for hidden or agent-visible malicious instructions. | Implemented | Strong | [Invariant Tool Poisoning Attacks](https://invariantlabs.ai/blog/mcp-security-notification-tool-poisoning-attacks) is direct evidence for this feature. | Validated via deterministic tests and deep-scan integration; public malicious MCP descriptions should be added to the live evidence queue. | Keep | -| Toxic-flow detection | Looks for compound attack paths where one tool/resource poisons or redirects another. | Implemented | Strong | [Invariant Toxic Flow Analysis](https://invariantlabs.ai/blog/toxic-flow-analysis) and [GitHub MCP exploit](https://invariantlabs.ai/blog/mcp-github-vulnerability) directly justify this family. | Validated in tests; live public validation still limited. | Keep and strengthen | -| Remote MCP domain / header governance | Flags non-allowlisted domains, credential-bearing headers, routing overrides, and risky remote-MCP policy combinations. | Implemented | Moderate | [AWS-2025-019](https://aws.amazon.com/security/security-bulletins/AWS-2025-019/) and [MSRC variant-hunting research](https://www.microsoft.com/en-us/msrc/blog/2025/11/msrc-variant-hunting-from-multi-tenant-authorization-to-model-context-protocol/) provide analog evidence for authorization and trust-boundary failures, but public artifact quality is weaker here. | Mostly validated by fixtures today. | Keep, but tighten scope before expanding | -| Safe local text analysis | Uses tool-less Claude to analyze local instruction files as inert text without fetching URLs or executing commands. | Partial | Strong | Strongly justified by the live public skills above, because static heuristics alone missed some cases until this path was added. | Validated on `security-review`, `browser-use`, `remote-browser`, and `create-beads-orchestration`. | Keep and expand carefully | -| Deep scan of remote resources | Fetches approved remote MCP/package metadata and runs model-assisted analysis on the fetched text/metadata. | Implemented | Strong | [Invariant Tool Poisoning](https://invariantlabs.ai/blog/mcp-security-notification-tool-poisoning-attacks), [GitHub MCP exploit](https://invariantlabs.ai/blog/mcp-github-vulnerability), and [Snyk MCP research](https://snyk.io/articles/mcp-security-research-brief-securing-tools-skill-execution/) all justify analyzing remote tool metadata. | Validated, but many public targets fail with auth/network issues before analysis. | Keep and strengthen | -| Wrapper mode / TOCTOU recheck | Scans before launch, blocks dangerous launches, and rechecks the scanned local config surface immediately before starting the tool. | Mature | Weak | This is mainly a product-control safeguard rather than a public incident-driven feature, though it directly mitigates the trust-drift pattern seen in [Cursor MCPoison](https://research.checkpoint.com/2025/cursor-vulnerability-mcpoison/). | Validated by CLI tests and recent product hardening work. | Keep | -| Remediation and undo | Applies guided or safe fixes, creates backups, and restores with hash-verified undo. | Mature | Weak | Product-value feature; not threat-evidence driven. It matters because the scanner is intended to block and help users recover safely, not just detect. | Validated by Layer 4 and CLI tests. | Keep | -| Reporting and policy controls | Provides terminal/JSON/SARIF/Markdown/HTML output, granular suppression, rule-pack filtering, OWASP shaping, allowlists, trusted directories, and domain policy overrides. | Mature | Weak | Product/usability feature, justified by operator workflow rather than external incidents. | Validated by tests and recent hardening work. | Keep | - -## Current Product Read - -### Tier A: strongest value, should remain central - -- Rule / skill maliciousness detection -- Command-surface detection -- Consent-bypass detection -- Environment override detection -- Plugin / marketplace provenance and integrity checks -- MCP rug-pull detection -- MCP tool-poisoning and toxic-flow analysis -- Session / cookie / profile transfer detection -- Bootstrap control-point detection - -### Tier B: valuable but still needs broader live-sample validation - -- IDE / workspace security settings detection -- Git hook detection -- Symlink escape detection -- Deep scan of remote resources -- Remote MCP domain / header governance - -### Tier C: product-control families, keep for operator value - -- Wrapper mode / TOCTOU recheck -- Remediation and undo -- Reporting and policy controls - -## Recommended Next Validation Queue - -1. **Plugin / marketplace provenance and integrity** - - Pull fresh public extension manifests and registry metadata into temp projects. - - Confirm CodeGate flags missing provenance / attestation / signature controls on real samples. - -2. **MCP poisoning / toxic-flow** - - Find public MCP metadata or description examples that exercise hidden-instruction and cross-tool shadowing patterns. - - Validate CodeGate’s deep-scan findings against those artifacts. - -3. **Consent-bypass / enterprise policy surfaces** - - Collect real public configs from Kiro, Cline, or Amazon Q ecosystems that demonstrate auto-approval or HITL-bypass patterns. - -4. **Git hooks / symlink escape** - - Find safe public repro repositories or advisories with minimal PoCs and stage only the relevant files in temp projects. - -5. **Remote MCP domain / header governance** - - Reassess whether the current granularity is worth the maintenance cost if strong public examples remain sparse. - -## Provisional Product Decisions - -- **Keep as-is:** cross-tool discovery, env overrides, command surfaces, consent bypass, MCP rug-pull detection, wrapper TOCTOU, remediation/undo, reporting/policy controls. -- **Keep and strengthen:** rule/skill maliciousness, session transfer, bootstrap control points, plugin marketplace provenance/integrity, MCP poisoning, toxic flow, safe local text analysis, deep remote-resource scan. -- **Keep but narrow/watch carefully:** remote MCP domain/header governance. -- **No immediate candidate for removal** surfaced in this audit, but the remote MCP header/domain policy family is the closest area to re-scope if evidence remains mostly analog rather than direct. diff --git a/docs/public-evidence-map.md b/docs/public-evidence-map.md index c2db7cf..afb48e9 100644 --- a/docs/public-evidence-map.md +++ b/docs/public-evidence-map.md @@ -2,8 +2,6 @@ This document summarizes public incident patterns that motivated CodeGate and maps them to the defensive checks CodeGate provides. -For the full feature-by-feature ledger with status and links, see [feature-evidence-ledger.md](./feature-evidence-ledger.md). - ## Evidence Themes 1. Repository files can become execution paths. diff --git a/docs/workflow-audit-parity-checklist.md b/docs/workflow-audit-parity-checklist.md deleted file mode 100644 index 53babc4..0000000 --- a/docs/workflow-audit-parity-checklist.md +++ /dev/null @@ -1,69 +0,0 @@ -# CodeGate Workflow Audit Parity Checklist - -Use this checklist to track the workflow-audit detectors implemented in CodeGate and the backlog that remains. - -## Wave A - -- [x] `dangerous-triggers` -- [x] `excessive-permissions` -- [x] `known-vulnerable-actions` -- [x] `template-injection` -- [x] `unpinned-uses` -- [x] `artipacked` -- [x] `cache-poisoning` -- [x] `github-env` -- [x] `insecure-commands` -- [x] `self-hosted-runner` -- [x] `overprovisioned-secrets` -- [x] `secrets-outside-env` -- [x] `secrets-inherit` -- [x] `use-trusted-publishing` -- [x] `undocumented-permissions` - -## Wave B - -- [x] `archived-uses` -- [x] `stale-action-refs` -- [x] `forbidden-uses` -- [x] `ref-confusion` -- [x] `ref-version-mismatch` -- [x] `impostor-commit` -- [x] `unpinned-images` - -## Wave C - -- [x] `anonymous-definition` -- [x] `concurrency-limits` -- [x] `superfluous-actions` -- [x] `misfeature` -- [x] `obfuscation` -- [x] `unsound-condition` -- [x] `unsound-contains` - -## Wave D - -- [x] `dependabot-cooldown` -- [x] `dependabot-execution` - -## Wave E - -- [x] `hardcoded-container-credentials` -- [x] `unredacted-secrets` -- [x] `bot-conditions` - -## Wave F - -- [x] `workflow-call-boundary` -- [x] `workflow-artifact-trust-chain` -- [x] `workflow-oidc-untrusted-context` -- [x] `workflow-pr-target-checkout-head` -- [x] `workflow-dynamic-matrix-injection` -- [x] `workflow-secret-exfiltration` -- [x] `dependabot-auto-merge` -- [x] `workflow-local-action-mutation` - -## Notes - -- Checked items are implemented in CodeGate. -- Unchecked items remain in the backlog. -- The checklist is intentionally limited to CodeGate workflow-audit terminology. diff --git a/docs/workflow-audit-real-cases.md b/docs/workflow-audit-real-cases.md deleted file mode 100644 index ec479ac..0000000 --- a/docs/workflow-audit-real-cases.md +++ /dev/null @@ -1,94 +0,0 @@ -# Workflow Audit Real-Case Corpus - -This document tracks real public workflow/dependabot examples used to validate workflow-audit detections locally. - -## Local Corpus - -Root: - -- `test-fixtures/workflow-audits/real-cases/` -- `test-fixtures/workflow-audits/real-cases/index.json` - -Each fixture is commit-pinned to keep source provenance stable. - -## Cases - -1. `RC-01-bot-conditions` - -- Expected rule: `bot-conditions` -- Source: -- Local file: `test-fixtures/workflow-audits/real-cases/RC-01-bot-conditions/.github/workflows/claude-dependabot.yml` - -2. `RC-02-obfuscation` - -- Expected rule: `workflow-obfuscation` -- Source: -- Local file: `test-fixtures/workflow-audits/real-cases/RC-02-obfuscation/.github/workflows/pipeline-electron-lint.yml` - -3. `RC-03-concurrency-limits` - -- Expected rule: `workflow-concurrency-limits` -- Source: -- Local file: `test-fixtures/workflow-audits/real-cases/RC-03-concurrency-limits/.github/workflows/label.yml` - -4. `RC-04-dependabot-execution` - -- Expected rule: `dependabot-execution` -- Source: -- Local file: `test-fixtures/workflow-audits/real-cases/RC-04-dependabot-execution/.github/dependabot.yml` - -5. `RC-05-workflow-pr-target-checkout-head` - -- Expected rule: `workflow-pr-target-checkout-head` -- Source: -- Local file: `test-fixtures/workflow-audits/real-cases/RC-05-workflow-pr-target-checkout-head/.github/workflows/tests.yml` - -6. `RC-06-workflow-artifact-trust-chain` - -- Expected rule: `workflow-artifact-trust-chain` -- Source: -- Local file: `test-fixtures/workflow-audits/real-cases/RC-06-workflow-artifact-trust-chain/.github/workflows/runtime_build_and_test.yml` - -7. `RC-07-workflow-call-boundary` - -- Expected rule: `workflow-call-boundary` -- Source: -- Local file: `test-fixtures/workflow-audits/real-cases/RC-07-workflow-call-boundary/.github/workflows/daily.yml` - -8. `RC-08-workflow-secret-exfiltration` - -- Expected rule: `workflow-secret-exfiltration` -- Source: -- Local file: `test-fixtures/workflow-audits/real-cases/RC-08-workflow-secret-exfiltration/.github/workflows/db-pro.yaml` - -9. `RC-09-workflow-oidc-untrusted-context` - -- Expected rule: `workflow-oidc-untrusted-context` -- Source: -- Local file: `test-fixtures/workflow-audits/real-cases/RC-09-workflow-oidc-untrusted-context/.github/workflows/frontend-lint.yml` - -10. `RC-10-dependabot-auto-merge` - -- Expected rule: `dependabot-auto-merge` -- Source: -- Local file: `test-fixtures/workflow-audits/real-cases/RC-10-dependabot-auto-merge/.github/workflows/dependabot-auto-merge.yml` - -11. `RC-11-workflow-local-action-mutation` - -- Expected rule: `workflow-local-action-mutation` -- Source: -- Local file: `test-fixtures/workflow-audits/real-cases/RC-11-workflow-local-action-mutation/.github/workflows/frontend-lint.yml` - -## Validation - -Run targeted test: - -```bash -npm test -- tests/layer2/workflow-real-cases.test.ts -``` - -Run CLI manually: - -```bash -codegate scan test-fixtures/workflow-audits/real-cases/RC-06-workflow-artifact-trust-chain --workflow-audits --no-tui --format json -``` diff --git a/tests/meta/workflow-audit-parity-contract.test.ts b/tests/meta/workflow-audit-parity-contract.test.ts deleted file mode 100644 index c199682..0000000 --- a/tests/meta/workflow-audit-parity-contract.test.ts +++ /dev/null @@ -1,69 +0,0 @@ -import { existsSync, readFileSync } from "node:fs"; -import { resolve } from "node:path"; -import { describe, expect, it } from "vitest"; - -const root = resolve(process.cwd()); -const checklistPath = resolve(root, "docs/workflow-audit-parity-checklist.md"); - -const expectedCheckedAuditIds = [ - "dangerous-triggers", - "excessive-permissions", - "known-vulnerable-actions", - "template-injection", - "unpinned-uses", - "artipacked", - "cache-poisoning", - "github-env", - "insecure-commands", - "self-hosted-runner", - "overprovisioned-secrets", - "secrets-outside-env", - "secrets-inherit", - "use-trusted-publishing", - "undocumented-permissions", - "archived-uses", - "stale-action-refs", - "forbidden-uses", - "ref-confusion", - "ref-version-mismatch", - "impostor-commit", - "unpinned-images", - "anonymous-definition", - "concurrency-limits", - "superfluous-actions", - "misfeature", - "obfuscation", - "unsound-condition", - "unsound-contains", - "dependabot-cooldown", - "dependabot-execution", - "hardcoded-container-credentials", - "unredacted-secrets", - "bot-conditions", - "workflow-call-boundary", - "workflow-artifact-trust-chain", - "workflow-pr-target-checkout-head", - "workflow-secret-exfiltration", - "workflow-oidc-untrusted-context", - "workflow-dynamic-matrix-injection", - "dependabot-auto-merge", - "workflow-local-action-mutation", -] as const; - -function readChecklist(): string { - return readFileSync(checklistPath, "utf8"); -} - -describe("workflow audit parity checklist contract", () => { - it("exists at the documented location", () => { - expect(existsSync(checklistPath)).toBe(true); - }); - - it("marks every currently implemented workflow audit id as checked", () => { - const checklist = readChecklist(); - - for (const auditId of expectedCheckedAuditIds) { - expect(checklist).toContain(`- [x] \`${auditId}\``); - } - }); -});