fix(scanners): correct scanner exit-code handling and stop duplicate skip logs#156
Merged
Merged
Conversation
…skip logs Triage of failures seen running `codehub analyze`/`scan` on a foreign Python/uv project surfaced four real adapter bugs: osv-scanner exit-code misclassification. osv-scanner v2 reserves exit 1-126 for findings, 127 for "general error", 128 for "no packages". The shared `invokeScanner` treated only 0/1 as clean, so a normal vulns-found run (exit 1) was flagged and a 127 was reported as a bare "exit code 127". Give osv its own exit-code interpreter (osvExitAdvisory), and drop the `--offline-vulnerabilities` default that — without a synced DB — made osv walk the tree then fail 127. Switch to the canonical `scan source` form (matches ci.yml). The "filesystem walk for root: /" line is osv's own internal log, not a `/` scan root — confirmed benign. bandit exit-2 usage error. When bandit is installed without the `bandit[sarif]` extra, argparse rejects `-f sarif` (exit 2 + usage banner) — it does NOT fall back to text. Detect that exact case and emit an actionable "install bandit[sarif]" advisory instead of the misleading "stdout was not valid JSON" note. Duplicate skip lines. The runner routed `onWarn` to the `skipped` status AND re-emitted `result.skipped` on the terminal event, double-printing the same line (the two identical `pip-audit skipped: ...` lines) and labeling ran-but-nonzero advisories as "skipped". Add a `warn` status, route `onWarn` to it, and coalesce the terminal event so each note prints once. Surface `warn` distinctly in the CLI scan reporter. scip-python mise-shim failure. A version-manager shim (mise/asdf) that resolves on PATH but has no version pinned exits non-zero before the real indexer runs. runIndexer threw on this, producing the alarming "python indexer scip-python exited 1". Detect the "No version is set for shim" pattern and return a graceful, actionable skip instead. The dead-code ghost-community warning is working as designed (members are all literally `dead`, >=2 members) — informational, not a bug; left as-is. Tests: scanners 88 pass (+7), scip-ingest 66 pass (+4). pip-audit binary-missing remains a graceful exit-0 skip (user-env, install hint correct).
Merged
4 tasks
theagenticguy
added a commit
that referenced
this pull request
May 29, 2026
…rict) (#159) ## Summary Extends `codehub doctor` so an operator can verify a deployment's full **parse + index toolchain**, not just node/pnpm/native bindings. Closes the gap where `doctor` was silent about the two things most likely to make `analyze` quietly under-perform: missing vendored grammars and missing SCIP indexers. Builds on the existing `doctor` framework (same `Check` interface, `ok/warn/fail`, exit 0/1/2) — `@ladybugdb/core` and the scanner binaries were already covered; this adds the missing rows. ## New checks **1. Vendored WASM grammars** (1 row) Asserts all **16** blobs ship in `@opencodehub/ingestion`'s `vendor/wasms/` with valid `\0asm` magic — mirrors the prepublish gate `verify-vendor-wasms.mjs`, but runs against the *installed* package so it validates a real deployment. **Always `fail` on absence/corruption** (never a soft skip — a shipped artifact being gone means parsing is broken). **2. SCIP indexers** (1 row per language) `typescript, python, go, rust, java, ruby, c/c++, c#, kotlin, cobol`. Probes `<bin> --version`, `~/.codehub/bin`, and JAR assets under `~/.codehub`. Hints route setup-installable indexers to `codehub setup --scip=<flag>` and system toolchains (go/rust/java SDKs) to the user's package manager. ## The `--strict` flag (skip = fail, opt-in) By default an absent indexer is **`warn`** — the analyze pipeline skips an unavailable language gracefully (a Python-only box doesn't need `scip-go`), matching the lenient runtime behavior. `--strict` escalates every absent indexer to **`fail` (exit 2)** for release/CI gates. Vendored WASMs are `fail` in both modes. ``` codehub doctor → scip-ruby absent = WARN (exit 1) codehub doctor --strict → scip-ruby absent = FAIL (exit 2) ``` This deliberately reconciles with #156: runtime stays lenient; the diagnostic gate can be strict. ## Implementation note The vendor dir resolves via `import.meta.resolve("@opencodehub/ingestion")`, **not** `createRequire().resolve()` — the package's `exports` map declares only the ESM `import` condition, so the require form throws `ERR_PACKAGE_PATH_NOT_EXPORTED`. Caught during testing: the require path only "worked" locally via the monorepo fallback and would have falsely FAILed the WASM check in a real global `npm i -g` install. The `import.meta.resolve` path is verified to resolve with a bogus repoRoot (i.e. no monorepo). ## Verification - `codehub doctor` → new rows render; exit 1 with absent indexers - `codehub doctor --strict` → indexer rows FAIL, exit 2; OK indexers (go/rust here) stay OK; wasm stays OK - **262/262 cli tests pass** (8 new: wasm ok/fail, indexer warn-vs-strict-fail, JAR-by-file resolution, exit-code escalation) - typecheck + biome clean ## Test plan - [x] vendored-wasms ok against real install; fail-capable (never warn) - [x] indexer warn (default) → fail (--strict) for the same absence - [x] JAR indexer (kotlin) resolves by file presence - [x] runDoctor exit code: 1 lenient, 2 strict
theagenticguy
pushed a commit
that referenced
this pull request
May 29, 2026
🤖 Automated release via release-please --- <details><summary>analysis: 0.3.2</summary> ## [0.3.2](analysis-v0.3.1...analysis-v0.3.2) (2026-05-29) ### Bug Fixes * **deps:** downgrade write-file-atomic 8.0.0→7.0.1 to match supported node range ([#155](#155)) ([a723e53](a723e53)) ### Dependencies * The following workspace dependencies were updated * dependencies * @opencodehub/storage bumped to 0.2.2 * @opencodehub/wiki bumped to 0.2.2 </details> <details><summary>cli: 0.5.4</summary> ## [0.5.4](cli-v0.5.3...cli-v0.5.4) (2026-05-29) ### Features * **cli:** doctor checks vendored wasm grammars + scip indexers (--strict) ([#159](#159)) ([36a241e](36a241e)) ### Bug Fixes * **deps:** downgrade write-file-atomic 8.0.0→7.0.1 to match supported node range ([#155](#155)) ([a723e53](a723e53)) * **scanners:** correct scanner exit-code handling and stop duplicate skip logs ([#156](#156)) ([5d30eb4](5d30eb4)) ### Dependencies * The following workspace dependencies were updated * dependencies * @opencodehub/analysis bumped to 0.3.2 * @opencodehub/ingestion bumped to 0.4.4 * @opencodehub/mcp bumped to 0.4.3 * @opencodehub/pack bumped to 0.2.3 * @opencodehub/scanners bumped to 0.2.1 * @opencodehub/search bumped to 0.2.2 * @opencodehub/storage bumped to 0.2.2 * @opencodehub/wiki bumped to 0.2.2 </details> <details><summary>cobol-proleap: 0.1.8</summary> ## [0.1.8](cobol-proleap-v0.1.7...cobol-proleap-v0.1.8) (2026-05-29) ### Dependencies * The following workspace dependencies were updated * dependencies * @opencodehub/ingestion bumped to 0.4.4 </details> <details><summary>ingestion: 0.4.4</summary> ## [0.4.4](ingestion-v0.4.3...ingestion-v0.4.4) (2026-05-29) ### Bug Fixes * **deps:** downgrade write-file-atomic 8.0.0→7.0.1 to match supported node range ([#155](#155)) ([a723e53](a723e53)) * **ingestion:** vendor graphty Leiden to drop node-pty install fetch ([#157](#157)) ([790ca4e](790ca4e)) ### Dependencies * The following workspace dependencies were updated * dependencies * @opencodehub/analysis bumped to 0.3.2 * @opencodehub/scip-ingest bumped to 0.2.4 * @opencodehub/storage bumped to 0.2.2 </details> <details><summary>mcp: 0.4.3</summary> ## [0.4.3](mcp-v0.4.2...mcp-v0.4.3) (2026-05-29) ### Dependencies * The following workspace dependencies were updated * dependencies * @opencodehub/analysis bumped to 0.3.2 * @opencodehub/pack bumped to 0.2.3 * @opencodehub/scanners bumped to 0.2.1 * @opencodehub/search bumped to 0.2.2 * @opencodehub/storage bumped to 0.2.2 </details> <details><summary>pack: 0.2.3</summary> ## [0.2.3](pack-v0.2.2...pack-v0.2.3) (2026-05-29) ### Dependencies * The following workspace dependencies were updated * dependencies * @opencodehub/analysis bumped to 0.3.2 * @opencodehub/ingestion bumped to 0.4.4 * @opencodehub/storage bumped to 0.2.2 </details> <details><summary>scanners: 0.2.1</summary> ## [0.2.1](scanners-v0.2.0...scanners-v0.2.1) (2026-05-29) ### Bug Fixes * **scanners:** correct scanner exit-code handling and stop duplicate skip logs ([#156](#156)) ([5d30eb4](5d30eb4)) </details> <details><summary>scip-ingest: 0.2.4</summary> ## [0.2.4](scip-ingest-v0.2.3...scip-ingest-v0.2.4) (2026-05-29) ### Bug Fixes * **scanners:** correct scanner exit-code handling and stop duplicate skip logs ([#156](#156)) ([5d30eb4](5d30eb4)) * **scip-ingest:** prepend ~/.codehub/bin to indexer spawn PATH ([#160](#160)) ([4418db9](4418db9)) ### Dependencies * The following workspace dependencies were updated * dependencies * @opencodehub/analysis bumped to 0.3.2 </details> <details><summary>search: 0.2.2</summary> ## [0.2.2](search-v0.2.1...search-v0.2.2) (2026-05-29) ### Dependencies * The following workspace dependencies were updated * dependencies * @opencodehub/storage bumped to 0.2.2 </details> <details><summary>storage: 0.2.2</summary> ## [0.2.2](storage-v0.2.1...storage-v0.2.2) (2026-05-29) ### Bug Fixes * **storage:** retry transient lbug WAL→checkpoint race in bulkLoad ([#161](#161)) ([450714c](450714c)) </details> <details><summary>wiki: 0.2.2</summary> ## [0.2.2](wiki-v0.2.1...wiki-v0.2.2) (2026-05-29) ### Bug Fixes * **deps:** downgrade write-file-atomic 8.0.0→7.0.1 to match supported node range ([#155](#155)) ([a723e53](a723e53)) ### Dependencies * The following workspace dependencies were updated * dependencies * @opencodehub/storage bumped to 0.2.2 </details> <details><summary>root: 0.6.5</summary> ## [0.6.5](root-v0.6.4...root-v0.6.5) (2026-05-29) ### Features * **cli:** doctor checks vendored wasm grammars + scip indexers (--strict) ([#159](#159)) ([36a241e](36a241e)) ### Bug Fixes * **ci:** isolate verify-global-install into a per-run npm prefix ([#162](#162)) ([3b59373](3b59373)) * **deps:** bump qs 6.15.1→6.15.2 and tmp 0.2.4→0.2.6 to clear osv findings ([#151](#151)) ([2f798ec](2f798ec)) * **deps:** downgrade write-file-atomic 8.0.0→7.0.1 to match supported node range ([#155](#155)) ([a723e53](a723e53)) * **ingestion:** vendor graphty Leiden to drop node-pty install fetch ([#157](#157)) ([790ca4e](790ca4e)) * **scanners:** correct scanner exit-code handling and stop duplicate skip logs ([#156](#156)) ([5d30eb4](5d30eb4)) * **scip-ingest:** prepend ~/.codehub/bin to indexer spawn PATH ([#160](#160)) ([4418db9](4418db9)) * **storage:** retry transient lbug WAL→checkpoint race in bulkLoad ([#161](#161)) ([450714c](450714c)) </details> --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes a cluster of scanner/indexer robustness bugs surfaced by running
codehub analyzeon an external Python/uv project. Each was misreporting a ran-but-nonzero scanner as a hard skip, or emitting misleading/duplicate diagnostics. Grounded against osv-scanner v2 and bandit exit-code semantics.Issues fixed
1. osv-scanner "exit code 127" despite running fine — osv v2 reserves exit
1–126= vulns found,127= general error,128= no packages (per osv docs). The shared invoker treated only 0/1 as clean, so 127 surfaced as a bare error. Root trigger: the wrapper passed--offline-vulnerabilitiesby default, which on a repo with no synced DB makes osv walk the tree, then fail to load the offline DB → exit 127.→ Added an osv-specific exit-code interpreter (127 → "general error, try
codehub db-sync"; 128 → "no packages discovered"). Dropped offline-by-default; use the canonicalscan source --recursive .form (matchesci.yml). Theroot: /line is osv's own internal log — the adapter correctly roots at the repo dir (cwd=projectPath, arg.).2. bandit "exit code 2 + usage:" — bandit exits 2 on argparse errors;
-f sarifis invalid without thebandit[sarif]extra installed → usage banner. (The old "falls back to text" assumption was false.)→ Detect exit-2 +
usage: banditand emit an actionable "installbandit[sarif]" advisory; suppress the misleading "stdout was not valid JSON" note.3. Duplicate skip messages — the runner routed
onWarnto status"skipped"AND re-emitted the terminal note, so lines printed twice and ran-but-nonzero advisories were mislabeled "skipped".→ Added a distinct
"warn"status (scan ran, here's a note) and coalesced the terminal event so each note prints once.4. scip-python "mise ERROR No version is set for shim" —
runIndexerthrew on any non-zero exit; a mise/asdf shim with no pinned version resolves on PATH but exits non-zero before the real indexer runs, producing an alarming "indexer failed".→ Detect the version-manager-shim failure pattern and return a graceful
skipped(logged as the calmer "python skipped — …") with an actionable hint. A genuine traceback still throws.Out of scope (user-env, not codehub bugs)
pip-audit "binary not found" is a graceful skip already (only bug was the duplicate print, fixed in #3). The dead-code ghost-community warning is correctly guarded and informational.
Verification
@opencodehub/scanners— 88 tests pass (+7: osv exit 1/127/128 + argv, bandit exit-2, runner de-dup + warn)@opencodehub/scip-ingest— 66 tests pass (+4: mise/asdf shim detection + genuine-crash still throws)Note for reviewer
Dropping
--offline-vulnerabilitiesis a deliberate posture change (osv does online lookups by default now) — flagging for sign-off.