Skip to content

fix(scanners): correct scanner exit-code handling and stop duplicate skip logs#156

Merged
theagenticguy merged 1 commit into
mainfrom
fix/scanner-adapter-robustness-pr
May 29, 2026
Merged

fix(scanners): correct scanner exit-code handling and stop duplicate skip logs#156
theagenticguy merged 1 commit into
mainfrom
fix/scanner-adapter-robustness-pr

Conversation

@theagenticguy
Copy link
Copy Markdown
Owner

Summary

Fixes a cluster of scanner/indexer robustness bugs surfaced by running codehub analyze on an external Python/uv project. Each was misreporting a ran-but-nonzero scanner as a hard skip, or emitting misleading/duplicate diagnostics. Grounded against osv-scanner v2 and bandit exit-code semantics.

Issues fixed

1. osv-scanner "exit code 127" despite running fine — osv v2 reserves exit 1–126 = vulns found, 127 = general error, 128 = no packages (per osv docs). The shared invoker treated only 0/1 as clean, so 127 surfaced as a bare error. Root trigger: the wrapper passed --offline-vulnerabilities by default, which on a repo with no synced DB makes osv walk the tree, then fail to load the offline DB → exit 127.
→ Added an osv-specific exit-code interpreter (127 → "general error, try codehub db-sync"; 128 → "no packages discovered"). Dropped offline-by-default; use the canonical scan source --recursive . form (matches ci.yml). The root: / line is osv's own internal log — the adapter correctly roots at the repo dir (cwd=projectPath, arg .).

2. bandit "exit code 2 + usage:" — bandit exits 2 on argparse errors; -f sarif is invalid without the bandit[sarif] extra installed → usage banner. (The old "falls back to text" assumption was false.)
→ Detect exit-2 + usage: bandit and emit an actionable "install bandit[sarif]" advisory; suppress the misleading "stdout was not valid JSON" note.

3. Duplicate skip messages — the runner routed onWarn to status "skipped" AND re-emitted the terminal note, so lines printed twice and ran-but-nonzero advisories were mislabeled "skipped".
→ Added a distinct "warn" status (scan ran, here's a note) and coalesced the terminal event so each note prints once.

4. scip-python "mise ERROR No version is set for shim"runIndexer threw on any non-zero exit; a mise/asdf shim with no pinned version resolves on PATH but exits non-zero before the real indexer runs, producing an alarming "indexer failed".
→ Detect the version-manager-shim failure pattern and return a graceful skipped (logged as the calmer "python skipped — …") with an actionable hint. A genuine traceback still throws.

Out of scope (user-env, not codehub bugs)

pip-audit "binary not found" is a graceful skip already (only bug was the duplicate print, fixed in #3). The dead-code ghost-community warning is correctly guarded and informational.

Verification

  • @opencodehub/scanners88 tests pass (+7: osv exit 1/127/128 + argv, bandit exit-2, runner de-dup + warn)
  • @opencodehub/scip-ingest66 tests pass (+4: mise/asdf shim detection + genuine-crash still throws)

Note for reviewer

Dropping --offline-vulnerabilities is a deliberate posture change (osv does online lookups by default now) — flagging for sign-off.

…skip logs

Triage of failures seen running `codehub analyze`/`scan` on a foreign
Python/uv project surfaced four real adapter bugs:

osv-scanner exit-code misclassification. osv-scanner v2 reserves exit
1-126 for findings, 127 for "general error", 128 for "no packages". The
shared `invokeScanner` treated only 0/1 as clean, so a normal vulns-found
run (exit 1) was flagged and a 127 was reported as a bare "exit code 127".
Give osv its own exit-code interpreter (osvExitAdvisory), and drop the
`--offline-vulnerabilities` default that — without a synced DB — made osv
walk the tree then fail 127. Switch to the canonical `scan source` form
(matches ci.yml). The "filesystem walk for root: /" line is osv's own
internal log, not a `/` scan root — confirmed benign.

bandit exit-2 usage error. When bandit is installed without the
`bandit[sarif]` extra, argparse rejects `-f sarif` (exit 2 + usage banner)
— it does NOT fall back to text. Detect that exact case and emit an
actionable "install bandit[sarif]" advisory instead of the misleading
"stdout was not valid JSON" note.

Duplicate skip lines. The runner routed `onWarn` to the `skipped` status
AND re-emitted `result.skipped` on the terminal event, double-printing the
same line (the two identical `pip-audit skipped: ...` lines) and labeling
ran-but-nonzero advisories as "skipped". Add a `warn` status, route
`onWarn` to it, and coalesce the terminal event so each note prints once.
Surface `warn` distinctly in the CLI scan reporter.

scip-python mise-shim failure. A version-manager shim (mise/asdf) that
resolves on PATH but has no version pinned exits non-zero before the real
indexer runs. runIndexer threw on this, producing the alarming
"python indexer scip-python exited 1". Detect the "No version is set for
shim" pattern and return a graceful, actionable skip instead.

The dead-code ghost-community warning is working as designed (members are
all literally `dead`, >=2 members) — informational, not a bug; left as-is.

Tests: scanners 88 pass (+7), scip-ingest 66 pass (+4). pip-audit
binary-missing remains a graceful exit-0 skip (user-env, install hint
correct).
@theagenticguy theagenticguy merged commit 5d30eb4 into main May 29, 2026
41 of 45 checks passed
@theagenticguy theagenticguy deleted the fix/scanner-adapter-robustness-pr branch May 29, 2026 11:56
@github-actions github-actions Bot mentioned this pull request May 29, 2026
theagenticguy added a commit that referenced this pull request May 29, 2026
…rict) (#159)

## Summary

Extends `codehub doctor` so an operator can verify a deployment's full
**parse + index toolchain**, not just node/pnpm/native bindings. Closes
the gap where `doctor` was silent about the two things most likely to
make `analyze` quietly under-perform: missing vendored grammars and
missing SCIP indexers.

Builds on the existing `doctor` framework (same `Check` interface,
`ok/warn/fail`, exit 0/1/2) — `@ladybugdb/core` and the scanner binaries
were already covered; this adds the missing rows.

## New checks

**1. Vendored WASM grammars** (1 row)
Asserts all **16** blobs ship in `@opencodehub/ingestion`'s
`vendor/wasms/` with valid `\0asm` magic — mirrors the prepublish gate
`verify-vendor-wasms.mjs`, but runs against the *installed* package so
it validates a real deployment. **Always `fail` on absence/corruption**
(never a soft skip — a shipped artifact being gone means parsing is
broken).

**2. SCIP indexers** (1 row per language)
`typescript, python, go, rust, java, ruby, c/c++, c#, kotlin, cobol`.
Probes `<bin> --version`, `~/.codehub/bin`, and JAR assets under
`~/.codehub`. Hints route setup-installable indexers to `codehub setup
--scip=<flag>` and system toolchains (go/rust/java SDKs) to the user's
package manager.

## The `--strict` flag (skip = fail, opt-in)

By default an absent indexer is **`warn`** — the analyze pipeline skips
an unavailable language gracefully (a Python-only box doesn't need
`scip-go`), matching the lenient runtime behavior. `--strict` escalates
every absent indexer to **`fail` (exit 2)** for release/CI gates.
Vendored WASMs are `fail` in both modes.

```
codehub doctor            → scip-ruby absent = WARN (exit 1)
codehub doctor --strict   → scip-ruby absent = FAIL (exit 2)
```

This deliberately reconciles with #156: runtime stays lenient; the
diagnostic gate can be strict.

## Implementation note

The vendor dir resolves via
`import.meta.resolve("@opencodehub/ingestion")`, **not**
`createRequire().resolve()` — the package's `exports` map declares only
the ESM `import` condition, so the require form throws
`ERR_PACKAGE_PATH_NOT_EXPORTED`. Caught during testing: the require path
only "worked" locally via the monorepo fallback and would have falsely
FAILed the WASM check in a real global `npm i -g` install. The
`import.meta.resolve` path is verified to resolve with a bogus repoRoot
(i.e. no monorepo).

## Verification
- `codehub doctor` → new rows render; exit 1 with absent indexers
- `codehub doctor --strict` → indexer rows FAIL, exit 2; OK indexers
(go/rust here) stay OK; wasm stays OK
- **262/262 cli tests pass** (8 new: wasm ok/fail, indexer
warn-vs-strict-fail, JAR-by-file resolution, exit-code escalation)
- typecheck + biome clean

## Test plan
- [x] vendored-wasms ok against real install; fail-capable (never warn)
- [x] indexer warn (default) → fail (--strict) for the same absence
- [x] JAR indexer (kotlin) resolves by file presence
- [x] runDoctor exit code: 1 lenient, 2 strict
theagenticguy pushed a commit that referenced this pull request May 29, 2026
🤖 Automated release via release-please
---


<details><summary>analysis: 0.3.2</summary>

##
[0.3.2](analysis-v0.3.1...analysis-v0.3.2)
(2026-05-29)


### Bug Fixes

* **deps:** downgrade write-file-atomic 8.0.0→7.0.1 to match supported
node range
([#155](#155))
([a723e53](a723e53))


### Dependencies

* The following workspace dependencies were updated
  * dependencies
    * @opencodehub/storage bumped to 0.2.2
    * @opencodehub/wiki bumped to 0.2.2
</details>

<details><summary>cli: 0.5.4</summary>

##
[0.5.4](cli-v0.5.3...cli-v0.5.4)
(2026-05-29)


### Features

* **cli:** doctor checks vendored wasm grammars + scip indexers
(--strict)
([#159](#159))
([36a241e](36a241e))


### Bug Fixes

* **deps:** downgrade write-file-atomic 8.0.0→7.0.1 to match supported
node range
([#155](#155))
([a723e53](a723e53))
* **scanners:** correct scanner exit-code handling and stop duplicate
skip logs
([#156](#156))
([5d30eb4](5d30eb4))


### Dependencies

* The following workspace dependencies were updated
  * dependencies
    * @opencodehub/analysis bumped to 0.3.2
    * @opencodehub/ingestion bumped to 0.4.4
    * @opencodehub/mcp bumped to 0.4.3
    * @opencodehub/pack bumped to 0.2.3
    * @opencodehub/scanners bumped to 0.2.1
    * @opencodehub/search bumped to 0.2.2
    * @opencodehub/storage bumped to 0.2.2
    * @opencodehub/wiki bumped to 0.2.2
</details>

<details><summary>cobol-proleap: 0.1.8</summary>

##
[0.1.8](cobol-proleap-v0.1.7...cobol-proleap-v0.1.8)
(2026-05-29)


### Dependencies

* The following workspace dependencies were updated
  * dependencies
    * @opencodehub/ingestion bumped to 0.4.4
</details>

<details><summary>ingestion: 0.4.4</summary>

##
[0.4.4](ingestion-v0.4.3...ingestion-v0.4.4)
(2026-05-29)


### Bug Fixes

* **deps:** downgrade write-file-atomic 8.0.0→7.0.1 to match supported
node range
([#155](#155))
([a723e53](a723e53))
* **ingestion:** vendor graphty Leiden to drop node-pty install fetch
([#157](#157))
([790ca4e](790ca4e))


### Dependencies

* The following workspace dependencies were updated
  * dependencies
    * @opencodehub/analysis bumped to 0.3.2
    * @opencodehub/scip-ingest bumped to 0.2.4
    * @opencodehub/storage bumped to 0.2.2
</details>

<details><summary>mcp: 0.4.3</summary>

##
[0.4.3](mcp-v0.4.2...mcp-v0.4.3)
(2026-05-29)


### Dependencies

* The following workspace dependencies were updated
  * dependencies
    * @opencodehub/analysis bumped to 0.3.2
    * @opencodehub/pack bumped to 0.2.3
    * @opencodehub/scanners bumped to 0.2.1
    * @opencodehub/search bumped to 0.2.2
    * @opencodehub/storage bumped to 0.2.2
</details>

<details><summary>pack: 0.2.3</summary>

##
[0.2.3](pack-v0.2.2...pack-v0.2.3)
(2026-05-29)


### Dependencies

* The following workspace dependencies were updated
  * dependencies
    * @opencodehub/analysis bumped to 0.3.2
    * @opencodehub/ingestion bumped to 0.4.4
    * @opencodehub/storage bumped to 0.2.2
</details>

<details><summary>scanners: 0.2.1</summary>

##
[0.2.1](scanners-v0.2.0...scanners-v0.2.1)
(2026-05-29)


### Bug Fixes

* **scanners:** correct scanner exit-code handling and stop duplicate
skip logs
([#156](#156))
([5d30eb4](5d30eb4))
</details>

<details><summary>scip-ingest: 0.2.4</summary>

##
[0.2.4](scip-ingest-v0.2.3...scip-ingest-v0.2.4)
(2026-05-29)


### Bug Fixes

* **scanners:** correct scanner exit-code handling and stop duplicate
skip logs
([#156](#156))
([5d30eb4](5d30eb4))
* **scip-ingest:** prepend ~/.codehub/bin to indexer spawn PATH
([#160](#160))
([4418db9](4418db9))


### Dependencies

* The following workspace dependencies were updated
  * dependencies
    * @opencodehub/analysis bumped to 0.3.2
</details>

<details><summary>search: 0.2.2</summary>

##
[0.2.2](search-v0.2.1...search-v0.2.2)
(2026-05-29)


### Dependencies

* The following workspace dependencies were updated
  * dependencies
    * @opencodehub/storage bumped to 0.2.2
</details>

<details><summary>storage: 0.2.2</summary>

##
[0.2.2](storage-v0.2.1...storage-v0.2.2)
(2026-05-29)


### Bug Fixes

* **storage:** retry transient lbug WAL→checkpoint race in bulkLoad
([#161](#161))
([450714c](450714c))
</details>

<details><summary>wiki: 0.2.2</summary>

##
[0.2.2](wiki-v0.2.1...wiki-v0.2.2)
(2026-05-29)


### Bug Fixes

* **deps:** downgrade write-file-atomic 8.0.0→7.0.1 to match supported
node range
([#155](#155))
([a723e53](a723e53))


### Dependencies

* The following workspace dependencies were updated
  * dependencies
    * @opencodehub/storage bumped to 0.2.2
</details>

<details><summary>root: 0.6.5</summary>

##
[0.6.5](root-v0.6.4...root-v0.6.5)
(2026-05-29)


### Features

* **cli:** doctor checks vendored wasm grammars + scip indexers
(--strict)
([#159](#159))
([36a241e](36a241e))


### Bug Fixes

* **ci:** isolate verify-global-install into a per-run npm prefix
([#162](#162))
([3b59373](3b59373))
* **deps:** bump qs 6.15.1→6.15.2 and tmp 0.2.4→0.2.6 to clear osv
findings
([#151](#151))
([2f798ec](2f798ec))
* **deps:** downgrade write-file-atomic 8.0.0→7.0.1 to match supported
node range
([#155](#155))
([a723e53](a723e53))
* **ingestion:** vendor graphty Leiden to drop node-pty install fetch
([#157](#157))
([790ca4e](790ca4e))
* **scanners:** correct scanner exit-code handling and stop duplicate
skip logs
([#156](#156))
([5d30eb4](5d30eb4))
* **scip-ingest:** prepend ~/.codehub/bin to indexer spawn PATH
([#160](#160))
([4418db9](4418db9))
* **storage:** retry transient lbug WAL→checkpoint race in bulkLoad
([#161](#161))
([450714c](450714c))
</details>

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant