Teach kmir to reduce SMIR K definitions over a set of CFG roots#845
Teach kmir to reduce SMIR K definitions over a set of CFG roots#845
Conversation
Resolve conflicts in __main__.py, kmir.py, and options.py by keeping both the PR's cfg_roots feature and master's refactored prove pipeline (with _prove.py, add_module, max_workers, break_on_function support). - __main__.py: add --cfg-roots alongside master's --add-module and --max-workers args - kmir.py: keep master's delegated prove_program (via _prove.py) - options.py: add cfg_roots field to master's unified ProveOpts class - _prove.py: use cfg_roots for reduce_to when available Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
After merging origin/master, fix the cfg_roots feature so that start_symbol is dynamically combined at usage time rather than baked into cfg_roots during __init__. This prevents stale roots when start_symbol is reassigned after construction (e.g. in tests with multiple start symbols per file). Also adapts compute_closure call sites to the new Sequence[Ty] signature and fixes minor style issues (quote style, black formatting). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add INFO-level logging in _prove.py showing before/after symbol table reduction stats (original items, reduced items, pruning percentage, root count, elapsed time) to make the cfg_roots feature observable. Add test_cfg_roots_prove_pipeline integration test that proves with a single root vs multiple roots using arith.smir.json (where add and mul are independent leaf functions), verifying that multi-root reduces to strictly more items than single-root. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
cfg_roots Testing ReportMerged Bug FixFound and fixed a bug where New: Reduction Stats LoggingAdded INFO-level logging showing before/after symbol table reduction: Integration TestAdded
Real-World Demo: solana-token
|
| Scenario | Items After Reduction | Pruned | Pruning % | Kompile Time | Total Time |
|---|---|---|---|---|---|
Single root (test_process_approve) |
128 | 794 | 86.1% | ~30s | 1m40s |
| All 32 test roots | 548 | 374 | 40.6% | ~70s | 3m45s |
Key observation: Using all 32 test roots retains 4.3x more items (548 vs 128), which increases kompile time by ~2.3x. The function equations count (3132) stays the same because reduce_to only filters items, not types or allocs.
Recommendation: --cfg-roots should be used selectively — only include roots that share dependencies with the start symbol, rather than all test entry points. The feature is most valuable for cases where the call graph analysis misses indirect references needed by the proof.
Tests Passing
make check✅make test-unit(38 tests) ✅test_cfg_roots_prove_pipeline✅kmir provewith/without--cfg-roots✅
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 4bdb7ea9ea
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
jberthold
left a comment
There was a problem hiding this comment.
The Codex review raises a valid point - the given option might be ignored if it is passed to a prove command and continues a proof.
Maybe the code can still be merged but as a stand-alone tool like the kmir link command? That decouples it from proving where artefacts are saved and reloaded.
Address reviewer feedback (jberthold): decouple SMIR reduction from `prove` into a standalone CLI command, following the `kmir link` pattern. - Add `kmir reduce` command: takes SMIR JSON + roots → outputs reduced SMIR JSON. Supports comma-separated roots or @file syntax. - Add `ReduceOpts` dataclass in options.py - Remove `--cfg-roots` from `prove`/`prove-rs` to avoid proof cache invalidation issues (proof identity doesn't include cfg_roots) - Keep reduction stats logging in _prove.py for start_symbol reduction - Update integration test to verify standalone reduce behavior Intended workflow: kmir link *.smir.json -o linked.smir.json kmir reduce linked.smir.json -r root1,root2 -o reduced.smir.json kmir prove-rs --smir reduced.smir.json --start-symbol root1 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Addressing reviewer feedback@jberthold Good suggestion — I've refactored New
|
jberthold
left a comment
There was a problem hiding this comment.
LGTM now (with a minor comment), thanks
| original_items = len(smir_info.items) | ||
| t0 = time.monotonic() | ||
| smir_info = smir_info.reduce_to(opts.start_symbol) | ||
| reduction_ms = (time.monotonic() - t0) * 1000 | ||
| reduced_items = len(smir_info.items) | ||
| pruned = original_items - reduced_items | ||
| pct = (pruned / original_items * 100) if original_items > 0 else 0 | ||
| _LOGGER.info( | ||
| f'Symbol table reduction: {original_items} -> {reduced_items} items' | ||
| f' ({pruned} pruned, {pct:.1f}%), {reduction_ms:.1f}ms' | ||
| ) |
No description provided.