Skip to content

Add YAML recipe system for model optimization#1013

Open
sungsooha wants to merge 18 commits intoNVIDIA:mainfrom
sungsooha:add-recipe-system
Open

Add YAML recipe system for model optimization#1013
sungsooha wants to merge 18 commits intoNVIDIA:mainfrom
sungsooha:add-recipe-system

Conversation

@sungsooha
Copy link

@sungsooha sungsooha commented Mar 10, 2026

What does this PR do?

Type of change: New feature

Add a YAML-based recipe system (modelopt.torch.recipes) for defining and validating model optimization workflows. Recipes provide a human-readable interface on top of ModelOpt's quantization, sparsity, pruning, and distillation APIs.

What's included:

  • Schema — Pydantic models for recipe validation with preset resolution (preset: fp8FP8_DEFAULT_CFG)
  • Pipeline planner — multi-technique execution ordering (pruning → sparsity → quantization → distillation)
  • Experiment controller — sweep job matrix generation (models × recipes × clusters) with per-model/launcher eval overrides
  • Bridge — recipe-to-hf_ptq.py argument translation
  • 11 example recipes organized by technique (PTQ, QAT, multi-technique, auto-quantize)
  • 34 unit tests, all passing. Zero modification to existing files.

Usage

from modelopt.torch.recipes import load_recipe

result = load_recipe("examples/recipes/ptq/ptq_fp8.yaml")
config = result["quantize_config"]
model = mtq.quantize(model, config, forward_loop=calibrate)
# Validate a recipe
python -m modelopt.torch.recipes.cli validate examples/recipes/ptq/ptq_fp8.yaml

# Show execution plan
python -m modelopt.torch.recipes.cli plan examples/recipes/multi_technique/sparse_quantize.yaml --verbose

# Dry-run experiment (2 models × 4 recipes × 2 clusters = 16 jobs)
python -m modelopt.torch.recipes.cli dry-run --config examples/recipes/experiments/sweep_demo.yaml

Testing

python -m pytest tests/unit/torch/recipes/ -v  # 34 passed, 0.11s

Existing quantization tests unaffected (357 passed, 32 skipped).

Before your PR is "Ready for review"

Make sure you read and follow Contributor guidelines and your commits are signed (git commit -s -S).

Make sure you read and follow the Security Best Practices (e.g. avoiding hardcoded trust_remote_code=True, torch.load(..., weights_only=False), pickle, etc.).

  • Is this change backward compatible?: ✅ (additive only — no existing files modified)
  • If you copied code from any other sources or added a new PIP dependency, did you follow guidance in CONTRIBUTING.md: ✅ (no new dependencies — pydantic and pyyaml already in project)
  • Did you write any new necessary tests?: ✅ (34 tests)
  • Did you update Changelog?: ❌ (pending review approval)

Summary by CodeRabbit

  • New Features
    • YAML-based recipe system with loader, preset/format registry, pipeline planning, sweep orchestration, and a new CLI for validate/plan/dry-run.
  • Examples
    • Many new example recipes: PTQ/QAT, FP8/NVFP4, AWQ, sparsity, pruning, distillation, multi‑technique flows, auto‑quantize, and sweep demos.
  • Documentation
    • New recipes documentation with quick-start, usage examples, execution order, and command examples.
  • Tests
    • Extensive unit tests for loading, schema validation, bridge mapping, planning, sweep controller, and CLI.

@sungsooha sungsooha requested review from a team as code owners March 10, 2026 00:09
@copy-pr-bot
Copy link

copy-pr-bot bot commented Mar 10, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 10, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds a YAML-driven recipe system and Python tooling to author, validate, resolve, plan, and sweep multi‑technique model optimization workflows (PTQ/QAT, sparsity, pruning, distillation, auto‑quantize). Includes format/preset registries, resolver, pipeline planner, sweep controller, CLI, example recipes, and unit tests.

Changes

Cohort / File(s) Summary
Examples / Recipes
examples/recipes/README.md, examples/recipes/...
Add README and ~20 example YAML recipes (ptq, qat, multi_technique, auto, experiments) with usage, dry‑run/plan examples and export guidance.
Recipe package API & bridge
modelopt/torch/recipes/__init__.py, modelopt/torch/recipes/bridge.py, modelopt/torch/recipes/cli.py
Add top-level loader load_recipe, bridge functions (recipe_to_hf_ptq_args, summarize_recipe), a CLI (main) with subcommands (validate, dry-run, plan, info) and re-export public API symbols.
Pipeline planner
modelopt/torch/recipes/pipeline.py
Add pipeline planner with TECHNIQUE_ORDER/TECHNIQUE_API, PipelineStep/PipelinePlan dataclasses, plan construction (plan_pipeline, load_and_plan) and dry_run rendering.
Sweep / Experiment system
modelopt/torch/recipes/experiment/...
Add SweepConfig, SweepController, SweepJob, job generation (Cartesian product), validation, multi‑phase dry_run report, and JSON export.
Schema: models & resolver
modelopt/torch/recipes/schema/models.py, modelopt/torch/recipes/schema/resolver.py
Add comprehensive Pydantic recipe schema (RecipeConfig, sections for quantization/auto/distill/prune/sparsity/export) and resolver resolve_recipe that converts validated recipes to mtq-compatible dicts (quantize_config, auto_quantize_kwargs, calibration, export), with overrides, staged quantizers, and KV-cache merging.
Presets & formats
modelopt/torch/recipes/schema/_bundled_presets.py, modelopt/torch/recipes/schema/presets.py, modelopt/torch/recipes/schema/formats.py
Add bundled preset snapshot, lazy tiered preset registry (live→bundled) with get_preset/get_preset_source/list_presets, and FORMAT_REGISTRY/KV_FORMAT_REGISTRY with accessors get_format/get_kv_format.
Schema package init / re-exports
modelopt/torch/recipes/schema/__init__.py, modelopt/torch/recipes/experiment/__init__.py
Add init modules that re-export core classes/registries (RecipeConfig, FORMAT_REGISTRY, presets accessors, SweepConfig/SweepController/SweepJob) for simpler imports.
Tests & fixtures
tests/unit/torch/recipes/*
Add pytest fixtures and ~12 unit test modules covering schema validation, loader/resolver, bridge mapping, pipeline planning, sweep controller dry-runs/job export, and example YAML compliance.
Examples: multi-technique recipes
examples/recipes/multi_technique/*
Add combined workflows (sparse→quantize, prune→quantize, fp8→distill) with per-stage calibration/training and export sections.
Examples: ptq & qat recipes
examples/recipes/ptq/*, examples/recipes/qat/*
Add PTQ and QAT recipe variants (fp8, nvfp4/awq, local_hessian, overrides, custom W4A8, int8 QAT) with calibration and kv_cache settings.
Examples: auto & experiments/sweeps
examples/recipes/auto/auto_quantize.yaml, examples/recipes/experiments/*
Add auto-quantize per-layer search recipe and sweep/demo YAMLs to drive Cartesian product experiment runs with eval settings and launcher overrides.
Project metadata
pyproject.toml
Add runtime dependency pyyaml>=5.0 and console script entry modelopt-recipes = "modelopt.torch.recipes.cli:main".

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant Loader as Recipe Loader
    participant Validator as Schema Validator
    participant Resolver as Recipe Resolver
    participant Planner as Pipeline Planner
    participant Controller as Sweep Controller

    User->>Loader: load_recipe(yaml_path)
    Loader->>Loader: read YAML
    Loader->>Validator: RecipeConfig.model_validate(...)
    Validator-->>Loader: RecipeConfig
    Loader->>Resolver: resolve_recipe(RecipeConfig)
    Resolver-->>Loader: resolved mtq-compatible dict
    Loader-->>User: resolved dict

    User->>Controller: SweepConfig.from_yaml(sweep.yaml)
    Controller->>Loader: load_recipe(referenced_recipe)
    Loader->>Resolver: resolve_recipe(RecipeConfig)
    Controller->>Planner: plan_pipeline(RecipeConfig)
    Planner-->>Controller: PipelinePlan
    Controller-->>User: dry_run summary / exported jobs
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

🚥 Pre-merge checks | ✅ 3 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 68.29% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title 'Add YAML recipe system for model optimization' accurately and concisely summarizes the primary changeset, which introduces a comprehensive YAML-based recipe system with schema, pipeline planning, experiment control, CLI, examples, and tests.
Security Anti-Patterns ✅ Passed Pull request introduces YAML recipe system with pyyaml>=5.0 (MIT-licensed). No security anti-patterns detected: no unsafe torch.load/numpy.load, no hardcoded trust_remote_code, no eval/exec on untrusted input, no # nosec bypasses. Pydantic validation used for all YAML inputs.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 16

Note

Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.

🟡 Minor comments (16)
examples/recipes/experiments/experiment.yaml-37-39 (1)

37-39: ⚠️ Potential issue | 🟡 Minor

This per-model override never matches any sweep model.

model_overrides is keyed by codellama/CodeLlama-34b-Instruct, but the sweep only includes Qwen/Qwen3-8B and meta-llama/Llama-3.1-8B-Instruct. As written, this example never exercises per-model overrides.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@examples/recipes/experiments/experiment.yaml` around lines 37 - 39, The
model_overrides mapping is using the key "codellama/CodeLlama-34b-Instruct"
which doesn't match any sweep models (the sweep contains "Qwen/Qwen3-8B" and
"meta-llama/Llama-3.1-8B-Instruct"), so the per-model override never runs;
update the model_overrides keys to match a sweep model (e.g., replace
"codellama/CodeLlama-34b-Instruct" with "Qwen/Qwen3-8B" or
"meta-llama/Llama-3.1-8B-Instruct"), or add additional entries that exactly
match your intended sweep model names so the overrides apply.
examples/recipes/experiments/experiment.yaml-6-8 (1)

6-8: ⚠️ Potential issue | 🟡 Minor

Update the usage example to the new CLI/path.

This still points to the old command and the old examples/sweeps/... location, so users following it will hit a dead path.

Suggested fix
 # Usage:
-#   modelopt-recipes dry-run --config examples/sweeps/experiment.yaml
+#   python -m modelopt.torch.recipes.cli dry-run --config examples/recipes/experiments/experiment.yaml
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@examples/recipes/experiments/experiment.yaml` around lines 6 - 8, The usage
example still references the old command and path string "modelopt-recipes
dry-run --config examples/sweeps/experiment.yaml"; update that line to the new
CLI and location by replacing the old command and path with the current one
(e.g., "modelopt recipes dry-run --config examples/recipes/experiment.yaml") so
the example points to the correct CLI invocation and the examples/recipes path;
search for the exact string "modelopt-recipes dry-run --config
examples/sweeps/experiment.yaml" in the file and update it.
examples/recipes/README.md-84-86 (1)

84-86: ⚠️ Potential issue | 🟡 Minor

Add a language to this fenced block.

This block trips markdownlint as-is. Use something like text so the docs stay lint-clean.

Suggested fix
-```
+```text
 pruning → sparsity → quantization → distillation
</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

Verify each finding against the current code and only fix it if needed.

In @examples/recipes/README.md around lines 84 - 86, The fenced code block
containing the line "pruning → sparsity → quantization → distillation" in
README.md needs a language tag to satisfy markdownlint; update that fenced block
from totext so it reads as a text code fence (i.e., add the language
identifier "text" to the existing triple-backtick fence that wraps the pruning →
... line).


</details>

</blockquote></details>
<details>
<summary>examples/recipes/multi_technique/distill_fp8.yaml-3-5 (1)</summary><blockquote>

`3-5`: _⚠️ Potential issue_ | _🟡 Minor_

**Fix the stage-order description in the header.**

Line 3 says distillation happens before quantization, but this recipe and Line 5 define the opposite order. The comment should not contradict the pipeline.


<details>
<summary>Suggested fix</summary>

```diff
-# Distill from a larger teacher model, then quantize the student to FP8.
+# Quantize the student to FP8, then distill from a larger teacher model.
```
</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

```
Verify each finding against the current code and only fix it if needed.

In `@examples/recipes/multi_technique/distill_fp8.yaml` around lines 3 - 5, Update
the header comment so the stage order matches the pipeline: replace the
contradictory first line "Distill from a larger teacher model, then quantize the
student to FP8." with a line that reflects the actual execution order used
later, e.g. "Quantize the student to FP8, then distill (fine-tune with teacher
guidance)." Adjust the wording around the existing "Execution order:
quantization → distillation (fine-tune with teacher guidance)" phrase to ensure
both lines consistently state quantization → distillation; locate the lines by
searching for the exact header text "Distill from a larger teacher model, then
quantize the student to FP8." and "Execution order: quantization → distillation
(fine-tune with teacher guidance)" and make the first line mirror the execution
order.
```

</details>

</blockquote></details>
<details>
<summary>examples/recipes/experiments/sweep_demo.yaml-6-8 (1)</summary><blockquote>

`6-8`: _⚠️ Potential issue_ | _🟡 Minor_

**Refresh this usage example to the recipe CLI.**

The command and path here still reference the old demo script layout, not the new recipe CLI/examples path introduced by this PR.


<details>
<summary>Suggested fix</summary>

```diff
 # Usage:
-#   .venv/bin/python scripts/demo_big_pareto_sweep.py --config examples/sweeps/sweep_demo.yaml --dry-run
+#   python -m modelopt.torch.recipes.cli dry-run --config examples/recipes/experiments/sweep_demo.yaml
```
</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

```
Verify each finding against the current code and only fix it if needed.

In `@examples/recipes/experiments/sweep_demo.yaml` around lines 6 - 8, Update the
usage example to show the new recipe CLI invocation instead of the old demo
script: replace the old command with the recipe CLI entrypoint (use the
repository's CLI/entrypoint name) invoking the same sweep_demo.yaml config
(sweep_demo.yaml) and include the --dry-run flag so the example runs against the
new examples/recipes layout; ensure the text shows the correct CLI pattern
(e.g., "<cli-entrypoint> run --config sweep_demo.yaml --dry-run") so readers can
adapt it to the repo's actual entrypoint.
```

</details>

</blockquote></details>
<details>
<summary>examples/recipes/qat/qat_int8.yaml-10-16 (1)</summary><blockquote>

`10-16`: _⚠️ Potential issue_ | _🟡 Minor_

**Align the metadata with the actual algorithm.**

The description/tags say AWQ, but the recipe is configured for SmoothQuant (`preset: int8_sq`, `algorithm: smoothquant`). That makes this example misleading in generated summaries and docs.


<details>
<summary>Suggested fix</summary>

```diff
 metadata:
   name: qat_int8
-  description: "INT8 quantization-aware training with AWQ initialization"
-  tags: [qat, int8, awq]
+  description: "INT8 quantization-aware training with SmoothQuant initialization"
+  tags: [qat, int8, smoothquant]
```
</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

```
Verify each finding against the current code and only fix it if needed.

In `@examples/recipes/qat/qat_int8.yaml` around lines 10 - 16, The recipe metadata
claims AWQ but the quantization config uses SmoothQuant (quantization.preset:
int8_sq and quantization.algorithm.method: smoothquant); either update the
description and tags to reference SmoothQuant (e.g., description: "INT8
quantization-aware training with SmoothQuant initialization" and tags: [qat,
int8, smoothquant]) or change the quantization block to AWQ-compatible values
(replace preset and algorithm.method with the appropriate AWQ preset/algorithm
names) so the metadata and quantization settings match.
```

</details>

</blockquote></details>
<details>
<summary>examples/recipes/README.md-90-99 (1)</summary><blockquote>

`90-99`: _⚠️ Potential issue_ | _🟡 Minor_

**Import `mtq` in the Python example.**

Line 98 uses `mtq.quantize(...)`, but `mtq` is never imported, so this snippet raises `NameError` when copied verbatim.

<details>
<summary>Suggested fix</summary>

```diff
 ```python
+import modelopt.torch.quantization as mtq
 from modelopt.torch.recipes import load_recipe
 
 # Load and resolve a recipe
 result = load_recipe("examples/recipes/ptq/ptq_fp8.yaml")
```
</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

```
Verify each finding against the current code and only fix it if needed.

In `@examples/recipes/README.md` around lines 90 - 99, The example calls
mtq.quantize but never imports mtq; add an import for
modelopt.torch.quantization (e.g. import modelopt.torch.quantization as mtq) at
the top of the snippet before using load_recipe so mtq.quantize is defined;
update the example near load_recipe and the use of
mtq.quantize/forward_loop=calibrate accordingly.
```

</details>

</blockquote></details>
<details>
<summary>examples/recipes/ptq/ptq_nvfp4_skip_first_last.yaml-28-32 (1)</summary><blockquote>

`28-32`: _⚠️ Potential issue_ | _🟡 Minor_

**Use the actual last-layer index in this example.**

`*layers.21*` contradicts the file name/description and Line 7’s `layers.31` note for 32-layer models. As written, this recipe skips the first and 22nd layers, not the first and last one. If this is meant to be a 32-layer example, the pattern should be `*layers.31*`; otherwise the example name/description should be updated.


<details>
<summary>📝 Proposed fix</summary>

```diff
     - pattern: "*layers.0*"
       enable: false
-    - pattern: "*layers.21*"
+    - pattern: "*layers.31*"
       enable: false
```
</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

```
Verify each finding against the current code and only fix it if needed.

In `@examples/recipes/ptq/ptq_nvfp4_skip_first_last.yaml` around lines 28 - 32,
The override pattern currently disables "*layers.0*" and "*layers.21*" which
mismatches the recipe name/description (32-layer model) and the note referencing
"layers.31"; update the second override pattern from "*layers.21*" to
"*layers.31*" so the recipe actually skips the first and last layers, or if
intended to skip layer 21 instead, update the file name/description and any
references (e.g., the "layers.31" note) to reflect a non-32-layer example.
```

</details>

</blockquote></details>
<details>
<summary>modelopt/torch/recipes/cli.py-47-63 (1)</summary><blockquote>

`47-63`: _⚠️ Potential issue_ | _🟡 Minor_

**Catch `dry-run` load/export failures like the other subcommands.**

`SweepConfig.from_yaml()`, `controller.dry_run()`, and `controller.export_jobs()` can all raise, and this branch currently lets those exceptions escape as raw tracebacks. That makes malformed configs and unwritable output paths much rougher to diagnose than `validate`/`plan`.

<details>
<summary>💡 Suggested fix</summary>

```diff
     elif args.command == "dry-run":
-        from modelopt.torch.recipes.experiment import SweepConfig, SweepController
-
-        config = SweepConfig.from_yaml(args.config)
-        errors = config.validate()
-        if errors:
-            print("Sweep config validation failed:")
-            for e in errors:
-                print(f"  - {e}")
-            sys.exit(1)
-
-        controller = SweepController(config)
-        print(controller.dry_run(verbose=args.verbose))
-
-        if args.output:
-            controller.export_jobs(args.output)
-            print(f"\nJobs exported to: {args.output}")
+        try:
+            from modelopt.torch.recipes.experiment import SweepConfig, SweepController
+
+            config = SweepConfig.from_yaml(args.config)
+            errors = config.validate()
+            if errors:
+                print("Sweep config validation failed:", file=sys.stderr)
+                for error in errors:
+                    print(f"  - {error}", file=sys.stderr)
+                sys.exit(1)
+
+            controller = SweepController(config)
+            print(controller.dry_run(verbose=args.verbose))
+
+            if args.output:
+                controller.export_jobs(args.output)
+                print(f"\nJobs exported to: {args.output}")
+        except Exception as exc:
+            print(f"Dry run failed: {exc}", file=sys.stderr)
+            sys.exit(1)
```
</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

```
Verify each finding against the current code and only fix it if needed.

In `@modelopt/torch/recipes/cli.py` around lines 47 - 63, Wrap the entire
"dry-run" branch in a try/except that mirrors the other subcommands: catch
exceptions from SweepConfig.from_yaml, SweepController(config),
controller.dry_run(verbose=...), and controller.export_jobs(...) and print a
clear error message (including the exception text) then sys.exit(1); keep
existing validation behavior (call config.validate() first), and ensure
export_jobs is also guarded by its own try/except so failures when writing to
args.output are reported and cause exit instead of leaking a raw traceback.
```

</details>

</blockquote></details>
<details>
<summary>modelopt/torch/recipes/schema/resolver.py-1-6 (1)</summary><blockquote>

`1-6`: _⚠️ Potential issue_ | _🟡 Minor_

**Missing NVIDIA Apache 2.0 license header.**

Per coding guidelines, all new Python files under `modelopt/` require the license header.

<details>
<summary>🤖 Prompt for AI Agents</summary>

```
Verify each finding against the current code and only fix it if needed.

In `@modelopt/torch/recipes/schema/resolver.py` around lines 1 - 6, This file is
missing the required NVIDIA Apache 2.0 license header; add the standard
multi-line NVIDIA Apache 2.0 license block at the very top of
modelopt/torch/recipes/schema/resolver.py (above the module docstring) so the
file complies with the repository's licensing guidelines for all new Python
files under modelopt; ensure the license text matches the project's canonical
header exactly.
```

</details>

</blockquote></details>
<details>
<summary>modelopt/torch/recipes/bridge.py-1-5 (1)</summary><blockquote>

`1-5`: _⚠️ Potential issue_ | _🟡 Minor_

**Missing NVIDIA Apache 2.0 license header.**

Per coding guidelines, all new Python files under `modelopt/` require the license header.

<details>
<summary>🤖 Prompt for AI Agents</summary>

```
Verify each finding against the current code and only fix it if needed.

In `@modelopt/torch/recipes/bridge.py` around lines 1 - 5, The file is missing the
required NVIDIA Apache 2.0 license header; add the project's standard
Apache-2.0/NVIDIA license block at the top of bridge.py (above the module
docstring) so all new Python files under modelopt include the license; ensure
the header matches the existing project license text/style (including copyright
holder and SPDX identifier) used elsewhere in the repo.
```

</details>

</blockquote></details>
<details>
<summary>modelopt/torch/recipes/experiment/controller.py-1-5 (1)</summary><blockquote>

`1-5`: _⚠️ Potential issue_ | _🟡 Minor_

**Missing NVIDIA Apache 2.0 license header.**

Per coding guidelines, all new Python files under `modelopt/` require the license header.

<details>
<summary>🤖 Prompt for AI Agents</summary>

```
Verify each finding against the current code and only fix it if needed.

In `@modelopt/torch/recipes/experiment/controller.py` around lines 1 - 5, This
file is missing the required NVIDIA Apache 2.0 license header; prepend the
standard NVIDIA Apache 2.0 license block to the top of the module (above the
existing module docstring in controller.py / the "Sweep Controller" module) so
all new Python files under modelopt include the license header as per
guidelines; ensure the header is the full required text and that no other file
content is moved below it.
```

</details>

</blockquote></details>
<details>
<summary>modelopt/torch/recipes/experiment/config.py-1-5 (1)</summary><blockquote>

`1-5`: _⚠️ Potential issue_ | _🟡 Minor_

**Missing NVIDIA Apache 2.0 license header.**

Per coding guidelines, all new Python files under `modelopt/` require the license header.

<details>
<summary>🤖 Prompt for AI Agents</summary>

```
Verify each finding against the current code and only fix it if needed.

In `@modelopt/torch/recipes/experiment/config.py` around lines 1 - 5, The file is
missing the required NVIDIA Apache 2.0 license header; add the standard
multi-line NVIDIA Apache 2.0 license header at the very top of
modelopt/torch/recipes/experiment/config.py before the existing module docstring
(the triple-quoted string starting "Sweep configuration schema."), ensuring the
header exactly matches the project's canonical license text and includes
copyright year and holder.
```

</details>

</blockquote></details>
<details>
<summary>modelopt/torch/recipes/schema/__init__.py-1-16 (1)</summary><blockquote>

`1-16`: _⚠️ Potential issue_ | _🟡 Minor_

**Missing NVIDIA Apache 2.0 license header.**

Per coding guidelines, all new Python files under `modelopt/` require the license header.

Otherwise, this is a clean re-export module with appropriate `__all__` definition.

<details>
<summary>🤖 Prompt for AI Agents</summary>

```
Verify each finding against the current code and only fix it if needed.

In `@modelopt/torch/recipes/schema/__init__.py` around lines 1 - 16, This file is
missing the required NVIDIA Apache 2.0 license header; add the standard license
comment block at the very top of the module (before the module docstring) so the
file that re-exports RecipeConfig, FORMAT_REGISTRY, KV_FORMAT_REGISTRY,
get_preset, get_preset_source, list_presets, and resolve_recipe contains the
NVIDIA Apache 2.0 header per project guidelines.
```

</details>

</blockquote></details>
<details>
<summary>modelopt/torch/recipes/__init__.py-1-16 (1)</summary><blockquote>

`1-16`: _⚠️ Potential issue_ | _🟡 Minor_

**Missing NVIDIA Apache 2.0 license header.**

Per coding guidelines, all new Python files under `modelopt/` require the NVIDIA Apache 2.0 license header.


<details>
<summary>Add license header at the top of the file</summary>

```diff
+# SPDX-FileCopyrightText: Copyright (c) 2024 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+# SPDX-License-Identifier: Apache-2.0
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
 """Recipe system and experiment controller for NVIDIA Model Optimizer.
```
</details>

As per coding guidelines: "NVIDIA Apache 2.0 license header required on all new Python/C++/CUDA files".

<details>
<summary>🤖 Prompt for AI Agents</summary>

```
Verify each finding against the current code and only fix it if needed.

In `@modelopt/torch/recipes/__init__.py` around lines 1 - 16, Add the required
NVIDIA Apache 2.0 license header to the top of this Python module
(modelopt.torch.recipes __init__.py) so the file begins with the standard
multi-line copyright and Apache 2.0 license block; keep the existing module
docstring and usage examples after the header, and ensure the header appears
before any code or docstring (so symbols like load_recipe and the module-level
docstring remain unchanged and located after the license block).
```

</details>

</blockquote></details>
<details>
<summary>modelopt/torch/recipes/bridge.py-55-59 (1)</summary><blockquote>

`55-59`: _⚠️ Potential issue_ | _🟡 Minor_

**Unused `recipe_path` parameter.**

The `recipe_path` parameter is declared but never used in the function body. Either remove it or include it in the returned summary dict if it's intended for context.


<details>
<summary>Option 1: Remove unused parameter</summary>

```diff
-def summarize_recipe(recipe_path: str, resolved: dict[str, Any], raw_yaml: dict) -> dict[str, Any]:
+def summarize_recipe(resolved: dict[str, Any], raw_yaml: dict) -> dict[str, Any]:
     """Extract human-readable summary from recipe YAML and resolved config.

-    Returns a dict with keys: preset, algorithm, kv_cache, overrides_count, type.
+    Returns a dict with keys: type, preset, algorithm, kv_cache, overrides_count.
     """
```
</details>
<details>
<summary>Option 2: Include in returned summary</summary>

```diff
     if quant:
         algo = quant.get("algorithm", "default")
         if isinstance(algo, dict):
             algo = algo.get("method", "default")

         return {
             "type": "quantize",
+            "recipe_path": recipe_path,
             "preset": quant.get("preset", "custom"),
```
</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

```
Verify each finding against the current code and only fix it if needed.

In `@modelopt/torch/recipes/bridge.py` around lines 55 - 59, The parameter
recipe_path on summarize_recipe is declared but unused; add it to the returned
summary dict so callers retain context: update summarize_recipe (and its
docstring) to include a "recipe_path" key in the returned dict alongside preset,
algorithm, kv_cache, overrides_count, and type, populating it with the incoming
recipe_path value; ensure any code that relies on those keys continues to work
and adjust tests/types if needed.
```

</details>

</blockquote></details>

</blockquote></details>

<details>
<summary>🧹 Nitpick comments (6)</summary><blockquote>

<details>
<summary>modelopt/torch/recipes/schema/formats.py (1)</summary><blockquote>

`135-148`: **Return copies from the public format getters.**

Unlike `get_preset()` in `modelopt/torch/recipes/schema/presets.py` on Lines 121-127, these accessors hand back the shared registry dicts. A caller-side merge or `update()` will mutate global defaults and leak into later recipe resolutions.


<details>
<summary>♻️ Proposed fix</summary>

```diff
+import copy
 from typing import Any
@@
 def get_format(name: str) -> dict[str, dict[str, Any]]:
     """Look up a format by name. Raises KeyError if not found."""
     if name not in FORMAT_REGISTRY:
         available = sorted(FORMAT_REGISTRY.keys())
         raise KeyError(f"Unknown format '{name}'. Available: {available}")
-    return FORMAT_REGISTRY[name]
+    return copy.deepcopy(FORMAT_REGISTRY[name])
@@
 def get_kv_format(name: str) -> dict[str, Any]:
     """Look up a KV cache format by name. Raises KeyError if not found."""
     if name not in KV_FORMAT_REGISTRY:
         available = sorted(KV_FORMAT_REGISTRY.keys())
         raise KeyError(f"Unknown KV cache format '{name}'. Available: {available}")
-    return KV_FORMAT_REGISTRY[name]
+    return copy.deepcopy(KV_FORMAT_REGISTRY[name])
```
</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

```
Verify each finding against the current code and only fix it if needed.

In `@modelopt/torch/recipes/schema/formats.py` around lines 135 - 148, get_format
and get_kv_format currently return references into the shared FORMAT_REGISTRY
and KV_FORMAT_REGISTRY so callers who merge/update will mutate global defaults;
change both functions to return copies (deep copies if values are nested dicts)
of the registry entries instead of the original dict objects (i.e., copy the
dict returned from FORMAT_REGISTRY[name] and KV_FORMAT_REGISTRY[name]) so
callers can modify the result without affecting the registries.
```

</details>

</blockquote></details>
<details>
<summary>modelopt/torch/recipes/experiment/controller.py (2)</summary><blockquote>

`80-86`: **Job IDs have gaps when recipes are invalid.**

The `job_id` is incremented before checking if the recipe is valid, resulting in non-sequential job IDs when some recipes fail validation. This may cause confusion in the dry-run output where jobs are displayed as "Job X/Y" but X has gaps.


<details>
<summary>Increment job_id only for valid jobs</summary>

```diff
         jobs: list[SweepJob] = []
-        job_id = 0
         for model, recipe_path, launcher in itertools.product(
             self.config.models, self.config.recipes, self.config.launchers
         ):
-            job_id += 1
             if recipe_path in self._validation_errors:
                 continue  # Skip invalid recipes

             resolved = self._validated_recipes[recipe_path]
             raw_yaml = self._raw_yamls[recipe_path]
             hf_ptq = recipe_to_hf_ptq_args(resolved)
             hf_ptq["pyt_ckpt_path"] = model

             jobs.append(SweepJob(
-                job_id=job_id,
+                job_id=len(jobs) + 1,
                 model=model,
```
</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

```
Verify each finding against the current code and only fix it if needed.

In `@modelopt/torch/recipes/experiment/controller.py` around lines 80 - 86, The
job_id is incremented before validating the recipe, causing gaps when a recipe
is skipped; in the loop that iterates over itertools.product(self.config.models,
self.config.recipes, self.config.launchers) (variables model, recipe_path,
launcher) move the job_id increment so it happens only for valid jobs (i.e.,
after checking recipe_path against self._validation_errors) or increment
conditionally inside the branch that proceeds with the job; update any "Job X/Y"
calculations to rely on this adjusted job_id so dry-run outputs remain
sequential.
```

</details>

---

`108-145`: **`dry_run` calls both `validate_recipes` and `generate_jobs`, which internally calls `validate_recipes` again.**

The `dry_run` method explicitly calls `validate_recipes()` at line 145, but `generate_jobs()` at line 212 will call `validate_recipes()` again if `_validated_recipes` is empty. Since `validate_recipes` populates `_validated_recipes`, the second call is a no-op, but this could be clearer.


<details>
<summary>Add comment or restructure for clarity</summary>

```diff
         # Phase 1: Recipe Validation
         section("Phase 1: Recipe Validation (load_recipe → resolved config)")

-        validation_results = self.validate_recipes()
+        validation_results = self.validate_recipes()  # Populates _validated_recipes for generate_jobs()
```
</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

```
Verify each finding against the current code and only fix it if needed.

In `@modelopt/torch/recipes/experiment/controller.py` around lines 108 - 145, The
dry_run method currently calls validate_recipes() and later calls
generate_jobs(), which itself calls validate_recipes() when _validated_recipes
is empty; remove the redundant explicit validate_recipes() invocation in dry_run
(or alternatively set/ensure _validated_recipes is populated once before calling
generate_jobs) and add a short comment near dry_run explaining that
generate_jobs() will validate recipes lazily using _validated_recipes to avoid
duplicate work; refer to the dry_run, generate_jobs, validate_recipes methods
and the _validated_recipes cache when making the change.
```

</details>

</blockquote></details>
<details>
<summary>modelopt/torch/recipes/experiment/config.py (1)</summary><blockquote>

`121-124`: **Consider documenting the zero-jobs edge case.**

When any of `models`, `recipes`, or `launchers` is empty, `total_jobs` returns 0. This is mathematically correct but downstream code dividing by `total_jobs` could fail. The `validate()` method catches these cases, so this is just a documentation note.


<details>
<summary>Optional docstring enhancement</summary>

```diff
     `@property`
     def total_jobs(self) -> int:
-        """Total number of jobs in the sweep (Cartesian product)."""
+        """Total number of jobs in the sweep (Cartesian product).
+
+        Returns 0 if any dimension (models, recipes, launchers) is empty.
+        Call validate() first to ensure all dimensions are populated.
+        """
         return len(self.models) * len(self.recipes) * len(self.launchers)
```
</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

```
Verify each finding against the current code and only fix it if needed.

In `@modelopt/torch/recipes/experiment/config.py` around lines 121 - 124, The
docstring for the total_jobs property should explicitly mention the zero-jobs
edge case: when any of models, recipes, or launchers is empty total_jobs will be
0, which can cause downstream divisions by total_jobs; note that validate() is
responsible for catching/raising on empty lists before execution. Update the
total_jobs docstring (in the total_jobs property) to include this behavior and a
brief note pointing callers to validate() to ensure non-zero job counts.
```

</details>

</blockquote></details>
<details>
<summary>tests/unit/torch/recipes/test_pipeline.py (1)</summary><blockquote>

`152-161`: **Consider adding a guard for missing examples directory.**

The test iterates over `examples/recipes` but doesn't handle the case where the directory might not exist or be empty. This could cause confusing test failures if the examples directory structure changes.


<details>
<summary>Proposed improvement</summary>

```diff
 def test_load_and_plan_all_examples():
     """All example recipes produce valid pipeline plans."""
     from pathlib import Path
     recipes_dir = Path("examples/recipes")
+    assert recipes_dir.exists(), f"Examples directory not found: {recipes_dir}"
+    yaml_files = [p for p in sorted(recipes_dir.rglob("*.yaml")) if "experiments" not in p.parts]
+    assert len(yaml_files) > 0, "No recipe YAML files found"
-    for path in sorted(recipes_dir.rglob("*.yaml")):
-        if "experiments" in path.parts:
-            continue  # skip sweep/experiment configs
+    for path in yaml_files:
         plan = load_and_plan(str(path))
         assert len(plan.steps) >= 1, f"{path} produced empty plan"
         assert plan.recipe_path == str(path)
```
</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

```
Verify each finding against the current code and only fix it if needed.

In `@tests/unit/torch/recipes/test_pipeline.py` around lines 152 - 161, The test
test_load_and_plan_all_examples assumes examples/recipes exists; add a guard
that checks the Path("examples/recipes") (recipes_dir) exists and is a directory
and that recipes_dir.rglob("*.yaml") yields at least one file before iterating;
if the directory is missing or no recipe files are found, call pytest.skip with
a clear message so load_and_plan is not invoked on a non-existent/empty dataset.
Use the existing recipes_dir and load_and_plan identifiers to locate where to
add the checks.
```

</details>

</blockquote></details>
<details>
<summary>modelopt/torch/recipes/schema/resolver.py (1)</summary><blockquote>

`34-57`: **Move imports to the top of the file.**

The imports at lines 48-57 are placed after the function definition `_update_quant_cfg_with_kv_cache`. While this works, it's non-idiomatic Python and can cause confusion. Consider moving all imports to the top of the file after the module docstring.


<details>
<summary>Reorganize imports</summary>

```diff
 from __future__ import annotations

 import copy
 from typing import Any

+from .formats import FORMAT_REGISTRY, get_format, get_kv_format
+from .presets import get_preset
+from .models import (
+    AlgorithmConfig,
+    AutoQuantizeSection,
+    OverrideEntry,
+    QuantizationSection,
+    QuantizerSpec,
+    RecipeConfig,
+)
+
 # Default disabled quantizer patterns — inlined from
 # modelopt.torch.quantization.config._default_disabled_quantizer_cfg
 # to avoid torch dependency at import time.
 _DEFAULT_DISABLED_QUANTIZER_CFG: dict[str, Any] = {
     ...
 }


 def _update_quant_cfg_with_kv_cache(
     ...
 ):
     ...

-from .formats import FORMAT_REGISTRY, get_format, get_kv_format
-from .presets import get_preset
-from .models import (
-    AlgorithmConfig,
-    AutoQuantizeSection,
-    OverrideEntry,
-    QuantizationSection,
-    QuantizerSpec,
-    RecipeConfig,
-)
```
</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

```
Verify each finding against the current code and only fix it if needed.

In `@modelopt/torch/recipes/schema/resolver.py` around lines 34 - 57, The
module-level imports (FORMAT_REGISTRY, get_format, get_kv_format from .formats;
get_preset from .presets; and AlgorithmConfig, AutoQuantizeSection,
OverrideEntry, QuantizationSection, QuantizerSpec, RecipeConfig from .models)
should be moved to the top of the file immediately after the module docstring
and before the function _update_quant_cfg_with_kv_cache; update the file so all
imports live together at the top in that order, keeping the helper function
_update_quant_cfg_with_kv_cache defined below them and ensuring no other code is
split by late imports.
```

</details>

</blockquote></details>

</blockquote></details>

<details>
<summary>🤖 Prompt for all review comments with AI agents</summary>

Verify each finding against the current code and only fix it if needed.

Inline comments:
In @modelopt/torch/recipes/cli.py:

  • Line 1: This module currently starts with a module docstring instead of the
    required NVIDIA Apache 2.0 license header; add the standard NVIDIA Apache 2.0
    header block at the very top of the file (above the existing triple-quoted
    docstring) and ensure the same header is added to all other new Python/C++/CUDA
    files introduced in this PR; locate the insertion point at the module-level
    docstring in cli.py (top of file) and prepend the exact SPDX/Apache 2.0 header
    used across the codebase.
  • Around line 9-13: Add a console-script entry so the CLI function main() in
    modelopt.torch.recipes.cli is exposed as the installable command; update
    packaging metadata (e.g., pyproject.toml) to include an entry under
    [project.scripts] like modelopt-recipes = "modelopt.torch.recipes.cli:main" (or
    add the equivalent console_scripts mapping in setup.cfg/setup.py) so the
    advertised modelopt-recipes command points to the main() function.

In @modelopt/torch/recipes/experiment/__init__.py:

  • Around line 1-9: Add the standard NVIDIA Apache 2.0 license header to the top
    of this new Python module so it complies with project guidelines; insert the
    header comment block above the module docstring in
    modelopt/torch/recipes/experiment/init.py (before the triple-quoted
    description) and ensure it remains a valid Python comment, then keep the
    existing imports and all unchanged (SweepConfig, SweepController, SweepJob).

In @modelopt/torch/recipes/pipeline.py:

  • Around line 50-57: The PipelinePlan currently only stores
    steps/export/metadata and drops the top-level recipe.model block; add a model
    field to the PipelinePlan dataclass (e.g., model: dict[str, Any] | None) and
    ensure load_and_plan() populates this field from recipe.model (including
    recipe.model.path and flags like trust_remote_code and attn_implementation) when
    constructing the plan and when returning the plan as an artifact so consumers
    can reconstruct the full workflow; update any code that builds the plan around
    PipelinePlan(...) (lines building the plan and the return at load_and_plan()) to
    copy the model block intact into the new model field.
  • Around line 156-177: The planners (_plan_pruning and similarly
    _plan_distillation) are discarding any technique-specific fields by manually
    recreating a small config dict; instead serialize the entire validated section
    and preserve all fields. Fix by calling section.model_dump() into a single dict
    (e.g. full = section.model_dump()), then build config from that dict (e.g.
    config = {k:v for k,v in full.items() if k not in ("calibration","training")})
    and extract calibration/training by popping or reading those keys and leaving
    the rest intact; if any field names must be normalized, rename only those
    specific keys before constructing the
    PipelineStep(technique="pruning"/"distillation",
    api_call=TECHNIQUE_API["pruning"/"distillation"], config=config,
    calibration=calib, training=training).
  • Around line 1-11: This new module is missing the required NVIDIA Apache 2.0
    license header; add the standard NVIDIA Apache 2.0 header block at the very top
    of modelopt/torch/recipes/pipeline.py immediately before the module docstring so
    the file begins with the license header followed by the existing triple-quoted
    docstring and code (do not alter the docstring text). Ensure the header exactly
    matches the repository's canonical NVIDIA Apache 2.0 header used in other
    Python/C++/CUDA files.

In @modelopt/torch/recipes/schema/_bundled_presets.py:

  • Around line 1-10: The generated file _bundled_presets.py is missing the
    required NVIDIA Apache 2.0 license header; update the file so the standard
    NVIDIA Apache 2.0 header (the same header used across the repo) is prepended to
    the top of modelopt/torch/recipes/schema/_bundled_presets.py before the module
    docstring; also update the generator (scripts/sync_presets.py) so future
    regenerations emit that header automatically to avoid repeated violations.

In @modelopt/torch/recipes/schema/formats.py:

  • Around line 1-12: This file lacks the required NVIDIA Apache 2.0 source
    header; add the standard NVIDIA Apache 2.0 license header block at the very top
    of modelopt/torch/recipes/schema/formats.py (before the module docstring and
    before any imports such as "from future import annotations"), using the
    project's canonical header text (including copyright year/owner and SPDX
    identifier) so the file begins with the full license comment followed by the
    existing module docstring and imports.
  • Around line 42-78: FORMAT_REGISTRY is missing an entry for "mxint8", causing
    format lookups to fail; add a "mxint8" key to the FORMAT_REGISTRY dict (the same
    structure as the existing "mxfp8"/"mxfp6"/"mxfp4" entries) under
    modelopt/torch/recipes/schema/formats.py with "weight" and "activation" subkeys
    and settings that match the mxint8 quantizer defaults defined in
    modelopt/torch/recipes/schema/_bundled_presets.py (Lines ~390-409) — ensure the
    num_bits, block_sizes (e.g., dynamic/-1 and scale_bits), and enable flags mirror
    those bundled defaults so format: mxint8 resolves correctly.

In @modelopt/torch/recipes/schema/models.py:

  • Around line 68-93: The validator in QuantizationSection (class
    QuantizationSection) currently only checks preset vs custom but doesn't enforce
    that mode="qat" must include a TrainingConfig; update validate_preset_or_custom
    (or add an additional @model_validator with mode="after") to raise a ValueError
    when self.mode == "qat" and self.training is None/empty, instructing the user to
    provide a training config for QAT; keep the existing preset vs
    weights/activations check intact and return self when valid.
  • Around line 1-6: This new module (modelopt/torch/recipes/schema/models.py) is
    missing the required NVIDIA Apache 2.0 license header; add the standard
    multi-line NVIDIA Apache 2.0 header immediately at the top of the file before
    the existing module docstring so the header precedes the triple-quoted string,
    ensuring the file begins with the exact required license text used across the
    repo.
  • Around line 54-66: The OverrideEntry model allows entries with neither
    selector set (both pattern and module_class optional), so add a Pydantic
    validator on OverrideEntry (e.g., a @root_validator or equivalent) that enforces
    at least one selector is provided: if both pattern and module_class are falsy,
    raise a ValidationError/ValueError with a clear message; update the error text
    to mention OverrideEntry, pattern, and module_class so invalid entries like "-
    format: fp8" fail validation and downstream behavior is prevented.

In @modelopt/torch/recipes/schema/presets.py:

  • Around line 1-18: The file module-level header in presets.py is missing the
    required NVIDIA Apache 2.0 license header; add the standard NVIDIA Apache 2.0
    source header block (including copyright line, SPDX identifier, and full license
    notice per project guidelines) at the very top of
    modelopt.torch.recipes.schema.presets (before any imports or code) so the file
    contains the canonical NVIDIA header used across the repo.
  • Around line 29-118: The current try/except masks missing-symbol bugs because
    it tries to directly import many names (including NVFP4_FP8_MHA_CONFIG which
    doesn't exist) and catches ImportError broadly; either correct the symbol to the
    correct name (likely NVFP4_FP8_MHA_CFG) in the import list, or change the
    pattern to "import modelopt.torch.quantization.config as config" then read
    attributes (config.NVFP4_FP8_MHA_CONFIG) so module import failures are caught
    but missing attributes raise AttributeError instead of silently falling back;
    update references around _PRESET_REGISTRY accordingly so missing symbols fail
    fast.

In @tests/unit/torch/recipes/conftest.py:

  • Around line 1-5: Prepend the repository's required NVIDIA Apache 2.0 license
    header to this new Python module by inserting the full license comment block
    above the existing module docstring (the top triple-quoted string in
    conftest.py). Ensure the header matches the canonical NVIDIA Apache 2.0 template
    used across the repo (including copyright line, license URL, and SPDX
    identifier) and that no other content moves above it.
  • Around line 23-29: In pytest_collection_modifyitems, stop importing the deep
    path and instead import the top-level package (use "import modelopt") and catch
    ModuleNotFoundError; if the caught exception's name (exc.name) does not match
    the expected package name (e.g., "modelopt" or the actual distribution name like
    "nvidia-modelopt") re-raise the exception so real submodule ImportErrors aren't
    masked; only when the missing-package name matches should you set skip =
    pytest.mark.skip(reason="nvidia-modelopt not installed") and return to skip
    tests.

Minor comments:
In @examples/recipes/experiments/experiment.yaml:

  • Around line 37-39: The model_overrides mapping is using the key
    "codellama/CodeLlama-34b-Instruct" which doesn't match any sweep models (the
    sweep contains "Qwen/Qwen3-8B" and "meta-llama/Llama-3.1-8B-Instruct"), so the
    per-model override never runs; update the model_overrides keys to match a sweep
    model (e.g., replace "codellama/CodeLlama-34b-Instruct" with "Qwen/Qwen3-8B" or
    "meta-llama/Llama-3.1-8B-Instruct"), or add additional entries that exactly
    match your intended sweep model names so the overrides apply.
  • Around line 6-8: The usage example still references the old command and path
    string "modelopt-recipes dry-run --config examples/sweeps/experiment.yaml";
    update that line to the new CLI and location by replacing the old command and
    path with the current one (e.g., "modelopt recipes dry-run --config
    examples/recipes/experiment.yaml") so the example points to the correct CLI
    invocation and the examples/recipes path; search for the exact string
    "modelopt-recipes dry-run --config examples/sweeps/experiment.yaml" in the file
    and update it.

In @examples/recipes/experiments/sweep_demo.yaml:

  • Around line 6-8: Update the usage example to show the new recipe CLI
    invocation instead of the old demo script: replace the old command with the
    recipe CLI entrypoint (use the repository's CLI/entrypoint name) invoking the
    same sweep_demo.yaml config (sweep_demo.yaml) and include the --dry-run flag so
    the example runs against the new examples/recipes layout; ensure the text shows
    the correct CLI pattern (e.g., " run --config sweep_demo.yaml
    --dry-run") so readers can adapt it to the repo's actual entrypoint.

In @examples/recipes/multi_technique/distill_fp8.yaml:

  • Around line 3-5: Update the header comment so the stage order matches the
    pipeline: replace the contradictory first line "Distill from a larger teacher
    model, then quantize the student to FP8." with a line that reflects the actual
    execution order used later, e.g. "Quantize the student to FP8, then distill
    (fine-tune with teacher guidance)." Adjust the wording around the existing
    "Execution order: quantization → distillation (fine-tune with teacher guidance)"
    phrase to ensure both lines consistently state quantization → distillation;
    locate the lines by searching for the exact header text "Distill from a larger
    teacher model, then quantize the student to FP8." and "Execution order:
    quantization → distillation (fine-tune with teacher guidance)" and make the
    first line mirror the execution order.

In @examples/recipes/ptq/ptq_nvfp4_skip_first_last.yaml:

  • Around line 28-32: The override pattern currently disables "layers.0" and
    "layers.21" which mismatches the recipe name/description (32-layer model) and
    the note referencing "layers.31"; update the second override pattern from
    "layers.21" to "layers.31" so the recipe actually skips the first and last
    layers, or if intended to skip layer 21 instead, update the file
    name/description and any references (e.g., the "layers.31" note) to reflect a
    non-32-layer example.

In @examples/recipes/qat/qat_int8.yaml:

  • Around line 10-16: The recipe metadata claims AWQ but the quantization config
    uses SmoothQuant (quantization.preset: int8_sq and
    quantization.algorithm.method: smoothquant); either update the description and
    tags to reference SmoothQuant (e.g., description: "INT8 quantization-aware
    training with SmoothQuant initialization" and tags: [qat, int8, smoothquant]) or
    change the quantization block to AWQ-compatible values (replace preset and
    algorithm.method with the appropriate AWQ preset/algorithm names) so the
    metadata and quantization settings match.

In @examples/recipes/README.md:

  • Around line 84-86: The fenced code block containing the line "pruning →
    sparsity → quantization → distillation" in README.md needs a language tag to
    satisfy markdownlint; update that fenced block from totext so it reads
    as a text code fence (i.e., add the language identifier "text" to the existing
    triple-backtick fence that wraps the pruning → ... line).
  • Around line 90-99: The example calls mtq.quantize but never imports mtq; add
    an import for modelopt.torch.quantization (e.g. import
    modelopt.torch.quantization as mtq) at the top of the snippet before using
    load_recipe so mtq.quantize is defined; update the example near load_recipe and
    the use of mtq.quantize/forward_loop=calibrate accordingly.

In @modelopt/torch/recipes/__init__.py:

  • Around line 1-16: Add the required NVIDIA Apache 2.0 license header to the top
    of this Python module (modelopt.torch.recipes init.py) so the file begins
    with the standard multi-line copyright and Apache 2.0 license block; keep the
    existing module docstring and usage examples after the header, and ensure the
    header appears before any code or docstring (so symbols like load_recipe and the
    module-level docstring remain unchanged and located after the license block).

In @modelopt/torch/recipes/bridge.py:

  • Around line 1-5: The file is missing the required NVIDIA Apache 2.0 license
    header; add the project's standard Apache-2.0/NVIDIA license block at the top of
    bridge.py (above the module docstring) so all new Python files under modelopt
    include the license; ensure the header matches the existing project license
    text/style (including copyright holder and SPDX identifier) used elsewhere in
    the repo.
  • Around line 55-59: The parameter recipe_path on summarize_recipe is declared
    but unused; add it to the returned summary dict so callers retain context:
    update summarize_recipe (and its docstring) to include a "recipe_path" key in
    the returned dict alongside preset, algorithm, kv_cache, overrides_count, and
    type, populating it with the incoming recipe_path value; ensure any code that
    relies on those keys continues to work and adjust tests/types if needed.

In @modelopt/torch/recipes/cli.py:

  • Around line 47-63: Wrap the entire "dry-run" branch in a try/except that
    mirrors the other subcommands: catch exceptions from SweepConfig.from_yaml,
    SweepController(config), controller.dry_run(verbose=...), and
    controller.export_jobs(...) and print a clear error message (including the
    exception text) then sys.exit(1); keep existing validation behavior (call
    config.validate() first), and ensure export_jobs is also guarded by its own
    try/except so failures when writing to args.output are reported and cause exit
    instead of leaking a raw traceback.

In @modelopt/torch/recipes/experiment/config.py:

  • Around line 1-5: The file is missing the required NVIDIA Apache 2.0 license
    header; add the standard multi-line NVIDIA Apache 2.0 license header at the very
    top of modelopt/torch/recipes/experiment/config.py before the existing module
    docstring (the triple-quoted string starting "Sweep configuration schema."),
    ensuring the header exactly matches the project's canonical license text and
    includes copyright year and holder.

In @modelopt/torch/recipes/experiment/controller.py:

  • Around line 1-5: This file is missing the required NVIDIA Apache 2.0 license
    header; prepend the standard NVIDIA Apache 2.0 license block to the top of the
    module (above the existing module docstring in controller.py / the "Sweep
    Controller" module) so all new Python files under modelopt include the license
    header as per guidelines; ensure the header is the full required text and that
    no other file content is moved below it.

In @modelopt/torch/recipes/schema/__init__.py:

  • Around line 1-16: This file is missing the required NVIDIA Apache 2.0 license
    header; add the standard license comment block at the very top of the module
    (before the module docstring) so the file that re-exports RecipeConfig,
    FORMAT_REGISTRY, KV_FORMAT_REGISTRY, get_preset, get_preset_source,
    list_presets, and resolve_recipe contains the NVIDIA Apache 2.0 header per
    project guidelines.

In @modelopt/torch/recipes/schema/resolver.py:

  • Around line 1-6: This file is missing the required NVIDIA Apache 2.0 license
    header; add the standard multi-line NVIDIA Apache 2.0 license block at the very
    top of modelopt/torch/recipes/schema/resolver.py (above the module docstring) so
    the file complies with the repository's licensing guidelines for all new Python
    files under modelopt; ensure the license text matches the project's canonical
    header exactly.

Nitpick comments:
In @modelopt/torch/recipes/experiment/config.py:

  • Around line 121-124: The docstring for the total_jobs property should
    explicitly mention the zero-jobs edge case: when any of models, recipes, or
    launchers is empty total_jobs will be 0, which can cause downstream divisions by
    total_jobs; note that validate() is responsible for catching/raising on empty
    lists before execution. Update the total_jobs docstring (in the total_jobs
    property) to include this behavior and a brief note pointing callers to
    validate() to ensure non-zero job counts.

In @modelopt/torch/recipes/experiment/controller.py:

  • Around line 80-86: The job_id is incremented before validating the recipe,
    causing gaps when a recipe is skipped; in the loop that iterates over
    itertools.product(self.config.models, self.config.recipes,
    self.config.launchers) (variables model, recipe_path, launcher) move the job_id
    increment so it happens only for valid jobs (i.e., after checking recipe_path
    against self._validation_errors) or increment conditionally inside the branch
    that proceeds with the job; update any "Job X/Y" calculations to rely on this
    adjusted job_id so dry-run outputs remain sequential.
  • Around line 108-145: The dry_run method currently calls validate_recipes() and
    later calls generate_jobs(), which itself calls validate_recipes() when
    _validated_recipes is empty; remove the redundant explicit validate_recipes()
    invocation in dry_run (or alternatively set/ensure _validated_recipes is
    populated once before calling generate_jobs) and add a short comment near
    dry_run explaining that generate_jobs() will validate recipes lazily using
    _validated_recipes to avoid duplicate work; refer to the dry_run, generate_jobs,
    validate_recipes methods and the _validated_recipes cache when making the
    change.

In @modelopt/torch/recipes/schema/formats.py:

  • Around line 135-148: get_format and get_kv_format currently return references
    into the shared FORMAT_REGISTRY and KV_FORMAT_REGISTRY so callers who
    merge/update will mutate global defaults; change both functions to return copies
    (deep copies if values are nested dicts) of the registry entries instead of the
    original dict objects (i.e., copy the dict returned from FORMAT_REGISTRY[name]
    and KV_FORMAT_REGISTRY[name]) so callers can modify the result without affecting
    the registries.

In @modelopt/torch/recipes/schema/resolver.py:

  • Around line 34-57: The module-level imports (FORMAT_REGISTRY, get_format,
    get_kv_format from .formats; get_preset from .presets; and AlgorithmConfig,
    AutoQuantizeSection, OverrideEntry, QuantizationSection, QuantizerSpec,
    RecipeConfig from .models) should be moved to the top of the file immediately
    after the module docstring and before the function
    _update_quant_cfg_with_kv_cache; update the file so all imports live together at
    the top in that order, keeping the helper function
    _update_quant_cfg_with_kv_cache defined below them and ensuring no other code is
    split by late imports.

In @tests/unit/torch/recipes/test_pipeline.py:

  • Around line 152-161: The test test_load_and_plan_all_examples assumes
    examples/recipes exists; add a guard that checks the Path("examples/recipes")
    (recipes_dir) exists and is a directory and that recipes_dir.rglob("*.yaml")
    yields at least one file before iterating; if the directory is missing or no
    recipe files are found, call pytest.skip with a clear message so load_and_plan
    is not invoked on a non-existent/empty dataset. Use the existing recipes_dir and
    load_and_plan identifiers to locate where to add the checks.

</details>

---

<details>
<summary>ℹ️ Review info</summary>

<details>
<summary>⚙️ Run configuration</summary>

**Configuration used**: Path: .coderabbit.yaml

**Review profile**: CHILL

**Plan**: Pro

**Run ID**: `a654782e-65b6-4e46-be5b-5e37b34ba760`

</details>

<details>
<summary>📥 Commits</summary>

Reviewing files that changed from the base of the PR and between a56b6f356b4017847f9ef9850cf81bd0203ce5a3 and 71a0675a9c7365cfce6f13c1f54dc1b73a887f3c.

</details>

<details>
<summary>📒 Files selected for processing (35)</summary>

* `examples/recipes/README.md`
* `examples/recipes/auto/auto_quantize.yaml`
* `examples/recipes/experiments/experiment.yaml`
* `examples/recipes/experiments/sweep_demo.yaml`
* `examples/recipes/multi_technique/distill_fp8.yaml`
* `examples/recipes/multi_technique/prune_gradnas_fp8.yaml`
* `examples/recipes/multi_technique/sparse_quantize.yaml`
* `examples/recipes/ptq/custom_w4a8_overrides.yaml`
* `examples/recipes/ptq/ptq_fp8.yaml`
* `examples/recipes/ptq/ptq_nvfp4_awq.yaml`
* `examples/recipes/ptq/ptq_nvfp4_local_hessian.yaml`
* `examples/recipes/ptq/ptq_nvfp4_local_hessian_overrides.yaml`
* `examples/recipes/ptq/ptq_nvfp4_skip_first_last.yaml`
* `examples/recipes/qat/qat_int8.yaml`
* `modelopt/torch/recipes/__init__.py`
* `modelopt/torch/recipes/bridge.py`
* `modelopt/torch/recipes/cli.py`
* `modelopt/torch/recipes/experiment/__init__.py`
* `modelopt/torch/recipes/experiment/config.py`
* `modelopt/torch/recipes/experiment/controller.py`
* `modelopt/torch/recipes/pipeline.py`
* `modelopt/torch/recipes/schema/__init__.py`
* `modelopt/torch/recipes/schema/_bundled_presets.py`
* `modelopt/torch/recipes/schema/formats.py`
* `modelopt/torch/recipes/schema/models.py`
* `modelopt/torch/recipes/schema/presets.py`
* `modelopt/torch/recipes/schema/resolver.py`
* `tests/unit/torch/recipes/__init__.py`
* `tests/unit/torch/recipes/conftest.py`
* `tests/unit/torch/recipes/test_bridge.py`
* `tests/unit/torch/recipes/test_load_recipe.py`
* `tests/unit/torch/recipes/test_pipeline.py`
* `tests/unit/torch/recipes/test_schema.py`
* `tests/unit/torch/recipes/test_sweep_config.py`
* `tests/unit/torch/recipes/test_sweep_controller.py`

</details>

</details>

<!-- This is an auto-generated comment by CodeRabbit for review status -->

Migrate recipe system from nvidia-modelopt-recipes standalone package
into modelopt.torch.recipes. Provides YAML-based recipe configs for
model optimization with preset resolution, multi-technique pipeline
planning, and experiment orchestration (sweep + eval).

Structure:
- modelopt/torch/recipes/schema/  — Pydantic models, resolver, presets
- modelopt/torch/recipes/experiment/ — sweep config, job matrix controller
- modelopt/torch/recipes/pipeline.py — multi-technique execution planner
- modelopt/torch/recipes/bridge.py — recipe-to-hf_ptq translation
- modelopt/torch/recipes/cli.py — CLI entry point
- examples/recipes/ — 11 recipes (ptq/qat/multi_technique/auto) + experiments
- tests/unit/torch/recipes/ — 34 tests, all passing

No existing files modified. Zero breakage to existing tests.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Sungsoo Ha <sungsooh@nvidia.com>
sungsooha and others added 3 commits March 9, 2026 17:43
QAT mode (quantize → fine-tune) needs training parameters. Reject
incomplete QAT recipes at schema validation time rather than producing
pipeline steps with no training config.

Signed-off-by: Sungsoo Ha <sungsooh@nvidia.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Sungsoo Ha <sungsooh@nvidia.com>
Import the config module first, then read attributes from it. This way
ModuleNotFoundError (package not installed) falls back to bundled
presets, while AttributeError (typo or renamed symbol) fails fast
instead of silently degrading.

Signed-off-by: Sungsoo Ha <sungsooh@nvidia.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Sungsoo Ha <sungsooh@nvidia.com>
Use ModuleNotFoundError instead of ImportError so genuine import
regressions in modelopt submodules surface as errors rather than
silently skipping tests.

Signed-off-by: Sungsoo Ha <sungsooh@nvidia.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Sungsoo Ha <sungsooh@nvidia.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

♻️ Duplicate comments (2)
tests/unit/torch/recipes/conftest.py (1)

38-48: ⚠️ Potential issue | 🟠 Major

Don't mask real import regressions when deciding test skips.

Line 40 probes a deep submodule and Lines 43-44 swallow any ImportError, so a broken modelopt.torch.quantization import gets converted into skipped requires_modelopt tests instead of a failing collection. Probe the top-level package only and only suppress ModuleNotFoundError for the missing package itself.

📎 Suggested fix
 def pytest_collection_modifyitems(config, items):
     try:
-        import modelopt.torch.quantization.config  # noqa: F401
+        import modelopt  # noqa: F401
 
         return  # modelopt available, run all tests
-    except ImportError:
-        pass
+    except ModuleNotFoundError as exc:
+        if exc.name != "modelopt":
+            raise
     skip = pytest.mark.skip(reason="nvidia-modelopt not installed")
     for item in items:
         if "requires_modelopt" in item.keywords:
             item.add_marker(skip)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/unit/torch/recipes/conftest.py` around lines 38 - 48, The test
collection is masking real import errors by importing a deep submodule and
catching all ImportError; update pytest_collection_modifyitems to probe only the
top-level package (import modelopt) and change the exception handling to only
suppress a ModuleNotFoundError for the top-level package (re-raise any other
ImportError/ModuleNotFoundError or those for submodules) so that genuine
regressions in modelopt.torch.quantization cause test failures; keep the skip
marker logic for items with the "requires_modelopt" keyword.
modelopt/torch/recipes/schema/formats.py (1)

81-93: ⚠️ Potential issue | 🟠 Major

Add mxint8 to FORMAT_REGISTRY.

The format is missing from the registry and will cause runtime failures when recipes reference format: mxint8. The configuration should match the other MX format patterns with weight/activation blocks.

Proposed fix
     "mxfp4": {
         "weight": {
             "num_bits": (2, 1),
             "block_sizes": {-1: 32, "type": "dynamic", "scale_bits": (8, 0)},
             "enable": True,
         },
         "activation": {
             "num_bits": (2, 1),
             "block_sizes": {-1: 32, "type": "dynamic", "scale_bits": (8, 0)},
             "enable": True,
         },
     },
+    "mxint8": {
+        "weight": {
+            "num_bits": 8,
+            "block_sizes": {-1: 32, "type": "dynamic", "scale_bits": (8, 0)},
+            "enable": True,
+        },
+        "activation": {
+            "num_bits": 8,
+            "block_sizes": {-1: 32, "type": "dynamic", "scale_bits": (8, 0)},
+            "enable": True,
+        },
+    },
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@modelopt/torch/recipes/schema/formats.py` around lines 81 - 93,
FORMAT_REGISTRY is missing an entry for "mxint8" which causes failures when
recipes use format: mxint8; add a new dictionary key "mxint8" to FORMAT_REGISTRY
mirroring the existing "mxfp4" structure with both "weight" and "activation"
entries, each containing "num_bits", "block_sizes", and "enable" fields (use the
same shapes/patterns as "mxfp4": num_bits tuples, block_sizes with -1:32,
"type":"dynamic", "scale_bits":(8,0), and enable=True) so recipes referencing
mxint8 resolve correctly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@examples/recipes/experiments/experiment.yaml`:
- Around line 37-40: The per-model override under model_overrides uses the key
"codellama/CodeLlama-34b-Instruct" but the sweep only iterates Qwen/Qwen3-8B and
meta-llama/Llama-3.1-8B-Instruct, so the override never applies; update the
sweep to include "codellama/CodeLlama-34b-Instruct" or change the
model_overrides key to match one of the swept models (e.g., set the override key
to "Qwen/Qwen3-8B" or "meta-llama/Llama-3.1-8B-Instruct") so the per-model tasks
(simple_evals.humaneval, simple_evals.livecodebench_v5) actually take effect for
a swept model.
- Around line 6-8: The usage comment at the top of the file points to the wrong
example path; update the comment string in
examples/recipes/experiments/experiment.yaml so the suggested command uses the
actual file location (replace examples/sweeps/experiment.yaml with
examples/recipes/experiments/experiment.yaml) in the usage text near the top of
the file.

In `@examples/recipes/experiments/sweep_demo.yaml`:
- Around line 6-8: Update the usage example line to call the new CLI entrypoint
and correct path: replace the old command that references
scripts/demo_big_pareto_sweep.py and examples/sweeps/... with the new CLI
invocation (modelopt.torch.recipes.cli) pointing to
examples/recipes/experiments/sweep_demo.yaml and include the same --config and
--dry-run flags so the example becomes something like: python -m
modelopt.torch.recipes.cli --config examples/recipes/experiments/sweep_demo.yaml
--dry-run.

In `@examples/recipes/README.md`:
- Around line 90-99: The example uses load_recipe and then calls mtq.quantize
but never defines or imports mtq or calibrate, causing NameError; update the
snippet to import the quantization module (e.g., add an import for
modelopt.torch.quantization as mtq) and either (a) document that model and
calibrate are application-provided placeholders or (b) provide minimal
placeholder definitions for model and calibrate in the example so it runs;
ensure the example still uses load_recipe and the quantize(...) call
(referencing load_recipe, mtq.quantize, and calibrate) so readers can copy-paste
a runnable example.

In `@modelopt/torch/recipes/experiment/controller.py`:
- Around line 106-123: The SweepJob payload drops auto-quantize search config
because recipe_to_hf_ptq_args(resolved) only handles
resolved["quantize_config"]; detect when resolved contains auto_quantize_kwargs
and merge or replace the hf_ptq payload with those args plus pyt_ckpt_path and
the existing calibration/export fields. Update the logic around
recipe_to_hf_ptq_args and the hf_ptq assignment in the SweepJob creation
(reference recipe_to_hf_ptq_args, resolved, auto_quantize_kwargs, and the
SweepJob ptq_config construction) so that when resolved has auto_quantize_kwargs
the job.ptq_config.hf_ptq includes the auto-quantize search config rather than
being incomplete.
- Line 31: The module-level import of PyYAML (`import yaml`) causes a hard
dependency and will raise ModuleNotFoundError for users who didn't install
PyYAML; update SweepController usage to avoid this by either (A) adding PyYAML
to the project's core dependencies in pyproject.toml, or (B) gate the recipes
feature behind an optional extra (e.g., [recipes]) and move the `import yaml`
into the specific functions or methods that need it within the SweepController
implementation so the module can be imported without PyYAML installed; choose
one approach and apply it consistently (if gating, ensure any public API that
parses YAML lazily imports `yaml` and raises a clear error directing users to
install the optional extra).

In `@modelopt/torch/recipes/schema/resolver.py`:
- Around line 202-232: In _apply_override, pattern-based overrides currently
ignore override.weights/override.activations and module_class overrides
overwrite rather than merge existing entries; fix by (1) when override.pattern
is set, start from a deepcopy of quant_cfg.get(override.pattern, {}) (like you
do elsewhere) and if override.weights/override.activations are present, insert
"*weight_quantizer" and "*input_quantizer" using _resolve_quantizer_spec(…,
"weight"/"activation") into the entry instead of ignoring them, then merge other
fields as you already do and assign back to quant_cfg[override.pattern]; (2)
when override.module_class is set, begin from a deepcopy of
quant_cfg.get(override.module_class, {}) instead of creating a fresh mc_cfg so
existing entries (e.g. {"*": {"enable": False}}) are preserved, then merge or
update keys ("*weight_quantizer", "*input_quantizer", and "*" enable-only case)
rather than replacing the whole mc_cfg before assigning back to
quant_cfg[override.module_class].
- Around line 235-249: The resolver _resolve_auto_quantize currently builds
format_configs = [get_preset(fmt_entry.preset) for fmt_entry in section.formats]
but never applies AutoQuantizeSection.kv_cache, so kv_cache settings are
dropped; fix by propagating section.kv_cache into each emitted format config
(e.g., after creating format_configs or while mapping get_preset, set
format_config["kv_cache"] = section.kv_cache when section.kv_cache is present)
so that the returned kwargs["quantization_formats"] include the requested
kv_cache for every format; update _resolve_auto_quantize to check
section.kv_cache and mutate/merge it into each format_config before returning
kwargs.

---

Duplicate comments:
In `@modelopt/torch/recipes/schema/formats.py`:
- Around line 81-93: FORMAT_REGISTRY is missing an entry for "mxint8" which
causes failures when recipes use format: mxint8; add a new dictionary key
"mxint8" to FORMAT_REGISTRY mirroring the existing "mxfp4" structure with both
"weight" and "activation" entries, each containing "num_bits", "block_sizes",
and "enable" fields (use the same shapes/patterns as "mxfp4": num_bits tuples,
block_sizes with -1:32, "type":"dynamic", "scale_bits":(8,0), and enable=True)
so recipes referencing mxint8 resolve correctly.

In `@tests/unit/torch/recipes/conftest.py`:
- Around line 38-48: The test collection is masking real import errors by
importing a deep submodule and catching all ImportError; update
pytest_collection_modifyitems to probe only the top-level package (import
modelopt) and change the exception handling to only suppress a
ModuleNotFoundError for the top-level package (re-raise any other
ImportError/ModuleNotFoundError or those for submodules) so that genuine
regressions in modelopt.torch.quantization cause test failures; keep the skip
marker logic for items with the "requires_modelopt" keyword.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: a1b92688-78a3-4506-a9d9-7e36c665dad1

📥 Commits

Reviewing files that changed from the base of the PR and between 71a0675 and 5ee6d3d.

📒 Files selected for processing (35)
  • examples/recipes/README.md
  • examples/recipes/auto/auto_quantize.yaml
  • examples/recipes/experiments/experiment.yaml
  • examples/recipes/experiments/sweep_demo.yaml
  • examples/recipes/multi_technique/distill_fp8.yaml
  • examples/recipes/multi_technique/prune_gradnas_fp8.yaml
  • examples/recipes/multi_technique/sparse_quantize.yaml
  • examples/recipes/ptq/custom_w4a8_overrides.yaml
  • examples/recipes/ptq/ptq_fp8.yaml
  • examples/recipes/ptq/ptq_nvfp4_awq.yaml
  • examples/recipes/ptq/ptq_nvfp4_local_hessian.yaml
  • examples/recipes/ptq/ptq_nvfp4_local_hessian_overrides.yaml
  • examples/recipes/ptq/ptq_nvfp4_skip_first_last.yaml
  • examples/recipes/qat/qat_int8.yaml
  • modelopt/torch/recipes/__init__.py
  • modelopt/torch/recipes/bridge.py
  • modelopt/torch/recipes/cli.py
  • modelopt/torch/recipes/experiment/__init__.py
  • modelopt/torch/recipes/experiment/config.py
  • modelopt/torch/recipes/experiment/controller.py
  • modelopt/torch/recipes/pipeline.py
  • modelopt/torch/recipes/schema/__init__.py
  • modelopt/torch/recipes/schema/_bundled_presets.py
  • modelopt/torch/recipes/schema/formats.py
  • modelopt/torch/recipes/schema/models.py
  • modelopt/torch/recipes/schema/presets.py
  • modelopt/torch/recipes/schema/resolver.py
  • tests/unit/torch/recipes/__init__.py
  • tests/unit/torch/recipes/conftest.py
  • tests/unit/torch/recipes/test_bridge.py
  • tests/unit/torch/recipes/test_load_recipe.py
  • tests/unit/torch/recipes/test_pipeline.py
  • tests/unit/torch/recipes/test_schema.py
  • tests/unit/torch/recipes/test_sweep_config.py
  • tests/unit/torch/recipes/test_sweep_controller.py
✅ Files skipped from review due to trivial changes (1)
  • tests/unit/torch/recipes/init.py
🚧 Files skipped from review as they are similar to previous changes (16)
  • examples/recipes/qat/qat_int8.yaml
  • tests/unit/torch/recipes/test_sweep_controller.py
  • modelopt/torch/recipes/schema/models.py
  • examples/recipes/ptq/ptq_nvfp4_local_hessian_overrides.yaml
  • tests/unit/torch/recipes/test_pipeline.py
  • modelopt/torch/recipes/cli.py
  • modelopt/torch/recipes/pipeline.py
  • modelopt/torch/recipes/bridge.py
  • examples/recipes/ptq/ptq_nvfp4_skip_first_last.yaml
  • examples/recipes/ptq/ptq_fp8.yaml
  • examples/recipes/ptq/ptq_nvfp4_local_hessian.yaml
  • examples/recipes/multi_technique/prune_gradnas_fp8.yaml
  • examples/recipes/ptq/ptq_nvfp4_awq.yaml
  • tests/unit/torch/recipes/test_schema.py
  • tests/unit/torch/recipes/test_sweep_config.py
  • examples/recipes/multi_technique/distill_fp8.yaml

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
modelopt/torch/recipes/schema/models.py (1)

69-81: ⚠️ Potential issue | 🟠 Major

Add a selector validator to enforce that pattern or module_class is set.

An override entry like - format: fp8 currently passes validation even though it targets nothing. Without a selector, override application is undefined and will either silently no-op or behave unexpectedly downstream.

🛠️ Proposed fix
 class OverrideEntry(BaseModel):
     """Per-layer or per-module-class override."""

     pattern: str | None = None  # glob: "*lm_head*"
     module_class: str | None = None  # class: "nn.LayerNorm"
     enable: bool | None = None
     format: str | None = None
     scale_type: Literal["static", "dynamic"] | None = None  # shorthand for block_sizes.type
     weights: QuantizerSpec | None = None
     activations: QuantizerSpec | None = None
     num_bits: int | list[int] | None = None
     axis: int | None = None
+
+    `@model_validator`(mode="after")
+    def validate_selector(self):
+        """Ensure at least one selector is provided."""
+        if not self.pattern and not self.module_class:
+            raise ValueError(
+                "Override entry must specify either 'pattern' or 'module_class'."
+            )
+        return self
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@modelopt/torch/recipes/schema/models.py` around lines 69 - 81, Add a Pydantic
validator to OverrideEntry that enforces a selector by checking that at least
one of pattern or module_class is provided; inside the OverrideEntry class (the
model with fields pattern and module_class) implement a root validator (e.g.,
validate_selector) that raises a ValueError when both pattern and module_class
are None/empty so an override like "- format: fp8" cannot validate without a
selector.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@modelopt/torch/recipes/schema/models.py`:
- Around line 69-81: Add a Pydantic validator to OverrideEntry that enforces a
selector by checking that at least one of pattern or module_class is provided;
inside the OverrideEntry class (the model with fields pattern and module_class)
implement a root validator (e.g., validate_selector) that raises a ValueError
when both pattern and module_class are None/empty so an override like "- format:
fp8" cannot validate without a selector.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 13e1134a-96db-4dfe-ad48-e9c06f17e94c

📥 Commits

Reviewing files that changed from the base of the PR and between 5ee6d3d and 60bb0eb.

📒 Files selected for processing (4)
  • modelopt/torch/recipes/schema/models.py
  • modelopt/torch/recipes/schema/presets.py
  • tests/unit/torch/recipes/conftest.py
  • tests/unit/torch/recipes/test_schema.py
🚧 Files skipped from review as they are similar to previous changes (2)
  • tests/unit/torch/recipes/conftest.py
  • tests/unit/torch/recipes/test_schema.py

- Add modelopt-recipes console script entry point in pyproject.toml
- Add pyyaml to core dependencies (used directly by recipe system)
- Preserve model config (path, trust_remote_code, attn_implementation)
  on PipelinePlan so consumers can reconstruct the full workflow
- Validate OverrideEntry requires pattern or module_class selector
- Validate QAT mode requires training config (previous commit)
- Fix override merging: pattern overrides now handle weights/activations
  fields; module_class overrides merge instead of replacing existing entries
- Fix auto_quantize resolver to apply kv_cache config to format candidates
- Preserve auto_quantize_kwargs in bridge output for sweep jobs
- Fix stale paths in experiment YAML usage comments
- Fix experiment.yaml model_override to match actual sweep models
- Add mtq import and placeholder note in README Python example

Signed-off-by: Sungsoo Ha <sungsooh@nvidia.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Sungsoo Ha <sungsooh@nvidia.com>
@sungsooha sungsooha requested a review from a team as a code owner March 10, 2026 00:55
Replace hand-picked field dicts with section.model_dump() via shared
_section_to_config() helper. This preserves technique-specific extra
fields that schema sections accept via model_config={"extra": "allow"}.

Signed-off-by: Sungsoo Ha <sungsooh@nvidia.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Sungsoo Ha <sungsooh@nvidia.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
modelopt/torch/recipes/pipeline.py (2)

177-178: Inconsistent use of exclude_none in model_dump() call.

Other model_dump() calls (lines 154, 157, 190, 230, 234, 251) use exclude_none=True, but this one does not. This could result in None values appearing in the export config when other sections exclude them.

♻️ Proposed fix for consistency
     if recipe.export:
-        plan.export = recipe.export.model_dump()
+        plan.export = recipe.export.model_dump(exclude_none=True)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@modelopt/torch/recipes/pipeline.py` around lines 177 - 178, The export
assignment is missing the exclude_none flag on the model_dump call which makes
it inconsistent with other dumps; update the branch where recipe.export is used
so that plan.export = recipe.export.model_dump(exclude_none=True) (i.e., call
model_dump with exclude_none=True) to match the other model_dump usages and
prevent None values from appearing in the exported config.

301-303: Specify explicit encoding when opening the YAML file.

open(path) uses the system default encoding, which can vary across platforms. For YAML files (typically UTF-8), explicitly specifying encoding="utf-8" ensures consistent behavior.

♻️ Proposed fix
     path = Path(path)
-    with open(path) as f:
+    with open(path, encoding="utf-8") as f:
         raw = yaml.safe_load(f)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@modelopt/torch/recipes/pipeline.py` around lines 301 - 303, The YAML file is
opened without an explicit encoding which can vary by platform; update the open
call to specify UTF-8 (either by using open(path, encoding="utf-8") or
Path(path).open(encoding="utf-8")) so that the code that calls
yaml.safe_load(raw) reads the file consistently; change the lines around
Path(path) and with open(path) to use the explicit encoding.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@modelopt/torch/recipes/pipeline.py`:
- Around line 18-26: Docstring lists technique order as "sparsity → quantization
→ distillation" but the actual ordering in the module is defined by
TECHNIQUE_ORDER which begins with "pruning"; update the module docstring to
reflect the real order (include "pruning" first and match TECHNIQUE_ORDER) and
ensure references to RecipeConfig and PipelinePlan remain accurate so the
docstring and the code (TECHNIQUE_ORDER) stay consistent.

---

Nitpick comments:
In `@modelopt/torch/recipes/pipeline.py`:
- Around line 177-178: The export assignment is missing the exclude_none flag on
the model_dump call which makes it inconsistent with other dumps; update the
branch where recipe.export is used so that plan.export =
recipe.export.model_dump(exclude_none=True) (i.e., call model_dump with
exclude_none=True) to match the other model_dump usages and prevent None values
from appearing in the exported config.
- Around line 301-303: The YAML file is opened without an explicit encoding
which can vary by platform; update the open call to specify UTF-8 (either by
using open(path, encoding="utf-8") or Path(path).open(encoding="utf-8")) so that
the code that calls yaml.safe_load(raw) reads the file consistently; change the
lines around Path(path) and with open(path) to use the explicit encoding.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 3f8af51a-1e48-4b68-951f-3714bb8836eb

📥 Commits

Reviewing files that changed from the base of the PR and between e8c983d and ace206a.

📒 Files selected for processing (1)
  • modelopt/torch/recipes/pipeline.py

@codecov
Copy link

codecov bot commented Mar 10, 2026

Codecov Report

❌ Patch coverage is 89.27681% with 43 lines in your changes missing coverage. Please review.
✅ Project coverage is 70.41%. Comparing base (a56b6f3) to head (76c7614).
⚠️ Report is 11 commits behind head on main.

Files with missing lines Patch % Lines
modelopt/torch/recipes/schema/formats.py 61.66% 23 Missing ⚠️
modelopt/torch/recipes/schema/resolver.py 89.39% 14 Missing ⚠️
modelopt/torch/recipes/__init__.py 58.33% 5 Missing ⚠️
modelopt/torch/recipes/utils.py 97.61% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1013      +/-   ##
==========================================
- Coverage   71.73%   70.41%   -1.33%     
==========================================
  Files         211      228      +17     
  Lines       23949    25860    +1911     
==========================================
+ Hits        17181    18209    +1028     
- Misses       6768     7651     +883     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

sungsooha and others added 4 commits March 9, 2026 18:30
Adopt PR NVIDIA#1000's modelopt_recipes/ YAML fragment library as the
preferred preset source (Tier 1). The existing Python constant import
becomes Tier 2 (deprecated), and bundled snapshot becomes Tier 3.

New Tier 1 implements lightweight __base__ inheritance (no OmegaConf
dependency) that resolves composed recipe directories into the same
config dicts that Python constants produced.

Preset name → YAML path mapping covers all 30+ presets. Each preset
maps to a recipe directory (e.g., "fp8" → "general/ptq/fp8_default-fp8_kv")
containing model_quant.yml + kv_quant.yml with __base__ references to
atomic fragments in configs/ptq/.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Sungsoo Ha <sungsooh@nvidia.com>
- Fix docstring technique order to include pruning
- Use exclude_none=True consistently in model_dump() calls
- Specify explicit UTF-8 encoding for YAML file reads
- Add pragma: no cover for untestable tier-loading functions in presets.py
- Add comprehensive coverage tests for pipeline, resolver, bridge, formats, and config

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Sungsoo Ha <sungsooh@nvidia.com>
Each source file now has a corresponding test file:
  pipeline.py      → test_pipeline.py
  bridge.py        → test_bridge.py
  schema/resolver.py → test_resolver.py (new)
  schema/presets.py  → test_presets.py (new, merged from test_yaml_presets.py + test_yaml_preset_e2e.py)
  schema/formats.py  → test_formats.py (new)
  experiment/config.py → test_sweep_config.py

Removed test_pipeline_coverage.py and test_yaml_presets.py — tests
redistributed to their proper homes. Total: 88 tests, all passing.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Sungsoo Ha <sungsooh@nvidia.com>
Test paths now match source paths:
  schema/models.py     → schema/test_models.py
  schema/resolver.py   → schema/test_resolver.py
  schema/presets.py    → schema/test_presets.py
  schema/formats.py    → schema/test_formats.py
  experiment/config.py → experiment/test_config.py
  experiment/controller.py → experiment/test_controller.py

Root-level tests (test_load_recipe.py, test_pipeline.py, test_bridge.py)
stay at the recipes/ test root matching their source location.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Sungsoo Ha <sungsooh@nvidia.com>
sungsooha and others added 2 commits March 9, 2026 20:26
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Sungsoo Ha <sungsooh@nvidia.com>
…t types, E2E tests

- Add forward-compatible load_config() imports (formats, presets, resolver)
  that try PR NVIDIA#1000's OmegaConf-based loader, fall back to inline values
- Align all num_bits/scale_bits to lists (not tuples) matching OmegaConf output
- Remove _bundled_presets.py (909 lines) — redundant with YAML + Python tiers
- Fix nn.* → torch.nn.* in disabled quantizer patterns (PR NVIDIA#1000 change)
- Add get_preset_info() API with recipe.yml metadata loading
- Add nvfp4_omlp_only preset, handle mamba_moe_fp8_aggressive as Tier-2-only
- Add E2E tests covering recipe→resolve→pipeline and sweep/experiment flows

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Sungsoo Ha <sungsooh@nvidia.com>
…nfig)

- Add recipes/utils.py with make_serializable() and try_import_load_config()
- Deduplicate _make_serializable from pipeline.py and controller.py
- Deduplicate load_config import boilerplate from formats.py, presets.py, resolver.py

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Sungsoo Ha <sungsooh@nvidia.com>
sungsooha and others added 4 commits March 9, 2026 21:51
- Move test from test_pipeline.py to test_utils.py (test belongs with the function)
- Add tests for int key stringification, primitive passthrough, non-JSON types

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Sungsoo Ha <sungsooh@nvidia.com>
- Move deep_merge and load_yaml_with_bases to utils.py as public functions
- Make load_recipe_from_yaml and PRESET_YAML_MAP public in presets.py
- Replace private function tests (_resolve_block_sizes, _update_quant_cfg_with_kv_cache)
  with equivalent tests through resolve_recipe public API
- Remove redundant tests subsumed by E2E (test_load_recipe.py, duplicate controller test,
  duplicate pipeline all-examples test)
- Add happy-path tests for formats registry (core formats, weight/activation keys)
- Move deep_merge tests to test_utils.py alongside make_serializable tests

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Sungsoo Ha <sungsooh@nvidia.com>
Add dist.destroy_process_group() to init_process() so Gloo backend
C++ threads are properly torn down before process exit, preventing
"terminate called without an active exception" errors.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Sungsoo Ha <sungsooh@nvidia.com>
This reverts commit 7d957f0.

Signed-off-by: Sungsoo Ha <sungsooh@nvidia.com>
…ML fixture tests

- Remove pipeline.py, bridge.py, cli.py, experiment/ (staged internally)
- Remove examples/recipes/ directory
- Remove CLI entry point from pyproject.toml
- Consolidate tests into 33 YAML fixture-driven test cases
- Add sparsity range validation to schema models
- 91 tests passing (67 parametrized fixtures + 24 unit tests)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Sungsoo Ha <sungsooh@nvidia.com>
@sungsooha sungsooha requested a review from ChenhanYu March 11, 2026 20:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants