Add YAML recipe system for model optimization#1013
Add YAML recipe system for model optimization#1013sungsooha wants to merge 18 commits intoNVIDIA:mainfrom
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds a YAML-driven recipe system and Python tooling to author, validate, resolve, plan, and sweep multi‑technique model optimization workflows (PTQ/QAT, sparsity, pruning, distillation, auto‑quantize). Includes format/preset registries, resolver, pipeline planner, sweep controller, CLI, example recipes, and unit tests. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant Loader as Recipe Loader
participant Validator as Schema Validator
participant Resolver as Recipe Resolver
participant Planner as Pipeline Planner
participant Controller as Sweep Controller
User->>Loader: load_recipe(yaml_path)
Loader->>Loader: read YAML
Loader->>Validator: RecipeConfig.model_validate(...)
Validator-->>Loader: RecipeConfig
Loader->>Resolver: resolve_recipe(RecipeConfig)
Resolver-->>Loader: resolved mtq-compatible dict
Loader-->>User: resolved dict
User->>Controller: SweepConfig.from_yaml(sweep.yaml)
Controller->>Loader: load_recipe(referenced_recipe)
Loader->>Resolver: resolve_recipe(RecipeConfig)
Controller->>Planner: plan_pipeline(RecipeConfig)
Planner-->>Controller: PipelinePlan
Controller-->>User: dry_run summary / exported jobs
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes 🚥 Pre-merge checks | ✅ 3 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
240c797 to
71a0675
Compare
There was a problem hiding this comment.
Actionable comments posted: 16
Note
Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.
🟡 Minor comments (16)
examples/recipes/experiments/experiment.yaml-37-39 (1)
37-39:⚠️ Potential issue | 🟡 MinorThis per-model override never matches any sweep model.
model_overridesis keyed bycodellama/CodeLlama-34b-Instruct, but the sweep only includesQwen/Qwen3-8Bandmeta-llama/Llama-3.1-8B-Instruct. As written, this example never exercises per-model overrides.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@examples/recipes/experiments/experiment.yaml` around lines 37 - 39, The model_overrides mapping is using the key "codellama/CodeLlama-34b-Instruct" which doesn't match any sweep models (the sweep contains "Qwen/Qwen3-8B" and "meta-llama/Llama-3.1-8B-Instruct"), so the per-model override never runs; update the model_overrides keys to match a sweep model (e.g., replace "codellama/CodeLlama-34b-Instruct" with "Qwen/Qwen3-8B" or "meta-llama/Llama-3.1-8B-Instruct"), or add additional entries that exactly match your intended sweep model names so the overrides apply.examples/recipes/experiments/experiment.yaml-6-8 (1)
6-8:⚠️ Potential issue | 🟡 MinorUpdate the usage example to the new CLI/path.
This still points to the old command and the old
examples/sweeps/...location, so users following it will hit a dead path.Suggested fix
# Usage: -# modelopt-recipes dry-run --config examples/sweeps/experiment.yaml +# python -m modelopt.torch.recipes.cli dry-run --config examples/recipes/experiments/experiment.yaml🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@examples/recipes/experiments/experiment.yaml` around lines 6 - 8, The usage example still references the old command and path string "modelopt-recipes dry-run --config examples/sweeps/experiment.yaml"; update that line to the new CLI and location by replacing the old command and path with the current one (e.g., "modelopt recipes dry-run --config examples/recipes/experiment.yaml") so the example points to the correct CLI invocation and the examples/recipes path; search for the exact string "modelopt-recipes dry-run --config examples/sweeps/experiment.yaml" in the file and update it.examples/recipes/README.md-84-86 (1)
84-86:⚠️ Potential issue | 🟡 MinorAdd a language to this fenced block.
This block trips markdownlint as-is. Use something like
textso the docs stay lint-clean.Suggested fix
-``` +```text pruning → sparsity → quantization → distillation</details> <details> <summary>🤖 Prompt for AI Agents</summary>Verify each finding against the current code and only fix it if needed.
In
@examples/recipes/README.mdaround lines 84 - 86, The fenced code block
containing the line "pruning → sparsity → quantization → distillation" in
README.md needs a language tag to satisfy markdownlint; update that fenced block
fromtotext so it reads as a text code fence (i.e., add the language
identifier "text" to the existing triple-backtick fence that wraps the pruning →
... line).</details> </blockquote></details> <details> <summary>examples/recipes/multi_technique/distill_fp8.yaml-3-5 (1)</summary><blockquote> `3-5`: _⚠️ Potential issue_ | _🟡 Minor_ **Fix the stage-order description in the header.** Line 3 says distillation happens before quantization, but this recipe and Line 5 define the opposite order. The comment should not contradict the pipeline. <details> <summary>Suggested fix</summary> ```diff -# Distill from a larger teacher model, then quantize the student to FP8. +# Quantize the student to FP8, then distill from a larger teacher model. ``` </details> <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@examples/recipes/multi_technique/distill_fp8.yaml` around lines 3 - 5, Update the header comment so the stage order matches the pipeline: replace the contradictory first line "Distill from a larger teacher model, then quantize the student to FP8." with a line that reflects the actual execution order used later, e.g. "Quantize the student to FP8, then distill (fine-tune with teacher guidance)." Adjust the wording around the existing "Execution order: quantization → distillation (fine-tune with teacher guidance)" phrase to ensure both lines consistently state quantization → distillation; locate the lines by searching for the exact header text "Distill from a larger teacher model, then quantize the student to FP8." and "Execution order: quantization → distillation (fine-tune with teacher guidance)" and make the first line mirror the execution order. ``` </details> </blockquote></details> <details> <summary>examples/recipes/experiments/sweep_demo.yaml-6-8 (1)</summary><blockquote> `6-8`: _⚠️ Potential issue_ | _🟡 Minor_ **Refresh this usage example to the recipe CLI.** The command and path here still reference the old demo script layout, not the new recipe CLI/examples path introduced by this PR. <details> <summary>Suggested fix</summary> ```diff # Usage: -# .venv/bin/python scripts/demo_big_pareto_sweep.py --config examples/sweeps/sweep_demo.yaml --dry-run +# python -m modelopt.torch.recipes.cli dry-run --config examples/recipes/experiments/sweep_demo.yaml ``` </details> <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@examples/recipes/experiments/sweep_demo.yaml` around lines 6 - 8, Update the usage example to show the new recipe CLI invocation instead of the old demo script: replace the old command with the recipe CLI entrypoint (use the repository's CLI/entrypoint name) invoking the same sweep_demo.yaml config (sweep_demo.yaml) and include the --dry-run flag so the example runs against the new examples/recipes layout; ensure the text shows the correct CLI pattern (e.g., "<cli-entrypoint> run --config sweep_demo.yaml --dry-run") so readers can adapt it to the repo's actual entrypoint. ``` </details> </blockquote></details> <details> <summary>examples/recipes/qat/qat_int8.yaml-10-16 (1)</summary><blockquote> `10-16`: _⚠️ Potential issue_ | _🟡 Minor_ **Align the metadata with the actual algorithm.** The description/tags say AWQ, but the recipe is configured for SmoothQuant (`preset: int8_sq`, `algorithm: smoothquant`). That makes this example misleading in generated summaries and docs. <details> <summary>Suggested fix</summary> ```diff metadata: name: qat_int8 - description: "INT8 quantization-aware training with AWQ initialization" - tags: [qat, int8, awq] + description: "INT8 quantization-aware training with SmoothQuant initialization" + tags: [qat, int8, smoothquant] ``` </details> <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@examples/recipes/qat/qat_int8.yaml` around lines 10 - 16, The recipe metadata claims AWQ but the quantization config uses SmoothQuant (quantization.preset: int8_sq and quantization.algorithm.method: smoothquant); either update the description and tags to reference SmoothQuant (e.g., description: "INT8 quantization-aware training with SmoothQuant initialization" and tags: [qat, int8, smoothquant]) or change the quantization block to AWQ-compatible values (replace preset and algorithm.method with the appropriate AWQ preset/algorithm names) so the metadata and quantization settings match. ``` </details> </blockquote></details> <details> <summary>examples/recipes/README.md-90-99 (1)</summary><blockquote> `90-99`: _⚠️ Potential issue_ | _🟡 Minor_ **Import `mtq` in the Python example.** Line 98 uses `mtq.quantize(...)`, but `mtq` is never imported, so this snippet raises `NameError` when copied verbatim. <details> <summary>Suggested fix</summary> ```diff ```python +import modelopt.torch.quantization as mtq from modelopt.torch.recipes import load_recipe # Load and resolve a recipe result = load_recipe("examples/recipes/ptq/ptq_fp8.yaml") ``` </details> <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@examples/recipes/README.md` around lines 90 - 99, The example calls mtq.quantize but never imports mtq; add an import for modelopt.torch.quantization (e.g. import modelopt.torch.quantization as mtq) at the top of the snippet before using load_recipe so mtq.quantize is defined; update the example near load_recipe and the use of mtq.quantize/forward_loop=calibrate accordingly. ``` </details> </blockquote></details> <details> <summary>examples/recipes/ptq/ptq_nvfp4_skip_first_last.yaml-28-32 (1)</summary><blockquote> `28-32`: _⚠️ Potential issue_ | _🟡 Minor_ **Use the actual last-layer index in this example.** `*layers.21*` contradicts the file name/description and Line 7’s `layers.31` note for 32-layer models. As written, this recipe skips the first and 22nd layers, not the first and last one. If this is meant to be a 32-layer example, the pattern should be `*layers.31*`; otherwise the example name/description should be updated. <details> <summary>📝 Proposed fix</summary> ```diff - pattern: "*layers.0*" enable: false - - pattern: "*layers.21*" + - pattern: "*layers.31*" enable: false ``` </details> <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@examples/recipes/ptq/ptq_nvfp4_skip_first_last.yaml` around lines 28 - 32, The override pattern currently disables "*layers.0*" and "*layers.21*" which mismatches the recipe name/description (32-layer model) and the note referencing "layers.31"; update the second override pattern from "*layers.21*" to "*layers.31*" so the recipe actually skips the first and last layers, or if intended to skip layer 21 instead, update the file name/description and any references (e.g., the "layers.31" note) to reflect a non-32-layer example. ``` </details> </blockquote></details> <details> <summary>modelopt/torch/recipes/cli.py-47-63 (1)</summary><blockquote> `47-63`: _⚠️ Potential issue_ | _🟡 Minor_ **Catch `dry-run` load/export failures like the other subcommands.** `SweepConfig.from_yaml()`, `controller.dry_run()`, and `controller.export_jobs()` can all raise, and this branch currently lets those exceptions escape as raw tracebacks. That makes malformed configs and unwritable output paths much rougher to diagnose than `validate`/`plan`. <details> <summary>💡 Suggested fix</summary> ```diff elif args.command == "dry-run": - from modelopt.torch.recipes.experiment import SweepConfig, SweepController - - config = SweepConfig.from_yaml(args.config) - errors = config.validate() - if errors: - print("Sweep config validation failed:") - for e in errors: - print(f" - {e}") - sys.exit(1) - - controller = SweepController(config) - print(controller.dry_run(verbose=args.verbose)) - - if args.output: - controller.export_jobs(args.output) - print(f"\nJobs exported to: {args.output}") + try: + from modelopt.torch.recipes.experiment import SweepConfig, SweepController + + config = SweepConfig.from_yaml(args.config) + errors = config.validate() + if errors: + print("Sweep config validation failed:", file=sys.stderr) + for error in errors: + print(f" - {error}", file=sys.stderr) + sys.exit(1) + + controller = SweepController(config) + print(controller.dry_run(verbose=args.verbose)) + + if args.output: + controller.export_jobs(args.output) + print(f"\nJobs exported to: {args.output}") + except Exception as exc: + print(f"Dry run failed: {exc}", file=sys.stderr) + sys.exit(1) ``` </details> <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@modelopt/torch/recipes/cli.py` around lines 47 - 63, Wrap the entire "dry-run" branch in a try/except that mirrors the other subcommands: catch exceptions from SweepConfig.from_yaml, SweepController(config), controller.dry_run(verbose=...), and controller.export_jobs(...) and print a clear error message (including the exception text) then sys.exit(1); keep existing validation behavior (call config.validate() first), and ensure export_jobs is also guarded by its own try/except so failures when writing to args.output are reported and cause exit instead of leaking a raw traceback. ``` </details> </blockquote></details> <details> <summary>modelopt/torch/recipes/schema/resolver.py-1-6 (1)</summary><blockquote> `1-6`: _⚠️ Potential issue_ | _🟡 Minor_ **Missing NVIDIA Apache 2.0 license header.** Per coding guidelines, all new Python files under `modelopt/` require the license header. <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@modelopt/torch/recipes/schema/resolver.py` around lines 1 - 6, This file is missing the required NVIDIA Apache 2.0 license header; add the standard multi-line NVIDIA Apache 2.0 license block at the very top of modelopt/torch/recipes/schema/resolver.py (above the module docstring) so the file complies with the repository's licensing guidelines for all new Python files under modelopt; ensure the license text matches the project's canonical header exactly. ``` </details> </blockquote></details> <details> <summary>modelopt/torch/recipes/bridge.py-1-5 (1)</summary><blockquote> `1-5`: _⚠️ Potential issue_ | _🟡 Minor_ **Missing NVIDIA Apache 2.0 license header.** Per coding guidelines, all new Python files under `modelopt/` require the license header. <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@modelopt/torch/recipes/bridge.py` around lines 1 - 5, The file is missing the required NVIDIA Apache 2.0 license header; add the project's standard Apache-2.0/NVIDIA license block at the top of bridge.py (above the module docstring) so all new Python files under modelopt include the license; ensure the header matches the existing project license text/style (including copyright holder and SPDX identifier) used elsewhere in the repo. ``` </details> </blockquote></details> <details> <summary>modelopt/torch/recipes/experiment/controller.py-1-5 (1)</summary><blockquote> `1-5`: _⚠️ Potential issue_ | _🟡 Minor_ **Missing NVIDIA Apache 2.0 license header.** Per coding guidelines, all new Python files under `modelopt/` require the license header. <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@modelopt/torch/recipes/experiment/controller.py` around lines 1 - 5, This file is missing the required NVIDIA Apache 2.0 license header; prepend the standard NVIDIA Apache 2.0 license block to the top of the module (above the existing module docstring in controller.py / the "Sweep Controller" module) so all new Python files under modelopt include the license header as per guidelines; ensure the header is the full required text and that no other file content is moved below it. ``` </details> </blockquote></details> <details> <summary>modelopt/torch/recipes/experiment/config.py-1-5 (1)</summary><blockquote> `1-5`: _⚠️ Potential issue_ | _🟡 Minor_ **Missing NVIDIA Apache 2.0 license header.** Per coding guidelines, all new Python files under `modelopt/` require the license header. <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@modelopt/torch/recipes/experiment/config.py` around lines 1 - 5, The file is missing the required NVIDIA Apache 2.0 license header; add the standard multi-line NVIDIA Apache 2.0 license header at the very top of modelopt/torch/recipes/experiment/config.py before the existing module docstring (the triple-quoted string starting "Sweep configuration schema."), ensuring the header exactly matches the project's canonical license text and includes copyright year and holder. ``` </details> </blockquote></details> <details> <summary>modelopt/torch/recipes/schema/__init__.py-1-16 (1)</summary><blockquote> `1-16`: _⚠️ Potential issue_ | _🟡 Minor_ **Missing NVIDIA Apache 2.0 license header.** Per coding guidelines, all new Python files under `modelopt/` require the license header. Otherwise, this is a clean re-export module with appropriate `__all__` definition. <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@modelopt/torch/recipes/schema/__init__.py` around lines 1 - 16, This file is missing the required NVIDIA Apache 2.0 license header; add the standard license comment block at the very top of the module (before the module docstring) so the file that re-exports RecipeConfig, FORMAT_REGISTRY, KV_FORMAT_REGISTRY, get_preset, get_preset_source, list_presets, and resolve_recipe contains the NVIDIA Apache 2.0 header per project guidelines. ``` </details> </blockquote></details> <details> <summary>modelopt/torch/recipes/__init__.py-1-16 (1)</summary><blockquote> `1-16`: _⚠️ Potential issue_ | _🟡 Minor_ **Missing NVIDIA Apache 2.0 license header.** Per coding guidelines, all new Python files under `modelopt/` require the NVIDIA Apache 2.0 license header. <details> <summary>Add license header at the top of the file</summary> ```diff +# SPDX-FileCopyrightText: Copyright (c) 2024 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + """Recipe system and experiment controller for NVIDIA Model Optimizer. ``` </details> As per coding guidelines: "NVIDIA Apache 2.0 license header required on all new Python/C++/CUDA files". <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@modelopt/torch/recipes/__init__.py` around lines 1 - 16, Add the required NVIDIA Apache 2.0 license header to the top of this Python module (modelopt.torch.recipes __init__.py) so the file begins with the standard multi-line copyright and Apache 2.0 license block; keep the existing module docstring and usage examples after the header, and ensure the header appears before any code or docstring (so symbols like load_recipe and the module-level docstring remain unchanged and located after the license block). ``` </details> </blockquote></details> <details> <summary>modelopt/torch/recipes/bridge.py-55-59 (1)</summary><blockquote> `55-59`: _⚠️ Potential issue_ | _🟡 Minor_ **Unused `recipe_path` parameter.** The `recipe_path` parameter is declared but never used in the function body. Either remove it or include it in the returned summary dict if it's intended for context. <details> <summary>Option 1: Remove unused parameter</summary> ```diff -def summarize_recipe(recipe_path: str, resolved: dict[str, Any], raw_yaml: dict) -> dict[str, Any]: +def summarize_recipe(resolved: dict[str, Any], raw_yaml: dict) -> dict[str, Any]: """Extract human-readable summary from recipe YAML and resolved config. - Returns a dict with keys: preset, algorithm, kv_cache, overrides_count, type. + Returns a dict with keys: type, preset, algorithm, kv_cache, overrides_count. """ ``` </details> <details> <summary>Option 2: Include in returned summary</summary> ```diff if quant: algo = quant.get("algorithm", "default") if isinstance(algo, dict): algo = algo.get("method", "default") return { "type": "quantize", + "recipe_path": recipe_path, "preset": quant.get("preset", "custom"), ``` </details> <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@modelopt/torch/recipes/bridge.py` around lines 55 - 59, The parameter recipe_path on summarize_recipe is declared but unused; add it to the returned summary dict so callers retain context: update summarize_recipe (and its docstring) to include a "recipe_path" key in the returned dict alongside preset, algorithm, kv_cache, overrides_count, and type, populating it with the incoming recipe_path value; ensure any code that relies on those keys continues to work and adjust tests/types if needed. ``` </details> </blockquote></details> </blockquote></details> <details> <summary>🧹 Nitpick comments (6)</summary><blockquote> <details> <summary>modelopt/torch/recipes/schema/formats.py (1)</summary><blockquote> `135-148`: **Return copies from the public format getters.** Unlike `get_preset()` in `modelopt/torch/recipes/schema/presets.py` on Lines 121-127, these accessors hand back the shared registry dicts. A caller-side merge or `update()` will mutate global defaults and leak into later recipe resolutions. <details> <summary>♻️ Proposed fix</summary> ```diff +import copy from typing import Any @@ def get_format(name: str) -> dict[str, dict[str, Any]]: """Look up a format by name. Raises KeyError if not found.""" if name not in FORMAT_REGISTRY: available = sorted(FORMAT_REGISTRY.keys()) raise KeyError(f"Unknown format '{name}'. Available: {available}") - return FORMAT_REGISTRY[name] + return copy.deepcopy(FORMAT_REGISTRY[name]) @@ def get_kv_format(name: str) -> dict[str, Any]: """Look up a KV cache format by name. Raises KeyError if not found.""" if name not in KV_FORMAT_REGISTRY: available = sorted(KV_FORMAT_REGISTRY.keys()) raise KeyError(f"Unknown KV cache format '{name}'. Available: {available}") - return KV_FORMAT_REGISTRY[name] + return copy.deepcopy(KV_FORMAT_REGISTRY[name]) ``` </details> <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@modelopt/torch/recipes/schema/formats.py` around lines 135 - 148, get_format and get_kv_format currently return references into the shared FORMAT_REGISTRY and KV_FORMAT_REGISTRY so callers who merge/update will mutate global defaults; change both functions to return copies (deep copies if values are nested dicts) of the registry entries instead of the original dict objects (i.e., copy the dict returned from FORMAT_REGISTRY[name] and KV_FORMAT_REGISTRY[name]) so callers can modify the result without affecting the registries. ``` </details> </blockquote></details> <details> <summary>modelopt/torch/recipes/experiment/controller.py (2)</summary><blockquote> `80-86`: **Job IDs have gaps when recipes are invalid.** The `job_id` is incremented before checking if the recipe is valid, resulting in non-sequential job IDs when some recipes fail validation. This may cause confusion in the dry-run output where jobs are displayed as "Job X/Y" but X has gaps. <details> <summary>Increment job_id only for valid jobs</summary> ```diff jobs: list[SweepJob] = [] - job_id = 0 for model, recipe_path, launcher in itertools.product( self.config.models, self.config.recipes, self.config.launchers ): - job_id += 1 if recipe_path in self._validation_errors: continue # Skip invalid recipes resolved = self._validated_recipes[recipe_path] raw_yaml = self._raw_yamls[recipe_path] hf_ptq = recipe_to_hf_ptq_args(resolved) hf_ptq["pyt_ckpt_path"] = model jobs.append(SweepJob( - job_id=job_id, + job_id=len(jobs) + 1, model=model, ``` </details> <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@modelopt/torch/recipes/experiment/controller.py` around lines 80 - 86, The job_id is incremented before validating the recipe, causing gaps when a recipe is skipped; in the loop that iterates over itertools.product(self.config.models, self.config.recipes, self.config.launchers) (variables model, recipe_path, launcher) move the job_id increment so it happens only for valid jobs (i.e., after checking recipe_path against self._validation_errors) or increment conditionally inside the branch that proceeds with the job; update any "Job X/Y" calculations to rely on this adjusted job_id so dry-run outputs remain sequential. ``` </details> --- `108-145`: **`dry_run` calls both `validate_recipes` and `generate_jobs`, which internally calls `validate_recipes` again.** The `dry_run` method explicitly calls `validate_recipes()` at line 145, but `generate_jobs()` at line 212 will call `validate_recipes()` again if `_validated_recipes` is empty. Since `validate_recipes` populates `_validated_recipes`, the second call is a no-op, but this could be clearer. <details> <summary>Add comment or restructure for clarity</summary> ```diff # Phase 1: Recipe Validation section("Phase 1: Recipe Validation (load_recipe → resolved config)") - validation_results = self.validate_recipes() + validation_results = self.validate_recipes() # Populates _validated_recipes for generate_jobs() ``` </details> <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@modelopt/torch/recipes/experiment/controller.py` around lines 108 - 145, The dry_run method currently calls validate_recipes() and later calls generate_jobs(), which itself calls validate_recipes() when _validated_recipes is empty; remove the redundant explicit validate_recipes() invocation in dry_run (or alternatively set/ensure _validated_recipes is populated once before calling generate_jobs) and add a short comment near dry_run explaining that generate_jobs() will validate recipes lazily using _validated_recipes to avoid duplicate work; refer to the dry_run, generate_jobs, validate_recipes methods and the _validated_recipes cache when making the change. ``` </details> </blockquote></details> <details> <summary>modelopt/torch/recipes/experiment/config.py (1)</summary><blockquote> `121-124`: **Consider documenting the zero-jobs edge case.** When any of `models`, `recipes`, or `launchers` is empty, `total_jobs` returns 0. This is mathematically correct but downstream code dividing by `total_jobs` could fail. The `validate()` method catches these cases, so this is just a documentation note. <details> <summary>Optional docstring enhancement</summary> ```diff `@property` def total_jobs(self) -> int: - """Total number of jobs in the sweep (Cartesian product).""" + """Total number of jobs in the sweep (Cartesian product). + + Returns 0 if any dimension (models, recipes, launchers) is empty. + Call validate() first to ensure all dimensions are populated. + """ return len(self.models) * len(self.recipes) * len(self.launchers) ``` </details> <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@modelopt/torch/recipes/experiment/config.py` around lines 121 - 124, The docstring for the total_jobs property should explicitly mention the zero-jobs edge case: when any of models, recipes, or launchers is empty total_jobs will be 0, which can cause downstream divisions by total_jobs; note that validate() is responsible for catching/raising on empty lists before execution. Update the total_jobs docstring (in the total_jobs property) to include this behavior and a brief note pointing callers to validate() to ensure non-zero job counts. ``` </details> </blockquote></details> <details> <summary>tests/unit/torch/recipes/test_pipeline.py (1)</summary><blockquote> `152-161`: **Consider adding a guard for missing examples directory.** The test iterates over `examples/recipes` but doesn't handle the case where the directory might not exist or be empty. This could cause confusing test failures if the examples directory structure changes. <details> <summary>Proposed improvement</summary> ```diff def test_load_and_plan_all_examples(): """All example recipes produce valid pipeline plans.""" from pathlib import Path recipes_dir = Path("examples/recipes") + assert recipes_dir.exists(), f"Examples directory not found: {recipes_dir}" + yaml_files = [p for p in sorted(recipes_dir.rglob("*.yaml")) if "experiments" not in p.parts] + assert len(yaml_files) > 0, "No recipe YAML files found" - for path in sorted(recipes_dir.rglob("*.yaml")): - if "experiments" in path.parts: - continue # skip sweep/experiment configs + for path in yaml_files: plan = load_and_plan(str(path)) assert len(plan.steps) >= 1, f"{path} produced empty plan" assert plan.recipe_path == str(path) ``` </details> <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@tests/unit/torch/recipes/test_pipeline.py` around lines 152 - 161, The test test_load_and_plan_all_examples assumes examples/recipes exists; add a guard that checks the Path("examples/recipes") (recipes_dir) exists and is a directory and that recipes_dir.rglob("*.yaml") yields at least one file before iterating; if the directory is missing or no recipe files are found, call pytest.skip with a clear message so load_and_plan is not invoked on a non-existent/empty dataset. Use the existing recipes_dir and load_and_plan identifiers to locate where to add the checks. ``` </details> </blockquote></details> <details> <summary>modelopt/torch/recipes/schema/resolver.py (1)</summary><blockquote> `34-57`: **Move imports to the top of the file.** The imports at lines 48-57 are placed after the function definition `_update_quant_cfg_with_kv_cache`. While this works, it's non-idiomatic Python and can cause confusion. Consider moving all imports to the top of the file after the module docstring. <details> <summary>Reorganize imports</summary> ```diff from __future__ import annotations import copy from typing import Any +from .formats import FORMAT_REGISTRY, get_format, get_kv_format +from .presets import get_preset +from .models import ( + AlgorithmConfig, + AutoQuantizeSection, + OverrideEntry, + QuantizationSection, + QuantizerSpec, + RecipeConfig, +) + # Default disabled quantizer patterns — inlined from # modelopt.torch.quantization.config._default_disabled_quantizer_cfg # to avoid torch dependency at import time. _DEFAULT_DISABLED_QUANTIZER_CFG: dict[str, Any] = { ... } def _update_quant_cfg_with_kv_cache( ... ): ... -from .formats import FORMAT_REGISTRY, get_format, get_kv_format -from .presets import get_preset -from .models import ( - AlgorithmConfig, - AutoQuantizeSection, - OverrideEntry, - QuantizationSection, - QuantizerSpec, - RecipeConfig, -) ``` </details> <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@modelopt/torch/recipes/schema/resolver.py` around lines 34 - 57, The module-level imports (FORMAT_REGISTRY, get_format, get_kv_format from .formats; get_preset from .presets; and AlgorithmConfig, AutoQuantizeSection, OverrideEntry, QuantizationSection, QuantizerSpec, RecipeConfig from .models) should be moved to the top of the file immediately after the module docstring and before the function _update_quant_cfg_with_kv_cache; update the file so all imports live together at the top in that order, keeping the helper function _update_quant_cfg_with_kv_cache defined below them and ensuring no other code is split by late imports. ``` </details> </blockquote></details> </blockquote></details> <details> <summary>🤖 Prompt for all review comments with AI agents</summary>Verify each finding against the current code and only fix it if needed.
Inline comments:
In@modelopt/torch/recipes/cli.py:
- Line 1: This module currently starts with a module docstring instead of the
required NVIDIA Apache 2.0 license header; add the standard NVIDIA Apache 2.0
header block at the very top of the file (above the existing triple-quoted
docstring) and ensure the same header is added to all other new Python/C++/CUDA
files introduced in this PR; locate the insertion point at the module-level
docstring in cli.py (top of file) and prepend the exact SPDX/Apache 2.0 header
used across the codebase.- Around line 9-13: Add a console-script entry so the CLI function main() in
modelopt.torch.recipes.cli is exposed as the installable command; update
packaging metadata (e.g., pyproject.toml) to include an entry under
[project.scripts] like modelopt-recipes = "modelopt.torch.recipes.cli:main" (or
add the equivalent console_scripts mapping in setup.cfg/setup.py) so the
advertised modelopt-recipes command points to the main() function.In
@modelopt/torch/recipes/experiment/__init__.py:
- Around line 1-9: Add the standard NVIDIA Apache 2.0 license header to the top
of this new Python module so it complies with project guidelines; insert the
header comment block above the module docstring in
modelopt/torch/recipes/experiment/init.py (before the triple-quoted
description) and ensure it remains a valid Python comment, then keep the
existing imports and all unchanged (SweepConfig, SweepController, SweepJob).In
@modelopt/torch/recipes/pipeline.py:
- Around line 50-57: The PipelinePlan currently only stores
steps/export/metadata and drops the top-level recipe.model block; add a model
field to the PipelinePlan dataclass (e.g., model: dict[str, Any] | None) and
ensure load_and_plan() populates this field from recipe.model (including
recipe.model.path and flags like trust_remote_code and attn_implementation) when
constructing the plan and when returning the plan as an artifact so consumers
can reconstruct the full workflow; update any code that builds the plan around
PipelinePlan(...) (lines building the plan and the return at load_and_plan()) to
copy the model block intact into the new model field.- Around line 156-177: The planners (_plan_pruning and similarly
_plan_distillation) are discarding any technique-specific fields by manually
recreating a small config dict; instead serialize the entire validated section
and preserve all fields. Fix by calling section.model_dump() into a single dict
(e.g. full = section.model_dump()), then build config from that dict (e.g.
config = {k:v for k,v in full.items() if k not in ("calibration","training")})
and extract calibration/training by popping or reading those keys and leaving
the rest intact; if any field names must be normalized, rename only those
specific keys before constructing the
PipelineStep(technique="pruning"/"distillation",
api_call=TECHNIQUE_API["pruning"/"distillation"], config=config,
calibration=calib, training=training).- Around line 1-11: This new module is missing the required NVIDIA Apache 2.0
license header; add the standard NVIDIA Apache 2.0 header block at the very top
of modelopt/torch/recipes/pipeline.py immediately before the module docstring so
the file begins with the license header followed by the existing triple-quoted
docstring and code (do not alter the docstring text). Ensure the header exactly
matches the repository's canonical NVIDIA Apache 2.0 header used in other
Python/C++/CUDA files.In
@modelopt/torch/recipes/schema/_bundled_presets.py:
- Around line 1-10: The generated file _bundled_presets.py is missing the
required NVIDIA Apache 2.0 license header; update the file so the standard
NVIDIA Apache 2.0 header (the same header used across the repo) is prepended to
the top of modelopt/torch/recipes/schema/_bundled_presets.py before the module
docstring; also update the generator (scripts/sync_presets.py) so future
regenerations emit that header automatically to avoid repeated violations.In
@modelopt/torch/recipes/schema/formats.py:
- Around line 1-12: This file lacks the required NVIDIA Apache 2.0 source
header; add the standard NVIDIA Apache 2.0 license header block at the very top
of modelopt/torch/recipes/schema/formats.py (before the module docstring and
before any imports such as "from future import annotations"), using the
project's canonical header text (including copyright year/owner and SPDX
identifier) so the file begins with the full license comment followed by the
existing module docstring and imports.- Around line 42-78: FORMAT_REGISTRY is missing an entry for "mxint8", causing
format lookups to fail; add a "mxint8" key to the FORMAT_REGISTRY dict (the same
structure as the existing "mxfp8"/"mxfp6"/"mxfp4" entries) under
modelopt/torch/recipes/schema/formats.py with "weight" and "activation" subkeys
and settings that match the mxint8 quantizer defaults defined in
modelopt/torch/recipes/schema/_bundled_presets.py (Lines ~390-409) — ensure the
num_bits, block_sizes (e.g., dynamic/-1 and scale_bits), and enable flags mirror
those bundled defaults so format: mxint8 resolves correctly.In
@modelopt/torch/recipes/schema/models.py:
- Around line 68-93: The validator in QuantizationSection (class
QuantizationSection) currently only checks preset vs custom but doesn't enforce
that mode="qat" must include a TrainingConfig; update validate_preset_or_custom
(or add an additional@model_validatorwith mode="after") to raise a ValueError
when self.mode == "qat" and self.training is None/empty, instructing the user to
provide a training config for QAT; keep the existing preset vs
weights/activations check intact and return self when valid.- Around line 1-6: This new module (modelopt/torch/recipes/schema/models.py) is
missing the required NVIDIA Apache 2.0 license header; add the standard
multi-line NVIDIA Apache 2.0 header immediately at the top of the file before
the existing module docstring so the header precedes the triple-quoted string,
ensuring the file begins with the exact required license text used across the
repo.- Around line 54-66: The OverrideEntry model allows entries with neither
selector set (both pattern and module_class optional), so add a Pydantic
validator on OverrideEntry (e.g., a@root_validatoror equivalent) that enforces
at least one selector is provided: if both pattern and module_class are falsy,
raise a ValidationError/ValueError with a clear message; update the error text
to mention OverrideEntry, pattern, and module_class so invalid entries like "-
format: fp8" fail validation and downstream behavior is prevented.In
@modelopt/torch/recipes/schema/presets.py:
- Around line 1-18: The file module-level header in presets.py is missing the
required NVIDIA Apache 2.0 license header; add the standard NVIDIA Apache 2.0
source header block (including copyright line, SPDX identifier, and full license
notice per project guidelines) at the very top of
modelopt.torch.recipes.schema.presets (before any imports or code) so the file
contains the canonical NVIDIA header used across the repo.- Around line 29-118: The current try/except masks missing-symbol bugs because
it tries to directly import many names (including NVFP4_FP8_MHA_CONFIG which
doesn't exist) and catches ImportError broadly; either correct the symbol to the
correct name (likely NVFP4_FP8_MHA_CFG) in the import list, or change the
pattern to "import modelopt.torch.quantization.config as config" then read
attributes (config.NVFP4_FP8_MHA_CONFIG) so module import failures are caught
but missing attributes raise AttributeError instead of silently falling back;
update references around _PRESET_REGISTRY accordingly so missing symbols fail
fast.In
@tests/unit/torch/recipes/conftest.py:
- Around line 1-5: Prepend the repository's required NVIDIA Apache 2.0 license
header to this new Python module by inserting the full license comment block
above the existing module docstring (the top triple-quoted string in
conftest.py). Ensure the header matches the canonical NVIDIA Apache 2.0 template
used across the repo (including copyright line, license URL, and SPDX
identifier) and that no other content moves above it.- Around line 23-29: In pytest_collection_modifyitems, stop importing the deep
path and instead import the top-level package (use "import modelopt") and catch
ModuleNotFoundError; if the caught exception's name (exc.name) does not match
the expected package name (e.g., "modelopt" or the actual distribution name like
"nvidia-modelopt") re-raise the exception so real submodule ImportErrors aren't
masked; only when the missing-package name matches should you set skip =
pytest.mark.skip(reason="nvidia-modelopt not installed") and return to skip
tests.
Minor comments:
In@examples/recipes/experiments/experiment.yaml:
- Around line 37-39: The model_overrides mapping is using the key
"codellama/CodeLlama-34b-Instruct" which doesn't match any sweep models (the
sweep contains "Qwen/Qwen3-8B" and "meta-llama/Llama-3.1-8B-Instruct"), so the
per-model override never runs; update the model_overrides keys to match a sweep
model (e.g., replace "codellama/CodeLlama-34b-Instruct" with "Qwen/Qwen3-8B" or
"meta-llama/Llama-3.1-8B-Instruct"), or add additional entries that exactly
match your intended sweep model names so the overrides apply.- Around line 6-8: The usage example still references the old command and path
string "modelopt-recipes dry-run --config examples/sweeps/experiment.yaml";
update that line to the new CLI and location by replacing the old command and
path with the current one (e.g., "modelopt recipes dry-run --config
examples/recipes/experiment.yaml") so the example points to the correct CLI
invocation and the examples/recipes path; search for the exact string
"modelopt-recipes dry-run --config examples/sweeps/experiment.yaml" in the file
and update it.In
@examples/recipes/experiments/sweep_demo.yaml:
- Around line 6-8: Update the usage example to show the new recipe CLI
invocation instead of the old demo script: replace the old command with the
recipe CLI entrypoint (use the repository's CLI/entrypoint name) invoking the
same sweep_demo.yaml config (sweep_demo.yaml) and include the --dry-run flag so
the example runs against the new examples/recipes layout; ensure the text shows
the correct CLI pattern (e.g., " run --config sweep_demo.yaml
--dry-run") so readers can adapt it to the repo's actual entrypoint.In
@examples/recipes/multi_technique/distill_fp8.yaml:
- Around line 3-5: Update the header comment so the stage order matches the
pipeline: replace the contradictory first line "Distill from a larger teacher
model, then quantize the student to FP8." with a line that reflects the actual
execution order used later, e.g. "Quantize the student to FP8, then distill
(fine-tune with teacher guidance)." Adjust the wording around the existing
"Execution order: quantization → distillation (fine-tune with teacher guidance)"
phrase to ensure both lines consistently state quantization → distillation;
locate the lines by searching for the exact header text "Distill from a larger
teacher model, then quantize the student to FP8." and "Execution order:
quantization → distillation (fine-tune with teacher guidance)" and make the
first line mirror the execution order.In
@examples/recipes/ptq/ptq_nvfp4_skip_first_last.yaml:
- Around line 28-32: The override pattern currently disables "layers.0" and
"layers.21" which mismatches the recipe name/description (32-layer model) and
the note referencing "layers.31"; update the second override pattern from
"layers.21" to "layers.31" so the recipe actually skips the first and last
layers, or if intended to skip layer 21 instead, update the file
name/description and any references (e.g., the "layers.31" note) to reflect a
non-32-layer example.In
@examples/recipes/qat/qat_int8.yaml:
- Around line 10-16: The recipe metadata claims AWQ but the quantization config
uses SmoothQuant (quantization.preset: int8_sq and
quantization.algorithm.method: smoothquant); either update the description and
tags to reference SmoothQuant (e.g., description: "INT8 quantization-aware
training with SmoothQuant initialization" and tags: [qat, int8, smoothquant]) or
change the quantization block to AWQ-compatible values (replace preset and
algorithm.method with the appropriate AWQ preset/algorithm names) so the
metadata and quantization settings match.In
@examples/recipes/README.md:
- Around line 84-86: The fenced code block containing the line "pruning →
sparsity → quantization → distillation" in README.md needs a language tag to
satisfy markdownlint; update that fenced block fromtotext so it reads
as a text code fence (i.e., add the language identifier "text" to the existing
triple-backtick fence that wraps the pruning → ... line).- Around line 90-99: The example calls mtq.quantize but never imports mtq; add
an import for modelopt.torch.quantization (e.g. import
modelopt.torch.quantization as mtq) at the top of the snippet before using
load_recipe so mtq.quantize is defined; update the example near load_recipe and
the use of mtq.quantize/forward_loop=calibrate accordingly.In
@modelopt/torch/recipes/__init__.py:
- Around line 1-16: Add the required NVIDIA Apache 2.0 license header to the top
of this Python module (modelopt.torch.recipes init.py) so the file begins
with the standard multi-line copyright and Apache 2.0 license block; keep the
existing module docstring and usage examples after the header, and ensure the
header appears before any code or docstring (so symbols like load_recipe and the
module-level docstring remain unchanged and located after the license block).In
@modelopt/torch/recipes/bridge.py:
- Around line 1-5: The file is missing the required NVIDIA Apache 2.0 license
header; add the project's standard Apache-2.0/NVIDIA license block at the top of
bridge.py (above the module docstring) so all new Python files under modelopt
include the license; ensure the header matches the existing project license
text/style (including copyright holder and SPDX identifier) used elsewhere in
the repo.- Around line 55-59: The parameter recipe_path on summarize_recipe is declared
but unused; add it to the returned summary dict so callers retain context:
update summarize_recipe (and its docstring) to include a "recipe_path" key in
the returned dict alongside preset, algorithm, kv_cache, overrides_count, and
type, populating it with the incoming recipe_path value; ensure any code that
relies on those keys continues to work and adjust tests/types if needed.In
@modelopt/torch/recipes/cli.py:
- Around line 47-63: Wrap the entire "dry-run" branch in a try/except that
mirrors the other subcommands: catch exceptions from SweepConfig.from_yaml,
SweepController(config), controller.dry_run(verbose=...), and
controller.export_jobs(...) and print a clear error message (including the
exception text) then sys.exit(1); keep existing validation behavior (call
config.validate() first), and ensure export_jobs is also guarded by its own
try/except so failures when writing to args.output are reported and cause exit
instead of leaking a raw traceback.In
@modelopt/torch/recipes/experiment/config.py:
- Around line 1-5: The file is missing the required NVIDIA Apache 2.0 license
header; add the standard multi-line NVIDIA Apache 2.0 license header at the very
top of modelopt/torch/recipes/experiment/config.py before the existing module
docstring (the triple-quoted string starting "Sweep configuration schema."),
ensuring the header exactly matches the project's canonical license text and
includes copyright year and holder.In
@modelopt/torch/recipes/experiment/controller.py:
- Around line 1-5: This file is missing the required NVIDIA Apache 2.0 license
header; prepend the standard NVIDIA Apache 2.0 license block to the top of the
module (above the existing module docstring in controller.py / the "Sweep
Controller" module) so all new Python files under modelopt include the license
header as per guidelines; ensure the header is the full required text and that
no other file content is moved below it.In
@modelopt/torch/recipes/schema/__init__.py:
- Around line 1-16: This file is missing the required NVIDIA Apache 2.0 license
header; add the standard license comment block at the very top of the module
(before the module docstring) so the file that re-exports RecipeConfig,
FORMAT_REGISTRY, KV_FORMAT_REGISTRY, get_preset, get_preset_source,
list_presets, and resolve_recipe contains the NVIDIA Apache 2.0 header per
project guidelines.In
@modelopt/torch/recipes/schema/resolver.py:
- Around line 1-6: This file is missing the required NVIDIA Apache 2.0 license
header; add the standard multi-line NVIDIA Apache 2.0 license block at the very
top of modelopt/torch/recipes/schema/resolver.py (above the module docstring) so
the file complies with the repository's licensing guidelines for all new Python
files under modelopt; ensure the license text matches the project's canonical
header exactly.
Nitpick comments:
In@modelopt/torch/recipes/experiment/config.py:
- Around line 121-124: The docstring for the total_jobs property should
explicitly mention the zero-jobs edge case: when any of models, recipes, or
launchers is empty total_jobs will be 0, which can cause downstream divisions by
total_jobs; note that validate() is responsible for catching/raising on empty
lists before execution. Update the total_jobs docstring (in the total_jobs
property) to include this behavior and a brief note pointing callers to
validate() to ensure non-zero job counts.In
@modelopt/torch/recipes/experiment/controller.py:
- Around line 80-86: The job_id is incremented before validating the recipe,
causing gaps when a recipe is skipped; in the loop that iterates over
itertools.product(self.config.models, self.config.recipes,
self.config.launchers) (variables model, recipe_path, launcher) move the job_id
increment so it happens only for valid jobs (i.e., after checking recipe_path
against self._validation_errors) or increment conditionally inside the branch
that proceeds with the job; update any "Job X/Y" calculations to rely on this
adjusted job_id so dry-run outputs remain sequential.- Around line 108-145: The dry_run method currently calls validate_recipes() and
later calls generate_jobs(), which itself calls validate_recipes() when
_validated_recipes is empty; remove the redundant explicit validate_recipes()
invocation in dry_run (or alternatively set/ensure _validated_recipes is
populated once before calling generate_jobs) and add a short comment near
dry_run explaining that generate_jobs() will validate recipes lazily using
_validated_recipes to avoid duplicate work; refer to the dry_run, generate_jobs,
validate_recipes methods and the _validated_recipes cache when making the
change.In
@modelopt/torch/recipes/schema/formats.py:
- Around line 135-148: get_format and get_kv_format currently return references
into the shared FORMAT_REGISTRY and KV_FORMAT_REGISTRY so callers who
merge/update will mutate global defaults; change both functions to return copies
(deep copies if values are nested dicts) of the registry entries instead of the
original dict objects (i.e., copy the dict returned from FORMAT_REGISTRY[name]
and KV_FORMAT_REGISTRY[name]) so callers can modify the result without affecting
the registries.In
@modelopt/torch/recipes/schema/resolver.py:
- Around line 34-57: The module-level imports (FORMAT_REGISTRY, get_format,
get_kv_format from .formats; get_preset from .presets; and AlgorithmConfig,
AutoQuantizeSection, OverrideEntry, QuantizationSection, QuantizerSpec,
RecipeConfig from .models) should be moved to the top of the file immediately
after the module docstring and before the function
_update_quant_cfg_with_kv_cache; update the file so all imports live together at
the top in that order, keeping the helper function
_update_quant_cfg_with_kv_cache defined below them and ensuring no other code is
split by late imports.In
@tests/unit/torch/recipes/test_pipeline.py:
- Around line 152-161: The test test_load_and_plan_all_examples assumes
examples/recipes exists; add a guard that checks the Path("examples/recipes")
(recipes_dir) exists and is a directory and that recipes_dir.rglob("*.yaml")
yields at least one file before iterating; if the directory is missing or no
recipe files are found, call pytest.skip with a clear message so load_and_plan
is not invoked on a non-existent/empty dataset. Use the existing recipes_dir and
load_and_plan identifiers to locate where to add the checks.</details> --- <details> <summary>ℹ️ Review info</summary> <details> <summary>⚙️ Run configuration</summary> **Configuration used**: Path: .coderabbit.yaml **Review profile**: CHILL **Plan**: Pro **Run ID**: `a654782e-65b6-4e46-be5b-5e37b34ba760` </details> <details> <summary>📥 Commits</summary> Reviewing files that changed from the base of the PR and between a56b6f356b4017847f9ef9850cf81bd0203ce5a3 and 71a0675a9c7365cfce6f13c1f54dc1b73a887f3c. </details> <details> <summary>📒 Files selected for processing (35)</summary> * `examples/recipes/README.md` * `examples/recipes/auto/auto_quantize.yaml` * `examples/recipes/experiments/experiment.yaml` * `examples/recipes/experiments/sweep_demo.yaml` * `examples/recipes/multi_technique/distill_fp8.yaml` * `examples/recipes/multi_technique/prune_gradnas_fp8.yaml` * `examples/recipes/multi_technique/sparse_quantize.yaml` * `examples/recipes/ptq/custom_w4a8_overrides.yaml` * `examples/recipes/ptq/ptq_fp8.yaml` * `examples/recipes/ptq/ptq_nvfp4_awq.yaml` * `examples/recipes/ptq/ptq_nvfp4_local_hessian.yaml` * `examples/recipes/ptq/ptq_nvfp4_local_hessian_overrides.yaml` * `examples/recipes/ptq/ptq_nvfp4_skip_first_last.yaml` * `examples/recipes/qat/qat_int8.yaml` * `modelopt/torch/recipes/__init__.py` * `modelopt/torch/recipes/bridge.py` * `modelopt/torch/recipes/cli.py` * `modelopt/torch/recipes/experiment/__init__.py` * `modelopt/torch/recipes/experiment/config.py` * `modelopt/torch/recipes/experiment/controller.py` * `modelopt/torch/recipes/pipeline.py` * `modelopt/torch/recipes/schema/__init__.py` * `modelopt/torch/recipes/schema/_bundled_presets.py` * `modelopt/torch/recipes/schema/formats.py` * `modelopt/torch/recipes/schema/models.py` * `modelopt/torch/recipes/schema/presets.py` * `modelopt/torch/recipes/schema/resolver.py` * `tests/unit/torch/recipes/__init__.py` * `tests/unit/torch/recipes/conftest.py` * `tests/unit/torch/recipes/test_bridge.py` * `tests/unit/torch/recipes/test_load_recipe.py` * `tests/unit/torch/recipes/test_pipeline.py` * `tests/unit/torch/recipes/test_schema.py` * `tests/unit/torch/recipes/test_sweep_config.py` * `tests/unit/torch/recipes/test_sweep_controller.py` </details> </details> <!-- This is an auto-generated comment by CodeRabbit for review status -->
71a0675 to
9e6865f
Compare
Migrate recipe system from nvidia-modelopt-recipes standalone package into modelopt.torch.recipes. Provides YAML-based recipe configs for model optimization with preset resolution, multi-technique pipeline planning, and experiment orchestration (sweep + eval). Structure: - modelopt/torch/recipes/schema/ — Pydantic models, resolver, presets - modelopt/torch/recipes/experiment/ — sweep config, job matrix controller - modelopt/torch/recipes/pipeline.py — multi-technique execution planner - modelopt/torch/recipes/bridge.py — recipe-to-hf_ptq translation - modelopt/torch/recipes/cli.py — CLI entry point - examples/recipes/ — 11 recipes (ptq/qat/multi_technique/auto) + experiments - tests/unit/torch/recipes/ — 34 tests, all passing No existing files modified. Zero breakage to existing tests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Sungsoo Ha <sungsooh@nvidia.com>
9e6865f to
5ee6d3d
Compare
QAT mode (quantize → fine-tune) needs training parameters. Reject incomplete QAT recipes at schema validation time rather than producing pipeline steps with no training config. Signed-off-by: Sungsoo Ha <sungsooh@nvidia.com> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Sungsoo Ha <sungsooh@nvidia.com>
Import the config module first, then read attributes from it. This way ModuleNotFoundError (package not installed) falls back to bundled presets, while AttributeError (typo or renamed symbol) fails fast instead of silently degrading. Signed-off-by: Sungsoo Ha <sungsooh@nvidia.com> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Sungsoo Ha <sungsooh@nvidia.com>
Use ModuleNotFoundError instead of ImportError so genuine import regressions in modelopt submodules surface as errors rather than silently skipping tests. Signed-off-by: Sungsoo Ha <sungsooh@nvidia.com> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Sungsoo Ha <sungsooh@nvidia.com>
There was a problem hiding this comment.
Actionable comments posted: 8
♻️ Duplicate comments (2)
tests/unit/torch/recipes/conftest.py (1)
38-48:⚠️ Potential issue | 🟠 MajorDon't mask real import regressions when deciding test skips.
Line 40 probes a deep submodule and Lines 43-44 swallow any
ImportError, so a brokenmodelopt.torch.quantizationimport gets converted into skippedrequires_modelopttests instead of a failing collection. Probe the top-level package only and only suppressModuleNotFoundErrorfor the missing package itself.📎 Suggested fix
def pytest_collection_modifyitems(config, items): try: - import modelopt.torch.quantization.config # noqa: F401 + import modelopt # noqa: F401 return # modelopt available, run all tests - except ImportError: - pass + except ModuleNotFoundError as exc: + if exc.name != "modelopt": + raise skip = pytest.mark.skip(reason="nvidia-modelopt not installed") for item in items: if "requires_modelopt" in item.keywords: item.add_marker(skip)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/unit/torch/recipes/conftest.py` around lines 38 - 48, The test collection is masking real import errors by importing a deep submodule and catching all ImportError; update pytest_collection_modifyitems to probe only the top-level package (import modelopt) and change the exception handling to only suppress a ModuleNotFoundError for the top-level package (re-raise any other ImportError/ModuleNotFoundError or those for submodules) so that genuine regressions in modelopt.torch.quantization cause test failures; keep the skip marker logic for items with the "requires_modelopt" keyword.modelopt/torch/recipes/schema/formats.py (1)
81-93:⚠️ Potential issue | 🟠 MajorAdd
mxint8toFORMAT_REGISTRY.The format is missing from the registry and will cause runtime failures when recipes reference
format: mxint8. The configuration should match the other MX format patterns with weight/activation blocks.Proposed fix
"mxfp4": { "weight": { "num_bits": (2, 1), "block_sizes": {-1: 32, "type": "dynamic", "scale_bits": (8, 0)}, "enable": True, }, "activation": { "num_bits": (2, 1), "block_sizes": {-1: 32, "type": "dynamic", "scale_bits": (8, 0)}, "enable": True, }, }, + "mxint8": { + "weight": { + "num_bits": 8, + "block_sizes": {-1: 32, "type": "dynamic", "scale_bits": (8, 0)}, + "enable": True, + }, + "activation": { + "num_bits": 8, + "block_sizes": {-1: 32, "type": "dynamic", "scale_bits": (8, 0)}, + "enable": True, + }, + }, }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@modelopt/torch/recipes/schema/formats.py` around lines 81 - 93, FORMAT_REGISTRY is missing an entry for "mxint8" which causes failures when recipes use format: mxint8; add a new dictionary key "mxint8" to FORMAT_REGISTRY mirroring the existing "mxfp4" structure with both "weight" and "activation" entries, each containing "num_bits", "block_sizes", and "enable" fields (use the same shapes/patterns as "mxfp4": num_bits tuples, block_sizes with -1:32, "type":"dynamic", "scale_bits":(8,0), and enable=True) so recipes referencing mxint8 resolve correctly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@examples/recipes/experiments/experiment.yaml`:
- Around line 37-40: The per-model override under model_overrides uses the key
"codellama/CodeLlama-34b-Instruct" but the sweep only iterates Qwen/Qwen3-8B and
meta-llama/Llama-3.1-8B-Instruct, so the override never applies; update the
sweep to include "codellama/CodeLlama-34b-Instruct" or change the
model_overrides key to match one of the swept models (e.g., set the override key
to "Qwen/Qwen3-8B" or "meta-llama/Llama-3.1-8B-Instruct") so the per-model tasks
(simple_evals.humaneval, simple_evals.livecodebench_v5) actually take effect for
a swept model.
- Around line 6-8: The usage comment at the top of the file points to the wrong
example path; update the comment string in
examples/recipes/experiments/experiment.yaml so the suggested command uses the
actual file location (replace examples/sweeps/experiment.yaml with
examples/recipes/experiments/experiment.yaml) in the usage text near the top of
the file.
In `@examples/recipes/experiments/sweep_demo.yaml`:
- Around line 6-8: Update the usage example line to call the new CLI entrypoint
and correct path: replace the old command that references
scripts/demo_big_pareto_sweep.py and examples/sweeps/... with the new CLI
invocation (modelopt.torch.recipes.cli) pointing to
examples/recipes/experiments/sweep_demo.yaml and include the same --config and
--dry-run flags so the example becomes something like: python -m
modelopt.torch.recipes.cli --config examples/recipes/experiments/sweep_demo.yaml
--dry-run.
In `@examples/recipes/README.md`:
- Around line 90-99: The example uses load_recipe and then calls mtq.quantize
but never defines or imports mtq or calibrate, causing NameError; update the
snippet to import the quantization module (e.g., add an import for
modelopt.torch.quantization as mtq) and either (a) document that model and
calibrate are application-provided placeholders or (b) provide minimal
placeholder definitions for model and calibrate in the example so it runs;
ensure the example still uses load_recipe and the quantize(...) call
(referencing load_recipe, mtq.quantize, and calibrate) so readers can copy-paste
a runnable example.
In `@modelopt/torch/recipes/experiment/controller.py`:
- Around line 106-123: The SweepJob payload drops auto-quantize search config
because recipe_to_hf_ptq_args(resolved) only handles
resolved["quantize_config"]; detect when resolved contains auto_quantize_kwargs
and merge or replace the hf_ptq payload with those args plus pyt_ckpt_path and
the existing calibration/export fields. Update the logic around
recipe_to_hf_ptq_args and the hf_ptq assignment in the SweepJob creation
(reference recipe_to_hf_ptq_args, resolved, auto_quantize_kwargs, and the
SweepJob ptq_config construction) so that when resolved has auto_quantize_kwargs
the job.ptq_config.hf_ptq includes the auto-quantize search config rather than
being incomplete.
- Line 31: The module-level import of PyYAML (`import yaml`) causes a hard
dependency and will raise ModuleNotFoundError for users who didn't install
PyYAML; update SweepController usage to avoid this by either (A) adding PyYAML
to the project's core dependencies in pyproject.toml, or (B) gate the recipes
feature behind an optional extra (e.g., [recipes]) and move the `import yaml`
into the specific functions or methods that need it within the SweepController
implementation so the module can be imported without PyYAML installed; choose
one approach and apply it consistently (if gating, ensure any public API that
parses YAML lazily imports `yaml` and raises a clear error directing users to
install the optional extra).
In `@modelopt/torch/recipes/schema/resolver.py`:
- Around line 202-232: In _apply_override, pattern-based overrides currently
ignore override.weights/override.activations and module_class overrides
overwrite rather than merge existing entries; fix by (1) when override.pattern
is set, start from a deepcopy of quant_cfg.get(override.pattern, {}) (like you
do elsewhere) and if override.weights/override.activations are present, insert
"*weight_quantizer" and "*input_quantizer" using _resolve_quantizer_spec(…,
"weight"/"activation") into the entry instead of ignoring them, then merge other
fields as you already do and assign back to quant_cfg[override.pattern]; (2)
when override.module_class is set, begin from a deepcopy of
quant_cfg.get(override.module_class, {}) instead of creating a fresh mc_cfg so
existing entries (e.g. {"*": {"enable": False}}) are preserved, then merge or
update keys ("*weight_quantizer", "*input_quantizer", and "*" enable-only case)
rather than replacing the whole mc_cfg before assigning back to
quant_cfg[override.module_class].
- Around line 235-249: The resolver _resolve_auto_quantize currently builds
format_configs = [get_preset(fmt_entry.preset) for fmt_entry in section.formats]
but never applies AutoQuantizeSection.kv_cache, so kv_cache settings are
dropped; fix by propagating section.kv_cache into each emitted format config
(e.g., after creating format_configs or while mapping get_preset, set
format_config["kv_cache"] = section.kv_cache when section.kv_cache is present)
so that the returned kwargs["quantization_formats"] include the requested
kv_cache for every format; update _resolve_auto_quantize to check
section.kv_cache and mutate/merge it into each format_config before returning
kwargs.
---
Duplicate comments:
In `@modelopt/torch/recipes/schema/formats.py`:
- Around line 81-93: FORMAT_REGISTRY is missing an entry for "mxint8" which
causes failures when recipes use format: mxint8; add a new dictionary key
"mxint8" to FORMAT_REGISTRY mirroring the existing "mxfp4" structure with both
"weight" and "activation" entries, each containing "num_bits", "block_sizes",
and "enable" fields (use the same shapes/patterns as "mxfp4": num_bits tuples,
block_sizes with -1:32, "type":"dynamic", "scale_bits":(8,0), and enable=True)
so recipes referencing mxint8 resolve correctly.
In `@tests/unit/torch/recipes/conftest.py`:
- Around line 38-48: The test collection is masking real import errors by
importing a deep submodule and catching all ImportError; update
pytest_collection_modifyitems to probe only the top-level package (import
modelopt) and change the exception handling to only suppress a
ModuleNotFoundError for the top-level package (re-raise any other
ImportError/ModuleNotFoundError or those for submodules) so that genuine
regressions in modelopt.torch.quantization cause test failures; keep the skip
marker logic for items with the "requires_modelopt" keyword.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: a1b92688-78a3-4506-a9d9-7e36c665dad1
📒 Files selected for processing (35)
examples/recipes/README.mdexamples/recipes/auto/auto_quantize.yamlexamples/recipes/experiments/experiment.yamlexamples/recipes/experiments/sweep_demo.yamlexamples/recipes/multi_technique/distill_fp8.yamlexamples/recipes/multi_technique/prune_gradnas_fp8.yamlexamples/recipes/multi_technique/sparse_quantize.yamlexamples/recipes/ptq/custom_w4a8_overrides.yamlexamples/recipes/ptq/ptq_fp8.yamlexamples/recipes/ptq/ptq_nvfp4_awq.yamlexamples/recipes/ptq/ptq_nvfp4_local_hessian.yamlexamples/recipes/ptq/ptq_nvfp4_local_hessian_overrides.yamlexamples/recipes/ptq/ptq_nvfp4_skip_first_last.yamlexamples/recipes/qat/qat_int8.yamlmodelopt/torch/recipes/__init__.pymodelopt/torch/recipes/bridge.pymodelopt/torch/recipes/cli.pymodelopt/torch/recipes/experiment/__init__.pymodelopt/torch/recipes/experiment/config.pymodelopt/torch/recipes/experiment/controller.pymodelopt/torch/recipes/pipeline.pymodelopt/torch/recipes/schema/__init__.pymodelopt/torch/recipes/schema/_bundled_presets.pymodelopt/torch/recipes/schema/formats.pymodelopt/torch/recipes/schema/models.pymodelopt/torch/recipes/schema/presets.pymodelopt/torch/recipes/schema/resolver.pytests/unit/torch/recipes/__init__.pytests/unit/torch/recipes/conftest.pytests/unit/torch/recipes/test_bridge.pytests/unit/torch/recipes/test_load_recipe.pytests/unit/torch/recipes/test_pipeline.pytests/unit/torch/recipes/test_schema.pytests/unit/torch/recipes/test_sweep_config.pytests/unit/torch/recipes/test_sweep_controller.py
✅ Files skipped from review due to trivial changes (1)
- tests/unit/torch/recipes/init.py
🚧 Files skipped from review as they are similar to previous changes (16)
- examples/recipes/qat/qat_int8.yaml
- tests/unit/torch/recipes/test_sweep_controller.py
- modelopt/torch/recipes/schema/models.py
- examples/recipes/ptq/ptq_nvfp4_local_hessian_overrides.yaml
- tests/unit/torch/recipes/test_pipeline.py
- modelopt/torch/recipes/cli.py
- modelopt/torch/recipes/pipeline.py
- modelopt/torch/recipes/bridge.py
- examples/recipes/ptq/ptq_nvfp4_skip_first_last.yaml
- examples/recipes/ptq/ptq_fp8.yaml
- examples/recipes/ptq/ptq_nvfp4_local_hessian.yaml
- examples/recipes/multi_technique/prune_gradnas_fp8.yaml
- examples/recipes/ptq/ptq_nvfp4_awq.yaml
- tests/unit/torch/recipes/test_schema.py
- tests/unit/torch/recipes/test_sweep_config.py
- examples/recipes/multi_technique/distill_fp8.yaml
There was a problem hiding this comment.
♻️ Duplicate comments (1)
modelopt/torch/recipes/schema/models.py (1)
69-81:⚠️ Potential issue | 🟠 MajorAdd a selector validator to enforce that
patternormodule_classis set.An override entry like
- format: fp8currently passes validation even though it targets nothing. Without a selector, override application is undefined and will either silently no-op or behave unexpectedly downstream.🛠️ Proposed fix
class OverrideEntry(BaseModel): """Per-layer or per-module-class override.""" pattern: str | None = None # glob: "*lm_head*" module_class: str | None = None # class: "nn.LayerNorm" enable: bool | None = None format: str | None = None scale_type: Literal["static", "dynamic"] | None = None # shorthand for block_sizes.type weights: QuantizerSpec | None = None activations: QuantizerSpec | None = None num_bits: int | list[int] | None = None axis: int | None = None + + `@model_validator`(mode="after") + def validate_selector(self): + """Ensure at least one selector is provided.""" + if not self.pattern and not self.module_class: + raise ValueError( + "Override entry must specify either 'pattern' or 'module_class'." + ) + return self🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@modelopt/torch/recipes/schema/models.py` around lines 69 - 81, Add a Pydantic validator to OverrideEntry that enforces a selector by checking that at least one of pattern or module_class is provided; inside the OverrideEntry class (the model with fields pattern and module_class) implement a root validator (e.g., validate_selector) that raises a ValueError when both pattern and module_class are None/empty so an override like "- format: fp8" cannot validate without a selector.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@modelopt/torch/recipes/schema/models.py`:
- Around line 69-81: Add a Pydantic validator to OverrideEntry that enforces a
selector by checking that at least one of pattern or module_class is provided;
inside the OverrideEntry class (the model with fields pattern and module_class)
implement a root validator (e.g., validate_selector) that raises a ValueError
when both pattern and module_class are None/empty so an override like "- format:
fp8" cannot validate without a selector.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 13e1134a-96db-4dfe-ad48-e9c06f17e94c
📒 Files selected for processing (4)
modelopt/torch/recipes/schema/models.pymodelopt/torch/recipes/schema/presets.pytests/unit/torch/recipes/conftest.pytests/unit/torch/recipes/test_schema.py
🚧 Files skipped from review as they are similar to previous changes (2)
- tests/unit/torch/recipes/conftest.py
- tests/unit/torch/recipes/test_schema.py
- Add modelopt-recipes console script entry point in pyproject.toml - Add pyyaml to core dependencies (used directly by recipe system) - Preserve model config (path, trust_remote_code, attn_implementation) on PipelinePlan so consumers can reconstruct the full workflow - Validate OverrideEntry requires pattern or module_class selector - Validate QAT mode requires training config (previous commit) - Fix override merging: pattern overrides now handle weights/activations fields; module_class overrides merge instead of replacing existing entries - Fix auto_quantize resolver to apply kv_cache config to format candidates - Preserve auto_quantize_kwargs in bridge output for sweep jobs - Fix stale paths in experiment YAML usage comments - Fix experiment.yaml model_override to match actual sweep models - Add mtq import and placeholder note in README Python example Signed-off-by: Sungsoo Ha <sungsooh@nvidia.com> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Sungsoo Ha <sungsooh@nvidia.com>
Replace hand-picked field dicts with section.model_dump() via shared
_section_to_config() helper. This preserves technique-specific extra
fields that schema sections accept via model_config={"extra": "allow"}.
Signed-off-by: Sungsoo Ha <sungsooh@nvidia.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Sungsoo Ha <sungsooh@nvidia.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
modelopt/torch/recipes/pipeline.py (2)
177-178: Inconsistent use ofexclude_noneinmodel_dump()call.Other
model_dump()calls (lines 154, 157, 190, 230, 234, 251) useexclude_none=True, but this one does not. This could result inNonevalues appearing in the export config when other sections exclude them.♻️ Proposed fix for consistency
if recipe.export: - plan.export = recipe.export.model_dump() + plan.export = recipe.export.model_dump(exclude_none=True)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@modelopt/torch/recipes/pipeline.py` around lines 177 - 178, The export assignment is missing the exclude_none flag on the model_dump call which makes it inconsistent with other dumps; update the branch where recipe.export is used so that plan.export = recipe.export.model_dump(exclude_none=True) (i.e., call model_dump with exclude_none=True) to match the other model_dump usages and prevent None values from appearing in the exported config.
301-303: Specify explicit encoding when opening the YAML file.
open(path)uses the system default encoding, which can vary across platforms. For YAML files (typically UTF-8), explicitly specifyingencoding="utf-8"ensures consistent behavior.♻️ Proposed fix
path = Path(path) - with open(path) as f: + with open(path, encoding="utf-8") as f: raw = yaml.safe_load(f)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@modelopt/torch/recipes/pipeline.py` around lines 301 - 303, The YAML file is opened without an explicit encoding which can vary by platform; update the open call to specify UTF-8 (either by using open(path, encoding="utf-8") or Path(path).open(encoding="utf-8")) so that the code that calls yaml.safe_load(raw) reads the file consistently; change the lines around Path(path) and with open(path) to use the explicit encoding.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@modelopt/torch/recipes/pipeline.py`:
- Around line 18-26: Docstring lists technique order as "sparsity → quantization
→ distillation" but the actual ordering in the module is defined by
TECHNIQUE_ORDER which begins with "pruning"; update the module docstring to
reflect the real order (include "pruning" first and match TECHNIQUE_ORDER) and
ensure references to RecipeConfig and PipelinePlan remain accurate so the
docstring and the code (TECHNIQUE_ORDER) stay consistent.
---
Nitpick comments:
In `@modelopt/torch/recipes/pipeline.py`:
- Around line 177-178: The export assignment is missing the exclude_none flag on
the model_dump call which makes it inconsistent with other dumps; update the
branch where recipe.export is used so that plan.export =
recipe.export.model_dump(exclude_none=True) (i.e., call model_dump with
exclude_none=True) to match the other model_dump usages and prevent None values
from appearing in the exported config.
- Around line 301-303: The YAML file is opened without an explicit encoding
which can vary by platform; update the open call to specify UTF-8 (either by
using open(path, encoding="utf-8") or Path(path).open(encoding="utf-8")) so that
the code that calls yaml.safe_load(raw) reads the file consistently; change the
lines around Path(path) and with open(path) to use the explicit encoding.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 3f8af51a-1e48-4b68-951f-3714bb8836eb
📒 Files selected for processing (1)
modelopt/torch/recipes/pipeline.py
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #1013 +/- ##
==========================================
- Coverage 71.73% 70.41% -1.33%
==========================================
Files 211 228 +17
Lines 23949 25860 +1911
==========================================
+ Hits 17181 18209 +1028
- Misses 6768 7651 +883 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Adopt PR NVIDIA#1000's modelopt_recipes/ YAML fragment library as the preferred preset source (Tier 1). The existing Python constant import becomes Tier 2 (deprecated), and bundled snapshot becomes Tier 3. New Tier 1 implements lightweight __base__ inheritance (no OmegaConf dependency) that resolves composed recipe directories into the same config dicts that Python constants produced. Preset name → YAML path mapping covers all 30+ presets. Each preset maps to a recipe directory (e.g., "fp8" → "general/ptq/fp8_default-fp8_kv") containing model_quant.yml + kv_quant.yml with __base__ references to atomic fragments in configs/ptq/. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Sungsoo Ha <sungsooh@nvidia.com>
- Fix docstring technique order to include pruning - Use exclude_none=True consistently in model_dump() calls - Specify explicit UTF-8 encoding for YAML file reads - Add pragma: no cover for untestable tier-loading functions in presets.py - Add comprehensive coverage tests for pipeline, resolver, bridge, formats, and config Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Sungsoo Ha <sungsooh@nvidia.com>
Each source file now has a corresponding test file: pipeline.py → test_pipeline.py bridge.py → test_bridge.py schema/resolver.py → test_resolver.py (new) schema/presets.py → test_presets.py (new, merged from test_yaml_presets.py + test_yaml_preset_e2e.py) schema/formats.py → test_formats.py (new) experiment/config.py → test_sweep_config.py Removed test_pipeline_coverage.py and test_yaml_presets.py — tests redistributed to their proper homes. Total: 88 tests, all passing. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Sungsoo Ha <sungsooh@nvidia.com>
Test paths now match source paths: schema/models.py → schema/test_models.py schema/resolver.py → schema/test_resolver.py schema/presets.py → schema/test_presets.py schema/formats.py → schema/test_formats.py experiment/config.py → experiment/test_config.py experiment/controller.py → experiment/test_controller.py Root-level tests (test_load_recipe.py, test_pipeline.py, test_bridge.py) stay at the recipes/ test root matching their source location. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Sungsoo Ha <sungsooh@nvidia.com>
4fbd778 to
ee0ab8d
Compare
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Sungsoo Ha <sungsooh@nvidia.com>
…t types, E2E tests - Add forward-compatible load_config() imports (formats, presets, resolver) that try PR NVIDIA#1000's OmegaConf-based loader, fall back to inline values - Align all num_bits/scale_bits to lists (not tuples) matching OmegaConf output - Remove _bundled_presets.py (909 lines) — redundant with YAML + Python tiers - Fix nn.* → torch.nn.* in disabled quantizer patterns (PR NVIDIA#1000 change) - Add get_preset_info() API with recipe.yml metadata loading - Add nvfp4_omlp_only preset, handle mamba_moe_fp8_aggressive as Tier-2-only - Add E2E tests covering recipe→resolve→pipeline and sweep/experiment flows Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Sungsoo Ha <sungsooh@nvidia.com>
21bd347 to
6acdb3b
Compare
…nfig) - Add recipes/utils.py with make_serializable() and try_import_load_config() - Deduplicate _make_serializable from pipeline.py and controller.py - Deduplicate load_config import boilerplate from formats.py, presets.py, resolver.py Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Sungsoo Ha <sungsooh@nvidia.com>
58e1e9a to
afcf409
Compare
- Move test from test_pipeline.py to test_utils.py (test belongs with the function) - Add tests for int key stringification, primitive passthrough, non-JSON types Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Sungsoo Ha <sungsooh@nvidia.com>
- Move deep_merge and load_yaml_with_bases to utils.py as public functions - Make load_recipe_from_yaml and PRESET_YAML_MAP public in presets.py - Replace private function tests (_resolve_block_sizes, _update_quant_cfg_with_kv_cache) with equivalent tests through resolve_recipe public API - Remove redundant tests subsumed by E2E (test_load_recipe.py, duplicate controller test, duplicate pipeline all-examples test) - Add happy-path tests for formats registry (core formats, weight/activation keys) - Move deep_merge tests to test_utils.py alongside make_serializable tests Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Sungsoo Ha <sungsooh@nvidia.com>
Add dist.destroy_process_group() to init_process() so Gloo backend C++ threads are properly torn down before process exit, preventing "terminate called without an active exception" errors. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Sungsoo Ha <sungsooh@nvidia.com>
This reverts commit 7d957f0. Signed-off-by: Sungsoo Ha <sungsooh@nvidia.com>
dbdf9bc to
66a2e6e
Compare
…ML fixture tests - Remove pipeline.py, bridge.py, cli.py, experiment/ (staged internally) - Remove examples/recipes/ directory - Remove CLI entry point from pyproject.toml - Consolidate tests into 33 YAML fixture-driven test cases - Add sparsity range validation to schema models - 91 tests passing (67 parametrized fixtures + 24 unit tests) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Sungsoo Ha <sungsooh@nvidia.com>
What does this PR do?
Type of change: New feature
Add a YAML-based recipe system (
modelopt.torch.recipes) for defining and validating model optimization workflows. Recipes provide a human-readable interface on top of ModelOpt's quantization, sparsity, pruning, and distillation APIs.What's included:
preset: fp8→FP8_DEFAULT_CFG)hf_ptq.pyargument translationUsage
Testing
python -m pytest tests/unit/torch/recipes/ -v # 34 passed, 0.11sExisting quantization tests unaffected (357 passed, 32 skipped).
Before your PR is "Ready for review"
Make sure you read and follow Contributor guidelines and your commits are signed (
git commit -s -S).Make sure you read and follow the Security Best Practices (e.g. avoiding hardcoded
trust_remote_code=True,torch.load(..., weights_only=False),pickle, etc.).CONTRIBUTING.md: ✅ (no new dependencies — pydantic and pyyaml already in project)Summary by CodeRabbit