Cortex-M: Thread target CPU/ISA through the AOT pass manager by rascani · Pull Request #19470 · pytorch/executorch

rascani · 2026-05-11T22:12:25Z

Summary

Introduce a CortexMCompileConfig dataclass (cpu + isa) that carries Cortex-M target information from the --target=cortex-m+int8 CLI string into CortexMPassManager. The full standard Cortex-M lineup is registered (M0, M0+, M3, M4, M7, M23, M33, M35P, M52, M55, M85), each with a sensible default ISA; the optional-DSP M33/M35P and optional-MVE M52/M55/M85 cases can be expressed via the isa= kwarg.

No pass reads the config yet, so this change is purely plumbing, but it positions both the upcoming AOT scratch-buffer sizing work (#16580) and the CMSIS-NN scalar (#17646) / DSP (#17644) backend support to plug in without re-plumbing the call site. Actually building for the new variants still requires

CortexMTester gains an optional config kwarg, and the Pico 2 MLP example now constructs CortexMPassManager with cpu='cortex-m33' to match the RP2350 hardware it targets.

Authored with Claude.

cc @digantdesai @freddan80 @per @zingo @oscarandersson8218 @mansnils @Sebastian-Larsson @robell

Introduce a CortexMCompileConfig dataclass (cpu + isa) that carries Cortex-M target information from the --target=cortex-m<variant>+int8 CLI string into CortexMPassManager. The full standard Cortex-M lineup is registered (M0, M0+, M3, M4, M7, M23, M33, M35P, M52, M55, M85), each with a sensible default ISA; the optional-DSP M33/M35P and optional-MVE M52/M55/M85 cases can be expressed via the isa= kwarg. No pass reads the config yet, so this change is purely plumbing — but it positions both the upcoming AOT scratch-buffer sizing work (pytorch#16580) and the M0+ (pytorch#17646) / M33 (pytorch#17644) backend support to plug in without re-plumbing the call site. Actually building for the new variants still requires Phase 2's MPS2 platform glue. CortexMTester gains an optional config kwarg, and the Pico 2 MLP example now constructs CortexMPassManager with cpu='cortex-m33' to match the RP2350 hardware it targets. Authored with Claude.

pytorch-bot · 2026-05-11T22:12:29Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19470

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

Run pull request jobs on OSDC runners in shadow mode

❌ 1 New Failure, 3 Pending, 1 Unrelated Failure

As of commit 2ec0840 with merge base 02f9866 ():

NEW FAILURE - The following job has failed:

Lint / lintrunner-mypy (gh)
>>> Lint for backends/arm/test/passes/test_symbolic_value_range.py:

FLAKY - The following job failed but was likely due to flakiness present on trunk:

pull / test-custom-ops-linux / linux-job (gh) (matched linux rule in flaky-rules.json)
The runner has received a shutdown signal. This can happen when the runner service is stopped, or a manually started runner is canceled.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

github-actions · 2026-05-11T22:13:13Z

This PR needs a `release notes:` label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

digantdesai · 2026-05-12T03:00:34Z

    "TOSA-1.0+INT",
    "TOSA-1.0+FP",
    "TOSA-1.0+INT+int16",
+    "cortex-m0+int8",


zingo · 2026-05-12T05:15:56Z

Really nice!

Erik-Lundell · 2026-05-12T07:31:19Z

+# Default ISA per CPU follows the most common configuration each core is
+# shipped with. M33/M35P optionally lack DSP, and M52/M55/M85 optionally
+# lack MVE; callers can pass `isa=` explicitly to override.
+_CPU_DEFAULT_ISA: dict[str, str] = {


Use the Cpu, Isa literals you defined for typing?

Switched to Enum

Erik-Lundell · 2026-05-12T07:42:45Z

+    """
+
+    cpu: Cpu = "cortex-m55"
+    isa: Isa | None = None


Use dataclass default/ default_factory instead of | None and post_init. Avoids annoying typing issues

Erik-Lundell · 2026-05-12T07:56:32Z

    "TOSA-1.0+INT+int16",
+    "cortex-m0+int8",
+    "cortex-m0plus+int8",
+    "cortex-m3+int8",


I don't think +int8 is a good model for a particular target. int8 is a result of quantization and cmsis_nn support, not target. If we suddenly start supporting for example int16 for some cortex-m, all such cortex-m targets will then support int16. I know that we do something like this for TOSA with +int16, but I at least don't think int8 should make it into the CortexMCompileConfig. It can be used in aot_arm_compiler for deciding quantization.

Agreed. I've dropped the +int8 on all cortex-m. It'll really just explode this list.

Erik-Lundell · 2026-05-12T07:57:01Z

    """Cortex-M/CMSIS-NN compilation path with no delegation."""
-    logging.info("Using Cortex-M/CMSIS-NN compilation path (no delegation)")
+    logging.info(
+        "Using Cortex-M/CMSIS-NN compilation path for cpu=%s isa=%s",


Nit: why not use f-strings

Erik-Lundell · 2026-05-12T07:58:17Z

        self.passes: list[PassClass] = (  # type: ignore[assignment]
            passes if passes is not None else self.pass_list  # type: ignore[assignment]
        )
+        self.config: CortexMCompileConfig = config or CortexMCompileConfig()


Add docstr to init about the default value of config

Erik-Lundell · 2026-05-12T07:59:17Z

+
+from executorch.backends.cortex_m.compile_config import CortexMCompileConfig
+
+_HAS_CMSIS_NN = find_spec("cmsis_nn") is not None


pytest has a function called import_or_skip or similar

Also, I think there is a danger in skipping tests when a dependency is missing. It could easily lead to something not being tested when we think it is.

Erik-Lundell · 2026-05-12T08:02:31Z

+from dataclasses import dataclass
+from typing import Literal
+
+Cpu = Literal[


Literals are fine I guess, but why not use Enums for minimal risk of typos? It is easy to go from string to enum in cases where you might need it (aot_arm_compiler)

Erik-Lundell · 2026-05-12T08:06:30Z

+# Default ISA per CPU follows the most common configuration each core is
+# shipped with. M33/M35P optionally lack DSP, and M52/M55/M85 optionally
+# lack MVE; callers can pass `isa=` explicitly to override.
+_CPU_DEFAULT_ISA: dict[str, str] = {


cmsis_nn has a function 'resolve_backend' https://github.com/ARM-software/CMSIS-NN/blob/main/Source/Bindings/arm_py_backend.cpp that should be used for default Isa.

Erik-Lundell

Hi! I was actually just working on something like this, there are patches on the way for aot memory planning: #19505

The main differences that I would like to see are:

Enums instead of literals
Involve cmsis_nn, for example have a .backend function that can be passed to cmsis_nn's buffer size calculationen functions. See my PR.
Possibly: An "any" option that could be overly pessimistic but always works.

Aligns CortexMTargetConfig with the design Erik proposes in pytorch#19505 while keeping the wider plumbing in place. The earlier CortexMCompileConfig is renamed to CortexMTargetConfig (and its file moved to target_config.py) to disambiguate from EdgeCompileConfig — this dataclass models a compilation *target*, not a step in the compile pipeline. Adopted from Erik's feedback: * CortexM enum replaces the Cpu/Isa Literals — typo-safe and IDE-friendly. * `.backend` property returns `cmsis_nn.Backend` directly, resolved via `cmsis_nn.resolve_backend(cmsis_nn.CortexM.<X>)`. The hand-rolled `_CPU_DEFAULT_ISA` dict is gone — cmsis_nn is the single source of truth for the CPU → backend mapping. * CortexMPass abstract base class added; CortexMPassManager.transform() uses signature inspection to inject both `exported_program` and `target_config` into passes that declare them (mirroring Erik's proposal). The pass manager also gains stricter validation — the exported_program must be a real ExportedProgram and the pass list must contain classes, not instances — failing fast instead of producing opaque errors deep in _transform. * cmsis_nn is now a hard dependency for the cortex_m tests: the top-level `import cmsis_nn` in test_target_config.py replaces the previous skipif-on-find_spec dance, addressing Erik's concern that skipping tests on missing deps can mask regressions. * `+int8` dropped from cortex-m target strings — quantization is a result of the export flow, not a CPU attribute. TARGETS, help text, from_target_string, CI script and README aligned. * Logging in `_to_edge_cortex_m` and the --delegate-ignored warning switched to f-strings. * `__init__` docstring on CortexMPassManager documents the exported_program / passes / target_config defaults (including the M55+MVE fallback that matches pre-config behaviour). * `import-not-found` removed from the cmsis_nn type-ignore — only `import-untyped` actually fires, and if cmsis_nn ever ships stubs the unused ignore will become a tripwire. Kept the optional `isa` override field for the optional-extension cases (M55 without MVE, M33 without DSP, etc.) — different from Erik's enum-only design, but the override remains useful for cores where ISA extensions are optional. A `_SUPPORTED_BACKENDS` table encodes the per-CPU architectural capability set so overrides validate at construction; forcing MVE on an M0 raises ValueError with the actual supported list. The SCALAR ⊂ DSP ⊂ MVE supersession reflects that an MVE-capable core also runs DSP and scalar code. Defers Erik's `ANY` proposal. In pytorch#19505 ANY falls back to MVE, but an honest "any cortex-m" choice would have to do worst-case scratch buffer planning across the ISA classes (which may not be MVE). Deferring until the scratch-buffer side lands and we can implement the worst-case analysis properly. Authored with Claude.

rascani · 2026-05-13T00:23:35Z

Hi! I was actually just working on something like this, there are patches on the way for aot memory planning: #19505

The main differences that I would like to see are:

Enums instead of literals

Involve cmsis_nn, for example have a .backend function that can be passed to cmsis_nn's buffer size calculationen functions. See my PR.

Possibly: An "any" option that could be overly pessimistic but always works.

Thanks for the feedback and so sorry we wound up duplicating effort. I've incorporated most of the feedback, but one of the things I punted on was an "Any" option. I think that's a valid use case where I want to generate a PTE that is compatible with any cortex-m target, but I want to think through how we'll do the scratch buffer allocation for it. One approach would be to continue to use temp allocator those. Another would be to memory plan for the worst case scratch buffer allocation. We'd probably need to query all of the ISA backends and pick the worst one.

CI's mypy env doesn't pip-install cmsis_nn (it's a native pybind11 module — type-checking-only envs typically skip such deps), so `import-not-found` fires there even though `import-untyped` is the one that fires locally. The combo `[import-not-found, import-untyped]` is honest about both states; dropping either half breaks one or the other environment. Authored with Claude.

Erik-Lundell · 2026-05-13T06:44:31Z

Hi! I was actually just working on something like this, there are patches on the way for aot memory planning: #19505
The main differences that I would like to see are:

Enums instead of literals

Involve cmsis_nn, for example have a .backend function that can be passed to cmsis_nn's buffer size calculationen functions. See my PR.

Possibly: An "any" option that could be overly pessimistic but always works.

Thanks for the feedback and so sorry we wound up duplicating effort. I've incorporated most of the feedback, but one of the things I punted on was an "Any" option. I think that's a valid use case where I want to generate a PTE that is compatible with any cortex-m target, but I want to think through how we'll do the scratch buffer allocation for it. One approach would be to continue to use temp allocator those. Another would be to memory plan for the worst case scratch buffer allocation. We'd probably need to query all of the ISA backends and pick the worst one.

Thanks! Looks good from my side now. I agree on the "any" option, we can save that for another time.

The Zephyr `hello-executorch` README has `` directives that the test-arm-backend-zephyr CI extracts and executes verbatim. The prior wording used `--target=cortex-m55+int8`, which the AOT compiler no longer accepts after the +int8 drop. Update both code blocks (and the surrounding prose) to use the new bare `--target=cortex-m55` spelling. Authored with Claude.

rascani requested review from AdrianLundell, mansnils and psiddh May 11, 2026 22:12

rascani requested a review from digantdesai as a code owner May 11, 2026 22:12

meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label May 11, 2026

github-actions Bot added ciflow/trunk module: arm Issues related to arm backend labels May 11, 2026

Fix lints

3000f9c

digantdesai reviewed May 12, 2026

View reviewed changes

digantdesai approved these changes May 12, 2026

View reviewed changes

Erik-Lundell reviewed May 12, 2026

View reviewed changes

Erik-Lundell requested changes May 12, 2026

View reviewed changes

zingo mentioned this pull request May 12, 2026

Zephyr: Auto-export models from prj.conf #19369

Open

Erik-Lundell reviewed May 13, 2026

View reviewed changes

Comment thread backends/cortex_m/test/tester.py

Erik-Lundell approved these changes May 13, 2026

View reviewed changes

rascani merged commit a3b13f3 into pytorch:main May 13, 2026
595 of 598 checks passed

rascani deleted the cortex-m-compile-config branch May 13, 2026 18:20


		from executorch.backends.cortex_m.compile_config import CortexMCompileConfig

		_HAS_CMSIS_NN = find_spec("cmsis_nn") is not None

Conversation

rascani commented May 11, 2026 • edited by pytorch-bot Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Uh oh!

pytorch-bot Bot commented May 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19470

❗ 1 Active SEVs

❌ 1 New Failure, 3 Pending, 1 Unrelated Failure

Uh oh!

github-actions Bot commented May 11, 2026

This PR needs a release notes: label

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zingo commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Erik-Lundell left a comment

Choose a reason for hiding this comment

Uh oh!

rascani commented May 13, 2026

Uh oh!

Uh oh!

Erik-Lundell commented May 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

rascani commented May 11, 2026 •

edited by pytorch-bot Bot

Loading

pytorch-bot Bot commented May 11, 2026 •

edited

Loading

This PR needs a `release notes:` label

zingo commented May 12, 2026 •

edited

Loading