Cortex-M: build for any Cortex-M variant against Corstone-300 by rascani · Pull Request #19520 · pytorch/executorch

rascani · 2026-05-12T20:09:57Z

Summary

Extend the Cortex-M test pipeline so the cortex-m<variant>+int8 target strings registered in the AOT compile-config plumbing actually produce runnable, ISA-faithful binaries. The binary is built end-to-end with -mcpu=cortex-m<variant> — runner and core libraries alike — so CMSIS-NN's compile-time __ARM_FEATURE_DSP / __ARM_FEATURE_MVE selector exercises the matching kernel implementation. The Corstone-300 M55 simulator is an ISA superset of every earlier Cortex-M, so it executes binaries compiled for older cores without modification — the CI gate becomes "did the right CMSIS-NN code path execute correctly" rather than "did per-CPU silicon behave as expected".

The build pipeline learns the target CPU end-to-end:

build_executorch.sh accepts --target_cpu, passes -DTARGET_CPU to the toolchain CMake, and stages per-CPU artifacts in cmake-out-<cpu> so they don't clobber each other.
build_test_runner.sh derives target_cpu from --target (using the same cortex-m+int8 regex as build_executor_runner.sh) and forwards it.
build_executor_runner.sh derives the matching target_cpu, points ET_BUILD_DIR_PATH at cmake-out-<cpu>, passes -Dexecutorch_DIR explicitly so find_package doesn't silently fall back to a stale cmake-out if it exists, and supplies a dummy ETHOSU_TARGET_NPU_CONFIG=ethos-u55-128 so core_platform's ethosu_get_architecture() parser stays happy.

Without these changes, build_executorch.sh defaulted to -mcpu=cortex-m55, so the core libraries (libexecutorch.a, libcortex_m_kernels.a, the bundled CMSIS-NN) baked in M55+MVE code paths. A runner built with -mcpu=cortex-m4 would link those libraries and execute MVE instructions on Corstone-300's M55 — passing bundled-IO checks while testing the wrong code path. The explicit -Dexecutorch_DIR is needed because CMake's find_package(HINTS ...) is not authoritative — a leftover cmake-out/lib/cmake/ExecuTorch/ from an earlier build was being preferred over the per-CPU dir we actually asked for.

One transient patch is layered into the externally-fetched ethos-u/core_platform repo via the existing patch_repo mechanism: an #if defined(__ARM_ARCH_8M_MAIN__) || defined(__ARM_ARCH_8_1M_MAIN__) guard around the MPU init block in corstone-300/target.cpp. Without it, the Armv8-M-only ARM_MPU_RBAR / ARM_MPU_RLAR API breaks the build for older cores. The FVP doesn't enforce protection regions without an explicit setup, so simulation correctness is unaffected. The patch is a bridge — see TODO at corstone_utils.cmake:52 — pending upstream merge of the equivalent guard.

Inside our own runner, the optional Armv8.1-M PMU intrinsics (ARM_PMU_*) in arm_executor_runner.cpp and arm_perf_monitor.cpp are guarded on __ARM_ARCH_8_1M_MAIN__. Earlier cores get a zero cycle count rather than a compile error; functional correctness is unaffected. run_fvp.sh routes all cortex-m* targets except cortex-m85* to the Corstone-300 FVP.

Test Plan

Locally validated end-to-end on Corstone-300 with the qadd model:

cortex-m55+int8 — baseline, PASS; op_quantize_per_tensor.cpp.obj in cmake-out-cortex-m55 contains MVE intrinsics (vdup.16, vmax.s16).
cortex-m4+int8 — PASS; same object in cmake-out-cortex-m4 has no MVE — only single-precision FP (vmul.f32, vcvt.s32.f32). CMSIS-NN selects the DSP path (1275 DSP opcodes in libcmsis-nn.a).
cortex-m7+int8 — PASS; same shape as M4.

Scalar-class variants (cortex-m{0,0plus,3,23}+int8) still need a follow-up: an Armv6-M HardFault_Handler guard in target.cpp and a core_software/cmsis.cmake ARMCM0plus directory-case fix. The target_cpu plumbing here already accommodates soft-float ABI builds — the follow-up only adds those two additional __ARM_ARCH_* guards.

Authored with Claude.

cc @digantdesai @freddan80 @per @zingo @oscarandersson8218 @mansnils @Sebastian-Larsson @robell

Extend the Cortex-M test pipeline so the `cortex-m<variant>+int8` target strings registered in the AOT compile-config plumbing actually produce runnable, ISA-faithful binaries. The binary is built end-to-end with `-mcpu=cortex-m<variant>` — runner and core libraries alike — so CMSIS-NN's compile-time `__ARM_FEATURE_DSP` / `__ARM_FEATURE_MVE` selector exercises the matching kernel implementation. The Corstone-300 M55 simulator is an ISA superset of every earlier Cortex-M, so it executes binaries compiled for older cores without modification — the CI gate becomes "did the right CMSIS-NN code path execute correctly" rather than "did per-CPU silicon behave as expected". The build pipeline learns the target CPU end-to-end: * `build_executorch.sh` accepts `--target_cpu`, passes `-DTARGET_CPU` to the toolchain CMake, and stages per-CPU artifacts in `cmake-out-<cpu>` so they don't clobber each other. * `build_test_runner.sh` derives `target_cpu` from `--target` (using the same cortex-m<X>+int8 regex as build_executor_runner.sh) and forwards it. * `build_executor_runner.sh` derives the matching `target_cpu`, points ET_BUILD_DIR_PATH at `cmake-out-<cpu>`, passes `-Dexecutorch_DIR` explicitly so `find_package` doesn't silently fall back to a stale `cmake-out` if it exists, and supplies a dummy ETHOSU_TARGET_NPU_CONFIG=ethos-u55-128 so core_platform's ethosu_get_architecture() parser stays happy. Without these changes, build_executorch.sh defaulted to `-mcpu=cortex-m55`, so the core libraries (libexecutorch.a, libcortex_m_kernels.a, the bundled CMSIS-NN) baked in M55+MVE code paths. A runner built with `-mcpu=cortex-m4` would link those libraries and execute MVE instructions on Corstone-300's M55 — passing bundled-IO checks while testing the wrong code path. The explicit `-Dexecutorch_DIR` is needed because CMake's `find_package(HINTS ...)` is not authoritative — a leftover `cmake-out/lib/cmake/ExecuTorch/` from an earlier build was being preferred over the per-CPU dir we actually asked for. One transient patch is layered into the externally-fetched `ethos-u/core_platform` repo via the existing `patch_repo` mechanism: an `#if defined(__ARM_ARCH_8M_MAIN__) || defined(__ARM_ARCH_8_1M_MAIN__)` guard around the MPU init block in `corstone-300/target.cpp`. Without it, the Armv8-M-only `ARM_MPU_RBAR` / `ARM_MPU_RLAR` API breaks the build for older cores. The FVP doesn't enforce protection regions without an explicit setup, so simulation correctness is unaffected. The patch is a bridge — see TODO at `corstone_utils.cmake:52` — pending upstream merge of the equivalent guard. Inside our own runner, the optional Armv8.1-M PMU intrinsics (`ARM_PMU_*`) in `arm_executor_runner.cpp` and `arm_perf_monitor.cpp` are guarded on `__ARM_ARCH_8_1M_MAIN__`. Earlier cores get a zero cycle count rather than a compile error; functional correctness is unaffected. `run_fvp.sh` routes all `cortex-m*` targets except `cortex-m85*` to the Corstone-300 FVP. Locally validated end-to-end on Corstone-300 with the `qadd` model: * `cortex-m55+int8` — baseline, PASS; op_quantize_per_tensor.cpp.obj in cmake-out-cortex-m55 contains MVE intrinsics (vdup.16, vmax.s16). * `cortex-m4+int8` — PASS; same object in cmake-out-cortex-m4 has no MVE — only single-precision FP (vmul.f32, vcvt.s32.f32). CMSIS-NN selects the DSP path (1275 DSP opcodes in libcmsis-nn.a). * `cortex-m7+int8` — PASS; same shape as M4. Scalar-class variants (`cortex-m{0,0plus,3,23}+int8`) still need a follow-up: an Armv6-M `HardFault_Handler` guard in `target.cpp` and a `core_software/cmsis.cmake` `ARMCM0plus` directory-case fix. The target_cpu plumbing here already accommodates soft-float ABI builds — the follow-up only adds those two additional `__ARM_ARCH_*` guards. Authored with Claude.

pytorch-bot · 2026-05-12T20:10:02Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19520

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

Run pull jobs on OSDC in pull requests shadow mode

❌ 2 New Failures

As of commit 22f6080 with merge base 23a91d5 ():

NEW FAILURES - The following jobs have failed:

pull / test-mcu-cortex-m-backend / linux-job (gh)
RuntimeError: Command docker exec -t 706151546b63a321af667c6be333c36ed5e0244a8086913e73478e34f0fbcf74 /exec failed with exit code 1
pull / unittest / macos / macos-job (gh)
export/tests/test_target_recipes.py::TestTargetRecipes::test_mv2_model

This comment was automatically generated by Dr. CI and updates every 15 minutes.

github-actions · 2026-05-12T20:10:47Z

This PR needs a `release notes:` label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

rascani requested review from AdrianLundell, Erik-Lundell, mansnils, psiddh and zingo May 12, 2026 20:09

meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label May 12, 2026

github-actions Bot added ciflow/trunk module: arm Issues related to arm backend labels May 12, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cortex-M: build for any Cortex-M variant against Corstone-300#19520

Cortex-M: build for any Cortex-M variant against Corstone-300#19520
rascani wants to merge 1 commit into
pytorch:mainfrom
rascani:cortex-m-non-mve-corstone

rascani commented May 12, 2026 •

edited by pytorch-bot Bot

Loading

Uh oh!

pytorch-bot Bot commented May 12, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented May 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

rascani commented May 12, 2026 • edited by pytorch-bot Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test Plan

Uh oh!

pytorch-bot Bot commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19520

❗ 1 Active SEVs

❌ 2 New Failures

Uh oh!

github-actions Bot commented May 12, 2026

This PR needs a release notes: label

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

rascani commented May 12, 2026 •

edited by pytorch-bot Bot

Loading

pytorch-bot Bot commented May 12, 2026 •

edited

Loading

This PR needs a `release notes:` label