Skip to content

Fix empty CUDA_ARCHITECTURES when SM120 is the only arch#2804

Open
sudhakarsingh27 wants to merge 1 commit intoNVIDIA:mainfrom
sudhakarsingh27:sudhakars/fix-sm120-cmake-arch
Open

Fix empty CUDA_ARCHITECTURES when SM120 is the only arch#2804
sudhakarsingh27 wants to merge 1 commit intoNVIDIA:mainfrom
sudhakarsingh27:sudhakars/fix-sm120-cmake-arch

Conversation

@sudhakarsingh27
Copy link
Collaborator

When SM120 (Blackwell) is the only architecture in the build and gets moved to per-source COMPILE_OPTIONS, CMAKE_CUDA_ARCHITECTURES becomes empty, causing CMake to error with "CUDA_ARCHITECTURES is empty".

Set CMAKE_CUDA_ARCHITECTURES to OFF as a placeholder when it's empty. The actual arch flags are still added per-source via COMPILE_OPTIONS.

Description

Please include a brief summary of the changes, relevant motivation and context.

Fixes # (issue)

Type of change

  • Documentation change (change only to the documentation, either a fix or a new content)
  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Infra/Build change
  • Code refactoring

Changes

Please list the changes introduced in this PR:

  • Change A
  • Change B

Checklist:

  • I have read and followed the contributing guidelines
  • The functionality is complete
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

@sudhakarsingh27 sudhakarsingh27 requested a review from ptrendx March 26, 2026 00:59
@sudhakarsingh27 sudhakarsingh27 force-pushed the sudhakars/fix-sm120-cmake-arch branch from db87355 to 7e8e188 Compare March 26, 2026 01:01
@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 26, 2026

Greptile Summary

This PR fixes a CMake build failure that occurs when CMAKE_CUDA_ARCHITECTURES is left empty after all user-specified architectures (e.g. SM120 alone) are removed from the list and promoted into per-source COMPILE_OPTIONS. CMake treats an empty CUDA_ARCHITECTURES as a fatal error, so the fix inserts a guard that sets the variable to OFF — a documented CMake sentinel meaning "do not add automatic -gencode flags" — while leaving the actual per-source arch flags intact.

Key points:

  • The fix is correct and minimal: a 7-line if(NOT CMAKE_CUDA_ARCHITECTURES) block placed after all arch-removal loops and before add_library.
  • It handles not just SM120-only builds but also SM100-only, SM101-only, and SM110-only builds, as all four arch families are removed from CMAKE_CUDA_ARCHITECTURES in preceding blocks.
  • Setting CMAKE_CUDA_ARCHITECTURES OFF is standard CMake practice; per-source COMPILE_OPTIONS are orthogonal and continue to function correctly.
  • No behavioral change for builds that include at least one non-promoted architecture (e.g., 70 80 89 90 120).

Confidence Score: 5/5

  • Safe to merge — the change is a minimal, correct CMake build fix with no runtime or behavioral impact on existing multi-arch builds.
  • The fix is 7 lines, uses documented CMake semantics (CMAKE_CUDA_ARCHITECTURES OFF), is placed at exactly the right point in the file (after all arch-removal loops, before add_library), and does not affect any builds where at least one non-promoted architecture remains. The only concern is that the PR checklist boxes and the GitHub issue reference were left unfilled, which is a process note rather than a code concern.
  • No files require special attention.

Important Files Changed

Filename Overview
transformer_engine/common/CMakeLists.txt Adds a 7-line guard that sets CMAKE_CUDA_ARCHITECTURES to OFF when it becomes empty after all architectures are moved to per-source COMPILE_OPTIONS; fix is correct, minimal, and handles SM100/SM101/SM110/SM120 single-arch builds.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[User sets CMAKE_CUDA_ARCHITECTURES\ne.g. 120] --> B{Is CMAKE_CUDA_ARCHITECTURES defined?}
    B -- No --> C[Auto-set based on CUDA toolkit version\ne.g. 70 80 89 90 100 120]
    B -- Yes --> D[Use user-supplied value]
    C --> E[Arch processing loop]
    D --> E
    E --> F{120 in list?}
    F -- Yes --> G[Remove 120 from CMAKE_CUDA_ARCHITECTURES\nAdd to NVTE_GENERIC_ARCHS and NVTE_SPECIFIC_ARCHS]
    F -- No --> H{100/101/110 in list?}
    G --> H
    H -- Yes --> I[Remove from CMAKE_CUDA_ARCHITECTURES\nAdd to NVTE_GENERIC/SPECIFIC_ARCHS]
    H -- No --> J{CMAKE_CUDA_ARCHITECTURES empty?}
    I --> J
    J -- Yes --> K["set(CMAKE_CUDA_ARCHITECTURES OFF)\n✅ PR Fix — avoids CMake error"]
    J -- No --> L[Keep remaining archs in CMAKE_CUDA_ARCHITECTURES]
    K --> M[add_library transformer_engine]
    L --> M
    M --> N[Per-source COMPILE_OPTIONS applied\nfor NVTE_GENERIC_ARCHS and NVTE_SPECIFIC_ARCHS]
    N --> O[Build succeeds with correct arch flags]
Loading

Reviews (2): Last reviewed commit: "Fix empty CUDA_ARCHITECTURES when SM120 ..." | Re-trigger Greptile

Signed-off-by: Sudhakar Singh <sudhakars@nvidia.com>
@sudhakarsingh27 sudhakarsingh27 force-pushed the sudhakars/fix-sm120-cmake-arch branch from 7e8e188 to b7e54b3 Compare March 26, 2026 01:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant