Skip to content

Enhancement - Support AMDGPU_TARGETS in ROCm CMake builds#793

Open
polarG wants to merge 1 commit intomainfrom
dev/hongtaozhang/fix-rocm-without-gpu-arch-set
Open

Enhancement - Support AMDGPU_TARGETS in ROCm CMake builds#793
polarG wants to merge 1 commit intomainfrom
dev/hongtaozhang/fix-rocm-without-gpu-arch-set

Conversation

@polarG
Copy link
Copy Markdown
Contributor

@polarG polarG commented Mar 25, 2026

Description
Allow specifying AMD GPU architectures via the AMDGPU_TARGETS environment variable in rocm_common.cmake. This fixes builds on machines without a GPU (e.g., CI) where hipcc auto-detection fails. When unset, the previous auto-detection behavior is preserved.

@polarG polarG requested a review from a team as a code owner March 25, 2026 22:44
Copilot AI review requested due to automatic review settings March 25, 2026 22:44
@polarG polarG self-assigned this Mar 25, 2026
@polarG polarG added enhancement New feature or request containers SuperBench Containers ROCm labels Mar 25, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds support for driving ROCm/HIP GPU architecture selection from the AMDGPU_TARGETS environment variable during CMake configuration, to make ROCm micro-benchmark builds work on hosts where hipcc GPU auto-detection is unavailable (e.g., CI without a GPU).

Changes:

  • Read AMDGPU_TARGETS from the environment and convert it to a CMake list for use during configuration.
  • Cache AMDGPU_TARGETS and (for CMake ≥ 3.21) set CMAKE_HIP_ARCHITECTURES when the env var is provided.
  • Preserve the prior “auto-detection” path when AMDGPU_TARGETS is not set, and add status logging.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +47 to +52
string(REPLACE " " ";" HIP_ARCH_LIST "$ENV{AMDGPU_TARGETS}")
set(AMDGPU_TARGETS ${HIP_ARCH_LIST} CACHE STRING "AMD GPU targets to compile for")
if(CMAKE_VERSION VERSION_GREATER_EQUAL 3.21.0)
set(CMAKE_HIP_ARCHITECTURES ${HIP_ARCH_LIST})
endif()
message(STATUS "Using AMDGPU_TARGETS from environment: ${HIP_ARCH_LIST}")
Copy link

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if(DEFINED ENV{AMDGPU_TARGETS}) is true even when the environment variable exists but is empty. In that case HIP_ARCH_LIST becomes empty and this will cache an empty AMDGPU_TARGETS, potentially breaking the HIP toolchain configuration instead of falling back to auto-detection. Consider checking that $ENV{AMDGPU_TARGETS} is non-empty (after trimming) before taking this branch.

Suggested change
string(REPLACE " " ";" HIP_ARCH_LIST "$ENV{AMDGPU_TARGETS}")
set(AMDGPU_TARGETS ${HIP_ARCH_LIST} CACHE STRING "AMD GPU targets to compile for")
if(CMAKE_VERSION VERSION_GREATER_EQUAL 3.21.0)
set(CMAKE_HIP_ARCHITECTURES ${HIP_ARCH_LIST})
endif()
message(STATUS "Using AMDGPU_TARGETS from environment: ${HIP_ARCH_LIST}")
# Copy and trim the environment variable to ensure it is non-empty after stripping whitespace.
set(_amdgpu_targets_raw "$ENV{AMDGPU_TARGETS}")
string(STRIP "${_amdgpu_targets_raw}" _amdgpu_targets)
if(NOT _amdgpu_targets STREQUAL "")
string(REPLACE " " ";" HIP_ARCH_LIST "${_amdgpu_targets}")
set(AMDGPU_TARGETS ${HIP_ARCH_LIST} CACHE STRING "AMD GPU targets to compile for")
if(CMAKE_VERSION VERSION_GREATER_EQUAL 3.21.0)
set(CMAKE_HIP_ARCHITECTURES ${HIP_ARCH_LIST})
endif()
message(STATUS "Using AMDGPU_TARGETS from environment: ${HIP_ARCH_LIST}")
else()
if(CMAKE_VERSION VERSION_GREATER_EQUAL 3.21.0)
set(CMAKE_HIP_ARCHITECTURES OFF)
endif()
message(STATUS "AMDGPU_TARGETS is defined but empty, relying on hipcc auto-detection")
endif()

Copilot uses AI. Check for mistakes.
# - CMAKE_HIP_ARCHITECTURES is the native CMake variable for HIP (requires >= 3.21)
if(DEFINED ENV{AMDGPU_TARGETS})
string(REPLACE " " ";" HIP_ARCH_LIST "$ENV{AMDGPU_TARGETS}")
set(AMDGPU_TARGETS ${HIP_ARCH_LIST} CACHE STRING "AMD GPU targets to compile for")
Copy link

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

set(AMDGPU_TARGETS ... CACHE STRING ...) will not override an existing cached AMDGPU_TARGETS value. If a build directory is reconfigured with a different AMDGPU_TARGETS environment variable, this code may silently keep using the old cached value. Decide the intended precedence (env vs cache) and implement it explicitly (e.g., only set when the cache entry is unset, or use FORCE when env should win).

Suggested change
set(AMDGPU_TARGETS ${HIP_ARCH_LIST} CACHE STRING "AMD GPU targets to compile for")
set(AMDGPU_TARGETS ${HIP_ARCH_LIST} CACHE STRING "AMD GPU targets to compile for" FORCE)

Copilot uses AI. Check for mistakes.
Comment on lines +43 to +45
# Both the CMake variable AMDGPU_TARGETS and CMAKE_HIP_ARCHITECTURES must be set:
# - AMDGPU_TARGETS is read by ROCm's hip-config-amd.cmake to set --offload-arch flags
# - CMAKE_HIP_ARCHITECTURES is the native CMake variable for HIP (requires >= 3.21)
Copy link

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment says both AMDGPU_TARGETS and CMAKE_HIP_ARCHITECTURES must be set, but the micro-benchmark CMake projects in this repo are LANGUAGES CXX and compile via hipcc as the C++ compiler (they don’t enable the HIP language). In that setup, CMAKE_HIP_ARCHITECTURES is unlikely to be consumed by CMake at all, so this note can be misleading. Please adjust the comment (and/or the logic) to reflect what actually drives offload arch selection here (typically AMDGPU_TARGETS as read by hip-config-amd.cmake).

Suggested change
# Both the CMake variable AMDGPU_TARGETS and CMAKE_HIP_ARCHITECTURES must be set:
# - AMDGPU_TARGETS is read by ROCm's hip-config-amd.cmake to set --offload-arch flags
# - CMAKE_HIP_ARCHITECTURES is the native CMake variable for HIP (requires >= 3.21)
# In this repository's micro-benchmarks, AMDGPU_TARGETS is what actually drives
# --offload-arch selection, via ROCm's hip-config-amd.cmake and hipcc.
# CMAKE_HIP_ARCHITECTURES is set (when supported by CMake >= 3.21) for compatibility
# with projects that enable the CMake HIP language, but is not required here.

Copilot uses AI. Check for mistakes.
# - AMDGPU_TARGETS is read by ROCm's hip-config-amd.cmake to set --offload-arch flags
# - CMAKE_HIP_ARCHITECTURES is the native CMake variable for HIP (requires >= 3.21)
if(DEFINED ENV{AMDGPU_TARGETS})
string(REPLACE " " ";" HIP_ARCH_LIST "$ENV{AMDGPU_TARGETS}")
Copy link

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

string(REPLACE " " ";" ...) will turn repeated spaces (or leading/trailing spaces) into empty list elements (e.g., gfx90a gfx942 -> gfx90a;;gfx942). That can propagate into AMDGPU_TARGETS/CMAKE_HIP_ARCHITECTURES and potentially generate invalid arch flags. Consider normalizing whitespace first (e.g., regex replace [ \t]+ with ; and stripping) or using a list-splitting helper that avoids empty entries.

Suggested change
string(REPLACE " " ";" HIP_ARCH_LIST "$ENV{AMDGPU_TARGETS}")
# Normalize whitespace to avoid empty list elements:
# 1) strip leading/trailing whitespace
# 2) collapse runs of spaces/tabs into a single list separator ';'
set(_AMDGPU_TARGETS_RAW "$ENV{AMDGPU_TARGETS}")
string(STRIP "${_AMDGPU_TARGETS_RAW}" _AMDGPU_TARGETS_STRIPPED)
string(REGEX REPLACE "[ \t]+" ";" HIP_ARCH_LIST "${_AMDGPU_TARGETS_STRIPPED}")

Copilot uses AI. Check for mistakes.
@codecov
Copy link
Copy Markdown

codecov bot commented Mar 25, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 85.69%. Comparing base (036c471) to head (e0ac11f).

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #793   +/-   ##
=======================================
  Coverage   85.69%   85.69%           
=======================================
  Files         103      103           
  Lines        7890     7890           
=======================================
  Hits         6761     6761           
  Misses       1129     1129           
Flag Coverage Δ
cpu-python3.10-unit-test 70.42% <ø> (ø)
cpu-python3.7-unit-test 69.85% <ø> (ø)
cuda-unit-test 83.60% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

containers SuperBench Containers enhancement New feature or request ROCm

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants