fix: AE-2197: stable gpu sorting by jhcipar · Pull Request #285 · runpod/flash

jhcipar · 2026-03-25T14:34:21Z

Summary

Canonicalizes gpuIds strings so comparisons and hashing are stable even when input order/spacing differs
Adds GpuGroup.normalize_gpu_ids_str(...) and uses it in both GPU ID generation (to_gpu_ids_str) and ServerlessResource input sync, so raw gpuIds inputs are normalized consistently.

Why

Previously, semantically equivalent gpuIds values could produce different strings depending on token order, which could cause avoidable config diffs and unstable hash/equality behavior, and trigger lots of releases
This change makes gpuIds representation deterministic across code paths

runpod-Henrik

PR #285 — fix: AE-2197: stable gpu sorting

Repo: runpod/flash
Severity: LOW
Key findings: No test covers the actual stated goal — that the same GPU set in different orders produces the same config hash and avoids a spurious redeploy.

1. Missing test for the user scenario this fix is meant to prevent

The purpose of this fix is: a user who deploys with gpuIds="AMPERE_24,ADA_24" and then redeploys the same config with gpuIds="ADA_24,AMPERE_24" should not trigger an unnecessary redeploy.

No test validates this end-to-end. The new tests cover the sort function in isolation, but nothing constructs two resource configs with the same GPU set in different orders and asserts they produce the same config_hash. If the normalization were ever disconnected from the hash path, this scenario would silently regress — the sort tests would still pass.

Nits

from_gpu_ids_str parses self.gpuIds but does not write back the normalized form. No bug today, but it creates an implicit ordering dependency that could cause confusion for anyone reading that code path later.

Verdict

PASS WITH NITS — the fix is correct and the spurious-redeploy scenario is resolved. Add a hash-level round-trip test (same GPU set, different input order, same config_hash) to lock in the stated goal.

🤖 Reviewed by Henrik's AI-Powered Bug Finder

fix: stable sorted gpu ids

54e34eb

jhcipar requested review from KAJdev, deanq and runpod-Henrik March 25, 2026 14:34

KAJdev approved these changes Mar 25, 2026

View reviewed changes

runpod-Henrik reviewed Mar 26, 2026

View reviewed changes

Merge branch 'main' into jhcipar/AE-2197/gpuids-order-normalization

e14b2d8

jhcipar merged commit 759e0f6 into main Mar 26, 2026
4 checks passed

jhcipar deleted the jhcipar/AE-2197/gpuids-order-normalization branch March 26, 2026 23:49

runpod-release-please-bot bot mentioned this pull request Mar 26, 2026

chore: release 1.11.1 #291

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: AE-2197: stable gpu sorting#285

fix: AE-2197: stable gpu sorting#285
jhcipar merged 2 commits intomainfrom
jhcipar/AE-2197/gpuids-order-normalization

jhcipar commented Mar 25, 2026

Uh oh!

runpod-Henrik left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

jhcipar commented Mar 25, 2026

Uh oh!

runpod-Henrik left a comment

Choose a reason for hiding this comment

PR #285 — fix: AE-2197: stable gpu sorting

1. Missing test for the user scenario this fix is meant to prevent

Nits

Verdict

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants