fix: AE-2322: network volume size bounds, forbid extra args by jhcipar · Pull Request #290 · runpod/flash

jhcipar · 2026-03-25T21:48:58Z

validation for network volume config:

name cannot be an empty string or whitespace-only string
forbid extra args (should probably be on base resource config, but i think it's fine to just have it on network volume for now)
limit size so failures happen at config time
reduce default network volume size

KAJdev · 2026-03-25T21:51:02Z

src/runpod_flash/core/resources/network_volume.py

    id: Optional[str] = Field(default=None)
    name: Optional[str] = None
-    size: Optional[int] = Field(default=100, gt=0)  # Size in GB
+    size: Optional[int] = Field(default=100, gt=0, le=4096)  # Size in GB


default should probably be lower, like 20GB? 10GB? Also I believe the minimum is 10GB.

yeah it's 10gb, updated. 20gb seems sensible for a default

promptless · 2026-03-25T21:52:41Z

📝 Documentation updates detected!

New suggestion: Document valid size range for network volumes

Tip: Configure how Promptless handles changelogs in Agent Settings 📋

KAJdev

just fix the assert in the test looks good

Copilot

Pull request overview

Adds stricter validation for NetworkVolume configuration so invalid volume definitions fail fast at model-construction/config time.

Changes:

Enforce NetworkVolume name is not empty/whitespace-only and forbid unknown fields via Pydantic config.
Introduce size bounds (min 10GB, max 4096GB) and change the default size to 20GB.
Update/unit-extend tests and manifest expectations to reflect the new validation and defaults.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
`tests/unit/test_p2_remaining_gaps.py`	Updates a gap-test expectation to reflect new name validation behavior and size constraint wording.
`tests/unit/test_p2_gaps.py`	Updates VOL-* tests for new min-size constraint and new default size.
`tests/unit/resources/test_network_volume.py`	Adds focused unit tests for empty/whitespace names, size bounds, and unknown-field rejection.
`tests/unit/cli/commands/build_utils/test_manifest.py`	Updates manifest extraction test to assert the new default volume size.
`src/runpod_flash/core/resources/network_volume.py`	Implements `extra="forbid"`, adds name validator, and applies min/max size constraints plus default size change.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

src/runpod_flash/core/resources/network_volume.py

runpod-Henrik

PR #290 — fix: AE-2322: network volume size bounds, forbid extra args

Repo: runpod/flash
Severity: MEDIUM
Key findings: Volume creation will fail silently after this PR ships — the volume gets created in RunPod but the SDK throws an error, leaving users with orphaned paid resources.

1. Size bounds and default change

The 10 GB minimum, 4 TB ceiling, and 20 GB default are correct and well-motivated. The unit test coverage for the new bounds is thorough.

2. Bug: volume creation broken end-to-end (high confidence)

The unit tests all pass with mocked API responses, but live volume creation will fail. When RunPod actually creates a network volume, the API response includes fields the SDK doesn't declare (createdAt, podCount, isPublic, mountPath, etc.). This PR makes the SDK reject any unknown fields as invalid — so the SDK throws a validation error immediately after the volume is created.

The failure scenario: user runs flash deploy, the network volume gets created and charged in RunPod, the SDK crashes, and the deploy appears to have failed. The user retries, potentially creating multiple orphaned volumes. There is no E2E test covering live volume creation — the unit tests mock the API response and only return the four fields the SDK expects, which is why this isn't caught.

This needs a live volume creation test before shipping, or the model needs to be changed to tolerate extra fields from the API response.

3. Issue: orphaned volumes cannot be removed via the CLI

flash undeploy does not delete network volumes — the undeploy path for NetworkVolume raises NotImplementedError and tells users to delete via the RunPod UI or API. A user who hits the creation bug (section 2) and retries will accumulate orphaned paid volumes with no CLI path to clean them up. They have to find and delete them manually in the RunPod console. This makes the section 2 failure significantly worse than it appears.

4. No documentation on network volume lifecycle

There is no user-facing documentation covering: how to find and reuse an existing volume by ID, what happens to a volume across redeploys, how to delete a volume (the CLI cannot do it — users must use the RunPod UI or API, which is not mentioned anywhere), or what to do when a volume is left orphaned by a failed deploy. The SDK reference shows the constructor syntax but not the operational lifecycle. This PR adds size constraints and a new default — none of those appear in any doc either.

Given that volumes are persistent paid resources and this PR makes the failure mode worse (creation succeeds but SDK crashes), documenting the lifecycle — especially deletion and recovery — should ship alongside this fix.

Nit: users with existing endpoints will see unexpected redeployment

Any endpoint currently configured with the old 100 GB volume default will appear as "drifted" on next flash deploy and trigger a full redeploy. Not a bug introduced by this PR, but users will be surprised. Worth a note in the release.

Verdict

NEEDS WORK — the size validation is correct but the change breaks live volume creation, and orphaned volumes from failed deploys cannot be cleaned up from the CLI. Needs a live test or a code fix before merging.

🤖 Reviewed by Henrik's AI-Powered Bug Finder

runpod-Henrik

Correction to prior review

Section 2 (Bug: volume creation broken end-to-end) was wrong and should be disregarded.

I claimed the RunPod API returns extra fields (createdAt, podCount, etc.) that would cause extra="forbid" to reject the response. I verified this by calling GET /v1/networkvolumes directly — the API returns only {id, name, dataCenterId, size}, which matches the model exactly. extra="forbid" is safe. There is no breakage here.

The Nit about unexpected redeployment was also wrong — the default size stays 100 GB.

Sections 3 and 4 from the original review remain valid (CLI cannot delete volumes, lifecycle not documented). Updated verdict: PASS WITH NITS.

🤖 Reviewed by Henrik's AI-Powered Bug Finder

promptless · 2026-03-26T23:51:35Z

📝 Documentation updates detected!

Updated existing suggestion: Document valid size range for network volumes

Tip: Assign suggestions to team members in the Promptless dashboard to claim work 👥

fix: network volume size bounds, forbid extra args

3b59f1a

jhcipar requested review from KAJdev, deanq and runpod-Henrik March 25, 2026 21:49

KAJdev requested changes Mar 25, 2026

View reviewed changes

chore: update defaults and min network volume size

f035a07

KAJdev approved these changes Mar 25, 2026

View reviewed changes

chore: update test_manifest

f00e70d

deanq requested a review from Copilot March 25, 2026 23:20

Copilot started reviewing on behalf of deanq March 25, 2026 23:21 View session

Copilot AI reviewed Mar 25, 2026

View reviewed changes

src/runpod_flash/core/resources/network_volume.py Outdated Show resolved Hide resolved

src/runpod_flash/core/resources/network_volume.py Show resolved Hide resolved

promptless bot mentioned this pull request Mar 26, 2026

docs: Document valid size range for network volumes runpod/docs#598

Open

runpod-Henrik reviewed Mar 26, 2026

View reviewed changes

jhcipar added 3 commits March 26, 2026 16:43

chore: default size back to 100 so we dont break old volumes

50c8eef

chore: failing test

abdeb59

chore: failing test again

9f540ad

runpod-Henrik reviewed Mar 26, 2026

View reviewed changes

Merge branch 'main' into jhcipar/AE-2322/network-volume-validation

dec0e4a

deanq approved these changes Mar 26, 2026

View reviewed changes

chore: brief docs on nv lifecycle

c962f71

jhcipar merged commit 6cc7ec2 into main Mar 26, 2026
4 checks passed

jhcipar deleted the jhcipar/AE-2322/network-volume-validation branch March 26, 2026 23:49

runpod-release-please-bot bot mentioned this pull request Mar 26, 2026

chore: release 1.11.1 #291

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: AE-2322: network volume size bounds, forbid extra args#290

fix: AE-2322: network volume size bounds, forbid extra args#290
jhcipar merged 8 commits intomainfrom
jhcipar/AE-2322/network-volume-validation

jhcipar commented Mar 25, 2026 •

edited

Loading

Uh oh!

KAJdev Mar 25, 2026

Uh oh!

jhcipar Mar 25, 2026

Uh oh!

promptless bot commented Mar 25, 2026

Uh oh!

KAJdev left a comment

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

runpod-Henrik left a comment

Uh oh!

runpod-Henrik left a comment

Uh oh!

Uh oh!

promptless bot commented Mar 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

jhcipar commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

KAJdev Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

jhcipar Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

promptless bot commented Mar 25, 2026

Uh oh!

KAJdev left a comment

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

runpod-Henrik left a comment

Choose a reason for hiding this comment

PR #290 — fix: AE-2322: network volume size bounds, forbid extra args

1. Size bounds and default change

2. Bug: volume creation broken end-to-end (high confidence)

3. Issue: orphaned volumes cannot be removed via the CLI

4. No documentation on network volume lifecycle

Nit: users with existing endpoints will see unexpected redeployment

Verdict

Uh oh!

runpod-Henrik left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

promptless bot commented Mar 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

jhcipar commented Mar 25, 2026 •

edited

Loading