Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
93 changes: 80 additions & 13 deletions flash/configuration/parameters.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -19,8 +19,8 @@ This page provides a complete reference for all parameters available on the `End
| `dependencies` | `list[str]` | Python packages to install | `None` |
| `system_dependencies` | `list[str]` | System packages to install (apt) | `None` |
| `accelerate_downloads` | `bool` | Enable download acceleration | `True` |
| `volume` | `NetworkVolume` | Network volume for persistent storage | `None` |
| `datacenter` | `DataCenter` | Preferred datacenter | `EU_RO_1` |
| `volume` | `NetworkVolume` or list | Network volume(s) for persistent storage | `None` |
| `datacenter` | `DataCenter`, list, or `None` | Datacenter(s) for deployment | `None` (all DCs) |
| `env` | `dict[str, str]` | Environment variables | `None` |
| `gpu_count` | `int` | GPUs per worker | `1` |
| `execution_timeout_ms` | `int` | Max execution time in milliseconds | `0` (no limit) |
Expand Down Expand Up @@ -208,19 +208,21 @@ async def process(data): ...

### volume

**Type**: `NetworkVolume`
**Type**: `NetworkVolume` or `list[NetworkVolume]`
**Default**: `None`

Attaches a network volume for persistent storage. Volumes are mounted at `/runpod-volume/`. Flash uses the volume `name` to find an existing volume or create a new one.
Attaches network volume(s) for persistent storage. Volumes are mounted at `/runpod-volume/`. Flash uses the volume `name` to find an existing volume or create a new one. Each volume is tied to a specific datacenter.

```python
from runpod_flash import Endpoint, GpuGroup, NetworkVolume
from runpod_flash import Endpoint, GpuGroup, DataCenter, NetworkVolume

vol = NetworkVolume(name="model-cache") # Finds existing or creates new
# Single volume in a specific datacenter
vol = NetworkVolume(name="model-cache", size=100, datacenter=DataCenter.US_GA_2)

@Endpoint(
name="model-server",
gpu=GpuGroup.ANY,
datacenter=DataCenter.US_GA_2,
volume=vol
)
async def serve(data):
Expand All @@ -229,6 +231,30 @@ async def serve(data):
...
```

For multi-datacenter deployments, pass a list of volumes (one per datacenter):
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Citation: Multi-volume support documented in PR #266's docs/Flash_Deploy_Guide.md and docs/Flash_SDK_Reference.md, with implementation in src/runpod_flash/core/resources/serverless.py adding networkVolumes field.
View source


```python
from runpod_flash import Endpoint, GpuGroup, DataCenter, NetworkVolume

volumes = [
NetworkVolume(name="models-us", size=100, datacenter=DataCenter.US_GA_2),
NetworkVolume(name="models-eu", size=100, datacenter=DataCenter.EU_RO_1),
]

@Endpoint(
name="global-server",
gpu=GpuGroup.ANY,
datacenter=[DataCenter.US_GA_2, DataCenter.EU_RO_1],
volume=volumes
)
async def serve(data):
...
```

<Warning>
Only one network volume is allowed per datacenter. If you specify multiple volumes in the same datacenter, deployment will fail.
</Warning>

**Use cases**:
- Share large models across workers
- Persist data between runs
Expand All @@ -238,24 +264,65 @@ See [Storage](/flash/configuration/storage) for setup instructions.

### datacenter

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Citation: The datacenter parameter changes come from PR #266, which adds multi-datacenter support. Changes are in src/runpod_flash/core/resources/serverless.py (field type changed from Optional[DataCenter] to Optional[List[DataCenter]]) and documented in docs/Flash_SDK_Reference.md.
View source

**Type**: `DataCenter`
**Default**: `DataCenter.EU_RO_1`
**Type**: `DataCenter`, `list[DataCenter]`, `str`, `list[str]`, or `None`
**Default**: `None` (all available datacenters)

Preferred datacenter for worker deployment.
Specifies the datacenter(s) for worker deployment. When set to `None`, the endpoint is available in all datacenters.

```python
from runpod_flash import Endpoint, DataCenter
from runpod_flash import Endpoint, GpuGroup, DataCenter

# Deploy to all available datacenters (default)
@Endpoint(name="global", gpu=GpuGroup.ANY)
async def process(data): ...

# Deploy to a single datacenter
@Endpoint(
name="us-workers",
gpu=GpuGroup.ANY,
datacenter=DataCenter.US_GA_2
)
async def process(data): ...

# Deploy to multiple datacenters
@Endpoint(
name="eu-workers",
name="multi-region",
gpu=GpuGroup.ANY,
datacenter=DataCenter.EU_RO_1
datacenter=[DataCenter.US_GA_2, DataCenter.EU_RO_1]
)
async def process(data): ...

# String DC IDs also work
@Endpoint(
name="us-workers",
gpu=GpuGroup.ANY,
datacenter="US-GA-2"
)
async def process(data): ...
```

**Available datacenters**:

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Citation: The available datacenter list comes from the updated DataCenter enum in src/runpod_flash/core/resources/network_volume.py as referenced in PR #266's docs/Flash_SDK_Reference.md.
View source

| Value | Location |
|-------|----------|
| `DataCenter.US_CA_2` | US - California |
| `DataCenter.US_GA_2` | US - Georgia |
| `DataCenter.US_IL_1` | US - Illinois |
| `DataCenter.US_KS_2` | US - Kansas |
| `DataCenter.US_MD_1` | US - Maryland |
| `DataCenter.US_MO_1` | US - Missouri |
| `DataCenter.US_MO_2` | US - Missouri |
| `DataCenter.US_NC_1` | US - North Carolina |
| `DataCenter.US_NC_2` | US - North Carolina |
| `DataCenter.US_NE_1` | US - Nebraska |
| `DataCenter.US_WA_1` | US - Washington |
| `DataCenter.EU_CZ_1` | Europe - Czech Republic |
| `DataCenter.EU_RO_1` | Europe - Romania |
| `DataCenter.EUR_IS_1` | Europe - Iceland |
| `DataCenter.EUR_NO_1` | Europe - Norway |

<Note>
Flash Serverless deployments are currently restricted to `EU-RO-1`.
CPU endpoints are restricted to `CPU_DATACENTERS`, which currently only includes `EU_RO_1`.
</Note>

### env
Expand Down
42 changes: 38 additions & 4 deletions flash/configuration/storage.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -43,27 +43,61 @@ If you specify a custom size that exceeds the instance limit, deployment will fa

## Network volumes

Network volumes provide persistent storage that survives worker restarts. Use this to share data between endpoint functions with the same network volume attached, or to persist data between runs.
Network volumes provide persistent storage that survives worker restarts. Each volume is tied to a specific datacenter. Use volumes to share data between endpoint functions or to persist data between runs.

### Attaching network volumes

Attach a network volume using the `volume` parameter. Flash uses the volume `name` to find an existing volume or create a new one:
Attach a network volume using the `volume` parameter. Flash uses the volume `name` to find an existing volume or create a new one. Specify the `datacenter` parameter to control where the volume is created:

```python
from runpod_flash import Endpoint, GpuType, NetworkVolume
from runpod_flash import Endpoint, GpuType, DataCenter, NetworkVolume

vol = NetworkVolume(name="model-cache") # Finds existing or creates new
vol = NetworkVolume(name="model-cache", size=100, datacenter=DataCenter.US_GA_2)

@Endpoint(
name="persistent-storage",
gpu=GpuType.NVIDIA_A100_80GB_PCIe,
datacenter=DataCenter.US_GA_2,
volume=vol
)
async def process(data: dict) -> dict:
# Access files at /runpod-volume/
...
```

You can also reference an existing volume by ID:

```python
vol = NetworkVolume(id="vol_abc123")
```

### Multi-datacenter volumes
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Citation: Multi-datacenter volume examples based on PR #266's docs/Flash_Deploy_Guide.md showing how to deploy with multiple volumes across datacenters.
View source


For endpoints deployed across multiple datacenters, pass a list of volumes (one per datacenter):

```python
from runpod_flash import Endpoint, GpuType, DataCenter, NetworkVolume

volumes = [
NetworkVolume(name="models-us", size=100, datacenter=DataCenter.US_GA_2),
NetworkVolume(name="models-eu", size=100, datacenter=DataCenter.EU_RO_1),
]

@Endpoint(
name="global-inference",
gpu=GpuType.NVIDIA_A100_80GB_PCIe,
datacenter=[DataCenter.US_GA_2, DataCenter.EU_RO_1],
volume=volumes
)
async def process(data: dict) -> dict:
# Workers in each region access their local volume at /runpod-volume/
...
```

<Warning>
Only one network volume is allowed per datacenter. If you specify multiple volumes in the same datacenter, deployment will fail.
</Warning>

### Accessing network volume files

Network volumes mount at `/runpod-volume/` and can be accessed like a regular filesystem:
Expand Down
27 changes: 24 additions & 3 deletions flash/create-endpoints.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -294,23 +294,44 @@ Environment variables are excluded from configuration hashing. Changing environm

## Persistent storage

Attach a network volume for persistent storage across workers. Flash uses the volume `name` to find an existing volume or create a new one:
Attach a network volume for persistent storage across workers. Each volume is tied to a specific datacenter. Flash uses the volume `name` to find an existing volume or create a new one:

```python
from runpod_flash import Endpoint, GpuGroup, NetworkVolume
from runpod_flash import Endpoint, GpuGroup, DataCenter, NetworkVolume

vol = NetworkVolume(name="model-cache") # Finds existing or creates new
vol = NetworkVolume(name="model-cache", size=100, datacenter=DataCenter.US_GA_2)

@Endpoint(
name="model-server",
gpu=GpuGroup.ANY,
datacenter=DataCenter.US_GA_2,
volume=vol
)
async def serve(data: dict) -> dict:
# Access files at /runpod-volume/
...
```

For multi-datacenter deployments, pass a list of volumes (one per datacenter):

```python
from runpod_flash import Endpoint, GpuGroup, DataCenter, NetworkVolume

volumes = [
NetworkVolume(name="models-us", size=100, datacenter=DataCenter.US_GA_2),
NetworkVolume(name="models-eu", size=100, datacenter=DataCenter.EU_RO_1),
]

@Endpoint(
name="global-server",
gpu=GpuGroup.ANY,
datacenter=[DataCenter.US_GA_2, DataCenter.EU_RO_1],
volume=volumes
)
async def serve(data: dict) -> dict:
...
```

See [Flash storage](/flash/configuration/storage) for setup instructions.

## Endpoint parameters
Expand Down
5 changes: 3 additions & 2 deletions flash/custom-docker-images.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -155,9 +155,9 @@ if __name__ == "__main__":
All standard `Endpoint` parameters work with custom images:

```python
from runpod_flash import Endpoint, GpuType, NetworkVolume, PodTemplate
from runpod_flash import Endpoint, GpuType, DataCenter, NetworkVolume, PodTemplate

vol = NetworkVolume(name="model-storage")
vol = NetworkVolume(name="model-storage", size=100, datacenter=DataCenter.US_GA_2)

vllm = Endpoint(
name="custom-vllm",
Expand All @@ -169,6 +169,7 @@ vllm = Endpoint(
"MODEL_PATH": "/models/llama",
"MAX_BATCH_SIZE": "32"
},
datacenter=DataCenter.US_GA_2,
volume=vol,
execution_timeout_ms=300000, # 5 minutes
template=PodTemplate(containerDiskInGb=100)
Expand Down
2 changes: 1 addition & 1 deletion flash/overview.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -104,7 +104,7 @@ flash --help
## Limitations

- Flash is currently only available for macOS and Linux. Windows support is in development.
- Serverless deployments using Flash are currently restricted to the `EU-RO-1` datacenter.
- CPU endpoints are restricted to the `EU-RO-1` datacenter. GPU endpoints can deploy to [multiple datacenters](/flash/configuration/parameters#datacenter).
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Citation: The EU-RO-1 restriction removal and multi-datacenter support for GPU endpoints comes from PR #266. CPU endpoints remain restricted to CPU_DATACENTERS which only contains EU_RO_1 (see src/runpod_flash/core/resources/network_volume.py).
View source

- Flash can rapidly scale workers across multiple endpoints, and you may hit your maximum worker threshold quickly. Contact [Runpod support](https://www.runpod.io/contact) to increase your account's capacity if needed.

## Tutorials
Expand Down
4 changes: 4 additions & 0 deletions release-notes.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,10 @@ print("Done!") # This runs locally
</Card>
</CardGroup>

## Flash: Multi-datacenter deployments

Flash now supports deploying endpoints to [multiple datacenters](/flash/configuration/parameters#datacenter) simultaneously. Pass a list of datacenters to distribute your workload across regions for improved availability and reduced latency. You can also attach [network volumes per datacenter](/flash/configuration/storage#multi-datacenter-volumes) for region-specific data access.

</Update>

<Update label="February 2026">
Expand Down
Loading