release: graphn 0.1.6 (LoRA support + custom-model update endpoint) by kunwar-vp · Pull Request #16 · voltagepark/graphn-sdk-python

kunwar-vp · 2026-05-21T19:03:31Z

Why now

Two spec-sync PRs (#12 on 2026-05-14, #15 on 2026-05-16) regenerated _generated/ on main but neither bumped pyproject.toml. The release workflow only fires when pyproject.toml changes, so the new control-plane surface — two new endpoints and the LoRA-adapter fields — has been sitting in main for a week with no PyPI release. This PR closes that gap and also wires the new surface through the hand-curated ergonomic layer so customers get typed accessors, not just model.model_extra["artifact_type"].

What's in 0.1.6

New high-level surface on `client.custom_models` (sync + async)

update(model_id, *, name=..., min_replicas=..., max_replicas=..., cooldown_seconds=..., extra=...) — wraps PATCH /v1/{workspaceId}/custom-models/{modelId}. In-place mutation of the live deployment, no rolling restart, no downtime. Empty body refused client-side with ValidationError(code="empty_update") one round-trip earlier than the server's 422. extra mapping lets callers PATCH future fields without waiting for an SDK release.
supported_architectures() — wraps GET /v1/{workspaceId}/custom-models/supported-architectures. Returns a typed SupportedArchitectures catalog where each ArchitectureInfo carries the capability tags (tool_calling, vision, image_input, video_input, streaming, json_mode) the architecture exposes. Drives UI architecture/capability filters before calling validate().
create(..., base_model_id=...) — wires up the LoRA-import hint. Required on weight_source=s3_* to classify the bundle as an adapter at create-time (omitting it routes through the base path and the bundle deploys to failed). Optional on weight_source=huggingface where it overrides adapter_config.json::base_model_name_or_path from the upstream repo (useful when the recorded base id isn't a valid HF id, e.g. a local filesystem path used during training).
validate(..., model_size_gb=...) — caller-supplied weight-size hint that lets the platform skip the HF head-bytes probe. Useful for very large models (405B-class) where the probe otherwise stalls the validate response.

Typed LoRA fields on existing Pydantic models

CustomModel: artifact_type (Literal["base", "lora"] | None), base_model_id, lora_adapter_name, lora_rank. artifact_type is None on responses from control planes that predate the LoRA work — treat that as "base" for compatibility.
ValidateModelResponse: artifact_type, detected_base_model_id, lora_rank. When artifact_type == "lora", the existing architectures, num_params, estimated_memory_gb, and max_context_length fields describe the base model resolved from adapter_config.json, not the adapter itself.

New public exports

ArchitectureInfo, SupportedArchitectures, ArtifactType from graphn (and graphn.custom_models).

Generated-layer change worth noting

CustomModelCreate.huggingface_model_id is now required on the generated attrs dataclass (was str | Unset). The server has rejected its absence with 422 on every weight_source since 0.1.3 (voltagepark/takao#1997) and the hand-curated client.custom_models.create already raised ValidationError client-side for S3 imports — so this is the generated type catching up, not a behavior change. Callers using the keyword-only ergonomic API are unaffected.

Verification

ruff check src tests — clean
pytest -q — 57 passed (43 existing + 14 new)
mypy src/graphn — no new errors in any file this PR touches (3 pre-existing no-any-return errors in _transport.py and tts.py are on main and not regressions)

New test coverage (14 cases)

Scenario	Test
LoRA classification for S3 imports	`test_create_s3_lora_passes_base_model_id`
HF base-id override flow	`test_create_huggingface_lora_override_passes_base_model_id`
Typed LoRA accessors after `get`	`test_get_returns_typed_lora_fields`
Back-compat with legacy control planes (no `artifact_type`)	`test_get_legacy_response_treats_artifact_type_as_none`
LoRA fields on `validate` response	`test_validate_returns_lora_fields`
`model_size_gb` forwarded on validate	`test_validate_forwards_model_size_gb`
PATCH happy path + body shape	`test_update_sends_patch_with_body`
Empty PATCH refused client-side	`test_update_rejects_empty_body`
Update 404 mapped to `NotFoundError`	`test_update_404_maps_to_not_found`
Forward-compat `extra` mapping	`test_update_extra_passes_through`
`supported_architectures` returns typed catalog	`test_supported_architectures_returns_typed`
Async PATCH happy path	`test_async_update_sends_patch_with_body`
Async empty-PATCH guard	`test_async_update_rejects_empty_body`
Async `supported_architectures`	`test_async_supported_architectures`

What this PR deliberately does NOT touch

examples/ — no example script demonstrates update / supported_architectures / LoRA imports today, and adding ones large enough to be useful would be a couple hundred lines on top of this diff. Happy to follow up with an examples/import_lora.py and examples/live_update.py in a separate PR.
README.md — Scope callout and scope table are not updated. Same reasoning: keeping this PR focused on the code surface + tests.

Release pipeline

The auto-tag job in .github/workflows/release.yml will:

Read version = "0.1.6" from pyproject.toml.
Match ## [0.1.6] in CHANGELOG.md ✓.
Create and push v0.1.6.
Trigger build + publish in the same workflow run, shipping to PyPI via OIDC trusted publishing.

pip install graphn==0.1.6 should resolve ~1 minute after merge.

Two spec-sync PRs (#12 on 2026-05-14, #15 on 2026-05-16) landed regenerated _generated/ on main but neither bumped pyproject, so the new control-plane surface has been sitting in the source tree unreleased on PyPI for a week. This PR closes the gap: bumps to 0.1.6, ships matching ergonomic wrappers + typed fields on the hand-curated resource layer, and adds tests so the new surface is covered, not just compiled. New high-level surface on client.custom_models (sync + async): - update(model_id, *, name=..., min_replicas=..., max_replicas=..., cooldown_seconds=..., extra=...) issues PATCH /v1/{ws}/custom-models/{id}. In-place mutation of the live deployment - no rolling restart, no downtime. Empty PATCH is refused client-side with ValidationError(code="empty_update") one round-trip earlier than the server's 422; an `extra` mapping lets callers PATCH future fields without an SDK release. - supported_architectures() returns a typed SupportedArchitectures catalog from GET /v1/{ws}/custom-models/supported-architectures. Each ArchitectureInfo carries the capability tags (tool_calling, vision, image_input, video_input, streaming, json_mode) the architecture exposes. Intended for driving UI architecture/ capability filters before calling validate(). - create(..., base_model_id=...) wires up the LoRA-import hint. Required on weight_source=s3_* to classify the bundle as an adapter at create-time; optional on weight_source=huggingface where it overrides adapter_config.json::base_model_name_or_path from the upstream repo (useful when the recorded base id isn't a valid HF id, e.g. a local filesystem path used during training). - validate(..., model_size_gb=...) lets callers skip the HF head-bytes probe by supplying a weight-size hint, useful for very large models (405B-class) where the probe stalls validate. Typed LoRA fields on the existing Pydantic types: - CustomModel: artifact_type ("base"|"lora"|None), base_model_id, lora_adapter_name, lora_rank. artifact_type is None on responses from control planes that predate the LoRA work - treat that as "base" for compatibility. - ValidateModelResponse: artifact_type (defaults to "base" on fresh responses, None on legacy), detected_base_model_id, lora_rank. When artifact_type == "lora", the architectures / num_params / estimated_memory_gb / max_context_length fields describe the base model resolved from adapter_config.json, not the adapter itself. New public exports: ArchitectureInfo, SupportedArchitectures, ArtifactType from graphn (and graphn.custom_models). CustomModelCreate.huggingface_model_id is now required on the generated attrs dataclass (was str | Unset). The server has returned 422 for omitted huggingface_model_id on every weight source since 0.1.3 (voltagepark/takao#1997) and the hand-curated client.custom_models.create resource raises ValidationError client-side for S3 imports, so this is the generated type catching up - callers using the keyword-only ergonomic API are unaffected. Tests: 57 pass (43 existing + 14 new) covering both transports. ruff check clean. mypy is clean on every file this PR touches (pre-existing no-any-return errors in _transport.py and tts.py are on main and not regressions). The auto-tag job's CHANGELOG check matches "## [0.1.6] - 2026-05-21" so PyPI publish fires automatically on merge.

kunwar-vp merged commit 653ea81 into main May 21, 2026
7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

release: graphn 0.1.6 (LoRA support + custom-model update endpoint)#16

release: graphn 0.1.6 (LoRA support + custom-model update endpoint)#16
kunwar-vp merged 1 commit into
mainfrom
chore/release-0.1.6

kunwar-vp commented May 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

kunwar-vp commented May 21, 2026

Why now

What's in 0.1.6

New high-level surface on client.custom_models (sync + async)

Typed LoRA fields on existing Pydantic models

New public exports

Generated-layer change worth noting

Verification

New test coverage (14 cases)

What this PR deliberately does NOT touch

Release pipeline

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

New high-level surface on `client.custom_models` (sync + async)