Skip to content

release: graphn 0.1.6 (LoRA support + custom-model update endpoint)#16

Merged
kunwar-vp merged 1 commit into
mainfrom
chore/release-0.1.6
May 21, 2026
Merged

release: graphn 0.1.6 (LoRA support + custom-model update endpoint)#16
kunwar-vp merged 1 commit into
mainfrom
chore/release-0.1.6

Conversation

@kunwar-vp
Copy link
Copy Markdown
Collaborator

Why now

Two spec-sync PRs (#12 on 2026-05-14, #15 on 2026-05-16) regenerated _generated/ on main but neither bumped pyproject.toml. The release workflow only fires when pyproject.toml changes, so the new control-plane surface — two new endpoints and the LoRA-adapter fields — has been sitting in main for a week with no PyPI release. This PR closes that gap and also wires the new surface through the hand-curated ergonomic layer so customers get typed accessors, not just model.model_extra["artifact_type"].

What's in 0.1.6

New high-level surface on client.custom_models (sync + async)

  • update(model_id, *, name=..., min_replicas=..., max_replicas=..., cooldown_seconds=..., extra=...) — wraps PATCH /v1/{workspaceId}/custom-models/{modelId}. In-place mutation of the live deployment, no rolling restart, no downtime. Empty body refused client-side with ValidationError(code="empty_update") one round-trip earlier than the server's 422. extra mapping lets callers PATCH future fields without waiting for an SDK release.
  • supported_architectures() — wraps GET /v1/{workspaceId}/custom-models/supported-architectures. Returns a typed SupportedArchitectures catalog where each ArchitectureInfo carries the capability tags (tool_calling, vision, image_input, video_input, streaming, json_mode) the architecture exposes. Drives UI architecture/capability filters before calling validate().
  • create(..., base_model_id=...) — wires up the LoRA-import hint. Required on weight_source=s3_* to classify the bundle as an adapter at create-time (omitting it routes through the base path and the bundle deploys to failed). Optional on weight_source=huggingface where it overrides adapter_config.json::base_model_name_or_path from the upstream repo (useful when the recorded base id isn't a valid HF id, e.g. a local filesystem path used during training).
  • validate(..., model_size_gb=...) — caller-supplied weight-size hint that lets the platform skip the HF head-bytes probe. Useful for very large models (405B-class) where the probe otherwise stalls the validate response.

Typed LoRA fields on existing Pydantic models

  • CustomModel: artifact_type (Literal["base", "lora"] | None), base_model_id, lora_adapter_name, lora_rank. artifact_type is None on responses from control planes that predate the LoRA work — treat that as "base" for compatibility.
  • ValidateModelResponse: artifact_type, detected_base_model_id, lora_rank. When artifact_type == "lora", the existing architectures, num_params, estimated_memory_gb, and max_context_length fields describe the base model resolved from adapter_config.json, not the adapter itself.

New public exports

ArchitectureInfo, SupportedArchitectures, ArtifactType from graphn (and graphn.custom_models).

Generated-layer change worth noting

CustomModelCreate.huggingface_model_id is now required on the generated attrs dataclass (was str | Unset). The server has rejected its absence with 422 on every weight_source since 0.1.3 (voltagepark/takao#1997) and the hand-curated client.custom_models.create already raised ValidationError client-side for S3 imports — so this is the generated type catching up, not a behavior change. Callers using the keyword-only ergonomic API are unaffected.

Verification

  • ruff check src tests — clean
  • pytest -q57 passed (43 existing + 14 new)
  • mypy src/graphn — no new errors in any file this PR touches (3 pre-existing no-any-return errors in _transport.py and tts.py are on main and not regressions)

New test coverage (14 cases)

Scenario Test
LoRA classification for S3 imports test_create_s3_lora_passes_base_model_id
HF base-id override flow test_create_huggingface_lora_override_passes_base_model_id
Typed LoRA accessors after get test_get_returns_typed_lora_fields
Back-compat with legacy control planes (no artifact_type) test_get_legacy_response_treats_artifact_type_as_none
LoRA fields on validate response test_validate_returns_lora_fields
model_size_gb forwarded on validate test_validate_forwards_model_size_gb
PATCH happy path + body shape test_update_sends_patch_with_body
Empty PATCH refused client-side test_update_rejects_empty_body
Update 404 mapped to NotFoundError test_update_404_maps_to_not_found
Forward-compat extra mapping test_update_extra_passes_through
supported_architectures returns typed catalog test_supported_architectures_returns_typed
Async PATCH happy path test_async_update_sends_patch_with_body
Async empty-PATCH guard test_async_update_rejects_empty_body
Async supported_architectures test_async_supported_architectures

What this PR deliberately does NOT touch

  • examples/ — no example script demonstrates update / supported_architectures / LoRA imports today, and adding ones large enough to be useful would be a couple hundred lines on top of this diff. Happy to follow up with an examples/import_lora.py and examples/live_update.py in a separate PR.
  • README.md — Scope callout and scope table are not updated. Same reasoning: keeping this PR focused on the code surface + tests.

Release pipeline

The auto-tag job in .github/workflows/release.yml will:

  1. Read version = "0.1.6" from pyproject.toml.
  2. Match ## [0.1.6] in CHANGELOG.md ✓.
  3. Create and push v0.1.6.
  4. Trigger build + publish in the same workflow run, shipping to PyPI via OIDC trusted publishing.

pip install graphn==0.1.6 should resolve ~1 minute after merge.

Two spec-sync PRs (#12 on 2026-05-14, #15 on 2026-05-16) landed
regenerated _generated/ on main but neither bumped pyproject, so
the new control-plane surface has been sitting in the source tree
unreleased on PyPI for a week. This PR closes the gap: bumps to
0.1.6, ships matching ergonomic wrappers + typed fields on the
hand-curated resource layer, and adds tests so the new surface is
covered, not just compiled.

New high-level surface on client.custom_models (sync + async):

- update(model_id, *, name=..., min_replicas=..., max_replicas=...,
  cooldown_seconds=..., extra=...) issues PATCH /v1/{ws}/custom-models/{id}.
  In-place mutation of the live deployment - no rolling restart, no
  downtime. Empty PATCH is refused client-side with
  ValidationError(code="empty_update") one round-trip earlier than
  the server's 422; an `extra` mapping lets callers PATCH future
  fields without an SDK release.
- supported_architectures() returns a typed SupportedArchitectures
  catalog from GET /v1/{ws}/custom-models/supported-architectures.
  Each ArchitectureInfo carries the capability tags (tool_calling,
  vision, image_input, video_input, streaming, json_mode) the
  architecture exposes. Intended for driving UI architecture/
  capability filters before calling validate().
- create(..., base_model_id=...) wires up the LoRA-import hint.
  Required on weight_source=s3_* to classify the bundle as an
  adapter at create-time; optional on weight_source=huggingface
  where it overrides adapter_config.json::base_model_name_or_path
  from the upstream repo (useful when the recorded base id isn't
  a valid HF id, e.g. a local filesystem path used during training).
- validate(..., model_size_gb=...) lets callers skip the HF
  head-bytes probe by supplying a weight-size hint, useful for
  very large models (405B-class) where the probe stalls validate.

Typed LoRA fields on the existing Pydantic types:

- CustomModel: artifact_type ("base"|"lora"|None), base_model_id,
  lora_adapter_name, lora_rank. artifact_type is None on responses
  from control planes that predate the LoRA work - treat that as
  "base" for compatibility.
- ValidateModelResponse: artifact_type (defaults to "base" on
  fresh responses, None on legacy), detected_base_model_id,
  lora_rank. When artifact_type == "lora", the architectures /
  num_params / estimated_memory_gb / max_context_length fields
  describe the base model resolved from adapter_config.json, not
  the adapter itself.

New public exports: ArchitectureInfo, SupportedArchitectures,
ArtifactType from graphn (and graphn.custom_models).

CustomModelCreate.huggingface_model_id is now required on the
generated attrs dataclass (was str | Unset). The server has
returned 422 for omitted huggingface_model_id on every weight
source since 0.1.3 (voltagepark/takao#1997) and the hand-curated
client.custom_models.create resource raises ValidationError
client-side for S3 imports, so this is the generated type catching
up - callers using the keyword-only ergonomic API are unaffected.

Tests: 57 pass (43 existing + 14 new) covering both transports.
ruff check clean. mypy is clean on every file this PR touches
(pre-existing no-any-return errors in _transport.py and tts.py are
on main and not regressions). The auto-tag job's CHANGELOG check
matches "## [0.1.6] - 2026-05-21" so PyPI publish fires automatically
on merge.
@kunwar-vp kunwar-vp merged commit 653ea81 into main May 21, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant