release: graphn 0.1.6 (LoRA support + custom-model update endpoint)#16
Merged
Conversation
Two spec-sync PRs (#12 on 2026-05-14, #15 on 2026-05-16) landed regenerated _generated/ on main but neither bumped pyproject, so the new control-plane surface has been sitting in the source tree unreleased on PyPI for a week. This PR closes the gap: bumps to 0.1.6, ships matching ergonomic wrappers + typed fields on the hand-curated resource layer, and adds tests so the new surface is covered, not just compiled. New high-level surface on client.custom_models (sync + async): - update(model_id, *, name=..., min_replicas=..., max_replicas=..., cooldown_seconds=..., extra=...) issues PATCH /v1/{ws}/custom-models/{id}. In-place mutation of the live deployment - no rolling restart, no downtime. Empty PATCH is refused client-side with ValidationError(code="empty_update") one round-trip earlier than the server's 422; an `extra` mapping lets callers PATCH future fields without an SDK release. - supported_architectures() returns a typed SupportedArchitectures catalog from GET /v1/{ws}/custom-models/supported-architectures. Each ArchitectureInfo carries the capability tags (tool_calling, vision, image_input, video_input, streaming, json_mode) the architecture exposes. Intended for driving UI architecture/ capability filters before calling validate(). - create(..., base_model_id=...) wires up the LoRA-import hint. Required on weight_source=s3_* to classify the bundle as an adapter at create-time; optional on weight_source=huggingface where it overrides adapter_config.json::base_model_name_or_path from the upstream repo (useful when the recorded base id isn't a valid HF id, e.g. a local filesystem path used during training). - validate(..., model_size_gb=...) lets callers skip the HF head-bytes probe by supplying a weight-size hint, useful for very large models (405B-class) where the probe stalls validate. Typed LoRA fields on the existing Pydantic types: - CustomModel: artifact_type ("base"|"lora"|None), base_model_id, lora_adapter_name, lora_rank. artifact_type is None on responses from control planes that predate the LoRA work - treat that as "base" for compatibility. - ValidateModelResponse: artifact_type (defaults to "base" on fresh responses, None on legacy), detected_base_model_id, lora_rank. When artifact_type == "lora", the architectures / num_params / estimated_memory_gb / max_context_length fields describe the base model resolved from adapter_config.json, not the adapter itself. New public exports: ArchitectureInfo, SupportedArchitectures, ArtifactType from graphn (and graphn.custom_models). CustomModelCreate.huggingface_model_id is now required on the generated attrs dataclass (was str | Unset). The server has returned 422 for omitted huggingface_model_id on every weight source since 0.1.3 (voltagepark/takao#1997) and the hand-curated client.custom_models.create resource raises ValidationError client-side for S3 imports, so this is the generated type catching up - callers using the keyword-only ergonomic API are unaffected. Tests: 57 pass (43 existing + 14 new) covering both transports. ruff check clean. mypy is clean on every file this PR touches (pre-existing no-any-return errors in _transport.py and tts.py are on main and not regressions). The auto-tag job's CHANGELOG check matches "## [0.1.6] - 2026-05-21" so PyPI publish fires automatically on merge.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why now
Two spec-sync PRs (#12 on 2026-05-14, #15 on 2026-05-16) regenerated
_generated/on main but neither bumpedpyproject.toml. Thereleaseworkflow only fires whenpyproject.tomlchanges, so the new control-plane surface — two new endpoints and the LoRA-adapter fields — has been sitting in main for a week with no PyPI release. This PR closes that gap and also wires the new surface through the hand-curated ergonomic layer so customers get typed accessors, not justmodel.model_extra["artifact_type"].What's in 0.1.6
New high-level surface on
client.custom_models(sync + async)update(model_id, *, name=..., min_replicas=..., max_replicas=..., cooldown_seconds=..., extra=...)— wrapsPATCH /v1/{workspaceId}/custom-models/{modelId}. In-place mutation of the live deployment, no rolling restart, no downtime. Empty body refused client-side withValidationError(code="empty_update")one round-trip earlier than the server's 422.extramapping lets callers PATCH future fields without waiting for an SDK release.supported_architectures()— wrapsGET /v1/{workspaceId}/custom-models/supported-architectures. Returns a typedSupportedArchitecturescatalog where eachArchitectureInfocarries the capability tags (tool_calling,vision,image_input,video_input,streaming,json_mode) the architecture exposes. Drives UI architecture/capability filters before callingvalidate().create(..., base_model_id=...)— wires up the LoRA-import hint. Required onweight_source=s3_*to classify the bundle as an adapter at create-time (omitting it routes through the base path and the bundle deploys tofailed). Optional onweight_source=huggingfacewhere it overridesadapter_config.json::base_model_name_or_pathfrom the upstream repo (useful when the recorded base id isn't a valid HF id, e.g. a local filesystem path used during training).validate(..., model_size_gb=...)— caller-supplied weight-size hint that lets the platform skip the HF head-bytes probe. Useful for very large models (405B-class) where the probe otherwise stalls the validate response.Typed LoRA fields on existing Pydantic models
CustomModel:artifact_type(Literal["base", "lora"] | None),base_model_id,lora_adapter_name,lora_rank.artifact_typeisNoneon responses from control planes that predate the LoRA work — treat that as"base"for compatibility.ValidateModelResponse:artifact_type,detected_base_model_id,lora_rank. Whenartifact_type == "lora", the existingarchitectures,num_params,estimated_memory_gb, andmax_context_lengthfields describe the base model resolved fromadapter_config.json, not the adapter itself.New public exports
ArchitectureInfo,SupportedArchitectures,ArtifactTypefromgraphn(andgraphn.custom_models).Generated-layer change worth noting
CustomModelCreate.huggingface_model_idis now required on the generatedattrsdataclass (wasstr | Unset). The server has rejected its absence with 422 on everyweight_sourcesince 0.1.3 (voltagepark/takao#1997) and the hand-curatedclient.custom_models.createalready raisedValidationErrorclient-side for S3 imports — so this is the generated type catching up, not a behavior change. Callers using the keyword-only ergonomic API are unaffected.Verification
ruff check src tests— cleanpytest -q— 57 passed (43 existing + 14 new)mypy src/graphn— no new errors in any file this PR touches (3 pre-existingno-any-returnerrors in_transport.pyandtts.pyare on main and not regressions)New test coverage (14 cases)
test_create_s3_lora_passes_base_model_idtest_create_huggingface_lora_override_passes_base_model_idgettest_get_returns_typed_lora_fieldsartifact_type)test_get_legacy_response_treats_artifact_type_as_nonevalidateresponsetest_validate_returns_lora_fieldsmodel_size_gbforwarded on validatetest_validate_forwards_model_size_gbtest_update_sends_patch_with_bodytest_update_rejects_empty_bodyNotFoundErrortest_update_404_maps_to_not_foundextramappingtest_update_extra_passes_throughsupported_architecturesreturns typed catalogtest_supported_architectures_returns_typedtest_async_update_sends_patch_with_bodytest_async_update_rejects_empty_bodysupported_architecturestest_async_supported_architecturesWhat this PR deliberately does NOT touch
examples/— no example script demonstratesupdate/supported_architectures/ LoRA imports today, and adding ones large enough to be useful would be a couple hundred lines on top of this diff. Happy to follow up with anexamples/import_lora.pyandexamples/live_update.pyin a separate PR.README.md— Scope callout and scope table are not updated. Same reasoning: keeping this PR focused on the code surface + tests.Release pipeline
The
auto-tagjob in.github/workflows/release.ymlwill:version = "0.1.6"frompyproject.toml.## [0.1.6]inCHANGELOG.md✓.v0.1.6.build+publishin the same workflow run, shipping to PyPI via OIDC trusted publishing.pip install graphn==0.1.6should resolve ~1 minute after merge.