Skip to content

Add model to serve command#50

Merged
InftyAI-Agent merged 2 commits intoInftyAI:mainfrom
kerthcet:cleanup/serve-model
May 6, 2026
Merged

Add model to serve command#50
InftyAI-Agent merged 2 commits intoInftyAI:mainfrom
kerthcet:cleanup/serve-model

Conversation

@kerthcet
Copy link
Copy Markdown
Member

@kerthcet kerthcet commented May 6, 2026

What this PR does / why we need it

Which issue(s) this PR fixes

Fixes #

Special notes for your reviewer

Does this PR introduce a user-facing change?


Signed-off-by: kerthcet <kerthcet@gmail.com>
Copilot AI review requested due to automatic review settings May 6, 2026 21:01
@kerthcet
Copy link
Copy Markdown
Member Author

kerthcet commented May 6, 2026

/lgtm
/kind cleanup

@InftyAI-Agent InftyAI-Agent added needs-triage Indicates an issue or PR lacks a label and requires one. needs-priority Indicates a PR lacks a label and requires one. do-not-merge/needs-kind Indicates a PR lacks a label and requires one. approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Looks good to me, indicates that a PR is ready to be merged. cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. and removed do-not-merge/needs-kind Indicates a PR lacks a label and requires one. labels May 6, 2026
Signed-off-by: kerthcet <kerthcet@gmail.com>
@InftyAI-Agent InftyAI-Agent removed the lgtm Looks good to me, indicates that a PR is ready to be merged. label May 6, 2026
@kerthcet
Copy link
Copy Markdown
Member Author

kerthcet commented May 6, 2026

/lgtm

@InftyAI-Agent InftyAI-Agent added the lgtm Looks good to me, indicates that a PR is ready to be merged. label May 6, 2026
@InftyAI-Agent InftyAI-Agent merged commit 82b9cb6 into InftyAI:main May 6, 2026
17 checks passed
@kerthcet kerthcet deleted the cleanup/serve-model branch May 6, 2026 21:05
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the serve CLI workflow to require an explicit model name and performs a preflight registry check before starting the API server. It also changes the /health response payload and updates README usage examples accordingly.

Changes:

  • Require puma serve <model> and pass the model name into the serve execution path.
  • Add a model-exists preflight check in the CLI before starting the server.
  • Remove version from the /health endpoint response and update documentation to match.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 5 comments.

File Description
src/cli/serve.rs Extends execute() to accept a model name and logs it on startup.
src/cli/commands.rs Adds a required positional model arg to serve, checks registry presence, and adds related tests.
src/api/routes.rs Updates /health response schema by removing version.
README.md Updates serve usage and health output; adds note about /v1/models.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/api/routes.rs
Comment on lines 59 to 69
/// Health check response
#[derive(Serialize)]
struct HealthResponse {
status: String,
version: String,
}

/// Health check endpoint
async fn health_check() -> Json<HealthResponse> {
Json(HealthResponse {
status: "ok".to_string(),
version: env!("CARGO_PKG_VERSION").to_string(),
})
Comment thread src/cli/serve.rs
Comment on lines 10 to 33
@@ -23,7 +27,7 @@ pub async fn execute(host: &str, port: u16) -> Result<(), Box<dyn std::error::Er
.bright_blue()
.bold()
);
info!("Starting PUMA inference server");
info!("Starting PUMA to serve model: {}", model_name);

// Initialize backend (MockEngine for now, replace with MLX later)
let engine = Arc::new(MockEngine::new());
Comment thread src/cli/commands.rs
Comment on lines +227 to +243
// Verify model exists
let registry = ModelRegistry::new(None);
match registry.get_model(&args.model) {
Ok(Some(_)) => {
// Model exists, proceed
}
Ok(None) => {
eprintln!("❌ Error: Model '{}' not found in registry", args.model);
eprintln!("Run 'puma pull {}' to download it first", args.model);
std::process::exit(1);
}
Err(e) => {
eprintln!("❌ Error checking model: {}", e);
std::process::exit(1);
}
}

Comment thread src/cli/commands.rs
Comment on lines +416 to +439
#[test]
fn test_serve_with_existing_model() {
let temp_dir = TempDir::new().unwrap();
let registry = ModelRegistry::new(Some(temp_dir.path().to_path_buf()));

let model = create_test_model("test/serve-model", "abc123");
registry.register_model(model).unwrap();

// Verify model exists (this is what serve command checks)
let result = registry.get_model("test/serve-model");
assert!(result.is_ok());
assert!(result.unwrap().is_some());
}

#[test]
fn test_serve_with_nonexistent_model() {
let temp_dir = TempDir::new().unwrap();
let registry = ModelRegistry::new(Some(temp_dir.path().to_path_buf()));

// Verify model doesn't exist
let result = registry.get_model("nonexistent/model");
assert!(result.is_ok());
assert!(result.unwrap().is_none());
}
Comment thread README.md

#### List Models
```bash
# Returns the currently loaded model
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. lgtm Looks good to me, indicates that a PR is ready to be merged. needs-priority Indicates a PR lacks a label and requires one. needs-triage Indicates an issue or PR lacks a label and requires one.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants