feat(serve): add SageMaker GenAI inference benchmarking and recommendation by ZealSV · Pull Request #5874 · aws/sagemaker-python-sdk

ZealSV · 2026-05-19T17:38:39Z

Adds sagemaker.serve.ai_inference_recommender, a thin ergonomic layer
over sagemaker-core's AIBenchmarkJob, AIRecommendationJob, and
AIWorkloadConfig resources.

ModelBuilder gains a new entry point and extends two existing verbs:

Benchmark a deployed endpoint

job = mb.start_benchmark(endpoint=ep, workload=Workload.synthetic(...))
result = BenchmarkResult.from_job(job)

Recommendation flow extends optimize() and deploy()

mb.optimize(workload=..., performance_target="throughput",
instance_types=["ml.g6.12xlarge"])
endpoint = mb.deploy(role=role) # top recommendation
endpoint = mb.deploy(role=role, recommendation_index=2) # alternative

print(result) and print(mb.recommendations[0]) render their data as
tables.

Public surface added under sagemaker.serve:

Workload -- typed factory; extras pass through **params, validated
server-side.
BenchmarkResult / BenchmarkMetrics / BenchmarkMetric -- parses the
AIPerf output.tar.gz from S3.
Secret -- opt-in helper for tokens >512 chars (Secrets Manager).
BenchmarkJob, RecommendationJob -- re-exports without the AI prefix.
FeatureGatedError, WorkloadValidationError -- typed exceptions.

Pin-mode and workload-mode optimize() kwargs are mutually exclusive.
Recommendation deploy uses the ModelPackage path (auto-approves the
package the rec job publishes).

Includes 51 unit tests and 2 slow_test integ tests
(tests/integ/test_ai_inference_recommender_integration.py) verified
end-to-end against real AWS.

Rebased onto upstream to pick up #5860 (preserve falsy values in
sagemaker-core serialize), required so optimize_model=False reaches
the wire.

Issue #, if available:

Description of changes:

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

…ation Adds sagemaker.serve.ai_inference_recommender, a thin ergonomic layer over sagemaker-core's AIBenchmarkJob, AIRecommendationJob, and AIWorkloadConfig resources. ModelBuilder gains a new entry point and extends two existing verbs: # Benchmark a deployed endpoint job = mb.start_benchmark(endpoint=ep, workload=Workload.synthetic(...)) result = BenchmarkResult.from_job(job) # Recommendation flow extends optimize() and deploy() mb.optimize(workload=..., performance_target="throughput", instance_types=["ml.g6.12xlarge"]) endpoint = mb.deploy(role=role) # top recommendation endpoint = mb.deploy(role=role, recommendation_index=2) # alternative print(result) and print(mb.recommendations[0]) render their data as tables. Public surface added under sagemaker.serve: * Workload -- typed factory; extras pass through **params, validated server-side. * BenchmarkResult / BenchmarkMetrics / BenchmarkMetric -- parses the AIPerf output.tar.gz from S3. * Secret -- opt-in helper for tokens >512 chars (Secrets Manager). * BenchmarkJob, RecommendationJob -- re-exports without the AI prefix. * FeatureGatedError, WorkloadValidationError -- typed exceptions. Pin-mode and workload-mode optimize() kwargs are mutually exclusive. Recommendation deploy uses the ModelPackage path (auto-approves the package the rec job publishes). Includes 51 unit tests and 2 slow_test integ tests (tests/integ/test_ai_inference_recommender_integration.py) verified end-to-end against real AWS. Rebased onto upstream to pick up aws#5860 (preserve falsy values in sagemaker-core serialize), required so optimize_model=False reaches the wire.

ZealSV had a problem deploying to manual-approval May 19, 2026 17:38 — with GitHub Actions Error

ZealSV changed the title ~~feat(serve): add SageMaker GenAI inference benchmarking and recommend…~~ feat(serve): add SageMaker GenAI inference benchmarking and recommendation May 19, 2026

ZealSV force-pushed the feature/lumen-ai-inference-recommender branch from c0cfc77 to 747baeb Compare May 20, 2026 18:58

ZealSV had a problem deploying to manual-approval May 20, 2026 18:58 — with GitHub Actions Error

ZealSV force-pushed the feature/lumen-ai-inference-recommender branch from 747baeb to bb8c26a Compare May 20, 2026 20:34

ZealSV had a problem deploying to manual-approval May 20, 2026 20:35 — with GitHub Actions Error

ZealSV force-pushed the feature/lumen-ai-inference-recommender branch from bb8c26a to 1d9b769 Compare May 21, 2026 19:29

ZealSV had a problem deploying to manual-approval May 21, 2026 19:29 — with GitHub Actions Error

ZealSV force-pushed the feature/lumen-ai-inference-recommender branch from 1d9b769 to 31f40bd Compare May 21, 2026 19:32

ZealSV had a problem deploying to manual-approval May 21, 2026 19:32 — with GitHub Actions Error

Merge branch 'master' into feature/lumen-ai-inference-recommender

be8c609

ZealSV temporarily deployed to manual-approval May 21, 2026 19:34 — with GitHub Actions Inactive

ZealSV had a problem deploying to manual-approval May 21, 2026 19:34 — with GitHub Actions Error

ZealSV had a problem deploying to manual-approval May 22, 2026 22:48 — with GitHub Actions Error

Merge branch 'master' into feature/lumen-ai-inference-recommender

1b68ecf

ZealSV requested a deployment to manual-approval May 22, 2026 22:49 — with GitHub Actions Waiting

ZealSV deployed to manual-approval May 22, 2026 22:49 — with GitHub Actions Active

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(serve): add SageMaker GenAI inference benchmarking and recommendation#5874

feat(serve): add SageMaker GenAI inference benchmarking and recommendation#5874
ZealSV wants to merge 4 commits into
aws:masterfrom
ZealSV:feature/lumen-ai-inference-recommender

ZealSV commented May 19, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ZealSV commented May 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmark a deployed endpoint

Recommendation flow extends optimize() and deploy()

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ZealSV commented May 19, 2026 •

edited

Loading