Skip to content

fix: MLFlow E2E Example Notebook (5513)#5694

Draft
aviruthen wants to merge 1 commit intoaws:masterfrom
aviruthen:fix/mlflow-e2e-example-notebook-5513
Draft

fix: MLFlow E2E Example Notebook (5513)#5694
aviruthen wants to merge 1 commit intoaws:masterfrom
aviruthen:fix/mlflow-e2e-example-notebook-5513

Conversation

@aviruthen
Copy link
Copy Markdown
Collaborator

Description

The existing MLflow E2E notebook at v3-examples/ml-ops-examples/v3-mlflow-train-inference-e2e-example.ipynb has several issues: (1) Uses registered_model.latest_versions which was removed in MLflow 3.x - should use client.search_model_versions() instead; (2) Uses Session.boto_region_name as a class attribute without instantiation - should be Session().boto_region_name; (3) Step 7 uses raw boto3 sagemaker-runtime client for inference instead of the V3 core_endpoint.invoke() pattern used in other V3 example notebooks; (4) The notebook references mlflow==3.4.0 but uses deprecated MLflow 2.x API patterns incompatible with MLflow 3.x.

Related Issue

Related issue: 5513

Changes Made

  • v3-examples/ml-ops-examples/v3-mlflow-train-inference-e2e-example.ipynb
  • v3-examples/inference-examples/train-inference-e2e-example.ipynb

AI-Generated PR

This PR was automatically generated by the PySDK Issue Agent.

  • Confidence score: 85%
  • Classification: bug
  • SDK version target: V3

Merge Checklist

  • Changes are backward compatible
  • Commit message follows prefix: description format
  • Unit tests added/updated
  • Integration tests added (if applicable)
  • Documentation updated (if applicable)

Copy link
Copy Markdown
Member

@mufaddal-rohawala mufaddal-rohawala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 AI Code Review

This PR fixes several real issues in example notebooks: MLflow 3.x API compatibility, incorrect Session class attribute access, and replacing raw boto3 calls with V3-native core_endpoint.invoke(). The changes are well-motivated and align with V3 SDK conventions. A few minor concerns around the Session() instantiation pattern and the core_endpoint.invoke() API usage.

"\n",
"# AWS Configuration\n",
"AWS_REGION = Session.boto_region_name\n",
"AWS_REGION = Session().boto_region_name\n",
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While Session().boto_region_name is correct (it needs instantiation), this creates a throwaway Session object just to get the region. Consider using boto3.session.Session().region_name for a lighter-weight approach, or storing the Session() instance for reuse later in the notebook:

sagemaker_session = Session()
AWS_REGION = sagemaker_session.boto_region_name

This avoids creating multiple Session objects if it's used elsewhere in the notebook.

"\n",
"# AWS Configuration\n",
"AWS_REGION = Session.boto_region_name\n",
"AWS_REGION = Session().boto_region_name\n",
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as the other notebook — consider storing the Session() instance in a variable for reuse rather than creating a throwaway object:

sagemaker_session = Session()
AWS_REGION = sagemaker_session.boto_region_name

"print(f\"Input: {test_data}\")\n",
"print(f\"Prediction: {prediction}\")"
]
},
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good change replacing raw boto3 sagemaker-runtime client with core_endpoint.invoke() — this aligns with the V3 tenet that all subpackages should use sagemaker-core rather than calling boto3 directly.

However, please verify the exact parameter names for core_endpoint.invoke(). The sagemaker-core Endpoint.invoke() method may use Body and ContentType (PascalCase matching the API model) rather than body and content_type (snake_case). Similarly, the response attribute might be Body rather than body. If the snake_case versions work, that's fine — just want to make sure this has been tested.

"\n",
"latest_version = registered_model.latest_versions[0]\n",
"# Search for the latest version of the registered model (MLflow 3.x compatible)\n",
"versions = client.search_model_versions(\n",
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: The order_by parameter value 'version_number DESC' — please verify this is the correct field name for MLflow's search_model_versions. The MLflow documentation uses 'version_number DESC' in some examples but older versions may use different field names. Since this notebook targets mlflow==3.4.0, it would be good to confirm this works with that specific version.

" max_results=1\n",
")\n",
"\n",
"if not versions:\n",
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good defensive check raising ValueError with a descriptive message including the model name. This follows the SDK convention of fail-fast validation with specific exceptions.

"from sagemaker.serve.mode.function_pointers import Mode\n",
"\n",
"# Cloud deployment to SageMaker endpoint\n",
"# Note: 'dependencies' parameter is deprecated. You may see a deprecation warning.\n",
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment says 'dependencies' parameter is deprecated and suggests configure_for_torchserve() — is this accurate? If so, should the example code itself be updated to use the non-deprecated approach rather than just adding a comment about it? Leaving deprecated code in an example notebook may confuse users.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants