[AINode]: Integrate toto as a builtin forecasting model#17322
[AINode]: Integrate toto as a builtin forecasting model#17322graceli02 wants to merge 8 commits intoapache:masterfrom
Conversation
Integrate Datadog's Toto-Open-Base-1.0 into AINode's builtin model registry as an optional lazy dependency. - Add TotoConfig (PretrainedConfig) with Toto architecture params - Add TotoForPrediction wrapper loaded via ModelHubMixin.from_pretrained - Add TotoPipeline (ForecastPipeline) with lazy toto-ts import and clear installation instructions if the package is missing - Register 'toto' in BUILTIN_HF_TRANSFORMERS_MODEL_MAP - Add 'toto' entry to AINodeTestUtils.BUILTIN_LTSM_MAP toto-ts is optional: no changes to pyproject.toml or poetry.lock
17d4276 to
7740dae
Compare
- Add Apache 2.0 license header to __init__.py and pipeline_toto.py - Fix pipeline_toto.py: replace broken local import with lazy toto-ts import via _import_toto() helper; fix merge conflict in model_info.py
CRZbulabula
left a comment
There was a problem hiding this comment.
Hi Grace, this is your first PR (pull request) for Apache IoTDB repository, our community highly appreciate your contribution!
Next, let us talk about this PR. For timeseries foundation models' integration in AINode, we generally introduce their source codes then declare their open source license in the LICENSE file, which you can find in the project root dir.
Although installing the released package then invoke corresponding forecaster seems more convenient, this is usually not feasible in system engineering projects. To elaborate, different python projects often employ diverse versions of specified dependencies, resulting in package conflict. For instance, the transformers project updates its kvcache component significantly from v4.4x to v4.5x, meaning models build upon v4.4x cannot share the same dependency with models built on v4.5x.
To improve this PR, I suggest:
- To trace the entire process of model forecast, dive into the forecast example scripts of Toto-1.0 model.
- Integrate the Toto model through introducing the source code.
- Packaging AINode, then verifying your integration via
SHOW MODELSandSELECT * FROM FORECAST.
In addition to this process, you might encounter some problems in the model loading phase. This is because the Toto-1.0 model is loaded through ModelHubMixin interface, while the current version of AINode only accepts the PreTrainedModel format. Feel free to this challenge, we are integrating the PyTorchModelHubMixin interface and the corresponding PR can be refered soon.
…/pipeline
- Fix build_binary.py: poetry lock → poetry install --no-root; remove
capture_output=True so errors are visible in CI
- Vendor toto source (DataDog/toto, Apache-2.0) into model/toto/:
model/{attention,backbone,distribution,embedding,feed_forward,
fusion,rope,scaler,transformer,toto,util}.py
data/util/dataset.py
inference/forecaster.py
Eliminates toto-ts pip dependency and all gluonts transitive deps.
gluonts replaced with pure PyTorch (TransformedDistribution/AffineTransform,
torch.nn.Module Scaler base).
- Rewrite modeling_toto.py: TotoForPrediction now inherits PreTrainedModel
(required by model_loader); backbone stored as self.model so safetensors
keys (model.*) map directly; custom from_pretrained applies
_map_state_dict_keys for SwiGLU remapping before loading weights.
- Rewrite pipeline_toto.py: import directly from local source;
TotoForecaster created lazily inside _get_forecaster() — not at __init__
time — fixing ImportError at pipeline construction in CI.
- pyproject.toml: add rotary-embedding-torch>=0.8.0 (only new dep)
- .gitignore: un-ignore toto data/ package (Python source, not data files)
- Add toto/NOTICE with Datadog attribution per Apache policy
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Apache RAT flagged the standalone NOTICE file inside the toto Python package because the project's RAT config does not exclude plain NOTICE files. Moved the Datadog/toto attribution to the standard location (project root NOTICE) and removed the inner NOTICE file. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…n_kwargs
- scaler.py: add short-name aliases ("per_variate", "per_variate_causal",
"per_variate_causal_patch") to scaler_types dict so that config.json
string values work without KeyError at backbone init time.
- backbone.py: recognise "per_variate_causal_patch" string in the
CausalPatchStdMeanScaler branch (alongside the legacy class-path string).
- configuration_toto.py: add output_distribution_kwargs parameter with
default {"k_components": 5} matching Datadog/Toto-Open-Base-1.0.
- modeling_toto.py: pass output_distribution_kwargs from config to
TotoBackbone so MixtureOfStudentTsOutput receives k_components.
Fixes: KeyError 'per_variate_causal' in scaler_types lookup.
Fixes: MixtureOfStudentTsOutput missing required positional arg k_components.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Summary
This PR integrates Toto (Time Series Optimized Transformer for Observability) by Datadog as a built-in forecasting model in AINode, following the same pattern as existing built-in HuggingFace models (sundial, chronos2, moirai2, timer_xl).
Toto is installed as an optional dependency via
pip install toto-tsand loaded lazily at runtime, so it does not affect AINode's startup time or core dependencies when not in use.Changes
toto/configuration_toto.py—TotoConfig(PretrainedConfig)defining Toto's architecture parameterstoto/modeling_toto.py—TotoForPredictionwrapper aroundtoto-ts'sTotoclass, bridgingModelHubMixin-based loading with AINode'sload_model_from_transformersmechanismtoto/pipeline_toto.py—TotoPipeline(ForecastPipeline)implementingpreprocess/forecast/postprocesswith lazy imports guarded by a clear install messagetoto/__init__.py— package init with Apache 2.0 license headermodel_info.py— registered toto inBUILTIN_HF_TRANSFORMERS_MODEL_MAPwithrepo_id="Datadog/Toto-Open-Base-1.0"andauto_mappointing to the above classesAINodeTestUtils.java— added toto toBUILTIN_LTSM_MAPwith expected state"active"Design Notes
Toto uses
huggingface_hub.ModelHubMixinrather thantransformers.PreTrainedModel.TotoForPrediction.from_pretrained()delegates directly toToto.from_pretrained()from thetoto-tspackage, then exposes a.backboneproperty (TotoBackbone) consumed byTotoForecasterin the pipeline. This avoids wrapping Toto in a standard HuggingFacePreTrainedModelwhile remaining fully compatible with AINode's existing loading infrastructure.Setup Required
The
Cluster IT - 1C1D1A / AINodetest verifies that all built-in models report state"active"inSHOW MODELS. Sincetotois newly added, its weights (~605 MB) must be pre-downloaded to the runner cache before the test will pass reliably.Please run the following once on the self-hosted GPU runner before merging:
No code changes are needed.
This PR has:
for an unfamiliar reader.
for code coverage.
Key changed/added classes (or packages if there are too many classes) in this PR