[AINode]: Integrate toto as a builtin forecasting model by graceli02 · Pull Request #17322 · apache/iotdb

graceli02 · 2026-03-20T01:35:11Z

Summary

This PR integrates Toto (Time Series Optimized Transformer for Observability) by Datadog as a built-in forecasting model in AINode, following the same pattern as existing built-in HuggingFace models (sundial, chronos2, moirai2, timer_xl).

Toto is installed as an optional dependency via pip install toto-ts and loaded lazily at runtime, so it does not affect AINode's startup time or core dependencies when not in use.

Changes

toto/configuration_toto.py — TotoConfig(PretrainedConfig) defining Toto's architecture parameters
toto/modeling_toto.py — TotoForPrediction wrapper around toto-ts's Toto class, bridging ModelHubMixin-based loading with AINode's load_model_from_transformers mechanism
toto/pipeline_toto.py — TotoPipeline(ForecastPipeline) implementing preprocess / forecast / postprocess with lazy imports guarded by a clear install message
toto/__init__.py — package init with Apache 2.0 license header
model_info.py — registered toto in BUILTIN_HF_TRANSFORMERS_MODEL_MAP with repo_id="Datadog/Toto-Open-Base-1.0" and auto_map pointing to the above classes
AINodeTestUtils.java — added toto to BUILTIN_LTSM_MAP with expected state "active"

Design Notes

Toto uses huggingface_hub.ModelHubMixin rather than transformers.PreTrainedModel. TotoForPrediction.from_pretrained() delegates directly to Toto.from_pretrained() from the toto-ts package, then exposes a .backbone property (TotoBackbone) consumed by TotoForecaster in the pipeline. This avoids wrapping Toto in a standard HuggingFace PreTrainedModel while remaining fully compatible with AINode's existing loading infrastructure.

Setup Required

The Cluster IT - 1C1D1A / AINode test verifies that all built-in models report state "active" in SHOW MODELS. Since toto is newly added, its weights (~605 MB) must be pre-downloaded to the runner cache before the test will pass reliably.

Please run the following once on the self-hosted GPU runner before merging:

from huggingface_hub import hf_hub_download

hf_hub_download(repo_id="Datadog/Toto-Open-Base-1.0", filename="model.safetensors")
hf_hub_download(repo_id="Datadog/Toto-Open-Base-1.0", filename="config.json")

No code changes are needed.

This PR has:

Key changed/added classes (or packages if there are too many classes) in this PR

Integrate Datadog's Toto-Open-Base-1.0 into AINode's builtin model registry as an optional lazy dependency. - Add TotoConfig (PretrainedConfig) with Toto architecture params - Add TotoForPrediction wrapper loaded via ModelHubMixin.from_pretrained - Add TotoPipeline (ForecastPipeline) with lazy toto-ts import and clear installation instructions if the package is missing - Register 'toto' in BUILTIN_HF_TRANSFORMERS_MODEL_MAP - Add 'toto' entry to AINodeTestUtils.BUILTIN_LTSM_MAP toto-ts is optional: no changes to pyproject.toml or poetry.lock

…to.py

- Add Apache 2.0 license header to __init__.py and pipeline_toto.py - Fix pipeline_toto.py: replace broken local import with lazy toto-ts import via _import_toto() helper; fix merge conflict in model_info.py

CRZbulabula

Hi Grace, this is your first PR (pull request) for Apache IoTDB repository, our community highly appreciate your contribution!

Next, let us talk about this PR. For timeseries foundation models' integration in AINode, we generally introduce their source codes then declare their open source license in the LICENSE file, which you can find in the project root dir.

Although installing the released package then invoke corresponding forecaster seems more convenient, this is usually not feasible in system engineering projects. To elaborate, different python projects often employ diverse versions of specified dependencies, resulting in package conflict. For instance, the transformers project updates its kvcache component significantly from v4.4x to v4.5x, meaning models build upon v4.4x cannot share the same dependency with models built on v4.5x.

To improve this PR, I suggest:

To trace the entire process of model forecast, dive into the forecast example scripts of Toto-1.0 model.
Integrate the Toto model through introducing the source code.
Packaging AINode, then verifying your integration via SHOW MODELS and SELECT * FROM FORECAST.

In addition to this process, you might encounter some problems in the model loading phase. This is because the Toto-1.0 model is loaded through ModelHubMixin interface, while the current version of AINode only accepts the PreTrainedModel format. Feel free to this challenge, we are integrating the PyTorchModelHubMixin interface and the corresponding PR can be refered soon.

…/pipeline - Fix build_binary.py: poetry lock → poetry install --no-root; remove capture_output=True so errors are visible in CI - Vendor toto source (DataDog/toto, Apache-2.0) into model/toto/: model/{attention,backbone,distribution,embedding,feed_forward, fusion,rope,scaler,transformer,toto,util}.py data/util/dataset.py inference/forecaster.py Eliminates toto-ts pip dependency and all gluonts transitive deps. gluonts replaced with pure PyTorch (TransformedDistribution/AffineTransform, torch.nn.Module Scaler base). - Rewrite modeling_toto.py: TotoForPrediction now inherits PreTrainedModel (required by model_loader); backbone stored as self.model so safetensors keys (model.*) map directly; custom from_pretrained applies _map_state_dict_keys for SwiGLU remapping before loading weights. - Rewrite pipeline_toto.py: import directly from local source; TotoForecaster created lazily inside _get_forecaster() — not at __init__ time — fixing ImportError at pipeline construction in CI. - pyproject.toml: add rotary-embedding-torch>=0.8.0 (only new dep) - .gitignore: un-ignore toto data/ package (Python source, not data files) - Add toto/NOTICE with Datadog attribution per Apache policy Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Apache RAT flagged the standalone NOTICE file inside the toto Python package because the project's RAT config does not exclude plain NOTICE files. Moved the Datadog/toto attribution to the standard location (project root NOTICE) and removed the inner NOTICE file. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…n_kwargs - scaler.py: add short-name aliases ("per_variate", "per_variate_causal", "per_variate_causal_patch") to scaler_types dict so that config.json string values work without KeyError at backbone init time. - backbone.py: recognise "per_variate_causal_patch" string in the CausalPatchStdMeanScaler branch (alongside the legacy class-path string). - configuration_toto.py: add output_distribution_kwargs parameter with default {"k_components": 5} matching Datadog/Toto-Open-Base-1.0. - modeling_toto.py: pass output_distribution_kwargs from config to TotoBackbone so MixtureOfStudentTsOutput receives k_components. Fixes: KeyError 'per_variate_causal' in scaler_types lookup. Fixes: MixtureOfStudentTsOutput missing required positional arg k_components. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

graceli02 changed the title ~~feat(ainode): Integrate toto as a builtin forecast model~~ [AINode]: Integrate toto as a builtin forecast model Mar 20, 2026

graceli02 changed the title ~~[AINode]: Integrate toto as a builtin forecast model~~ [AINode]: Integrate toto as a builtin forecasting model Mar 20, 2026

graceli02 marked this pull request as draft March 20, 2026 21:47

graceli02 force-pushed the toto-ainode-integration branch from 17d4276 to 7740dae Compare March 21, 2026 07:21

graceli02 added 2 commits March 21, 2026 03:26

[AINode] Fix merge conflict in model_info.py and reformat pipeline_to…

434ec2d

…to.py

[AINode] Add Apache license headers and fix pipeline_toto.py

315e086

- Add Apache 2.0 license header to __init__.py and pipeline_toto.py - Fix pipeline_toto.py: replace broken local import with lazy toto-ts import via _import_toto() helper; fix merge conflict in model_info.py

CRZbulabula requested changes Mar 22, 2026

View reviewed changes

graceli02 and others added 5 commits March 22, 2026 20:17

[AINode] Fix toto code style: Black formatting and isort imports

a77e4f6

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

[AINode] Fix build_binary.py: print poetry lock errors to CI logs

437383c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AINode]: Integrate toto as a builtin forecasting model#17322

[AINode]: Integrate toto as a builtin forecasting model#17322
graceli02 wants to merge 8 commits intoapache:masterfrom
graceli02:toto-ainode-integration

graceli02 commented Mar 20, 2026 •

edited

Loading

Uh oh!

CRZbulabula left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

graceli02 commented Mar 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Design Notes

Setup Required

Key changed/added classes (or packages if there are too many classes) in this PR

Uh oh!

CRZbulabula left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

graceli02 commented Mar 20, 2026 •

edited

Loading