Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
62 changes: 62 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -72,3 +72,65 @@ Build the HTML version and host it locally:
```bash
make serve-docs
```

## Preparing documentation for a release

Use this checklist when cutting a new package release (for example `0.3.0`). Documentation is published to [deeppavlov.github.io/AutoIntent](https://deeppavlov.github.io/AutoIntent/) via GitHub Pages; a **published GitHub Release** triggers the multi-version build and deploy.

### Before opening a PR

1. **Align versions** across `pyproject.toml`, `docs/source/conf.py` (`release`), and `CHANGELOG.md`.
2. **Update prose** under `docs/source/` (`.rst` files).
3. **Update tutorials** in the repo-root `user_guides/` directory — not under `docs/source/user_guides/`, which is generated at build time and gitignored.
4. **Run doctests** (same as CI on PRs and pushes to `dev`):
```bash
make test-docs
```
5. **Build HTML locally** and fix any errors:
```bash
make docs
```
Optional preview:
```bash
make serve-docs
```
If the build is stale or autoapi / tutorial links look wrong:
```bash
make clean-docs
make docs
```
6. **Match CI dependencies** when tutorials or API pages fail on missing imports:
```bash
uv sync --group docs --extra catboost --extra peft --extra transformers --extra sentence-transformers --extra openai
```
**Pandoc** is required for nbsphinx (CI installs it via `apt`).
7. **Regenerate the optimizer JSON schema** if `OptimizerConfig` or related Pydantic models changed:
```bash
make schema
```

### Do not commit

- `docs/build/`
- `docs/source/autoapi/`
- `docs/source/user_guides/` (symlinks and notebook run artifacts)
- `**/__pycache__/`

### Version switcher (`versions.json`)

`docs/_static/versions.json` is **auto-generated** on every Sphinx build from git tags matching `vX.Y.Z` (see `docs/source/docs_utils/versions_generator.py`). Do not hand-edit it for a release. Until the `vX.Y.Z` tag exists, local builds will still list the previous tag as stable — that is expected.

### Release day

1. Merge documentation and code changes into **`dev`** (CI runs `make test-docs` and `make docs`).
2. Create a git tag **`vX.Y.Z`** on the release commit (must match `v` + semver, for example `v0.3.0`).
3. **Publish a GitHub Release** for that tag. This triggers:
- PyPI publish (`.github/workflows/release.yaml`)
- Multi-version docs build and deploy (`.github/workflows/build-docs.yaml` → `make multi-version-docs` → GitHub Pages under `/versions/`).
4. Verify the live site: the version switcher shows the new release as **stable**, and `https://deeppavlov.github.io/AutoIntent/versions/vX.Y.Z/` loads.

To dry-run the multi-version build locally (requires full git history and tags):

```bash
make multi-version-docs
```
4 changes: 4 additions & 0 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,9 @@ In-Depth Learning
Reference
.........

:doc:`🌐 Inference servers <server>`
Deploy a trained pipeline behind HTTP (FastAPI) or MCP (FastMCP): installation extras, environment variables, and how to run each server.

:doc:`🔧 API Reference <autoapi/autointent/index>`
Complete technical documentation for all classes, methods, and functions. Essential reference for developers integrating AutoIntent into their applications.

Expand All @@ -80,4 +83,5 @@ Reference
concepts
user_guides
learn/index
server
autoapi/autointent/index
123 changes: 123 additions & 0 deletions docs/source/server.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,123 @@
Inference servers
=================

AutoIntent can serve a **trained** pipeline behind two optional interfaces:

* **HTTP (FastAPI)** — a small REST API for ``predict`` and health checks. Use this when you integrate with services, gateways, or clients that speak HTTP/JSON.
* **MCP (FastMCP)** — a Model Context Protocol server with tools (``predict``, ``classes``, ``train_data``). Use this when an LLM host or IDE connects over MCP (stdio for local tools, or HTTP transport for remote access).

Both servers load assets from a directory on disk (the same folder produced when you optimize and dump a pipeline). They are **not** a replacement for training: you must fit or load a pipeline and write it to that directory first.

Installation
------------

Install the core package, then add the extra that matches the server you need:

.. code-block:: bash

pip install "autointent[fastapi]"

.. code-block:: bash

pip install "autointent[fastmcp]"

The ``fastapi`` extra pulls in FastAPI, Uvicorn, and ``pydantic-settings``. The ``fastmcp`` extra pulls in FastMCP and ``pydantic-settings``.

.. note::

If you use ``uv``, the project declares **incompatible optional extras** for ``codecarbon`` and ``fastmcp`` (see ``tool.uv.conflicts`` in ``pyproject.toml``). You cannot enable both in the same resolved environment; pick one or use separate virtual environments.

Prerequisites
-------------

* A directory containing a **saved optimized pipeline** (for example the project directory after ``context.dump()`` from optimization, or another path where ``Pipeline.load`` succeeds).
* For the **MCP** server only: a ``dataset.json`` file inside that same directory (the server loads training metadata and samples for the ``classes`` and ``train_data`` tools).

Configuration (both servers)
----------------------------

Settings are defined with ``pydantic-settings`` and the prefix ``AUTOINTENT_``. Values can be set in the process environment or in a ``.env`` file in the current working directory.

**Shared**

``AUTOINTENT_PATH`` *(required)* — filesystem path to the pipeline directory (same meaning as the ``path`` field in code).

**HTTP server**

``AUTOINTENT_HOST`` — bind address (default ``127.0.0.1``).

``AUTOINTENT_PORT`` — listen port (default ``8013``).

**MCP server**

``AUTOINTENT_TRANSPORT`` — ``stdio`` (default) or ``http``.

``AUTOINTENT_HOST`` / ``AUTOINTENT_PORT`` — used when ``AUTOINTENT_TRANSPORT=http`` (defaults ``127.0.0.1`` and ``8012``).

Example ``.env``:

.. code-block:: text

AUTOINTENT_PATH=/path/to/my_autointent_project
# Optional HTTP defaults:
# AUTOINTENT_HOST=0.0.0.0
# AUTOINTENT_PORT=8013
# Optional MCP over HTTP:
# AUTOINTENT_TRANSPORT=http
# AUTOINTENT_PORT=8012

Set these variables **before** starting the process (the HTTP app reads settings at import time).

HTTP server (FastAPI)
---------------------

**Run with Uvicorn** (recommended; module path matches the FastAPI instance):

.. code-block:: bash

uvicorn autointent.server.http:app --host 127.0.0.1 --port 8013

Bind address and port can follow your deployment; ensure ``AUTOINTENT_PATH`` still points at the pipeline directory.

**Run via the module entrypoint** (uses ``AUTOINTENT_HOST`` and ``AUTOINTENT_PORT`` from settings):

.. code-block:: bash

python -c "from autointent.server.http import main; main()"

Endpoints
.........

* ``GET /health`` — returns ``{"status": "healthy"}``.
* ``POST /predict`` — JSON body and response shaped like the Pydantic models below.

**Request** (``PredictRequest``): ``{"utterances": ["text one", "text two"]}``

**Response** (``PredictResponse``): ``{"predictions": [...]}`` — one prediction per input utterance.

Predictions follow the same convention as ``Pipeline.predict``:

* Single-label: each item is an integer class id, or ``null`` for out-of-scope.
* Multi-label: each item is a list of integer class ids, or ``null`` for out-of-scope.

MCP server (FastMCP)
--------------------

**Stdio (default)** — typical for MCP clients that spawn a subprocess:

.. code-block:: bash

python -c "from autointent.server.mcp import main; main()"

With ``AUTOINTENT_TRANSPORT`` unset or ``stdio``, ``main()`` calls ``mcp.run()`` with stdio transport.

**HTTP transport** — set ``AUTOINTENT_TRANSPORT=http`` (and optionally host/port). ``main()`` then runs with ``transport="http"`` so clients can connect to the configured TCP port (default ``8012``).

Tools
.....

* ``predict`` — arguments: ``utterances: list[str]``. Returns ``predictions`` in the same sense as the HTTP API.
* ``classes`` — pagination: ``page``, ``page_size``. Returns ``classes`` (list of ``Intent`` objects: ``id``, ``name``, ``tags``, regex fields, ``description``) and ``pagination_info``.
* ``train_data`` — pagination and optional ``class_filter`` (list of class ids). Returns ``samples`` (``id``, ``text``, ``label``) and ``pagination_info``.

See the :doc:`API reference <autoapi/autointent/index>` for full type details.
Loading