From e3f314df5f1a561017726701d8a3ce6b94126044 Mon Sep 17 00:00:00 2001
From: voorhs <ilya_alekseev_2016@list.ru>
Date: Tue, 19 May 2026 20:33:39 +0300
Subject: [PATCH 1/2] docs: extend quickstart, fix links, add adversarial
 augmentation page

Co-authored-by: Cursor <cursoragent@cursor.com>
---
 .../augmentation_tutorials/adversarial.rst    | 47 +++++++++++++++++++
 docs/source/augmentation_tutorials/index.rst  |  1 +
 docs/source/concepts.rst                      |  3 ++
 docs/source/quickstart.rst                    |  9 ++--
 user_guides/basic_usage/03_automl.py          |  4 +-
 5 files changed, 60 insertions(+), 4 deletions(-)
 create mode 100644 docs/source/augmentation_tutorials/adversarial.rst

diff --git a/docs/source/augmentation_tutorials/adversarial.rst b/docs/source/augmentation_tutorials/adversarial.rst
new file mode 100644
index 000000000..006d76771
--- /dev/null
+++ b/docs/source/augmentation_tutorials/adversarial.rst
@@ -0,0 +1,47 @@
+.. _adversarial_human_like_augmentation:
+
+Adversarial human-like augmentation
+====================================
+
+This tutorial covers :py:class:`autointent.generation.utterances.HumanUtteranceGenerator` together with :py:class:`autointent.generation.utterances.CriticHumanLike`. The generator proposes paraphrases of training utterances; the critic asks an LLM to label each candidate as ``human`` or ``generated``. Candidates classified as ``generated`` are rejected and refined in a loop until the critic accepts them (or retries are exhausted).
+
+.. warning::
+
+   This path is **experimental** and may hurt data quality if the critic or base model mis-judges natural text. Use small ``n_final_per_class`` values first and inspect outputs.
+
+How it fits together
+--------------------
+
+- **Generator** — :py:class:`autointent.generation.Generator` wraps your chat/structured-output API (OpenAI-compatible).
+- **CriticHumanLike** — builds a JSON-schema prompt so the LLM returns ``reasoning`` and ``label`` (``human`` \| ``generated``); :py:meth:`~autointent.generation.utterances.CriticHumanLike.is_human` returns whether the utterance passed.
+- **HumanUtteranceGenerator** — orchestrates rewrite attempts per intent; :py:meth:`~autointent.generation.utterances.HumanUtteranceGenerator.augment` can append accepted samples back into a chosen split (default: train).
+
+Installation
+------------
+
+Install the OpenAI-backed generator extra (the ``Generator`` wrapper loads the OpenAI client):
+
+.. code-block:: bash
+
+    pip install "autointent[openai]"
+
+Set ``OPENAI_API_KEY`` (and optional base URL) as required by your deployment. No separate DSPy extra is needed for this augmentation path.
+
+Minimal sketch
+--------------
+
+.. code-block:: python
+
+    from autointent import Dataset
+    from autointent.generation import Generator
+    from autointent.generation.utterances import CriticHumanLike, HumanUtteranceGenerator
+
+    dataset = Dataset.from_dict({...})  # your train split, with intent names if you use them in prompts
+
+    llm = Generator(model_name="gpt-4o-mini")
+    critic = CriticHumanLike(generator=llm)
+    augmenter = HumanUtteranceGenerator(generator=llm, critic=critic, async_mode=False)
+
+    new_samples = augmenter.augment(dataset, split_name="train", n_final_per_class=3)
+
+See the API reference for full argument lists (:py:class:`~autointent.generation.utterances.HumanUtteranceGenerator`, :py:class:`~autointent.generation.utterances.CriticHumanLike`).
diff --git a/docs/source/augmentation_tutorials/index.rst b/docs/source/augmentation_tutorials/index.rst
index 38280bda6..dcc3fea7c 100644
--- a/docs/source/augmentation_tutorials/index.rst
+++ b/docs/source/augmentation_tutorials/index.rst
@@ -8,4 +8,5 @@ Data augmentation tutorials
 
    balancer
    dspy_augmentation
+   adversarial
    intent_description
diff --git a/docs/source/concepts.rst b/docs/source/concepts.rst
index 2e4816a32..caa997343 100644
--- a/docs/source/concepts.rst
+++ b/docs/source/concepts.rst
@@ -85,6 +85,9 @@ A critical capability for production text classification systems, especially in
 **🔗 Integration with Multi-Label**
    OOS detection works seamlessly with multi-label scenarios, enabling detection of completely unknown inputs vs. partial matches to known classes.
 
+**🧭 Split handling**
+   When splits contain OOS samples (``label is None``), the data handler keeps scoring stages on in-domain rows only: in hold-out mode it can duplicate affected splits into ``{split}_0`` (OOS removed for scoring) and ``{split}_1`` (full data for decision) when ``separation_ratio`` is not configured, and cross-validation similarly drops OOS from training folds used while scoring. Before fitting, you can validate whether your data supports splitting with :py:func:`autointent.context.data_handler.check_split_readiness`.
+
 .. _concepts-presets:
 
 Optimization Presets
diff --git a/docs/source/quickstart.rst b/docs/source/quickstart.rst
index 79cfc0227..9247a8b1b 100644
--- a/docs/source/quickstart.rst
+++ b/docs/source/quickstart.rst
@@ -22,7 +22,7 @@ Installation
 Basic Installation
 ..................
 
-AutoIntent is compatible with Python 3.10+. For core functionality:
+AutoIntent supports Python ``>=3.10,<3.15``. For core functionality:
 
 .. code-block:: bash
 
@@ -173,10 +173,12 @@ Available Presets
 .................
 
 - ``classic-light``: Fast training with traditional ML methods
+- ``classic-medium``: Medium-budget traditional ML search
 - ``classic-heavy``: Comprehensive search with traditional methods
 - ``nn-medium``: Classic neural network-based approaches (RNN, CNN)
 - ``nn-heavy``: Comprehensive neural network optimization
 - ``transformers-light``: Transformer models with limited search
+- ``transformers-heavy``: Deeper transformer search (more compute)
 - ``transformers-no-hpo``: Transformer models without hyperparameter optimization
 - ``zero-shot-llm``: Zero-shot classification using OpenAI models
 - ``zero-shot-encoders``: Zero-shot classification using transformer models
@@ -230,8 +232,9 @@ For more control, use individual components without AutoML:
 Available Modules
 .................
 
-- **Scoring**: :class:`autointent.modules.KNNScorer`, :class:`autointent.modules.BertScorer`, :class:`autointent.modules.SklearnScorer`, :class:`autointent.modules.CatBoostScorer`
-- **Decision**: :class:`autointent.modules.ArgmaxDecision`,  :class:`autointent.modules.TunableDecision`, :class:`autointent.modules.AdaptiveDecision`
+- **Embedding**: :class:`autointent.modules.RetrievalAimedEmbedding`, :class:`autointent.modules.LogregAimedEmbedding`
+- **Scoring**: :class:`autointent.modules.KNNScorer`, :class:`autointent.modules.RerankScorer`, :class:`autointent.modules.GCNScorer`, :class:`autointent.modules.MLKnnScorer`, :class:`autointent.modules.BertScorer`, :class:`autointent.modules.SklearnScorer`, :class:`autointent.modules.CatBoostScorer`, and description-based scorers such as :class:`autointent.modules.BiEncoderDescriptionScorer`, :class:`autointent.modules.CrossEncoderDescriptionScorer`, :class:`autointent.modules.LLMDescriptionScorer`
+- **Decision**: :class:`autointent.modules.ArgmaxDecision`, :class:`autointent.modules.ThresholdDecision`, :class:`autointent.modules.JinoosDecision`, :class:`autointent.modules.TunableDecision`, :class:`autointent.modules.AdaptiveDecision`
 
 See more at API reference  :doc:`Modules <autoapi/autointent/modules/index>`.
 
diff --git a/user_guides/basic_usage/03_automl.py b/user_guides/basic_usage/03_automl.py
index 909a5d4d2..29a3f5acc 100644
--- a/user_guides/basic_usage/03_automl.py
+++ b/user_guides/basic_usage/03_automl.py
@@ -52,6 +52,8 @@
 
 # %% [markdown]
 """
+The same preset can also be loaded as a typed %mddoclink(class,,OptimizationConfig) via ``OptimizationConfig.from_preset("classic-light")`` and passed to %mddoclink(method,Pipeline,from_optimization_config) when you want a validated configuration object instead of editing the raw dict from ``load_preset``.
+
 You can inspect the structure and default values of any preset:
 """
 
@@ -77,7 +79,7 @@
 
 # %% [markdown]
 """
-See tutorial %mddoclink(notebook,advanced.03_search_space_configuration) on how the search space is structured.
+See tutorial %mddoclink(notebook,advanced.03_automl) on how the search space is structured.
 """
 
 # %% [markdown]

From 6dbfe05e819b8ddcbd8656d3bf38e48fcfaf55ae Mon Sep 17 00:00:00 2001
From: voorhs <ilya_alekseev_2016@list.ru>
Date: Tue, 19 May 2026 23:33:20 +0300
Subject: [PATCH 2/2] revert quickstart page

---
 docs/source/quickstart.rst | 9 +++------
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/docs/source/quickstart.rst b/docs/source/quickstart.rst
index 9247a8b1b..79cfc0227 100644
--- a/docs/source/quickstart.rst
+++ b/docs/source/quickstart.rst
@@ -22,7 +22,7 @@ Installation
 Basic Installation
 ..................
 
-AutoIntent supports Python ``>=3.10,<3.15``. For core functionality:
+AutoIntent is compatible with Python 3.10+. For core functionality:
 
 .. code-block:: bash
 
@@ -173,12 +173,10 @@ Available Presets
 .................
 
 - ``classic-light``: Fast training with traditional ML methods
-- ``classic-medium``: Medium-budget traditional ML search
 - ``classic-heavy``: Comprehensive search with traditional methods
 - ``nn-medium``: Classic neural network-based approaches (RNN, CNN)
 - ``nn-heavy``: Comprehensive neural network optimization
 - ``transformers-light``: Transformer models with limited search
-- ``transformers-heavy``: Deeper transformer search (more compute)
 - ``transformers-no-hpo``: Transformer models without hyperparameter optimization
 - ``zero-shot-llm``: Zero-shot classification using OpenAI models
 - ``zero-shot-encoders``: Zero-shot classification using transformer models
@@ -232,9 +230,8 @@ For more control, use individual components without AutoML:
 Available Modules
 .................
 
-- **Embedding**: :class:`autointent.modules.RetrievalAimedEmbedding`, :class:`autointent.modules.LogregAimedEmbedding`
-- **Scoring**: :class:`autointent.modules.KNNScorer`, :class:`autointent.modules.RerankScorer`, :class:`autointent.modules.GCNScorer`, :class:`autointent.modules.MLKnnScorer`, :class:`autointent.modules.BertScorer`, :class:`autointent.modules.SklearnScorer`, :class:`autointent.modules.CatBoostScorer`, and description-based scorers such as :class:`autointent.modules.BiEncoderDescriptionScorer`, :class:`autointent.modules.CrossEncoderDescriptionScorer`, :class:`autointent.modules.LLMDescriptionScorer`
-- **Decision**: :class:`autointent.modules.ArgmaxDecision`, :class:`autointent.modules.ThresholdDecision`, :class:`autointent.modules.JinoosDecision`, :class:`autointent.modules.TunableDecision`, :class:`autointent.modules.AdaptiveDecision`
+- **Scoring**: :class:`autointent.modules.KNNScorer`, :class:`autointent.modules.BertScorer`, :class:`autointent.modules.SklearnScorer`, :class:`autointent.modules.CatBoostScorer`
+- **Decision**: :class:`autointent.modules.ArgmaxDecision`,  :class:`autointent.modules.TunableDecision`, :class:`autointent.modules.AdaptiveDecision`
 
 See more at API reference  :doc:`Modules <autoapi/autointent/modules/index>`.