deeppavlov · voorhs · May 19, 2026 · May 19, 2026 · May 19, 2026 · May 19, 2026
diff --git a/docs/source/augmentation_tutorials/adversarial.rst b/docs/source/augmentation_tutorials/adversarial.rst
@@ -0,0 +1,47 @@
+.. _adversarial_human_like_augmentation:
+
+Adversarial human-like augmentation
+====================================
+
+This tutorial covers :py:class:`autointent.generation.utterances.HumanUtteranceGenerator` together with :py:class:`autointent.generation.utterances.CriticHumanLike`. The generator proposes paraphrases of training utterances; the critic asks an LLM to label each candidate as ``human`` or ``generated``. Candidates classified as ``generated`` are rejected and refined in a loop until the critic accepts them (or retries are exhausted).
+
+.. warning::
+
+   This path is **experimental** and may hurt data quality if the critic or base model mis-judges natural text. Use small ``n_final_per_class`` values first and inspect outputs.
+
+How it fits together
+--------------------
+
+- **Generator** — :py:class:`autointent.generation.Generator` wraps your chat/structured-output API (OpenAI-compatible).
+- **CriticHumanLike** — builds a JSON-schema prompt so the LLM returns ``reasoning`` and ``label`` (``human`` \| ``generated``); :py:meth:`~autointent.generation.utterances.CriticHumanLike.is_human` returns whether the utterance passed.
+- **HumanUtteranceGenerator** — orchestrates rewrite attempts per intent; :py:meth:`~autointent.generation.utterances.HumanUtteranceGenerator.augment` can append accepted samples back into a chosen split (default: train).
+
+Installation
+------------
+
+Install the OpenAI-backed generator extra (the ``Generator`` wrapper loads the OpenAI client):
+
+.. code-block:: bash
+
+    pip install "autointent[openai]"
+
+Set ``OPENAI_API_KEY`` (and optional base URL) as required by your deployment. No separate DSPy extra is needed for this augmentation path.
+
+Minimal sketch
+--------------
+
+.. code-block:: python
+
+    from autointent import Dataset
+    from autointent.generation import Generator
+    from autointent.generation.utterances import CriticHumanLike, HumanUtteranceGenerator
+
+    dataset = Dataset.from_dict({...})  # your train split, with intent names if you use them in prompts
+
+    llm = Generator(model_name="gpt-4o-mini")
+    critic = CriticHumanLike(generator=llm)
+    augmenter = HumanUtteranceGenerator(generator=llm, critic=critic, async_mode=False)
+
+    new_samples = augmenter.augment(dataset, split_name="train", n_final_per_class=3)
+
+See the API reference for full argument lists (:py:class:`~autointent.generation.utterances.HumanUtteranceGenerator`, :py:class:`~autointent.generation.utterances.CriticHumanLike`).
diff --git a/docs/source/augmentation_tutorials/index.rst b/docs/source/augmentation_tutorials/index.rst
@@ -8,4 +8,5 @@ Data augmentation tutorials
 
    balancer
    dspy_augmentation
+   adversarial
    intent_description
diff --git a/docs/source/concepts.rst b/docs/source/concepts.rst
@@ -85,6 +85,9 @@ A critical capability for production text classification systems, especially in
 **🔗 Integration with Multi-Label**
    OOS detection works seamlessly with multi-label scenarios, enabling detection of completely unknown inputs vs. partial matches to known classes.
 
+**🧭 Split handling**
+   When splits contain OOS samples (``label is None``), the data handler keeps scoring stages on in-domain rows only: in hold-out mode it can duplicate affected splits into ``{split}_0`` (OOS removed for scoring) and ``{split}_1`` (full data for decision) when ``separation_ratio`` is not configured, and cross-validation similarly drops OOS from training folds used while scoring. Before fitting, you can validate whether your data supports splitting with :py:func:`autointent.context.data_handler.check_split_readiness`.
+
 .. _concepts-presets:
 
 Optimization Presets

diff --git a/user_guides/basic_usage/03_automl.py b/user_guides/basic_usage/03_automl.py
@@ -52,6 +52,8 @@
 
 # %% [markdown]
 """
+The same preset can also be loaded as a typed %mddoclink(class,,OptimizationConfig) via ``OptimizationConfig.from_preset("classic-light")`` and passed to %mddoclink(method,Pipeline,from_optimization_config) when you want a validated configuration object instead of editing the raw dict from ``load_preset``.
+
 You can inspect the structure and default values of any preset:
 """
 
@@ -77,7 +79,7 @@
 
 # %% [markdown]
 """
-See tutorial %mddoclink(notebook,advanced.03_search_space_configuration) on how the search space is structured.
+See tutorial %mddoclink(notebook,advanced.03_automl) on how the search space is structured.
 """
 
 # %% [markdown]