[FEATURE] Extensible tag classification model discovery through Entry Points#463
Open
Roel Bollens (RoelBollens-TomTom) wants to merge 23 commits intodevfrom
Open
[FEATURE] Extensible tag classification model discovery through Entry Points#463Roel Bollens (RoelBollens-TomTom) wants to merge 23 commits intodevfrom
Roel Bollens (RoelBollens-TomTom) wants to merge 23 commits intodevfrom
Conversation
b8ca2fe to
c00022c
Compare
Seth Fitzsimmons (sethfitz)
requested changes
Mar 12, 2026
a2461e3 to
61bb58f
Compare
b602345 to
1501cc9
Compare
Collaborator
Victor Schappert (vcschapp)
left a comment
There was a problem hiding this comment.
Let some comments, but I'm generally aligned and would merge once Roel Bollens (@RoelBollens-TomTom) and Seth Fitzsimmons (@mojodna) are jointly aligned on merging.
Left some thoughts on the AND/OR issue in the CLI, probably above there somewhere. 👆
1501cc9 to
e5ad4c5
Compare
🗺️ Schema reference docs preview is live!
Note ♻️ This preview updates automatically with each push to this PR. |
Copilot started reviewing on behalf of
Roel Bollens (RoelBollens-TomTom)
April 14, 2026 16:22
View session
e5ad4c5 to
3daf759
Compare
3daf759 to
f19416c
Compare
Copilot started reviewing on behalf of
Roel Bollens (RoelBollens-TomTom)
April 14, 2026 18:13
View session
f19416c to
c8aa891
Compare
Copilot started reviewing on behalf of
Roel Bollens (RoelBollens-TomTom)
April 14, 2026 18:48
View session
c8aa891 to
e9eabf3
Compare
4c94921 to
3a28d80
Compare
Signed-off-by: Roel <75250264+RoelBollens-TomTom@users.noreply.github.com>
Co-authored-by: Seth Fitzsimmons <sethfitz@amazon.com> Signed-off-by: Roel <75250264+RoelBollens-TomTom@users.noreply.github.com>
Co-authored-by: Seth Fitzsimmons <sethfitz@amazon.com> Signed-off-by: Roel <75250264+RoelBollens-TomTom@users.noreply.github.com>
Signed-off-by: Roel <75250264+RoelBollens-TomTom@users.noreply.github.com>
… filtering logic - Removes overture tag provider (was deferred) - Simplified tags - Reserved tags instead of reserved namespaces - Fixes small issue introduced in earlier commit Signed-off-by: Roel <75250264+RoelBollens-TomTom@users.noreply.github.com>
Signed-off-by: Roel <75250264+RoelBollens-TomTom@users.noreply.github.com>
… CLI commands Signed-off-by: Roel <75250264+RoelBollens-TomTom@users.noreply.github.com>
Signed-off-by: Roel <75250264+RoelBollens-TomTom@users.noreply.github.com>
Signed-off-by: Roel <75250264+RoelBollens-TomTom@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> Signed-off-by: Seth Fitzsimmons <seth@mojodna.net>
`filter_models` selects feature types from the registry through three
combinators applied to the same tag grammar (plain `feature`,
namespaced `system:extension`, or compound `overture:theme=buildings`):
--tag OR defines scope (any-of)
--filter AND narrows scope (all-of)
--exclude OR-NOT subtracts (none-of)
--type OR closed-list match on ModelKey.name (orthogonal)
T = ⋃ tag predicates (absent → U)
F = ⋂ filter predicates (absent → U)
E = ⋃ exclude predicates (absent → ∅)
result = (T ∩ F \ E) restricted to type_names if non-empty
The mental model is procedural: --tag widens, --filter narrows,
--exclude subtracts. Without --tag the scope is every registered
model. An empty selector imposes no filtering.
A `TagSelector` value object carries the three tag predicates:
class TagSelector:
include_any: tuple[str, ...] = ()
require_all: tuple[str, ...] = ()
exclude_any: tuple[str, ...] = ()
Field names encode the combinator (any-of / all-of / none-of),
deliberately distinct from CLI flag names. Flags are user-facing
affordances; field names are implementation-facing and self-document
at the call site.
`type_names` lives on `filter_models` as a keyword, not on
`TagSelector`. It's a closed-list match on `ModelKey.name`, orthogonal
to the tag predicate algebra. Isolating it makes `TagSelector`'s
purpose statable in one sentence and confines a future fold-in of
`--type` to a kwarg deletion that doesn't disturb `TagSelector`.
User-facing help text frames flags as acting on feature types
("Include feature types with these tags — defines scope (OR;
repeatable)"). Internal API docstrings keep "models" since they
describe the Python class layer; "feature types" is the user-facing
vocabulary for entry-point-registered top-level types, distinct from
the Pydantic models used for nested fields.
Signed-off-by: Seth Fitzsimmons <seth@mojodna.net>
Use provider_key.name (always a string) instead of provider.__name__, which raises AttributeError when a provider is a callable instance without __name__ — masking the original error inside the except block. Add exc_info=True to preserve the traceback in the warning. Signed-off-by: Seth Fitzsimmons <seth@mojodna.net>
Replace unittest.TestCase classes with module-level pytest functions parametrized over the tag lists. Per-tag parametrization isolates failures to the offending input instead of stopping at the first assertion in a loop. Signed-off-by: Seth Fitzsimmons <seth@mojodna.net>
Fixes D100 reported by pydocstyle / make docformat. Signed-off-by: Seth Fitzsimmons <seth@mojodna.net>
Plain tags, namespaces, and predicates now share a single TAG_PART pattern: lowercase alphanumeric start followed by alphanumeric, hyphen, underscore, or dot. Values remain case-permissive. Drops the prior asymmetry where namespaces and predicates allowed dots but plain tags did not. Make generate_tags private (its sole caller is discover_models) and broaden TagProvider's return type to Iterable[str] so providers can yield, return lists, or return sets. Signed-off-by: Seth Fitzsimmons <seth@mojodna.net>
The provider's first argument is the value loaded from an `overture.models` entry point. For discriminated-union features (e.g. `Segment`) that's `Annotated[Union[...], Field(...)]`, not `type[BaseModel]` — the prior signature was a lie. Widen `TagProvider` and the in-tree providers to accept `Any` and document the boundary in `discovery/types.py`. Strip `typing_util.collect_types` to the cases discovery actually meets today: `Annotated`, `Union`/`X | Y`, plain class. Drop the unreached `NewType` and `Literal` branches. Point at `overture-schema-codegen`'s `extraction/type_analyzer.py:analyze_type` as the more capable implementation, with consolidation across system, core, and cli flagged as future work. `theme_provider` extracts the theme via `_theme_literal`, which asserts that `theme` is a single-value `str` `Literal[...]` and raises `TypeError` otherwise. `_generate_tags` catches and logs at WARNING, so third-party model-definition bugs surface visibly without crashing discovery. Promote tag-rejection logging from DEBUG to WARNING so authorization failures (invalid tags, reserved tags, reserved namespaces) don't disappear silently in normal operation. Convert filter tests from direct `_filter_tags` calls to a fake `TagProvider` driven through `_generate_tags`. Tests now exercise provider invocation and merge wiring, not just the filter, and decouple from the private filter name. Provider-behavior tests still call the providers directly. Add discriminated-union coverage for both `feature_provider` and `theme_provider`, plus a `TypeError` case for a non-Literal `theme`. Signed-off-by: Seth Fitzsimmons <seth@mojodna.net>
Add Discovery and Tagging sections to system's README, covering the overture.models / overture.tag_providers entry point groups, the tag format, provider contract, namespace and tag reservation, the built-in providers, and TagSelector-based filtering. Update core's README: replace the stale Discovery bullet (discovery has moved to system) with one describing the authority and theme tag providers core contributes. Signed-off-by: Seth Fitzsimmons <seth@mojodna.net>
Tag providers now receive the concrete BaseModel subclasses for the entry point instead of the raw entry-point value. _generate_tags walks the model once via collect_types and passes the result to every provider, so providers can't forget to handle discriminated unions and the walk happens once per model rather than once per provider. The TagProvider type alias drops Any in favor of Iterable[type[BaseModel]], honestly typing what providers receive. The first arg of _generate_tags is annotated Any to match the entry-point loader, which yields union expressions that aren't type[BaseModel]. All three registered providers (feature_provider, authority_provider, theme_provider) update to the new signature; unit tests pass concrete classes directly while union-handling tests move to the _generate_tags integration boundary, where the walk now lives. Signed-off-by: Seth Fitzsimmons <seth@mojodna.net>
Per discussion in the coding sesh. Signed-off-by: Victor Schappert <schapper@amazon.com>
The module defines ModelKey and TagProviderKey -- key types, not domain models. Rename clarifies intent and avoids confusion with theme model modules elsewhere in the codebase. Signed-off-by: Seth Fitzsimmons <seth@mojodna.net>
Signed-off-by: Roel <75250264+RoelBollens-TomTom@users.noreply.github.com>
Click 8.1 introduced typed decorator returns that preserve the TypeVar in `tag_selection_options`, so the lowest-direct mypy job no longer reports `Callable[..., Any]` reassignments. The 8.0 floor predated this and only affected lowest-direct. Signed-off-by: Seth Fitzsimmons <seth@mojodna.net>
…lain tag that were missed in the authority_provider removal Signed-off-by: Roel <75250264+RoelBollens-TomTom@users.noreply.github.com>
3a28d80 to
e11a1c4
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Extensible tag classification model discovery through Entry Points
This replaces the hardcoded model classification system with tag-based classification model discovery through Entry Points. This is based on #440 by Seth and several schema (ad-hoc) coding sessions where Seth, Vic, Dana, Tristan and Roel participated in.
Model discovery moved into
system, eliminating assumptions about Overture in the process. The hardcodednamespaceconcept ("overture","annex") and theModelKindclassifier is replaced with tags -- string labels derived by tag providers. Tags become the filtering, grouping, and classification mechanism for model discovery, driven by introspection and package metadata rather than central coordination.systemprovides generic tag-based grouping without understanding what any particular tag means. Any package can register tag providers that classify models without special support in the discovery layer.Purpose
Tags serve three roles:
--tag system:feature,--tag draft)These roles overlap -- a tag like
overture:theme=buildingsserves both filtering and taxonomy. The design accommodates this overlap through structured tags that encode both ownership and dimension.Tag Format
Tags are strings following the pattern
[prefix:]key[=value]:overture,draft,featuresystem:extension--:separates ownershipoverture:theme=buildings:signals ownership and enables prefix reservation (see Privileged Packages and Tag Reservation).=signals a dimension with a value (groupable via--group-by). One level of each -- no nested colons or multiple=signs.Minimal launch set
feature(was:system:feature)overture:theme=<theme>buildings,transportation)overture(was:overture:official)Reserved tags
Tags can be reserved either as simple tags or by namespaces. These are the tags and namespaces that are currently reserved:
featureoverture-schema-systemovertureoverture-schema-coreoverture:*overture-schema-coresystem:*overture-schema-systemExtensions
Additional extensions and accompanied tags will be introduced in a future PR. Extensions allows to augment existing types with new fields (columns).
CLI
The
list-typescommand has been updated to support filtering and grouping by tags. Currently, it no longer displays the description or fully qualified class name. Thejson-schemaandvalidatecommands from the overture-schema cli andgeneratecommand from the overture-codegen cli have been updated to be able to filter on tags instead of filtering by theme and type. Further changes can be introduced in a future update.Examples
Deviations
Closes #512