Skip to content

fix(InlineModelResolver): prevent numbered duplicate models from multi-file OAS 3.1 specs#23856

Open
Shaun-3adesign wants to merge 2 commits into
OpenAPITools:masterfrom
Shaun-3adesign:fix/inline-model-resolver-deduplicate-existing-components
Open

fix(InlineModelResolver): prevent numbered duplicate models from multi-file OAS 3.1 specs#23856
Shaun-3adesign wants to merge 2 commits into
OpenAPITools:masterfrom
Shaun-3adesign:fix/inline-model-resolver-deduplicate-existing-components

Conversation

@Shaun-3adesign
Copy link
Copy Markdown

@Shaun-3adesign Shaun-3adesign commented May 23, 2026

fix #23854

Problem

Multi-file OAS 3.1 specs produce numbered duplicate models (e.g.
DeletionRequest1, FlowSegmentPost1, ContainerMapping1) due to three
related bugs in InlineModelResolver:

  1. Pre-existing components/schemas entries were not seeded into the
    deduplication map before flattening, so identical inline schemas were
    re-registered under numbered names.
  2. OAS 3.1 allows $ref + sibling description (per JSON Schema 2020-12);
    the parser produces a Schema with an overridden description, making the
    content hash differ from the canonical registered schema.
  3. The Swagger Parser shares a single resolved Schema object across all
    usages of the same external file (e.g. uuid.json). Properties that
    carry a sibling description overwrite the shared object's description
    in-place, so two Schema instances from the same source file end up with
    entirely different serialised content depending on processing order.

Fix

  • Seed generatedSignature from components/schemas at the start of flatten().
  • Add a structural deduplication map (generatedStructuralSignature) keyed on
    a description-free serialisation. Uses a Jackson ObjectMapper with a
    @JsonIgnoreProperties({"description"}) MixIn registered on Schema.class,
    which suppresses description recursively across the entire schema graph.
    The exact-hash path is tried first to avoid false-positive merges.

Tests

Four regression tests added to InlineModelResolverTest, plus a set of
multi-file YAML fixtures under src/test/resources/3_0/inline-model-resolver-dedup/
that model the structure of the spec that triggered the bugs.


Summary by cubic

Stops InlineModelResolver from creating numbered duplicate models when flattening multi-file OpenAPI 3.1 specs. Keeps component schema names stable and avoids false-positive matches for untitled components.

  • Bug Fixes
    • Pre-populate the dedup map with existing titled components/schemas only to reuse canonical names and prevent numbered variants, without hijacking untitled inline schemas.
    • Add a structural dedup map that ignores all description fields (via a Jackson MixIn) as a fallback after the exact content hash, covering $ref + sibling descriptions and parser-shared/mutated schemas.
    • Add four regression tests and multi-file YAML fixtures to verify no *_1 duplicates and to protect untitled-schema naming and discriminator behavior.

Written for commit ff3cfc3. Summary will update on new commits. Review in cubic

…i-file OAS 3.1 specs

Three related bugs caused InlineModelResolver to generate numbered duplicate
models (e.g. DeletionRequest1, FlowSegmentPost1, ContainerMapping1) when
processing multi-file OpenAPI 3.1 specs:

1. Pre-existing components/schemas not seeded into deduplication map

   When the spec already declares schemas in components/schemas, they were
   not registered in generatedSignature before flattening began.  Any inline
   schema encountered during flattening that was structurally identical to a
   pre-declared component would be given a new numbered name instead of
   reusing the existing one.

   Fix: iterate over components.schemas at the start of flatten() and call
   addGenerated() for each entry.

2. OAS 3.1 $ref + sibling description produces a different content hash

   OpenAPI 3.1 (following JSON Schema 2020-12) allows a $ref alongside a
   sibling description.  The Swagger Parser resolves this by producing a
   Schema object whose description is overridden by the property-level value.
   Two usages of the same $ref with different sibling descriptions produce
   Schema objects with different serialised content, causing the content-hash
   lookup to miss the already-registered schema.

   Fix: in matchGenerated(), after the exact-hash lookup fails, retry with
   the top-level description field set to null.  In addGenerated(), register
   a description-stripped key as well as the full-content key.

3. Swagger Parser shares and mutates resolved Schema objects across usages

   When setResolve(true) is used, the parser shares a single resolved Schema
   object for each external file (e.g. uuid.json) across every property that
   references it.  Each property that carries a sibling description overwrites
   the shared object's description field in-place.  This means two Schema
   instances for the same source file can end up with completely different
   property-level descriptions (and thus different content hashes) depending
   on processing order.  The top-level description strip from fix 2 is
   insufficient because the mutation affects nested property descriptions.

   Fix: add a second deduplication map (generatedStructuralSignature) keyed
   on a description-free serialisation produced by a dedicated Jackson
   ObjectMapper (structuralMapper) that has a MixIn registered on Schema.class
   with @JsonIgnoreProperties({"description"}).  Because Jackson applies the
   MixIn to the registered class and all subtypes, this recursively suppresses
   description at every level of the schema graph (properties, allOf/anyOf/oneOf
   items, array items, additionalProperties, etc.).  The exact-hash path is
   tried first so schemas that are genuinely structurally distinct but happen
   to share a title are never incorrectly merged.

   The structural hash subsumes the top-level description strip from fix 2,
   so that mechanism is retained only in addGenerated (to keep both hash maps
   populated) and no longer needs the temporary setDescription(null) pattern.

Also adds:
  - Four regression tests in InlineModelResolverTest covering all three cases:
    programmatic deduplication against a pre-existing component; multi-file
    deduplication via external $ref chains; programmatic deduplication when
    property descriptions have been mutated; multi-file deduplication when
    multiple paths reference the same external schema file.
  - Test fixture files under src/test/resources/3_0/inline-model-resolver-dedup/
    modelling the BBC TAMS multi-file spec structure that triggered the bugs.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 10 files

Re-trigger cubic

… false-positive deduplication

The pre-populate step (added in the previous commit) seeded all
components/schemas entries into the deduplication maps before flattening.
This caused a regression: anonymous inline schemas (with no explicit title)
that happened to be structurally identical to a pre-existing untitled
component were incorrectly matched to the component and reused its name
instead of getting a generated name based on their location in the spec.

Concretely:
- ObjectWithOptionalAndRequiredProps (untitled component) was matched by
  the inline response schema, producing "ObjectWithOptionalAndRequiredProps"
  instead of the expected "objectWithOptionalAndRequiredProps_request".
- AppleReqDisc (untitled component with seeds/fruitType properties) was
  matched by the structurally-identical inline anyOf schema inside
  FruitInlineDisc, breaking the discriminator mapped-model hierarchy.

Fix: only pre-populate components that carry an explicit title field.
A schema identified solely by its YAML key in components/schemas has no
inherent identity that transcends its position — two anonymous schemas
with the same structure can be intentionally distinct (separate request
and response bodies that happen to share the same properties). A titled
schema (e.g. title: "Container Mapping" in container-mapping.yaml)
represents a named type defined in its own file; any inline occurrence
of that schema should reuse the canonical name.

This restores the correct behaviour for all four regression tests added
in the previous commit while fixing the discriminator and model-naming
regressions introduced by the unconditional pre-populate.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG][rust-axum][python-fastapi] Duplicate model generated when same external schema is referenced via allOf chains across multiple files

1 participant