Skip to content

API: Fix AllowMissing bug and Add Unit tests for StructProjection class#15984

Open
Govindarajan-D wants to merge 12 commits intoapache:mainfrom
Govindarajan-D:test-struct-projection
Open

API: Fix AllowMissing bug and Add Unit tests for StructProjection class#15984
Govindarajan-D wants to merge 12 commits intoapache:mainfrom
Govindarajan-D:test-struct-projection

Conversation

@Govindarajan-D
Copy link
Copy Markdown
Contributor

@Govindarajan-D Govindarajan-D commented Apr 15, 2026

Added Unit tests that improves the reliability of StructProjection class.

  1. Subset projection from a flat schema
  2. Nested projection across multiple levels
  3. Optional field projection (schema evolution)
  4. Map projection with nested value and Primitive Types
  5. List projection with nested value and Primitive Types
  6. Map and List Partial Projection
  7. And other small functions

Code coverage from 0% to 100% (Based on Jacoco report)

Fixes #16123

@github-actions github-actions Bot added the API label Apr 15, 2026
Comment thread api/src/test/java/org/apache/iceberg/util/TestStructProjection.java Outdated
Comment thread api/src/test/java/org/apache/iceberg/util/TestStructProjection.java Outdated
Comment thread api/src/test/java/org/apache/iceberg/util/TestStructProjection.java
@pvary
Copy link
Copy Markdown
Contributor

pvary commented Apr 16, 2026

Thanks @Govindarajan-D for the PR. Left some minor formatting comments

@Govindarajan-D
Copy link
Copy Markdown
Contributor Author

@pvary Resolved the mentioned formatting corrections. Thanks much!

Comment thread api/src/test/java/org/apache/iceberg/util/TestStructProjection.java Outdated
Comment thread api/src/test/java/org/apache/iceberg/util/TestStructProjection.java Outdated
Comment thread api/src/test/java/org/apache/iceberg/util/TestStructProjection.java Outdated

public class TestStructProjection {
@Test
public void testSubsetProjection() {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's omit test as a prefix in new test names as they don't add any value


Schema projectedSchema = new Schema(required(30, "value", Types.IntegerType.get()));

StructProjection projectedStructure = StructProjection.create(dataSchema, projectedSchema);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
StructProjection projectedStructure = StructProjection.create(dataSchema, projectedSchema);
StructProjection projection = StructProjection.create(dataSchema, projectedSchema);

projectedStructure.wrap(row);

assertThat(projectedStructure.size()).isEqualTo(1);
assertThat(projectedStructure.get(0, Integer.class)).isEqualTo(42);
Copy link
Copy Markdown
Contributor

@nastra nastra Apr 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this tests only the happy path. Let's also add something where we try to access a position that isn't valid and such. Another test method could also call set() and verify that the expected exception is thrown

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should probably also check projectedFieldIds() in all of the tests


@Test
public void testMapWithNestedValueFullMatch() {
StructType coordinateStruct =
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
StructType coordinateStruct =
StructType coordinatesStruct =

Copy link
Copy Markdown
Contributor

@nastra nastra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few more things that would be good to cover as well:

  • create(Schema, Set<Integer>) / copyFor(StructLike) are currently not tested
  • Missing required field (without allowMissing) should throw IllegalArgumentException
  • Partial map-value/list-element projection should throw IAE and we should most likely test this

@Govindarajan-D
Copy link
Copy Markdown
Contributor Author

Will add more tests as planned

@Govindarajan-D Govindarajan-D requested review from nastra and pvary April 26, 2026 09:35
@Govindarajan-D
Copy link
Copy Markdown
Contributor Author

@nastra @pvary Added more tests covering map and list with partial projection and testing few other small functions. The code coverage is at 100% as per JaCoCo report.

Comment thread api/src/main/java/org/apache/iceberg/util/StructProjection.java
@Govindarajan-D
Copy link
Copy Markdown
Contributor Author

Fixes #16123

@Govindarajan-D Govindarajan-D changed the title API: Add Unit tests for StructProjection class API: Fix AllowMissing bug and Unit tests for StructProjection class Apr 26, 2026
@Govindarajan-D Govindarajan-D changed the title API: Fix AllowMissing bug and Unit tests for StructProjection class API: Fix AllowMissing bug and Add Unit tests for StructProjection class Apr 26, 2026

public class TestStructProjection {
@Test
public void testProjectSubsetOfSchemaFields() {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's omit test as a prefix in new test names as they don't add any value

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nastra Removed the test prefix

Comment thread api/src/test/java/org/apache/iceberg/util/TestStructProjection.java
@Govindarajan-D Govindarajan-D requested a review from pvary April 30, 2026 14:26
Comment on lines +250 to +259
Schema projectedSchema =
new Schema(
required(
2,
"address",
Types.MapType.ofRequired(3, 4, Types.StringType.get(), coordinatesStruct)),
required(
5,
"metadata",
Types.MapType.ofRequired(6, 7, Types.StringType.get(), Types.StringType.get())));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In many cases the projected schema could be calculated by TypeUtil.select/selectNot.

For example:

Schema projectedSchema = TypeUtil.select(dataSchema, Sets.newHashSet(2, 5));

or

Schema projectedSchema = TypeUtil.selectNot(dataSchema, Sets.newHashSet(1));

Please check other places too?
I think the selectNot is actually quite intuitive in projection scenarios

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But we do have tests that are not exact subset of dataSchema right. So I would have to use a mix of select/selectNot with normal definition which would not be uniform. Let me know what you think.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In most of the test either the schema, or the projected schema could be generated by a single select or selectNot from the other. select is easier if we want only a few columns, selectNot is easier if we just want to remove a few columns.

Just try it in a few places and see if you like it, and based on the results work on the other ones

Schema dataSchema =
new Schema(
required(2, "data", Types.MapType.ofRequired(3, 4, keyStruct, Types.StringType.get())));
Schema projectedSchema =
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the difference between the dataSchema and the projectedSchema?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no difference. This is for testing the nested structure, so re-used the same structure with nested Map.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we can do a:

StructProjection projection = StructProjection.create(dataSchema, dataSchema);

Comment thread api/src/test/java/org/apache/iceberg/util/TestStructProjection.java
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

API: StructProjection.createAllowMissing does not propagate allowMissing to nested struct sub-projections

3 participants