API: Fix AllowMissing bug and Add Unit tests for StructProjection class#15984
API: Fix AllowMissing bug and Add Unit tests for StructProjection class#15984Govindarajan-D wants to merge 12 commits intoapache:mainfrom
Conversation
|
Thanks @Govindarajan-D for the PR. Left some minor formatting comments |
|
@pvary Resolved the mentioned formatting corrections. Thanks much! |
|
|
||
| public class TestStructProjection { | ||
| @Test | ||
| public void testSubsetProjection() { |
There was a problem hiding this comment.
let's omit test as a prefix in new test names as they don't add any value
|
|
||
| Schema projectedSchema = new Schema(required(30, "value", Types.IntegerType.get())); | ||
|
|
||
| StructProjection projectedStructure = StructProjection.create(dataSchema, projectedSchema); |
There was a problem hiding this comment.
| StructProjection projectedStructure = StructProjection.create(dataSchema, projectedSchema); | |
| StructProjection projection = StructProjection.create(dataSchema, projectedSchema); |
| projectedStructure.wrap(row); | ||
|
|
||
| assertThat(projectedStructure.size()).isEqualTo(1); | ||
| assertThat(projectedStructure.get(0, Integer.class)).isEqualTo(42); |
There was a problem hiding this comment.
this tests only the happy path. Let's also add something where we try to access a position that isn't valid and such. Another test method could also call set() and verify that the expected exception is thrown
There was a problem hiding this comment.
we should probably also check projectedFieldIds() in all of the tests
|
|
||
| @Test | ||
| public void testMapWithNestedValueFullMatch() { | ||
| StructType coordinateStruct = |
There was a problem hiding this comment.
| StructType coordinateStruct = | |
| StructType coordinatesStruct = |
nastra
left a comment
There was a problem hiding this comment.
A few more things that would be good to cover as well:
create(Schema, Set<Integer>)/copyFor(StructLike)are currently not tested- Missing required field (without allowMissing) should throw IllegalArgumentException
- Partial map-value/list-element projection should throw IAE and we should most likely test this
|
Will add more tests as planned |
…s, and missing fields
|
Fixes #16123 |
|
|
||
| public class TestStructProjection { | ||
| @Test | ||
| public void testProjectSubsetOfSchemaFields() { |
There was a problem hiding this comment.
let's omit test as a prefix in new test names as they don't add any value
| Schema projectedSchema = | ||
| new Schema( | ||
| required( | ||
| 2, | ||
| "address", | ||
| Types.MapType.ofRequired(3, 4, Types.StringType.get(), coordinatesStruct)), | ||
| required( | ||
| 5, | ||
| "metadata", | ||
| Types.MapType.ofRequired(6, 7, Types.StringType.get(), Types.StringType.get()))); |
There was a problem hiding this comment.
In many cases the projected schema could be calculated by TypeUtil.select/selectNot.
For example:
Schema projectedSchema = TypeUtil.select(dataSchema, Sets.newHashSet(2, 5));
or
Schema projectedSchema = TypeUtil.selectNot(dataSchema, Sets.newHashSet(1));
Please check other places too?
I think the selectNot is actually quite intuitive in projection scenarios
There was a problem hiding this comment.
But we do have tests that are not exact subset of dataSchema right. So I would have to use a mix of select/selectNot with normal definition which would not be uniform. Let me know what you think.
There was a problem hiding this comment.
In most of the test either the schema, or the projected schema could be generated by a single select or selectNot from the other. select is easier if we want only a few columns, selectNot is easier if we just want to remove a few columns.
Just try it in a few places and see if you like it, and based on the results work on the other ones
| Schema dataSchema = | ||
| new Schema( | ||
| required(2, "data", Types.MapType.ofRequired(3, 4, keyStruct, Types.StringType.get()))); | ||
| Schema projectedSchema = |
There was a problem hiding this comment.
What is the difference between the dataSchema and the projectedSchema?
There was a problem hiding this comment.
There is no difference. This is for testing the nested structure, so re-used the same structure with nested Map.
There was a problem hiding this comment.
So we can do a:
StructProjection projection = StructProjection.create(dataSchema, dataSchema);
Added Unit tests that improves the reliability of StructProjection class.
Code coverage from 0% to 100% (Based on Jacoco report)
Fixes #16123