Conversation
Add ExportIcebergConfig to the Export oneof in transactions.proto with fields: catalog_uri, namespace, table_name, catalog_properties (warehouse, token, credential), schema, prefix, target_file_size_bytes, and compression. Grammar rules, parsers, printers, and tests are included for all three SDKs (Python, Go, Julia). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… into mjs-add-iceberg-export
…ferent mechanisms there but also keep it separate from general properties for privacy/security reasons
There was a problem hiding this comment.
@reviewers, I'm not sure if this file is supposed to be generated...
| # or appear in hashed structures (e.g. Dict keys, caches). | ||
| Base.:(==)(a::IcebergConfig, b::IcebergConfig) = | ||
| a.catalog_uri == b.catalog_uri && a.scope == b.scope && a.properties == b.properties | ||
| Base.hash(a::IcebergConfig, h::UInt) = hash(a.properties, hash(a.scope, hash(a.catalog_uri, h))) |
There was a problem hiding this comment.
@reviewers: I omitted auth_properties from hash and ==, thinking they shouldn't define object's identity. But this then influences that tests don't take this into account either. Not sure if this is fine, nor required.
… grammar.y itself
…ction is needed because GNF contain key that is not part of the exported iceberg
mjschleich
left a comment
There was a problem hiding this comment.
Generally this looks good to me, just noticed two small issues
| deconstruct if builtin.has_proto_field($$, 'csv_config'): | ||
| $3: transactions.ExportCSVConfig = $$.csv_config |
There was a problem hiding this comment.
| deconstruct if builtin.has_proto_field($$, 'csv_config'): | |
| $3: transactions.ExportCSVConfig = $$.csv_config | |
| deconstruct: $3: transactions.ExportCSVConfig = $$.csv_config |
I don't think this is needed? 🤔
There was a problem hiding this comment.
Seems needed, this pattern is used e.g. in :
data
: edb
construct: $$ = logic.Data(edb=$1)
deconstruct if builtin.has_proto_field($$, 'edb'):
$1: logic.EDB = $$.edb
| betree_relation
construct: $$ = logic.Data(betree_relation=$1)
deconstruct if builtin.has_proto_field($$, 'betree_relation'):
$1: logic.BeTreeRelation = $$.betree_relation
| csv_data
construct: $$ = logic.Data(csv_data=$1)
deconstruct if builtin.has_proto_field($$, 'csv_data'):
$1: logic.CSVData = $$.csv_data
| iceberg_data
construct: $$ = logic.Data(iceberg_data=$1)
deconstruct if builtin.has_proto_field($$, 'iceberg_data'):
$1: logic.IcebergData = $$.iceberg_data
And claude says:
The pretty printer needs to know which variant is active. Without guards, when
printing an iceberg export, it first tries the CSV alternative — $$.csv_config
returns an empty/default ExportCSVConfig, which then fails downstream when trying to
print its sub-fields.
The has_proto_field guard checks which oneof variant is actually set, so only the
matching alternative runs. This is the same pattern used elsewhere for oneofs (e.g.,
Data, Value, Term).
| ( | ||
| source_table_def | ||
| 0x585c5a1121bd88784c1378aa8837195c) | ||
| ( |
There was a problem hiding this comment.
I don't like how this is pretty-printed. Claude suggests that with the current pretty printer the only fix is to extract these into non-terminals, but I'm not sure if this is the right way...
Summary
ExportIcebergConfigmessage totransactions.protowith fields:locator,config(warehouse/token/credential),schema,prefix,target_file_size_bytes, andcompressionIcebergDatanode, reusingGNFColumns fromCSVData.Exportoneof to support Iceberg alongside CSVTest plan
make protobuf— regenerated bindings for Python, Go, Juliamake parsers/make printers— regenerated from grammarmake update-bins/make update-snapshots— test fixtures updatedmake test— all tests pass (Python 686, Julia 31509, Go ok)