Summary
The audit review of PR #1453 detected a defect in Iceberg sort-key expression generation.
When iceberg_partition_timezone is set and a temporal Iceberg transform is used in sort metadata (day, month, year, hour), timezone is appended as a raw token instead of a SQL string literal in getSortingKeyDescriptionFromMetadata().
Simplified example of generated expression:
toRelativeDayNum(column, UTC)
Expected:
toRelativeDayNum(column, 'UTC')
Without quoting, the timezone token is interpreted by the SQL parser as an identifier instead of a string literal, which can break KeyDescription::parse(...).
Affected area
Impact
Medium (correctness/reliability in planning path):
- Queries reading Iceberg tables with transformed sort-order metadata may fail during sort-key parsing/planning when timezone override is enabled.
- Affects read/planning path for Iceberg metadata consumers using transformed sort keys.
Code evidence
full_argument = clickhouse_transform_name->transform_name + "(";
if (clickhouse_transform_name->argument)
{
full_argument += std::to_string(*clickhouse_transform_name->argument) + ", ";
}
full_argument += column_name;
if (clickhouse_transform_name->time_zone)
full_argument += ", " + *clickhouse_transform_name->time_zone;
full_argument += ")";
order_by_str.pop_back();
return KeyDescription::parse(order_by_str, column_description, local_context, true);
Reproduction sketch
- Configure
iceberg_partition_timezone='UTC'.
- Use/read an Iceberg table with temporal transformed sort metadata (e.g.
day(ts), month(ts), etc.).
- Execute a query that triggers Iceberg sort-key planning.
- Observe parse/planning failure due to unquoted timezone argument in generated expression.
Expected behavior
Timezone argument should be serialized as a SQL string literal (properly quoted/escaped), or generated via AST without raw string concatenation.
Suggested fix direction
- Quote/escape timezone before appending to
full_argument, or
- Build the expression as AST instead of SQL string concatenation.
Suggested regression test
Add a test for Iceberg transformed sort-order + non-empty iceberg_partition_timezone and assert sort-key description parsing/planning succeeds.
Summary
The audit review of PR #1453 detected a defect in Iceberg sort-key expression generation.
When
iceberg_partition_timezoneis set and a temporal Iceberg transform is used in sort metadata (day,month,year,hour), timezone is appended as a raw token instead of a SQL string literal ingetSortingKeyDescriptionFromMetadata().Simplified example of generated expression:
toRelativeDayNum(column, UTC)Expected:
toRelativeDayNum(column, 'UTC')Without quoting, the timezone token is interpreted by the SQL parser as an identifier instead of a string literal, which can break
KeyDescription::parse(...).Affected area
src/Storages/ObjectStorage/DataLakes/Iceberg/Utils.cppgetSortingKeyDescriptionFromMetadata()Impact
Medium (correctness/reliability in planning path):
Code evidence
Reproduction sketch
iceberg_partition_timezone='UTC'.day(ts),month(ts), etc.).Expected behavior
Timezone argument should be serialized as a SQL string literal (properly quoted/escaped), or generated via AST without raw string concatenation.
Suggested fix direction
full_argument, orSuggested regression test
Add a test for Iceberg transformed sort-order + non-empty
iceberg_partition_timezoneand assert sort-key description parsing/planning succeeds.