Skip to content

Replace ANY/ALL CASE planning with array_has/min/max desugaring#22102

Open
cetra3 wants to merge 1 commit into
apache:mainfrom
pydantic:parquet_pruning_for_any
Open

Replace ANY/ALL CASE planning with array_has/min/max desugaring#22102
cetra3 wants to merge 1 commit into
apache:mainfrom
pydantic:parquet_pruning_for_any

Conversation

@cetra3
Copy link
Copy Markdown
Contributor

@cetra3 cetra3 commented May 11, 2026

Which issue does this PR close?

Rationale for this change

This partially reverts the changes in PR #21743 but keeps the cardinality when desugaring to array_min and array_max values.

This aligns more with the outputs from the existing datafusion functions, rather than going down the path of having full on PostgreSQL null semantics.

What changes are included in this PR?

Adjusts how we desugar certain queries such as > ANY etc.. rather than using a full chain, we use a simplified version that just checks the cardinality first and combines with array_min/array_max operators

I.e,

SELECT * FROM t WHERE col > ANY([1, 2, 3])

Desugars to:

cardinality([1, 2, 3]) > 0 AND col > array_min([1, 2, 3])

Which get simplified to:

col > 1

Are these changes tested?

Yes they are tested

Are there any user-facing changes?

Yes, there are some changes to the output of some queries.

However these changes were not shipped as part of 53.1.0, and are only on main

@alamb
Copy link
Copy Markdown
Contributor

alamb commented May 11, 2026

FYI @buraksenn @berkaysynnada and @Jefffrey

Perhaps you can help review this PR as you helped review #21743

Copy link
Copy Markdown
Contributor

@Jefffrey Jefffrey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So do we have a comprehensive view of how empty haystacks/null haystacks/haystacks containing nulls/null needles look with any/all and all supported operators with this PR?

I've lost track a bit of how the behaviour has evolved over the PRs:

So I want to ensure we have a clear understanding of the final behaviour we're agreeing on, since this PR is fixing the any = behaviour to what it previously was and hopefully aligning the other operators (and all) to similar behaviour it seems?

----
NULL

# Mixed NULL + non-NULL (non-NULL elements satisfy, but NULLs present → NULL)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'd need to adjust this comment on these test cases

@@ -145,7 +145,7 @@
query B
select 5 <> ALL(make_array(NULL::INT, NULL::INT));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test case has changed behaviour so should be moved out from under the parent comment (All-NULL arrays: returns NULL)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

sql SQL Planner sqllogictest SQL Logic Tests (.slt)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

PR #21743 disables Parquet pruning for = ANY([literals])

3 participants