Describe the bug
The sqllogictest push_down_filter_regression.slt (added in #22150) is flaky in CI. The DynamicFilter content asserted by the EXPLAIN ANALYZE queries on agg_dyn_single is not deterministic, contrary to what the test's own comment claims.
A recent CI run failed with:
[SQL] EXPLAIN ANALYZE SELECT MIN(a), MAX(a) FROM agg_dyn_single;
[Diff] (-expected|+actual)
- predicate=DynamicFilter [ a@0 < 1 OR a@0 > 8 ], pruning_predicate=... a_min@0 < 1 ...
+ predicate=DynamicFilter [ a@0 < 3 OR a@0 > 8 ], pruning_predicate=... a_min@0 < 3 ...
at datafusion/sqllogictest/test_files/push_down_filter_regression.slt:330
Root cause
The test data is split across two files:
file_0 → (5), (1) — partial min = 1 (the global minimum)
file_1 → (3), (8) — partial min = 3, partial max = 8
The comment above the queries states:
Pruning metrics here are subject to a parallel-execution race (the order in which Partial aggregates publish filter updates vs. when the scan reads each partition), so the filter content is deterministic but the pruning counts are not.
That assumption is incorrect. The dynamic filter threshold tightens as each AggregateExec(mode=Partial) publishes its running min/max. EXPLAIN ANALYZE captures a snapshot of the filter's state. The same race the comment acknowledges for the pruning counts also affects the filter content: if the snapshot is taken after file_1 has published its partial min (3) but before file_0 publishes the global min (1), the filter reads a < 3 instead of the final a < 1. The MAX side (> 8) happened to converge in time.
So the filter content is an intermediate value of a converging filter, and which value is observed depends on partition scheduling — exactly the non-determinism the comment attributes only to the counts.
To Reproduce
Hard to reproduce deterministically because it is a thread-scheduling race; it surfaces intermittently in CI. The failing assertions are the agg_dyn_single EXPLAIN ANALYZE queries in datafusion/sqllogictest/test_files/push_down_filter_regression.slt (around line 330).
Expected behavior
The test should be stable across runs and not depend on the order in which partial aggregates publish their filter updates.
Additional context
Possible directions (open to maintainer preference):
- Assert only on the shape of the dynamic filter (e.g. that a
DynamicFilter is present with the right column/structure) rather than its converged threshold value.
- Force a single partition / deterministic scan order for these specific queries so the filter is guaranteed to be fully converged at snapshot time.
- Use data where every file shares the same per-file min/max so any intermediate snapshot equals the final value.
Introduced in #22150. Happy to open a PR once there's agreement on the preferred approach.
Describe the bug
The sqllogictest
push_down_filter_regression.slt(added in #22150) is flaky in CI. TheDynamicFiltercontent asserted by theEXPLAIN ANALYZEqueries onagg_dyn_singleis not deterministic, contrary to what the test's own comment claims.A recent CI run failed with:
Root cause
The test data is split across two files:
file_0→(5), (1)— partialmin= 1 (the global minimum)file_1→(3), (8)— partialmin= 3, partialmax= 8The comment above the queries states:
That assumption is incorrect. The dynamic filter threshold tightens as each
AggregateExec(mode=Partial)publishes its runningmin/max.EXPLAIN ANALYZEcaptures a snapshot of the filter's state. The same race the comment acknowledges for the pruning counts also affects the filter content: if the snapshot is taken afterfile_1has published its partial min (3) but beforefile_0publishes the global min (1), the filter readsa < 3instead of the finala < 1. TheMAXside (> 8) happened to converge in time.So the filter content is an intermediate value of a converging filter, and which value is observed depends on partition scheduling — exactly the non-determinism the comment attributes only to the counts.
To Reproduce
Hard to reproduce deterministically because it is a thread-scheduling race; it surfaces intermittently in CI. The failing assertions are the
agg_dyn_singleEXPLAIN ANALYZEqueries indatafusion/sqllogictest/test_files/push_down_filter_regression.slt(around line 330).Expected behavior
The test should be stable across runs and not depend on the order in which partial aggregates publish their filter updates.
Additional context
Possible directions (open to maintainer preference):
DynamicFilteris present with the right column/structure) rather than its converged threshold value.Introduced in #22150. Happy to open a PR once there's agreement on the preferred approach.