Is your feature request related to a problem or challenge?
A follow-up to #22583 / #22612.
Several recent optimizer perf PRs have added per-rule fast-paths to dodge wasted work on plans that don't match the rule's shape:
These are useful local wins but @neilconway and @alamb's comments on #22583 point at the deeper issue: the ownership-based TreeNode::rewrite traversal copies the Arc spine on every pass, even when the rule doesn't change anything. A per-rule "did anything here change?" scan is a stop-gap; the real fix is to express tree rewrites in a way where untouched subtrees aren't reallocated to begin with.
@adriangb already laid groundwork in #22298 with a private map_children_mut helper in datafusion/optimizer/src/optimizer.rs that mutates Arc<LogicalPlan> children in place via Arc::make_mut (zero-cost when refcount == 1). That helper currently:
- Is only invoked from
rewrite_plan_in_place, which itself is gated on !plan_has_subqueries(plan) — so the in-place path only kicks in for the simpler subquery-free case
- Doesn't recurse into the subquery plans referenced from
Expr::ScalarSubquery / InSubquery / Exists / SetComparison
- Is
pub(crate) to the optimizer crate, not reachable from individual rules
Describe the solution you'd like
Generalize the in-place rewriting infrastructure so that all logical optimizer rules can opt into it, regardless of ApplyOrder:
-
Extend map_children_mut to also walk subquery plans. Add a sibling map_subquery_plans_mut (or fold it into a single map_children_and_subqueries_mut) so the in-place path works on plans that contain IN (SELECT ...) / EXISTS (...) etc.
-
Expose an in-place traversal entry point usable from rules with ApplyOrder::None. Today EliminateCrossJoin and a couple of other rules drive their own recursion via rewrite_children (which uses ownership-based map_children + map_uncorrelated_subqueries). Give them an in-place equivalent so they can recurse without forcing Arc copies when nothing changes.
-
Decide on the API shape. Two candidates:
- Add an
OptimizerRule::rewrite_in_place(&mut LogicalPlan, ...) -> Result<bool> method alongside the existing by-value rewrite. Rules can opt in; framework falls back to ownership-based otherwise.
- Or keep the by-value
rewrite contract and have the framework do std::mem::take + write-back transparently (similar to how rewrite_plan_in_place already bridges in optimizer.rs:511).
-
Once the framework is in place, remove the per-rule fast-paths. plan_has_subqueries and plan_has_joins become redundant when "no-op rewrites" are already free.
Describe alternatives you've considered
-
Keep adding per-rule fast-paths. Easy, contained, and works — but every new rule pays the same cost and we accumulate plan_has_X helpers. Doesn't scale.
-
Tackle EliminateCrossJoin specifically with in-place rewrite. Mixed approach: ownership-based for the join restructuring (flatten_join_inputs + find_inner_join), in-place for the non-join recursion. Doable (~300 LOC) but it's a one-off — doesn't help any other rule.
Additional context
Is your feature request related to a problem or challenge?
A follow-up to #22583 / #22612.
Several recent optimizer perf PRs have added per-rule fast-paths to dodge wasted work on plans that don't match the rule's shape:
plan_has_subqueriesgate aroundrewrite_with_subqueriesplan_has_joinsgate at the top ofEliminateCrossJoin::rewriteThese are useful local wins but @neilconway and @alamb's comments on #22583 point at the deeper issue: the ownership-based
TreeNode::rewritetraversal copies the Arc spine on every pass, even when the rule doesn't change anything. A per-rule "did anything here change?" scan is a stop-gap; the real fix is to express tree rewrites in a way where untouched subtrees aren't reallocated to begin with.@adriangb already laid groundwork in #22298 with a private
map_children_muthelper indatafusion/optimizer/src/optimizer.rsthat mutatesArc<LogicalPlan>children in place viaArc::make_mut(zero-cost when refcount == 1). That helper currently:rewrite_plan_in_place, which itself is gated on!plan_has_subqueries(plan)— so the in-place path only kicks in for the simpler subquery-free caseExpr::ScalarSubquery/InSubquery/Exists/SetComparisonpub(crate)to the optimizer crate, not reachable from individual rulesDescribe the solution you'd like
Generalize the in-place rewriting infrastructure so that all logical optimizer rules can opt into it, regardless of
ApplyOrder:Extend
map_children_mutto also walk subquery plans. Add a siblingmap_subquery_plans_mut(or fold it into a singlemap_children_and_subqueries_mut) so the in-place path works on plans that containIN (SELECT ...)/EXISTS (...)etc.Expose an in-place traversal entry point usable from rules with
ApplyOrder::None. TodayEliminateCrossJoinand a couple of other rules drive their own recursion viarewrite_children(which uses ownership-basedmap_children+map_uncorrelated_subqueries). Give them an in-place equivalent so they can recurse without forcing Arc copies when nothing changes.Decide on the API shape. Two candidates:
OptimizerRule::rewrite_in_place(&mut LogicalPlan, ...) -> Result<bool>method alongside the existing by-valuerewrite. Rules can opt in; framework falls back to ownership-based otherwise.rewritecontract and have the framework dostd::mem::take+ write-back transparently (similar to howrewrite_plan_in_placealready bridges in optimizer.rs:511).Once the framework is in place, remove the per-rule fast-paths.
plan_has_subqueriesandplan_has_joinsbecome redundant when "no-op rewrites" are already free.Describe alternatives you've considered
Keep adding per-rule fast-paths. Easy, contained, and works — but every new rule pays the same cost and we accumulate
plan_has_Xhelpers. Doesn't scale.Tackle
EliminateCrossJoinspecifically with in-place rewrite. Mixed approach: ownership-based for the join restructuring (flatten_join_inputs+find_inner_join), in-place for the non-join recursion. Doable (~300 LOC) but it's a one-off — doesn't help any other rule.Additional context
map_children_mutimplementation lives indatafusion/optimizer/src/optimizer.rs:391rewrite_plan_in_placeis atdatafusion/optimizer/src/optimizer.rs:498