Skip to content

Add support for PostgreSQL's ORDER BY ... USING <operator> clause#2246

Open
LucaCappelletti94 wants to merge 3 commits intoapache:mainfrom
LucaCappelletti94:postgres-regression-2
Open

Add support for PostgreSQL's ORDER BY ... USING <operator> clause#2246
LucaCappelletti94 wants to merge 3 commits intoapache:mainfrom
LucaCappelletti94:postgres-regression-2

Conversation

@LucaCappelletti94
Copy link
Contributor

This PR adds parser and AST support for PostgreSQL USING operators in ORDER BY expressions, including:

  • top-level ORDER BY
  • aggregate argument list ORDER BY (e.g. inside aggfns(...) / string_agg(...) style clauses)

Problem

sqlparser-rs previously rejected valid PostgreSQL statements containing:

  • ORDER BY <expr> USING <operator>

with errors like:

  • Expected: ), found: using

This blocked parsing of several PostgreSQL regression-style aggregate statements.

Root Cause

parse_order_by_expr_inner only supported:

  • ASC | DESC
  • NULLS FIRST | NULLS LAST

and had no branch to parse USING <operator>.

Dialect Scope Decision

ORDER BY ... USING <operator> is intentionally enabled for PostgreSQL only.

Rationale:

  • PostgreSQL explicitly documents ORDER BY expression [ USING operator ] ... in SELECT.
  • Other dialects supported by this repository generally document ORDER BY with expression + ASC|DESC (+ null ordering), but not USING <operator>.

References:

Complete Examples

PostgreSQL: accepted

SELECT a FROM t ORDER BY a USING <;

SELECT a FROM t ORDER BY a USING OPERATOR(pg_catalog.<) NULLS LAST;

SELECT aggfns(DISTINCT a, a, c ORDER BY c USING ~<~, a)
FROM (VALUES (1,3,'foo'),(0,NULL,NULL),(2,2,'bar'),(3,1,'baz')) v(a,b,c),
     generate_series(1,2) i;

CREATE OR REPLACE VIEW agg_view1 AS
SELECT aggfns(a,b,c ORDER BY c USING ~<~)
FROM (VALUES (1,3,'foo'),(0,NULL,NULL),(2,2,'bar'),(3,1,'baz')) v(a,b,c);

PostgreSQL: rejected (invalid syntax)

SELECT a FROM t ORDER BY a USING ;
-- Error: Expected an ordering operator after USING

SELECT a FROM t ORDER BY a USING OPERATOR();
-- Error: Expected an operator name

SELECT a FROM t ORDER BY a USING < DESC;
-- Error: ASC/DESC cannot be used together with USING in ORDER BY

@LucaCappelletti94 LucaCappelletti94 marked this pull request as ready for review February 26, 2026 19:55
}
}

fn is_valid_order_by_using_operator_symbol(symbol: &str) -> bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this code necessary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed, parse_order_by_using_operator now uses dialect.is_custom_operator_part(...) directly. No separate symbol-check helper anymore. I wasn't aware there was a dialect helper method.


let options = self.parse_order_by_options()?;
let using_operator = if !with_operator_class
&& dialect_of!(self is PostgreSqlDialect)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we need to dialect guard this, I think we can introduce a dialect method instead of the macro

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added supports_order_by_using_operator() on Dialect (default false), enabled in PostgreSqlDialect, and parser now checks that method instead of dialect_of!. I double checked that no other dialect supports it at this time.

Comment on lines +4992 to +4995
Token::Word(w)
if w.quote_style.is_none() && terminal_keywords.contains(&w.keyword) =>
{
break;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are these diffs required for this PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it was nightly clippy fixes. Removed from git history.

);

// `USING` in ORDER BY is PostgreSQL-specific and should not parse in GenericDialect.
let generic = TestedDialects::new(vec![Box::new(GenericDialect {})]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note, per the dialect comment, we'd want to avoid hardcoding dialects in tests

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replaced the GenericDialect negative case with a local wrapper dialect that behaves like Postgres but returns false for supports_order_by_using_operator().

@LucaCappelletti94
Copy link
Contributor Author

I’ve applied the requested changes and force-pushed the branch, so to remove irrelevant clippy changes from the commits.

  • Replaced the hardcoded parser guard (dialect_of!(...PostgreSqlDialect)) with a dialect capability method: Dialect::supports_order_by_using_operator(), enabled in PostgreSqlDialect only. I have also spent a bit of time checking yet again if any other dialects had it, but so far no other dialects support it.
  • Removed the dedicated is_valid_order_by_using_operator_symbol helper and now validate USING operators through existing dialect behavior (dialect.is_custom_operator_part(...)), so we don’t duplicate operator-character rules in parser code.
  • Trimmed the PR scope/history to a single focused commit and removed unrelated nightly clippy code smell fixes - I hadn't realized it was a nightly only rant.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants