Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -144,6 +144,8 @@ private ConditionToken translateMatch2(RexNode node) {
return ConditionToken.unary(fieldNames.get(inputRef.getIndex()), "istrue");
case NOT:
return translateUnary("isfalse", (RexCall) node);
case LIKE:
return translateBinary("like", null, (RexCall) node);
default:
throw new UnsupportedOperationException("Unsupported operator " + node);
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -1109,4 +1109,88 @@ static void initializeArrowState(@TempDir Path sharedTempDir)
.returns(result)
.explainContains(plan);
}

/** Test case for
* <a href="https://issues.apache.org/jira/browse/CALCITE-7472">[CALCITE-7472]
* Arrow adapter should support like operator push down</a>. */
@Test void testArrowProjectFieldsWithLikePrefixFilter() {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have these Arrow plans been validated using third party tools?

Copy link
Copy Markdown
Member Author

@xuzifu666 xuzifu666 Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, no third-party tools have been used for verification, my view is that the test data is in the standard Apache Arrow IPC file format, and the test results are correct. This may indirectly prove that the plan is correct; In addition, I also checked the registration of the gandiva function, which shows that the like statement supports pushdown (see reference:https://github.com/apache/arrow/blob/main/cpp/src/gandiva/function_registry_string.cc#L210).

Of course, these are just my thoughts; I don't know if you agree. @mihaibudiu

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not asking for a test that calls third party tools, I just want to know that you are confident that these programs are correct for the target.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is primarily based on my understanding. From the perspective of the planned generation logic, I think there are no issues so far, and the results have been verified based on actual Arrow data testing, which also shows no problems.

The significance of this feature, in my view, is that it allows the arrow adapter to push down more functions needed for business applications (as long as they are functions supported by Gandiva, which is similar to the principle of Spark engine's support for pushing down built-in functions, greatly benefiting us in production), significantly improving the execution efficiency of arrow adapter operators.

Of course, if there are any issues or areas that might cause performance regressions, please feel free to point them out.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the results have been verified based on actual Arrow data testing, which also shows no problems.

This was my question: have the results been verified?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, they were tested with actual arrow data.

String sql = "select \"stringField\"\n"
+ "from arrowdatatype\n"
+ "where \"stringField\" like '1%'";
String plan = "PLAN=ArrowToEnumerableConverter\n"
+ " ArrowProject(stringField=[$3])\n"
+ " ArrowFilter(condition=[LIKE($3, '1%')])\n"
+ " ArrowTableScan(table=[[ARROW, ARROWDATATYPE]], "
+ "fields=[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]])\n\n";
String result = "stringField=1\nstringField=10\n";

CalciteAssert.that()
.with(arrow)
.query(sql)
.limit(2)
.returns(result)
.explainContains(plan);
}

/** Test case for
* <a href="https://issues.apache.org/jira/browse/CALCITE-7472">[CALCITE-7472]
* Arrow adapter should support like operator push down</a>. */
@Test void testArrowProjectFieldsWithLikeSuffixFilter() {
String sql = "select \"stringField\"\n"
+ "from arrowdatatype\n"
+ "where \"stringField\" like '%5'";
String plan = "PLAN=ArrowToEnumerableConverter\n"
+ " ArrowProject(stringField=[$3])\n"
+ " ArrowFilter(condition=[LIKE($3, '%5')])\n"
+ " ArrowTableScan(table=[[ARROW, ARROWDATATYPE]], "
+ "fields=[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]])\n\n";

CalciteAssert.that()
.with(arrow)
.query(sql)
.returnsCount(5) // 5, 15, 25, 35, 45
.explainContains(plan);
}

/** Test case for
* <a href="https://issues.apache.org/jira/browse/CALCITE-7472">[CALCITE-7472]
* Arrow adapter should support like operator push down</a>. */
@Test void testArrowProjectFieldsWithLikeContainsFilter() {
String sql = "select \"stringField\"\n"
+ "from arrowdatatype\n"
+ "where \"stringField\" like '%2%'";
String plan = "PLAN=ArrowToEnumerableConverter\n"
+ " ArrowProject(stringField=[$3])\n"
+ " ArrowFilter(condition=[LIKE($3, '%2%')])\n"
+ " ArrowTableScan(table=[[ARROW, ARROWDATATYPE]],"
+ " fields=[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]])\n\n";

CalciteAssert.that()
.with(arrow)
.query(sql)
.returnsCount(14) // 2, 12, 20-29, 32, 42
.explainContains(plan);
}

/** Test case for
* <a href="https://issues.apache.org/jira/browse/CALCITE-7472">[CALCITE-7472]
* Arrow adapter should support like operator push down</a>. */
@Test void testArrowProjectFieldsWithLikeSingleCharFilter() {
String sql = "select \"stringField\"\n"
+ "from arrowdatatype\n"
+ "where \"stringField\" like '1_'";
String plan = "PLAN=ArrowToEnumerableConverter\n"
+ " ArrowProject(stringField=[$3])\n"
+ " ArrowFilter(condition=[LIKE($3, '1_')])\n"
+ " ArrowTableScan(table=[[ARROW, ARROWDATATYPE]], "
+ "fields=[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]])\n\n";
String result = "stringField=10\nstringField=11\n";

CalciteAssert.that()
.with(arrow)
.query(sql)
.limit(2)
.returns(result)
.explainContains(plan);
}
}
Loading