Skip to content

Fix filter source variable not found bug in left join to semi join optimizer#25111

Merged
feilong-liu merged 1 commit into
prestodb:masterfrom
feilong-liu:fix_lj_semi
May 14, 2025
Merged

Fix filter source variable not found bug in left join to semi join optimizer#25111
feilong-liu merged 1 commit into
prestodb:masterfrom
feilong-liu:fix_lj_semi

Conversation

@feilong-liu

@feilong-liu feilong-liu commented May 14, 2025

Copy link
Copy Markdown
Contributor

Description

Fix a bug introduced in #24884

An example to trigger the bug is

WITH t1 AS (
    SELECT * FROM (
        VALUES
            (1, 2),
            (2, 3),
            (1, 0)
    ) t(k1, k2)),
t2 AS (
    SELECT * FROM (
        VALUES
            (1, 2),
            (2, 3)
    ) t(k1, k2))
SELECT t1.k1, t1.k2
FROM t1
LEFT JOIN t2
    ON t1.k2 = t2.k2 AND t2.k1 > 1
WHERE
    t2.k2 IS NULL

which will return error like:

Query 20250514_161545_00066_b5vpi failed: Filtering source does not contain filtering join symbol
java.lang.IllegalArgumentException: Filtering source does not contain filtering join symbol
	at com.facebook.presto.common.Utils.checkArgument(Utils.java:61)
	at com.facebook.presto.spi.plan.SemiJoinNode.<init>(SemiJoinNode.java:89)
	at com.facebook.presto.spi.plan.SemiJoinNode.<init>(SemiJoinNode.java:60)
	at com.facebook.presto.sql.planner.iterative.rule.LeftJoinNullFilterToSemiJoin.apply(LeftJoinNullFilterToSemiJoin.java:175)
	at com.facebook.presto.sql.planner.iterative.rule.LeftJoinNullFilterToSemiJoin.apply(LeftJoinNullFilterToSemiJoin.java:86)
	at com.facebook.presto.sql.planner.iterative.IterativeOptimizer.transform(IterativeOptimizer.java:237)
	at com.facebook.presto.sql.planner.iterative.IterativeOptimizer.exploreNode(IterativeOptimizer.java:189)
	at com.facebook.presto.sql.planner.iterative.IterativeOptimizer.exploreGroup(IterativeOptimizer.java:151)
	at com.facebook.presto.sql.planner.iterative.IterativeOptimizer.exploreChildren(IterativeOptimizer.java:281)
	at com.facebook.presto.sql.planner.iterative.IterativeOptimizer.exploreGroup(IterativeOptimizer.java:153)
	at com.facebook.presto.sql.planner.iterative.IterativeOptimizer.exploreChildren(IterativeOptimizer.java:281)
	at com.facebook.presto.sql.planner.iterative.IterativeOptimizer.exploreGroup(IterativeOptimizer.java:153)
	at com.facebook.presto.sql.planner.iterative.IterativeOptimizer.optimize(IterativeOptimizer.java:138)
	at com.facebook.presto.sql.Optimizer.validateAndOptimizePlan(Optimizer.java:112)
	at com.facebook.presto.execution.SqlQueryExecution.lambda$doCreateLogicalPlanAndOptimize$5(SqlQueryExecution.java:589)
	at com.facebook.presto.common.RuntimeStats.recordWallAndCpuTime(RuntimeStats.java:158)
	at com.facebook.presto.execution.SqlQueryExecution.doCreateLogicalPlanAndOptimize(SqlQueryExecution.java:587)
	at com.facebook.presto.common.RuntimeStats.recordWallAndCpuTime(RuntimeStats.java:158)
	at com.facebook.presto.execution.SqlQueryExecution.createLogicalPlanAndOptimize(SqlQueryExecution.java:558)
	at com.facebook.presto.execution.SqlQueryExecution.start(SqlQueryExecution.java:486)
	at com.facebook.presto.$gen.Presto_null__testversion____20250514_160949_4.run(Unknown Source)
	at com.facebook.presto.execution.SqlQueryManager.createQuery(SqlQueryManager.java:320)
	at com.facebook.presto.dispatcher.LocalDispatchQuery.lambda$startExecution$8(LocalDispatchQuery.java:210)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
	at java.base/java.lang.Thread.run(Thread.java:829)

In this case, the right child of the left join will have more than one output variables (although only one variable will be used in the left join, which will be the join key), we cannot assume that the first output variable is the join key.

Motivation and Context

Fix a bug

Impact

Bug fix

Test Plan

Add unit test

Contributor checklist

  • Please make sure your submission complies with our contributing guide, in particular code style and commit standards.
  • PR description addresses the issue accurately and concisely. If the change is non-trivial, a GitHub Issue is referenced.
  • Documented new properties (with its default value), SQL syntax, functions, or other functionality.
  • If release notes are required, they follow the release notes guidelines.
  • Adequate tests were added if applicable.
  • CI passed.

Release Notes

Please follow release notes guidelines and fill in the release notes below.

== RELEASE NOTES ==

General Changes
* Fix a bug in left join to semi join optimizer which leads to filter source variable not found error

@feilong-liu feilong-liu requested a review from ZacBlanco May 14, 2025 16:37
@prestodb-ci prestodb-ci added the from:Meta PR from Meta label May 14, 2025
assertNotEquals(computeActual("EXPLAIN(TYPE DISTRIBUTED) " + sql).getOnlyValue().toString().indexOf("Aggregate"), -1);

// filter in right child of join
sql = "with t1 as (select * from (values (1, 2), (2, 3), (1, 0)) t(k1, k2)), t2 as (select * from (values (1, 2), (2, 3)) t(k1, k2)) select t1.k1, t1.k2 from t1 left join t2 on t1.k2=t2.k2 and t2.k1 > 1 where t2.k2 is null";

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Query to trigger the bug

@kaikalur kaikalur left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@hantangwangd hantangwangd left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch.

@feilong-liu feilong-liu merged commit edfebc2 into prestodb:master May 14, 2025
97 checks passed
@feilong-liu feilong-liu deleted the fix_lj_semi branch May 14, 2025 18:00
@ZacBlanco ZacBlanco mentioned this pull request May 29, 2025
21 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

from:Meta PR from Meta

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants