Skip to content

[SPARK-55043][SQL] Fix time travel with subquery containing table references#53811

Closed
cloud-fan wants to merge 2 commits intoapache:masterfrom
cloud-fan:udf
Closed

[SPARK-55043][SQL] Fix time travel with subquery containing table references#53811
cloud-fan wants to merge 2 commits intoapache:masterfrom
cloud-fan:udf

Conversation

@cloud-fan
Copy link
Copy Markdown
Contributor

What changes were proposed in this pull request?

This PR fixes an issue where TIMESTAMP AS OF (subquery) fails when the subquery references a table.

Before this fix, queries like:

SELECT * FROM t TIMESTAMP AS OF (SELECT MIN(ts) FROM t)

would fail with:

assertion failed: No plan for SubqueryAlias testcat.t

The fix changes EvalSubqueriesForTimeTravel to wrap the scalar subquery in a Project over OneRowRelation and execute it through the normal query execution path (sessionState.executePlan), which properly handles table references including V2 tables.

Why are the changes needed?

The EvalSubqueriesForTimeTravel analyzer rule was directly calling QueryExecution.prepareExecutedPlan on the subquery's inner plan, which failed to properly plan V2 table relations.

Does this PR introduce any user-facing change?

Yes. Users can now use subqueries with table references in TIMESTAMP AS OF expressions.

How was this patch tested?

Added a new test case in DataSourceV2SQLSuite that verifies time travel with a subquery containing a table reference.

Was this patch authored or co-authored using generative AI tooling?

Yes.

…erences

### What changes were proposed in this pull request?

This PR fixes an issue where `TIMESTAMP AS OF (subquery)` fails when the subquery references a table.

Before this fix, queries like:
```sql
SELECT * FROM t TIMESTAMP AS OF (SELECT MIN(ts) FROM t)
```
would fail with:
```
assertion failed: No plan for SubqueryAlias testcat.t
```

### Why are the changes needed?

The `EvalSubqueriesForTimeTravel` analyzer rule was directly calling `QueryExecution.prepareExecutedPlan` on the subquery's inner plan, which failed to properly plan V2 table relations.

### Does this PR introduce _any_ user-facing change?

Yes. Users can now use subqueries with table references in `TIMESTAMP AS OF` expressions.

### How was this patch tested?

Added a new test case in `DataSourceV2SQLSuite` that verifies time travel with a subquery containing a table reference.

### Was this patch authored or co-authored using generative AI tooling?

Yes.
@github-actions
Copy link
Copy Markdown

github-actions bot commented Jan 15, 2026

JIRA Issue Information

=== Bug SPARK-55043 ===
Summary: Fix time travel with subquery containing table references
Assignee: None
Status: Open
Affected: ["4.2.0"]


This comment was automatically generated by GitHub Actions

@github-actions github-actions bot added the SQL label Jan 15, 2026
@cloud-fan cloud-fan changed the title [SPARK-XXXXX][SQL] Fix time travel with subquery containing table references [SPARK-55043][SQL] Fix time travel with subquery containing table references Jan 15, 2026
@cloud-fan
Copy link
Copy Markdown
Contributor Author

cc @gengliangwang @yaooqinn

val spark = SparkSession.active
val qe = spark.sessionState.executePlan(wrappedPlan)
val result = qe.executedPlan.executeCollect().head.get(0, s.dataType)
Literal(result, s.dataType)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case of NULL, what behavior are we expecting for time travel?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

null value will error out at TimeTravelSpec.create, before we call v2 catalog APIs.

sql(s"INSERT INTO $t3 VALUES (6)")
sql(s"INSERT INTO $t4 VALUES (7)")
sql(s"INSERT INTO $t4 VALUES (8)")
sql(s"INSERT INTO t VALUES ('2019-01-29 00:37:58')")
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about we add a test here for the NULL case if we don't have yet?

val t4 = s"testcat.t$ts2"

withTable(t3, t4) {
withTable(t3, t4, "t") {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the "t" really stands out here, shall we predef it?

val spark = SparkSession.active
val qe = spark.sessionState.executePlan(wrappedPlan)
val result = qe.executedPlan.executeCollect().head.get(0, s.dataType)
Literal(result, s.dataType)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is implicit casting allowed here? E.g.

SELECT * FROM t TIMESTAMP AS OF (SELECT MIN(date_type_col) from t)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cast is explicitly handled in TimeTravelSpec.create, not relying on the type coercion framework.

We also have test for it. the time travel test in DataSourceV2SQLSuite.scala uses string as timestamp.

@cloud-fan
Copy link
Copy Markdown
Contributor Author

thanks for review, merging to master!

@cloud-fan cloud-fan closed this in 67f6a3f Jan 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants