Skip to content

Commit c74663e

Browse files
ahshahidyaooqinnsuqilongDooyoung Hwanggengliangwang
authored
merging master (apache#11)
* [SPARK-33641][SQL][DOC][FOLLOW-UP] Add migration guide for CHAR VARCHAR types ### What changes were proposed in this pull request? Add migration guide for CHAR VARCHAR types ### Why are the changes needed? for migration ### Does this PR introduce _any_ user-facing change? doc change ### How was this patch tested? passing ci Closes apache#30654 from yaooqinn/SPARK-33641-F. Authored-by: Kent Yao <[email protected]> Signed-off-by: Wenchen Fan <[email protected]> * [SPARK-33669] Wrong error message from YARN application state monitor when sc.stop in yarn client mode ### What changes were proposed in this pull request? This change make InterruptedIOException to be treated as InterruptedException when closing YarnClientSchedulerBackend, which doesn't log error with "YARN application has exited unexpectedly xxx" ### Why are the changes needed? For YarnClient mode, when stopping YarnClientSchedulerBackend, it first tries to interrupt Yarn application monitor thread. In MonitorThread.run() it catches InterruptedException to gracefully response to stopping request. But client.monitorApplication method also throws InterruptedIOException when the hadoop rpc call is calling. In this case, MonitorThread will not know it is interrupted, a Yarn App failed is returned with "Failed to contact YARN for application xxxxx; YARN application has exited unexpectedly with state xxxxx" is logged with error level. which confuse user a lot. ### Does this PR introduce _any_ user-facing change? Yes ### How was this patch tested? very simple patch, seems no need? Closes apache#30617 from sqlwindspeaker/yarn-client-interrupt-monitor. Authored-by: suqilong <[email protected]> Signed-off-by: Mridul Muralidharan <mridul<at>gmail.com> * [SPARK-33655][SQL] Improve performance of processing FETCH_PRIOR ### What changes were proposed in this pull request? Currently, when a client requests FETCH_PRIOR to Thriftserver, Thriftserver reiterates from the start position. Because Thriftserver caches a query result with an array when THRIFTSERVER_INCREMENTAL_COLLECT feature is off, FETCH_PRIOR can be implemented without reiterating the result. A trait FeatureIterator is added in order to separate the implementation for iterator and an array. Also, FeatureIterator supports moves cursor with absolute position, which will be useful for the implementation of FETCH_RELATIVE, FETCH_ABSOLUTE. ### Why are the changes needed? For better performance of Thriftserver. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? FetchIteratorSuite Closes apache#30600 from Dooyoung-Hwang/refactor_with_fetch_iterator. Authored-by: Dooyoung Hwang <[email protected]> Signed-off-by: HyukjinKwon <[email protected]> * [SPARK-33719][DOC] Add make_date/make_timestamp/make_interval into the doc of ANSI Compliance ### What changes were proposed in this pull request? Add make_date/make_timestamp/make_interval into the doc of ANSI Compliance ### Why are the changes needed? Users can know that these functions throw runtime exceptions under ANSI mode if the result is not valid. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Build doc and check it in browser: ![image](https://user-images.githubusercontent.com/1097932/101608930-34a79e80-39bb-11eb-9294-9d9b8c3f6faa.png) Closes apache#30683 from gengliangwang/improveDoc. Authored-by: Gengliang Wang <[email protected]> Signed-off-by: HyukjinKwon <[email protected]> * [SPARK-33071][SPARK-33536][SQL][FOLLOW-UP] Rename deniedMetadataKeys to nonInheritableMetadataKeys in Alias ### What changes were proposed in this pull request? This PR is a followup of apache#30488. This PR proposes to rename `Alias.deniedMetadataKeys` to `Alias.nonInheritableMetadataKeys` to make it less confusing. ### Why are the changes needed? To make it easier to maintain and read. ### Does this PR introduce _any_ user-facing change? No. This is rather a code cleanup. ### How was this patch tested? Ran the unittests written in the previous PR manually. Jenkins and GitHub Actions in this PR should also test them. Closes apache#30682 from HyukjinKwon/SPARK-33071-SPARK-33536. Authored-by: HyukjinKwon <[email protected]> Signed-off-by: HyukjinKwon <[email protected]> * [SPARK-33722][SQL] Handle DELETE in ReplaceNullWithFalseInPredicate ### What changes were proposed in this pull request? This PR adds `DeleteFromTable` to supported plans in `ReplaceNullWithFalseInPredicate`. ### Why are the changes needed? This change allows Spark to optimize delete conditions like we optimize filters. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? This PR extends the existing test cases to also cover `DeleteFromTable`. Closes apache#30688 from aokolnychyi/spark-33722. Authored-by: Anton Okolnychyi <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]> Co-authored-by: Kent Yao <[email protected]> Co-authored-by: suqilong <[email protected]> Co-authored-by: Dooyoung Hwang <[email protected]> Co-authored-by: Gengliang Wang <[email protected]> Co-authored-by: HyukjinKwon <[email protected]> Co-authored-by: Anton Okolnychyi <[email protected]>
1 parent e77424b commit c74663e

File tree

13 files changed

+317
-79
lines changed

13 files changed

+317
-79
lines changed

docs/sql-migration-guide.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -58,6 +58,8 @@ license: |
5858

5959
- In Spark 3.1, creating or altering a view will capture runtime SQL configs and store them as view properties. These configs will be applied during the parsing and analysis phases of the view resolution. To restore the behavior before Spark 3.1, you can set `spark.sql.legacy.useCurrentConfigsForView` to `true`.
6060

61+
- Since Spark 3.1, CHAR/CHARACTER and VARCHAR types are supported in the table schema. Table scan/insertion will respect the char/varchar semantic. If char/varchar is used in places other than table schema, an exception will be thrown (CAST is an exception that simply treats char/varchar as string like before). To restore the behavior before Spark 3.1, which treats them as STRING types and ignores a length parameter, e.g. `CHAR(4)`, you can set `spark.sql.legacy.charVarcharAsString` to `true`.
62+
6163
## Upgrading from Spark SQL 3.0 to 3.0.1
6264

6365
- In Spark 3.0, JSON datasource and JSON function `schema_of_json` infer TimestampType from string values if they match to the pattern defined by the JSON option `timestampFormat`. Since version 3.0.1, the timestamp type inference is disabled by default. Set the JSON option `inferTimestamp` to `true` to enable such type inference.

docs/sql-ref-ansi-compliance.md

Lines changed: 10 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -144,14 +144,18 @@ SELECT * FROM t;
144144

145145
The behavior of some SQL functions can be different under ANSI mode (`spark.sql.ansi.enabled=true`).
146146
- `size`: This function returns null for null input.
147-
- `element_at`: This function throws `ArrayIndexOutOfBoundsException` if using invalid indices.
148-
- `element_at`: This function throws `NoSuchElementException` if key does not exist in map.
147+
- `element_at`:
148+
- This function throws `ArrayIndexOutOfBoundsException` if using invalid indices.
149+
- This function throws `NoSuchElementException` if key does not exist in map.
149150
- `elt`: This function throws `ArrayIndexOutOfBoundsException` if using invalid indices.
150151
- `parse_url`: This function throws `IllegalArgumentException` if an input string is not a valid url.
151-
- `to_date` This function should fail with an exception if the input string can't be parsed, or the pattern string is invalid.
152-
- `to_timestamp` This function should fail with an exception if the input string can't be parsed, or the pattern string is invalid.
153-
- `unix_timestamp` This function should fail with an exception if the input string can't be parsed, or the pattern string is invalid.
154-
- `to_unix_timestamp` This function should fail with an exception if the input string can't be parsed, or the pattern string is invalid.
152+
- `to_date`: This function should fail with an exception if the input string can't be parsed, or the pattern string is invalid.
153+
- `to_timestamp`: This function should fail with an exception if the input string can't be parsed, or the pattern string is invalid.
154+
- `unix_timestamp`: This function should fail with an exception if the input string can't be parsed, or the pattern string is invalid.
155+
- `to_unix_timestamp`: This function should fail with an exception if the input string can't be parsed, or the pattern string is invalid.
156+
- `make_date`: This function should fail with an exception if the result date is invalid.
157+
- `make_timestamp`: This function should fail with an exception if the result timestamp is invalid.
158+
- `make_interval`: This function should fail with an exception if the result interval is invalid.
155159

156160
### SQL Operators
157161

resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1069,7 +1069,7 @@ private[spark] class Client(
10691069
logError(s"Application $appId not found.")
10701070
cleanupStagingDir()
10711071
return YarnAppReport(YarnApplicationState.KILLED, FinalApplicationStatus.KILLED, None)
1072-
case NonFatal(e) =>
1072+
case NonFatal(e) if !e.isInstanceOf[InterruptedIOException] =>
10731073
val msg = s"Failed to contact YARN for application $appId."
10741074
logError(msg, e)
10751075
// Don't necessarily clean up staging dir because status is unknown

resource-managers/yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnClientSchedulerBackend.scala

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,8 @@
1717

1818
package org.apache.spark.scheduler.cluster
1919

20+
import java.io.InterruptedIOException
21+
2022
import scala.collection.mutable.ArrayBuffer
2123

2224
import org.apache.hadoop.yarn.api.records.YarnApplicationState
@@ -121,7 +123,8 @@ private[spark] class YarnClientSchedulerBackend(
121123
allowInterrupt = false
122124
sc.stop()
123125
} catch {
124-
case e: InterruptedException => logInfo("Interrupting monitor thread")
126+
case _: InterruptedException | _: InterruptedIOException =>
127+
logInfo("Interrupting monitor thread")
125128
}
126129
}
127130

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/AliasHelper.scala

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -90,7 +90,7 @@ trait AliasHelper {
9090
exprId = a.exprId,
9191
qualifier = a.qualifier,
9292
explicitMetadata = Some(a.metadata),
93-
deniedMetadataKeys = a.deniedMetadataKeys)
93+
nonInheritableMetadataKeys = a.nonInheritableMetadataKeys)
9494
case a: MultiAlias =>
9595
a.copy(child = trimAliases(a.child))
9696
case other => trimAliases(other)

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/namedExpressions.scala

Lines changed: 11 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -143,14 +143,14 @@ abstract class Attribute extends LeafExpression with NamedExpression with NullIn
143143
* fully qualified way. Consider the examples tableName.name, subQueryAlias.name.
144144
* tableName and subQueryAlias are possible qualifiers.
145145
* @param explicitMetadata Explicit metadata associated with this alias that overwrites child's.
146-
* @param deniedMetadataKeys Keys of metadata entries that are supposed to be removed when
147-
* inheriting the metadata from the child.
146+
* @param nonInheritableMetadataKeys Keys of metadata entries that are supposed to be removed when
147+
* inheriting the metadata from the child.
148148
*/
149149
case class Alias(child: Expression, name: String)(
150150
val exprId: ExprId = NamedExpression.newExprId,
151151
val qualifier: Seq[String] = Seq.empty,
152152
val explicitMetadata: Option[Metadata] = None,
153-
val deniedMetadataKeys: Seq[String] = Seq.empty)
153+
val nonInheritableMetadataKeys: Seq[String] = Seq.empty)
154154
extends UnaryExpression with NamedExpression {
155155

156156
// Alias(Generator, xx) need to be transformed into Generate(generator, ...)
@@ -172,7 +172,7 @@ case class Alias(child: Expression, name: String)(
172172
child match {
173173
case named: NamedExpression =>
174174
val builder = new MetadataBuilder().withMetadata(named.metadata)
175-
deniedMetadataKeys.foreach(builder.remove)
175+
nonInheritableMetadataKeys.foreach(builder.remove)
176176
builder.build()
177177

178178
case _ => Metadata.empty
@@ -181,7 +181,10 @@ case class Alias(child: Expression, name: String)(
181181
}
182182

183183
def newInstance(): NamedExpression =
184-
Alias(child, name)(qualifier = qualifier, explicitMetadata = explicitMetadata)
184+
Alias(child, name)(
185+
qualifier = qualifier,
186+
explicitMetadata = explicitMetadata,
187+
nonInheritableMetadataKeys = nonInheritableMetadataKeys)
185188

186189
override def toAttribute: Attribute = {
187190
if (resolved) {
@@ -201,7 +204,7 @@ case class Alias(child: Expression, name: String)(
201204
override def toString: String = s"$child AS $name#${exprId.id}$typeSuffix$delaySuffix"
202205

203206
override protected final def otherCopyArgs: Seq[AnyRef] = {
204-
exprId :: qualifier :: explicitMetadata :: deniedMetadataKeys :: Nil
207+
exprId :: qualifier :: explicitMetadata :: nonInheritableMetadataKeys :: Nil
205208
}
206209

207210
override def hashCode(): Int = {
@@ -212,7 +215,8 @@ case class Alias(child: Expression, name: String)(
212215
override def equals(other: Any): Boolean = other match {
213216
case a: Alias =>
214217
name == a.name && exprId == a.exprId && child == a.child && qualifier == a.qualifier &&
215-
explicitMetadata == a.explicitMetadata && deniedMetadataKeys == a.deniedMetadataKeys
218+
explicitMetadata == a.explicitMetadata &&
219+
nonInheritableMetadataKeys == a.nonInheritableMetadataKeys
216220
case _ => false
217221
}
218222

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/ReplaceNullWithFalseInPredicate.scala

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ package org.apache.spark.sql.catalyst.optimizer
2020
import org.apache.spark.sql.catalyst.expressions.{And, ArrayExists, ArrayFilter, CaseWhen, Expression, If}
2121
import org.apache.spark.sql.catalyst.expressions.{LambdaFunction, Literal, MapFilter, Or}
2222
import org.apache.spark.sql.catalyst.expressions.Literal.FalseLiteral
23-
import org.apache.spark.sql.catalyst.plans.logical.{Filter, Join, LogicalPlan}
23+
import org.apache.spark.sql.catalyst.plans.logical.{DeleteFromTable, Filter, Join, LogicalPlan}
2424
import org.apache.spark.sql.catalyst.rules.Rule
2525
import org.apache.spark.sql.types.BooleanType
2626
import org.apache.spark.util.Utils
@@ -53,6 +53,7 @@ object ReplaceNullWithFalseInPredicate extends Rule[LogicalPlan] {
5353
def apply(plan: LogicalPlan): LogicalPlan = plan transform {
5454
case f @ Filter(cond, _) => f.copy(condition = replaceNullWithFalse(cond))
5555
case j @ Join(_, _, _, Some(cond), _) => j.copy(condition = Some(replaceNullWithFalse(cond)))
56+
case d @ DeleteFromTable(_, Some(cond)) => d.copy(condition = Some(replaceNullWithFalse(cond)))
5657
case p: LogicalPlan => p transformExpressions {
5758
case i @ If(pred, _, _) => i.copy(predicate = replaceNullWithFalse(pred))
5859
case cw @ CaseWhen(branches, _) =>

sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/ReplaceNullWithFalseInPredicateSuite.scala

Lines changed: 22 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@ import org.apache.spark.sql.catalyst.dsl.plans._
2424
import org.apache.spark.sql.catalyst.expressions.{And, ArrayExists, ArrayFilter, ArrayTransform, CaseWhen, Expression, GreaterThan, If, LambdaFunction, Literal, MapFilter, NamedExpression, Or, UnresolvedNamedLambdaVariable}
2525
import org.apache.spark.sql.catalyst.expressions.Literal.{FalseLiteral, TrueLiteral}
2626
import org.apache.spark.sql.catalyst.plans.{Inner, PlanTest}
27-
import org.apache.spark.sql.catalyst.plans.logical.{LocalRelation, LogicalPlan}
27+
import org.apache.spark.sql.catalyst.plans.logical.{DeleteFromTable, LocalRelation, LogicalPlan}
2828
import org.apache.spark.sql.catalyst.rules.RuleExecutor
2929
import org.apache.spark.sql.internal.SQLConf
3030
import org.apache.spark.sql.types.{BooleanType, IntegerType}
@@ -48,6 +48,7 @@ class ReplaceNullWithFalseInPredicateSuite extends PlanTest {
4848
test("replace null inside filter and join conditions") {
4949
testFilter(originalCond = Literal(null, BooleanType), expectedCond = FalseLiteral)
5050
testJoin(originalCond = Literal(null, BooleanType), expectedCond = FalseLiteral)
51+
testDelete(originalCond = Literal(null, BooleanType), expectedCond = FalseLiteral)
5152
}
5253

5354
test("Not expected type - replaceNullWithFalse") {
@@ -64,6 +65,7 @@ class ReplaceNullWithFalseInPredicateSuite extends PlanTest {
6465
Literal(null, BooleanType))
6566
testFilter(originalCond, expectedCond = FalseLiteral)
6667
testJoin(originalCond, expectedCond = FalseLiteral)
68+
testDelete(originalCond, expectedCond = FalseLiteral)
6769
}
6870

6971
test("replace nulls in nested expressions in branches of If") {
@@ -73,6 +75,7 @@ class ReplaceNullWithFalseInPredicateSuite extends PlanTest {
7375
UnresolvedAttribute("b") && Literal(null, BooleanType))
7476
testFilter(originalCond, expectedCond = FalseLiteral)
7577
testJoin(originalCond, expectedCond = FalseLiteral)
78+
testDelete(originalCond, expectedCond = FalseLiteral)
7679
}
7780

7881
test("replace null in elseValue of CaseWhen") {
@@ -83,6 +86,7 @@ class ReplaceNullWithFalseInPredicateSuite extends PlanTest {
8386
val expectedCond = CaseWhen(branches, FalseLiteral)
8487
testFilter(originalCond, expectedCond)
8588
testJoin(originalCond, expectedCond)
89+
testDelete(originalCond, expectedCond)
8690
}
8791

8892
test("replace null in branch values of CaseWhen") {
@@ -92,6 +96,7 @@ class ReplaceNullWithFalseInPredicateSuite extends PlanTest {
9296
val originalCond = CaseWhen(branches, Literal(null))
9397
testFilter(originalCond, expectedCond = FalseLiteral)
9498
testJoin(originalCond, expectedCond = FalseLiteral)
99+
testDelete(originalCond, expectedCond = FalseLiteral)
95100
}
96101

97102
test("replace null in branches of If inside CaseWhen") {
@@ -108,6 +113,7 @@ class ReplaceNullWithFalseInPredicateSuite extends PlanTest {
108113

109114
testFilter(originalCond, expectedCond)
110115
testJoin(originalCond, expectedCond)
116+
testDelete(originalCond, expectedCond)
111117
}
112118

113119
test("replace null in complex CaseWhen expressions") {
@@ -127,19 +133,22 @@ class ReplaceNullWithFalseInPredicateSuite extends PlanTest {
127133

128134
testFilter(originalCond, expectedCond)
129135
testJoin(originalCond, expectedCond)
136+
testDelete(originalCond, expectedCond)
130137
}
131138

132139
test("replace null in Or") {
133140
val originalCond = Or(UnresolvedAttribute("b"), Literal(null))
134141
val expectedCond = UnresolvedAttribute("b")
135142
testFilter(originalCond, expectedCond)
136143
testJoin(originalCond, expectedCond)
144+
testDelete(originalCond, expectedCond)
137145
}
138146

139147
test("replace null in And") {
140148
val originalCond = And(UnresolvedAttribute("b"), Literal(null))
141149
testFilter(originalCond, expectedCond = FalseLiteral)
142150
testJoin(originalCond, expectedCond = FalseLiteral)
151+
testDelete(originalCond, expectedCond = FalseLiteral)
143152
}
144153

145154
test("replace nulls in nested And/Or expressions") {
@@ -148,6 +157,7 @@ class ReplaceNullWithFalseInPredicateSuite extends PlanTest {
148157
Or(Literal(null), And(Literal(null), And(UnresolvedAttribute("b"), Literal(null)))))
149158
testFilter(originalCond, expectedCond = FalseLiteral)
150159
testJoin(originalCond, expectedCond = FalseLiteral)
160+
testDelete(originalCond, expectedCond = FalseLiteral)
151161
}
152162

153163
test("replace null in And inside branches of If") {
@@ -157,6 +167,7 @@ class ReplaceNullWithFalseInPredicateSuite extends PlanTest {
157167
And(UnresolvedAttribute("b"), Literal(null, BooleanType)))
158168
testFilter(originalCond, expectedCond = FalseLiteral)
159169
testJoin(originalCond, expectedCond = FalseLiteral)
170+
testDelete(originalCond, expectedCond = FalseLiteral)
160171
}
161172

162173
test("replace null in branches of If inside And") {
@@ -168,6 +179,7 @@ class ReplaceNullWithFalseInPredicateSuite extends PlanTest {
168179
And(FalseLiteral, UnresolvedAttribute("b"))))
169180
testFilter(originalCond, expectedCond = FalseLiteral)
170181
testJoin(originalCond, expectedCond = FalseLiteral)
182+
testDelete(originalCond, expectedCond = FalseLiteral)
171183
}
172184

173185
test("replace null in branches of If inside another If") {
@@ -177,13 +189,15 @@ class ReplaceNullWithFalseInPredicateSuite extends PlanTest {
177189
Literal(null))
178190
testFilter(originalCond, expectedCond = FalseLiteral)
179191
testJoin(originalCond, expectedCond = FalseLiteral)
192+
testDelete(originalCond, expectedCond = FalseLiteral)
180193
}
181194

182195
test("replace null in CaseWhen inside another CaseWhen") {
183196
val nestedCaseWhen = CaseWhen(Seq(UnresolvedAttribute("b") -> FalseLiteral), Literal(null))
184197
val originalCond = CaseWhen(Seq(nestedCaseWhen -> TrueLiteral), Literal(null))
185198
testFilter(originalCond, expectedCond = FalseLiteral)
186199
testJoin(originalCond, expectedCond = FalseLiteral)
200+
testDelete(originalCond, expectedCond = FalseLiteral)
187201
}
188202

189203
test("inability to replace null in non-boolean branches of If") {
@@ -196,6 +210,7 @@ class ReplaceNullWithFalseInPredicateSuite extends PlanTest {
196210
FalseLiteral)
197211
testFilter(originalCond = condition, expectedCond = condition)
198212
testJoin(originalCond = condition, expectedCond = condition)
213+
testDelete(originalCond = condition, expectedCond = condition)
199214
}
200215

201216
test("inability to replace null in non-boolean values of CaseWhen") {
@@ -210,6 +225,7 @@ class ReplaceNullWithFalseInPredicateSuite extends PlanTest {
210225
val condition = CaseWhen(branches)
211226
testFilter(originalCond = condition, expectedCond = condition)
212227
testJoin(originalCond = condition, expectedCond = condition)
228+
testDelete(originalCond = condition, expectedCond = condition)
213229
}
214230

215231
test("inability to replace null in non-boolean branches of If inside another If") {
@@ -222,6 +238,7 @@ class ReplaceNullWithFalseInPredicateSuite extends PlanTest {
222238
FalseLiteral)
223239
testFilter(originalCond = condition, expectedCond = condition)
224240
testJoin(originalCond = condition, expectedCond = condition)
241+
testDelete(originalCond = condition, expectedCond = condition)
225242
}
226243

227244
test("replace null in If used as a join condition") {
@@ -353,6 +370,10 @@ class ReplaceNullWithFalseInPredicateSuite extends PlanTest {
353370
test((rel, exp) => rel.select(exp), originalExpr, expectedExpr)
354371
}
355372

373+
private def testDelete(originalCond: Expression, expectedCond: Expression): Unit = {
374+
test((rel, expr) => DeleteFromTable(rel, Some(expr)), originalCond, expectedCond)
375+
}
376+
356377
private def testHigherOrderFunc(
357378
argument: Expression,
358379
createExpr: (Expression, Expression) => Expression,

sql/core/src/main/scala/org/apache/spark/sql/Column.scala

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1164,10 +1164,11 @@ class Column(val expr: Expression) extends Logging {
11641164
* @since 2.0.0
11651165
*/
11661166
def name(alias: String): Column = withExpr {
1167-
// SPARK-33536: The Alias is no longer a column reference after converting to an attribute.
1168-
// These denied metadata keys are used to strip the column reference related metadata for
1169-
// the Alias. So it won't be caught as a column reference in DetectAmbiguousSelfJoin.
1170-
Alias(expr, alias)(deniedMetadataKeys = Seq(Dataset.DATASET_ID_KEY, Dataset.COL_POS_KEY))
1167+
// SPARK-33536: an alias is no longer a column reference. Therefore,
1168+
// we should not inherit the column reference related metadata in an alias
1169+
// so that it is not caught as a column reference in DetectAmbiguousSelfJoin.
1170+
Alias(expr, alias)(
1171+
nonInheritableMetadataKeys = Seq(Dataset.DATASET_ID_KEY, Dataset.COL_POS_KEY))
11711172
}
11721173

11731174
/**

sql/core/src/test/scala/org/apache/spark/sql/SparkSessionExtensionSuite.scala

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -577,8 +577,8 @@ class ColumnarAlias(child: ColumnarExpression, name: String)(
577577
override val exprId: ExprId = NamedExpression.newExprId,
578578
override val qualifier: Seq[String] = Seq.empty,
579579
override val explicitMetadata: Option[Metadata] = None,
580-
override val deniedMetadataKeys: Seq[String] = Seq.empty)
581-
extends Alias(child, name)(exprId, qualifier, explicitMetadata, deniedMetadataKeys)
580+
override val nonInheritableMetadataKeys: Seq[String] = Seq.empty)
581+
extends Alias(child, name)(exprId, qualifier, explicitMetadata, nonInheritableMetadataKeys)
582582
with ColumnarExpression {
583583

584584
override def columnarEval(batch: ColumnarBatch): Any = child.columnarEval(batch)
@@ -715,7 +715,7 @@ case class PreRuleReplaceAddWithBrokenVersion() extends Rule[SparkPlan] {
715715
def replaceWithColumnarExpression(exp: Expression): ColumnarExpression = exp match {
716716
case a: Alias =>
717717
new ColumnarAlias(replaceWithColumnarExpression(a.child),
718-
a.name)(a.exprId, a.qualifier, a.explicitMetadata, a.deniedMetadataKeys)
718+
a.name)(a.exprId, a.qualifier, a.explicitMetadata, a.nonInheritableMetadataKeys)
719719
case att: AttributeReference =>
720720
new ColumnarAttributeReference(att.name, att.dataType, att.nullable,
721721
att.metadata)(att.exprId, att.qualifier)

0 commit comments

Comments
 (0)