Skip to content

Commit d7b97a1

Browse files
committed
[SPARK-31166][SQL] UNION map<null, null> and other maps should not fail
### What changes were proposed in this pull request? After #27542, `map()` returns `map<null, null>` instead of `map<string, string>`. However, this breaks queries which union `map()` and other maps. The reason is, `TypeCoercion` rules and `Cast` think it's illegal to cast null type map key to other types, as it makes the key nullable, but it's actually legal. This PR fixes it. ### Why are the changes needed? To avoid breaking queries. ### Does this PR introduce any user-facing change? Yes, now some queries that work in 2.x can work in 3.0 as well. ### How was this patch tested? new test Closes #27926 from cloud-fan/bug. Authored-by: Wenchen Fan <[email protected]> Signed-off-by: Wenchen Fan <[email protected]>
1 parent 93088f7 commit d7b97a1

File tree

3 files changed

+13
-3
lines changed

3 files changed

+13
-3
lines changed

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -158,7 +158,7 @@ object TypeCoercion {
158158
}
159159
case (MapType(kt1, vt1, valueContainsNull1), MapType(kt2, vt2, valueContainsNull2)) =>
160160
findTypeFunc(kt1, kt2)
161-
.filter { kt => !Cast.forceNullable(kt1, kt) && !Cast.forceNullable(kt2, kt) }
161+
.filter { kt => Cast.canCastMapKeyNullSafe(kt1, kt) && Cast.canCastMapKeyNullSafe(kt2, kt) }
162162
.flatMap { kt =>
163163
findTypeFunc(vt1, vt2).map { vt =>
164164
MapType(kt, vt, valueContainsNull1 || valueContainsNull2 ||

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -77,8 +77,7 @@ object Cast {
7777
resolvableNullability(fn || forceNullable(fromType, toType), tn)
7878

7979
case (MapType(fromKey, fromValue, fn), MapType(toKey, toValue, tn)) =>
80-
canCast(fromKey, toKey) &&
81-
(!forceNullable(fromKey, toKey)) &&
80+
canCast(fromKey, toKey) && canCastMapKeyNullSafe(fromKey, toKey) &&
8281
canCast(fromValue, toValue) &&
8382
resolvableNullability(fn || forceNullable(fromValue, toValue), tn)
8483

@@ -98,6 +97,11 @@ object Cast {
9897
case _ => false
9998
}
10099

100+
def canCastMapKeyNullSafe(fromType: DataType, toType: DataType): Boolean = {
101+
// If the original map key type is NullType, it's OK as the map must be empty.
102+
fromType == NullType || !forceNullable(fromType, toType)
103+
}
104+
101105
/**
102106
* Return true if we need to use the `timeZone` information casting `from` type to `to` type.
103107
* The patterns matched reflect the current implementation in the Cast node.

sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3487,6 +3487,12 @@ class SQLQuerySuite extends QueryTest with SharedSparkSession with AdaptiveSpark
34873487
}
34883488
}
34893489
}
3490+
3491+
test("SPARK-31166: UNION map<null, null> and other maps should not fail") {
3492+
checkAnswer(
3493+
sql("(SELECT map()) UNION ALL (SELECT map(1, 2))"),
3494+
Seq(Row(Map[Int, Int]()), Row(Map(1 -> 2))))
3495+
}
34903496
}
34913497

34923498
case class Foo(bar: Option[String])

0 commit comments

Comments
 (0)