-
Notifications
You must be signed in to change notification settings - Fork 28.7k
[SPARK-12258] [SQL] passing null into ScalaUDF #10259
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
test this please |
Thank you @davies ! I guess we might still have a bug in the code. As long as any input variable is Null, the return result is For example, one of input variables is test("SPARK-12258 UDF and Null value") {
hiveContext.runSqlHive("CREATE TABLE test (ti TINYINT, si SMALLINT, i INT, bi BIGINT, " +
"bo BOOLEAN, f FLOAT, d DOUBLE, s STRING, bin BINARY, t TIMESTAMP, da DATE)" +
"STORED AS TEXTFILE")
hiveContext.runSqlHive("INSERT INTO TABLE test VALUES(Null, Null, 3, Null, Null, " +
"Null, Null, Null, Null, Null, Null)")
hiveContext.udf.register("typeNullCheck",
(ti: Byte, si: Short, i: Int, bi: Long, bo: Boolean, f: Float, d: Double, s: String,
bin: Array[Byte], t: Timestamp, da: Date) =>
(ti, si, i, bi, bo, f, d, s, bin, t, da))
checkAnswer(
sql("SELECT typeNullCheck(ti, si, i, bi, bo, f, d, s, bin, t, da) FROM test"),
Row(null, null, 3, null, null, null, null, null, null, null, null))
} This is caused by the our Analyzer rule. Below is the physical plan:
|
Does |
Yeah. Below is the result of sql("SELECT * FROM test").show();
|
the analyzer rule was introduced by #9770 ? |
Yeah. That is my understanding.
|
@gatorsmile Because your UDF is using primitive types, we have to chance to pass |
Ok, I see your point. This is a possible workaround. It works well when the input values of primitive types are not null, even if the input values of Date and Timestamp columns are null. I am just afraid how users know this? We might see more related JIRAs in the future. Is it documented? Or any way to avoid it? |
I tried use @davies how about we update our doc (the |
LGTM pending jenkins. |
LGTM |
Test build #47552 has finished for PR 10259 at commit
|
Check nullability and passing them into ScalaUDF. Closes #10249 Author: Davies Liu <[email protected]> Closes #10259 from davies/udf_null. (cherry picked from commit b1b4ee7) Signed-off-by: Yin Huai <[email protected]>
Sorry, but after this I am now seeing codeGen errors. Like this:
|
hi @markhamstra , can you share you test code? so that we can reproduce it, thanks! |
@markhamstra Does #10266 fix it? |
@davies No -- see the other PR. |
This is a follow-up PR for #10259 Author: Davies Liu <[email protected]> Closes #10266 from davies/null_udf2. (cherry picked from commit c119a34) Signed-off-by: Davies Liu <[email protected]>
This is a follow-up PR for #10259 Author: Davies Liu <[email protected]> Closes #10266 from davies/null_udf2.
Check nullability and passing them into ScalaUDF.
Closes #10249