-
Notifications
You must be signed in to change notification settings - Fork 28.7k
[SPARK-4573] [SQL] Add SettableStructObjectInspector support in "wrap" function #3429
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Test build #23782 has started for PR 3429 at commit
|
Test build #23782 has finished for PR 3429 at commit
|
Test FAILed. |
Test build #23826 has started for PR 3429 at commit
|
Test build #23829 has started for PR 3429 at commit
|
Test build #23826 has finished for PR 3429 at commit
|
Test FAILed. |
Test build #23829 has finished for PR 3429 at commit
|
Test FAILed. |
Test build #23844 has started for PR 3429 at commit
|
Test build #23844 has finished for PR 3429 at commit
|
Test FAILed. |
Test build #23859 has started for PR 3429 at commit
|
Test build #23859 has finished for PR 3429 at commit
|
Test PASSed. |
Thanks for working on this and adding a bunch of tests. All of this is getting pretty complicated, so I think it would be good if you could add some more explanation to the scala doc of wrap/unwrap that discusses what types object inspectors exist and which will be returned for what type of expression. |
checkValues(d, unwrap(wrap(null, toInspector(Literal(d, dt))), toInspector(Literal(d, dt)))) | ||
} | ||
|
||
test("wrap / unwrap #6") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test build #24046 has started for PR 3429 at commit
|
Test build #24046 has finished for PR 3429 at commit
|
Test PASSed. |
Test build #24084 has started for PR 3429 at commit
|
Test build #24084 has finished for PR 3429 at commit
|
Test PASSed. |
@marmbrus I've updated the scala doc a little bit, I know that's quite complicated now, but we need to make thing right (to integrated with Hive UDFs seamlessly), since bugs like #2802 is depended on this. I will make another PR to refactor / improve the performance described at https://github.com/apache/spark/pull/3429/files#diff-f88c3e731fcb17b1323b778807c35b38R167 once this PR merged. |
/cc @liancheng can you help review this? I did a quick pass and this seems reasonable. @chenghao-intel is this ready to merge if @liancheng approves? |
* Array[Byte] | ||
* java.sql.Date | ||
* java.sql.Timestamp | ||
* Complicated Types => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Complicated" => "Complex"
In general this LGTM except for some minor styling comments, thanks! |
Test build #24590 has started for PR 3429 at commit
|
Thank you @liancheng , I've updated the code as feedback. @marmbrus I think this PR is ready to be merged once Jenkins agrees too. |
Test build #24590 has finished for PR 3429 at commit
|
Test PASSed. |
Thanks! Merged to master. |
case x: ByteObjectInspector if x.preferWritable() => x.get(data) | ||
case x: HiveDecimalObjectInspector => HiveShim.toCatalystDecimal(x, data) | ||
case x: BinaryObjectInspector if x.preferWritable() => | ||
x.getPrimitiveWritableObject(data).copyBytes() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like this PR may have caused a build break for the hadoop1.0 profile:
[warn] Note: Recompile with -Xlint:unchecked for details.
[info] Compiling 21 Scala sources and 1 Java source to /home/jenkins/workspace/Spark-Master-SBT/AMPLAB_JENKINS_BUILD_PROFILE/hadoop1.0/label/centos/sql/hive/target/scala-2.10/classes...
[error] /home/jenkins/workspace/Spark-Master-SBT/AMPLAB_JENKINS_BUILD_PROFILE/hadoop1.0/label/centos/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveInspectors.scala:300: value copyBytes is not a member of org.apache.hadoop.io.BytesWritable
[error] x.getPrimitiveWritableObject(data).copyBytes()
[error] ^
[error] one error found
[error] (hive/compile:compile) Compilation failed
[error] Total time: 117 s, completed Dec 18, 2014 8:51:43 PM
[error] Got a return code of 1 on line 155 of the run-tests script.
Build step 'Execute shell' marked build as failure
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this didn't break the pull request builder and it's nighttime now (so we're probably not merging tons of stuff), I'm going to hold off on reverting this for a little bit to see if we can come up with a quick hotfix. Otherwise, I'll revert this commit when I get up tomorrow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @JoshRosen , I've created #3742 for this hot fix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've merged that PR, so the build should be fixed. Thanks!
Since #3429 has been merged, the bug of wrapping to Writable for HiveGenericUDF is resolved, we can safely remove the foldable checking in `HiveGenericUdf.eval`, which discussed in #2802. Author: Cheng Hao <[email protected]> Closes #3745 from chenghao-intel/generic_udf and squashes the following commits: 622ad03 [Cheng Hao] Remove the unnecessary code change in Generic UDF
Hive UDAF may create an customized object constructed by SettableStructObjectInspector, this is critical when integrate Hive UDAF with the refactor-ed UDAF interface.
Performance issue in
wrap/unwrap
since more match cases added, will do it in another PR.