-
Notifications
You must be signed in to change notification settings - Fork 28.7k
[SPARK-4358][SQL] Let BigDecimal do checking type compatibility #3208
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Can one of the admins verify this patch? |
When parsing NumericLiteral, using more specified numeric types including Byte, Short that may improve memory efficiency slightly. |
case v if bigIntValue.isValidShort => v.toShortExact | ||
case v if bigIntValue.isValidInt => v.toIntExact | ||
case v if bigIntValue.isValidLong => v.toLongExact | ||
case v => v |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 this seems like a cleaner way to do this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, sorry, I realize I initially said this was a good idea. Thinking about it further though I'm not sure if this is actually something we want to do. The memory benefits of picking the smallest possible number representation don't really seem to outweigh the added complexity of having to deal with bytes everywhere all of a sudden.
Are there any other SQL systems that do this?
To be clear I am in favor of using BigDecimal's isValidX
instead of our hand coded checking for int/long/bigdecimal
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have no idea if any other SQL systems do like this. The only extra complexity it adds is the type casting in GetItem
and Substring
as I see. But I can not tell if the memory benefits are worth doing this. It may depend on how often the type casting is happened. If the time complexity issue is a big concern here, I can remove the byte and short types. Please let me know your suggestion. Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Recently I have debugged few bugs in Presto database. So I tried to look at how Presto treats the integer literal. I found that it just uses Long to represent all number types. Although it does not mean all SQL systems doing this, I think that it can be a reference here.
So I will remove the byte and short types in this PR and let this PR can be merged then. Thanks.
Thanks for working on this! |
@@ -460,6 +460,20 @@ trait HiveTypeCoercion { | |||
// Skip nodes who's children have not been resolved yet. | |||
case e if !e.childrenResolved => e | |||
|
|||
case g @ GetItem(c, o @ IntegralType()) if o.dataType != IntegerType => | |||
GetItem(c, Cast(o, IntegerType)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indentation is off (2 spaces from case
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. It is fixed.
Hi @marmbrus Is this ok to be merged? |
Thanks! Merged to master. |
Remove hardcoding max and min values for types. Let BigDecimal do checking type compatibility. Author: Liang-Chi Hsieh <[email protected]> Closes #3208 from viirya/more_numericLit and squashes the following commits: e9834b4 [Liang-Chi Hsieh] Remove byte and short types for number literal. 1bd1825 [Liang-Chi Hsieh] Fix Indentation and make the modification clearer. cf1a997 [Liang-Chi Hsieh] Modified for comment to add a rule of analysis that adds a cast. 91fe489 [Liang-Chi Hsieh] add Byte and Short. 1bdc69d [Liang-Chi Hsieh] Let BigDecimal do checking type compatibility. (cherry picked from commit b57365a) Signed-off-by: Michael Armbrust <[email protected]>
Remove hardcoding max and min values for types. Let BigDecimal do checking type compatibility.