Skip to content

[SPARK-4358][SQL] Let BigDecimal do checking type compatibility #3208

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 5 commits into from

Conversation

viirya
Copy link
Member

@viirya viirya commented Nov 11, 2014

Remove hardcoding max and min values for types. Let BigDecimal do checking type compatibility.

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@viirya viirya changed the title [SQL][Minor] Let BigDecimal do checking type compatibility [SPARK-4358][SQL] Let BigDecimal do checking type compatibility and use more specified numeric types Nov 12, 2014
@viirya
Copy link
Member Author

viirya commented Nov 12, 2014

When parsing NumericLiteral, using more specified numeric types including Byte, Short that may improve memory efficiency slightly.

case v if bigIntValue.isValidShort => v.toShortExact
case v if bigIntValue.isValidInt => v.toIntExact
case v if bigIntValue.isValidLong => v.toLongExact
case v => v
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 this seems like a cleaner way to do this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, sorry, I realize I initially said this was a good idea. Thinking about it further though I'm not sure if this is actually something we want to do. The memory benefits of picking the smallest possible number representation don't really seem to outweigh the added complexity of having to deal with bytes everywhere all of a sudden.

Are there any other SQL systems that do this?

To be clear I am in favor of using BigDecimal's isValidX instead of our hand coded checking for int/long/bigdecimal

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no idea if any other SQL systems do like this. The only extra complexity it adds is the type casting in GetItem and Substring as I see. But I can not tell if the memory benefits are worth doing this. It may depend on how often the type casting is happened. If the time complexity issue is a big concern here, I can remove the byte and short types. Please let me know your suggestion. Thanks.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recently I have debugged few bugs in Presto database. So I tried to look at how Presto treats the integer literal. I found that it just uses Long to represent all number types. Although it does not mean all SQL systems doing this, I think that it can be a reference here.

So I will remove the byte and short types in this PR and let this PR can be merged then. Thanks.

@marmbrus
Copy link
Contributor

Thanks for working on this!

@@ -460,6 +460,20 @@ trait HiveTypeCoercion {
// Skip nodes who's children have not been resolved yet.
case e if !e.childrenResolved => e

case g @ GetItem(c, o @ IntegralType()) if o.dataType != IntegerType =>
GetItem(c, Cast(o, IntegerType))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indentation is off (2 spaces from case).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. It is fixed.

@viirya viirya changed the title [SPARK-4358][SQL] Let BigDecimal do checking type compatibility and use more specified numeric types [SPARK-4358][SQL] Let BigDecimal do checking type compatibility Nov 24, 2014
@viirya
Copy link
Member Author

viirya commented Nov 27, 2014

Hi @marmbrus Is this ok to be merged?

@marmbrus
Copy link
Contributor

marmbrus commented Dec 1, 2014

Thanks! Merged to master.

@asfgit asfgit closed this in b57365a Dec 1, 2014
asfgit pushed a commit that referenced this pull request Dec 1, 2014
Remove hardcoding max and min values for types. Let BigDecimal do checking type compatibility.

Author: Liang-Chi Hsieh <[email protected]>

Closes #3208 from viirya/more_numericLit and squashes the following commits:

e9834b4 [Liang-Chi Hsieh] Remove byte and short types for number literal.
1bd1825 [Liang-Chi Hsieh] Fix Indentation and make the modification clearer.
cf1a997 [Liang-Chi Hsieh] Modified for comment to add a rule of analysis that adds a cast.
91fe489 [Liang-Chi Hsieh] add Byte and Short.
1bdc69d [Liang-Chi Hsieh] Let BigDecimal do checking type compatibility.

(cherry picked from commit b57365a)
Signed-off-by: Michael Armbrust <[email protected]>
@viirya viirya deleted the more_numericLit branch December 27, 2023 18:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants