[SPARK-8148] Do not use FloatType in partition column inference. #6692

rxin · 2015-06-07T19:50:40Z

Use DoubleType instead to be more stable and robust.

rxin · 2015-06-07T19:50:46Z

rxin · 2015-06-07T19:56:55Z

The other thing we should consider, although I'm less sure about, is whether we should skip IntegerType and go straight to LongType.

SparkQA · 2015-06-07T21:36:46Z

Test build #34396 has finished for PR 6692 at commit f88b4e2.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2015-06-07T23:35:39Z

Test build #34398 has finished for PR 6692 at commit 0ac8c31.

This patch passes all tests.
This patch merges cleanly.
This patch adds the following public classes (experimental):
- case class GeneratedExpressionCode(var code: Code, var isNull: Term, var primitive: Term)
- class CodeGenContext
- case class Pow(left: Expression, right: Expression)
- case class Rint(child: Expression) extends UnaryMathExpression(math.rint, "ROUND")
- case class ToDegrees(child: Expression) extends UnaryMathExpression(math.toDegrees, "DEGREES")
- case class ToRadians(child: Expression) extends UnaryMathExpression(math.toRadians, "RADIANS")

liancheng · 2015-06-08T05:32:18Z

I'm worrying about skipping FloatType (and possibly IntegerType) might break existing user code because the partition column data types gets changed. Especially when the inferred schema gets persisted in places like Parquet file metadata and metastore.

rxin · 2015-06-08T06:10:18Z

If it is persisted in the metastore, then inference is no longer used, isn't it?

And why would partition columns be stored in Parquet file metadata?

liancheng · 2015-06-08T10:22:55Z

Had offline discussion with @rxin. There can be rare corner cases where compatibility issues may arise. However, stop using FloatType in this case can eliminate these corner cases. So we would like to have this and add a note in release notes.

SparkQA · 2015-06-08T10:30:04Z

Test build #34419 has finished for PR 6692 at commit 4fd761f.

This patch fails PySpark unit tests.
This patch merges cleanly.
This patch adds no public classes.

Use DoubleType instead to be more stable and robust.

SparkQA · 2015-06-08T20:01:32Z

Test build #34457 has finished for PR 6692 at commit 6742ecc.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2015-06-08T21:02:22Z

Test build #891 timed out for PR 6692 at commit 4fd761f after a configured wait of 175m.

Use DoubleType instead to be more stable and robust. Author: Reynold Xin <[email protected]> Closes apache#6692 from rxin/SPARK-8148 and squashes the following commits: 6742ecc [Reynold Xin] [SPARK-8148] Do not use FloatType in partition column inference.

rxin mentioned this pull request Jun 7, 2015

[SPARK-8117] [SQL] Push codegen implementation into each Expression #6690

Closed

rxin force-pushed the SPARK-8148 branch from f88b4e2 to 0ac8c31 Compare June 7, 2015 21:14

rxin force-pushed the SPARK-8148 branch from 0ac8c31 to 4fd761f Compare June 8, 2015 08:03

[SPARK-8148] Do not use FloatType in partition column inference.

6742ecc

Use DoubleType instead to be more stable and robust.

rxin force-pushed the SPARK-8148 branch from 4fd761f to 6742ecc Compare June 8, 2015 18:16

asfgit closed this in 5185389 Jun 8, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SPARK-8148] Do not use FloatType in partition column inference. #6692

[SPARK-8148] Do not use FloatType in partition column inference. #6692

Uh oh!

rxin commented Jun 7, 2015

Uh oh!

rxin commented Jun 7, 2015

Uh oh!

rxin commented Jun 7, 2015

Uh oh!

SparkQA commented Jun 7, 2015

Uh oh!

SparkQA commented Jun 7, 2015

Uh oh!

liancheng commented Jun 8, 2015

Uh oh!

rxin commented Jun 8, 2015

Uh oh!

liancheng commented Jun 8, 2015

Uh oh!

SparkQA commented Jun 8, 2015

Uh oh!

SparkQA commented Jun 8, 2015

Uh oh!

SparkQA commented Jun 8, 2015

Uh oh!

Uh oh!

[SPARK-8148] Do not use FloatType in partition column inference. #6692

[SPARK-8148] Do not use FloatType in partition column inference. #6692

Uh oh!

Conversation

rxin commented Jun 7, 2015

Uh oh!

rxin commented Jun 7, 2015

Uh oh!

rxin commented Jun 7, 2015

Uh oh!

SparkQA commented Jun 7, 2015

Uh oh!

SparkQA commented Jun 7, 2015

Uh oh!

liancheng commented Jun 8, 2015

Uh oh!

rxin commented Jun 8, 2015

Uh oh!

liancheng commented Jun 8, 2015

Uh oh!

SparkQA commented Jun 8, 2015

Uh oh!

SparkQA commented Jun 8, 2015

Uh oh!

SparkQA commented Jun 8, 2015

Uh oh!

Uh oh!