Skip to content

Commit e32c0f6

Browse files
viiryarxin
authored andcommitted
[SPARK-7299][SQL] Set precision and scale for Decimal according to JDBC metadata instead of returned BigDecimal
JIRA: https://issues.apache.org/jira/browse/SPARK-7299 When connecting with oracle db through jdbc, the precision and scale of `BigDecimal` object returned by `ResultSet.getBigDecimal` is not correctly matched to the table schema reported by `ResultSetMetaData.getPrecision` and `ResultSetMetaData.getScale`. So in case you insert a value like `19999` into a column with `NUMBER(12, 2)` type, you get through a `BigDecimal` object with scale as 0. But the dataframe schema has correct type as `DecimalType(12, 2)`. Thus, after you save the dataframe into parquet file and then retrieve it, you will get wrong result `199.99`. Because it is reported to be problematic on jdbc connection with oracle db. It might be difficult to add test case for it. But according to the user's test on JIRA, it solves this problem. Author: Liang-Chi Hsieh <[email protected]> Closes apache#5833 from viirya/jdbc_decimal_precision and squashes the following commits: 69bc2b5 [Liang-Chi Hsieh] Merge remote-tracking branch 'upstream/master' into jdbc_decimal_precision 928f864 [Liang-Chi Hsieh] Add comments. 5f9da94 [Liang-Chi Hsieh] Set up Decimal's precision and scale according to table schema instead of returned BigDecimal.
1 parent 775e6f9 commit e32c0f6

File tree

1 file changed

+19
-4
lines changed

1 file changed

+19
-4
lines changed

sql/core/src/main/scala/org/apache/spark/sql/jdbc/JDBCRDD.scala

Lines changed: 19 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -300,7 +300,7 @@ private[sql] class JDBCRDD(
300300
abstract class JDBCConversion
301301
case object BooleanConversion extends JDBCConversion
302302
case object DateConversion extends JDBCConversion
303-
case object DecimalConversion extends JDBCConversion
303+
case class DecimalConversion(precisionInfo: Option[(Int, Int)]) extends JDBCConversion
304304
case object DoubleConversion extends JDBCConversion
305305
case object FloatConversion extends JDBCConversion
306306
case object IntegerConversion extends JDBCConversion
@@ -317,8 +317,8 @@ private[sql] class JDBCRDD(
317317
schema.fields.map(sf => sf.dataType match {
318318
case BooleanType => BooleanConversion
319319
case DateType => DateConversion
320-
case DecimalType.Unlimited => DecimalConversion
321-
case DecimalType.Fixed(d) => DecimalConversion
320+
case DecimalType.Unlimited => DecimalConversion(None)
321+
case DecimalType.Fixed(d) => DecimalConversion(Some(d))
322322
case DoubleType => DoubleConversion
323323
case FloatType => FloatConversion
324324
case IntegerType => IntegerConversion
@@ -375,7 +375,22 @@ private[sql] class JDBCRDD(
375375
} else {
376376
mutableRow.update(i, null)
377377
}
378-
case DecimalConversion =>
378+
// When connecting with Oracle DB through JDBC, the precision and scale of BigDecimal
379+
// object returned by ResultSet.getBigDecimal is not correctly matched to the table
380+
// schema reported by ResultSetMetaData.getPrecision and ResultSetMetaData.getScale.
381+
// If inserting values like 19999 into a column with NUMBER(12, 2) type, you get through
382+
// a BigDecimal object with scale as 0. But the dataframe schema has correct type as
383+
// DecimalType(12, 2). Thus, after saving the dataframe into parquet file and then
384+
// retrieve it, you will get wrong result 199.99.
385+
// So it is needed to set precision and scale for Decimal based on JDBC metadata.
386+
case DecimalConversion(Some((p, s))) =>
387+
val decimalVal = rs.getBigDecimal(pos)
388+
if (decimalVal == null) {
389+
mutableRow.update(i, null)
390+
} else {
391+
mutableRow.update(i, Decimal(decimalVal, p, s))
392+
}
393+
case DecimalConversion(None) =>
379394
val decimalVal = rs.getBigDecimal(pos)
380395
if (decimalVal == null) {
381396
mutableRow.update(i, null)

0 commit comments

Comments
 (0)