[SPARK-8501] [SQL] Avoids reading schema from empty ORC files #7199

liancheng · 2015-07-02T22:00:55Z

ORC writes empty schema (struct<>) to ORC files containing zero rows. This is OK for Hive since the table schema is managed by the metastore. But it causes trouble when reading raw ORC files via Spark SQL since we have to discover the schema from the files.

Notice that the ORC data source always avoids writing empty ORC files, but it's still problematic when reading Hive tables which contain empty part-files.

liancheng · 2015-07-02T22:04:47Z

cc @yhuai @zhzhan

SparkQA · 2015-07-02T23:44:59Z

Test build #36437 has finished for PR 7199 at commit a290221.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2015-07-03T03:01:23Z

Test build #36456 has finished for PR 7199 at commit ad5b0ae.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2015-07-03T03:20:42Z

Test build #36459 has finished for PR 7199 at commit bb8cd95.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

liancheng · 2015-07-03T04:29:47Z

Merging to master. This PR is backported to branch-1.4 by #7200.

…rt to 1.4) This PR backports #7199 to branch-1.4 Author: Cheng Lian <[email protected]> Closes #7200 from liancheng/spark-8501-for-1.4 and squashes the following commits: 725e9e3 [Cheng Lian] Addresses comments 0fa25af [Cheng Lian] Avoids reading schema from empty ORC files

Avoids reading schema from empty ORC files

a290221

liancheng force-pushed the spark-8501 branch from c3a4623 to a290221 Compare July 2, 2015 22:04

liancheng mentioned this pull request Jul 2, 2015

[SPARK-8501] [SQL] Avoids reading schema from empty ORC files (backport to 1.4) #7200

Closed

Addresses comments

bb8cd95

liancheng force-pushed the spark-8501 branch from ad5b0ae to bb8cd95 Compare July 3, 2015 01:51

asfgit closed this in 20a4d7d Jul 3, 2015

liancheng deleted the spark-8501 branch September 27, 2016 14:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SPARK-8501] [SQL] Avoids reading schema from empty ORC files #7199

[SPARK-8501] [SQL] Avoids reading schema from empty ORC files #7199

Uh oh!

liancheng commented Jul 2, 2015

Uh oh!

liancheng commented Jul 2, 2015

Uh oh!

SparkQA commented Jul 2, 2015

Uh oh!

SparkQA commented Jul 3, 2015

Uh oh!

SparkQA commented Jul 3, 2015

Uh oh!

liancheng commented Jul 3, 2015

Uh oh!

Uh oh!

[SPARK-8501] [SQL] Avoids reading schema from empty ORC files #7199

[SPARK-8501] [SQL] Avoids reading schema from empty ORC files #7199

Uh oh!

Conversation

liancheng commented Jul 2, 2015

Uh oh!

liancheng commented Jul 2, 2015

Uh oh!

SparkQA commented Jul 2, 2015

Uh oh!

SparkQA commented Jul 3, 2015

Uh oh!

SparkQA commented Jul 3, 2015

Uh oh!

liancheng commented Jul 3, 2015

Uh oh!

Uh oh!