Skip to content

Support insert into bucketed unpartitioned Hive table#1127

Merged
kokosing merged 1 commit into
trinodb:masterfrom
kokosing:origin/master/141_insert_into_unpartitioned_bucketed_table
Aug 19, 2019
Merged

Support insert into bucketed unpartitioned Hive table#1127
kokosing merged 1 commit into
trinodb:masterfrom
kokosing:origin/master/141_insert_into_unpartitioned_bucketed_table

Conversation

@kokosing

Copy link
Copy Markdown
Member

Support insert into bucketed unpartitioned Hive table

@cla-bot cla-bot Bot added the cla-signed label Jul 15, 2019
@kokosing

Copy link
Copy Markdown
Member Author

@electrum @findepi I naively disabled the check state and run some manual tests and it seems to be working. What am I missing? Or what should I also test?

presto:default> show create table b;
         Create Table
-------------------------------
 CREATE TABLE hive.default.b (
    i integer
 )
 WITH (
    bucket_count = 5,
    bucketed_by = ARRAY['i'],
    format = 'ORC',
    sorted_by = ARRAY[]
 )
(1 row)

Query 20190715_140514_00026_vgg79, FINISHED, 1 node
Splits: 1 total, 1 done (100.00%)
0:00 [0 rows, 0B] [0 rows/s, 0B/s]

presto:default> select count(*) from b;
 _col0
-------
    19
(1 row)

Query 20190715_140517_00027_vgg79, FINISHED, 1 node
Splits: 47 total, 47 done (100.00%)
0:01 [19 rows, 4.9KB] [32 rows/s, 8.38KB/s]

presto:default> insert into b select * from b;
INSERT: 19 rows

Query 20190715_140525_00028_vgg79, FINISHED, 1 node
Splits: 48 total, 48 done (100.00%)
0:02 [19 rows, 4.9KB] [9 rows/s, 2.45KB/s]

presto:default> select count(*) from b;
 _col0
-------
    38
(1 row)

Query 20190715_140529_00029_vgg79, FINISHED, 1 node
Splits: 52 total, 52 done (100.00%)
0:01 [38 rows, 6.19KB] [45 rows/s, 7.39KB/s]

@kokosing kokosing added the WIP label Jul 15, 2019
@kokosing

Copy link
Copy Markdown
Member Author

Fixes #1088

@kokosing kokosing force-pushed the origin/master/141_insert_into_unpartitioned_bucketed_table branch from b52f12d to 168b347 Compare July 15, 2019 18:48
@sopel39

sopel39 commented Jul 16, 2019

Copy link
Copy Markdown
Member

There is another, "similar" check:

                if (bucketNumber.isPresent()) {
                    throw new PrestoException(HIVE_PARTITION_READ_ONLY, "Cannot insert into existing partition of bucketed Hive table: " + partitionName.get());
                }

Comment thread presto-hive/src/main/java/io/prestosql/plugin/hive/HiveWriterFactory.java Outdated
@kokosing

Copy link
Copy Markdown
Member Author

There is another, "similar" check:

               if (bucketNumber.isPresent()) {
                   throw new PrestoException(HIVE_PARTITION_READ_ONLY, "Cannot insert into existing > partition of bucketed Hive table: " + partitionName.get());
               }

Yes, this about partitioned and bucketed tables (a bit out of the scope of this PR but strongly related).

@kokosing kokosing force-pushed the origin/master/141_insert_into_unpartitioned_bucketed_table branch from 168b347 to d87b06c Compare July 16, 2019 12:06
@kokosing

Copy link
Copy Markdown
Member Author

@electrum @dain do you guys have any comments to that? If not I would like to continue on that (adding more tests and handle also partitioned tables)

@dain dain self-requested a review July 23, 2019 23:28
@kokosing kokosing requested a review from electrum July 24, 2019 08:05
@dain

dain commented Jul 27, 2019

Copy link
Copy Markdown
Member

@kokosing I don't have any comments, but I'd like to see @electrum reply.

@dain dain removed their request for review July 27, 2019 22:00
@kokosing kokosing force-pushed the origin/master/141_insert_into_unpartitioned_bucketed_table branch from d87b06c to fd02a49 Compare August 6, 2019 08:11
@kokosing kokosing removed the WIP label Aug 6, 2019
@kokosing

kokosing commented Aug 6, 2019

Copy link
Copy Markdown
Member Author

@sopel39, @electrum I added more tests, would you like to review?

@kokosing kokosing force-pushed the origin/master/141_insert_into_unpartitioned_bucketed_table branch from fd02a49 to bfcb4e3 Compare August 16, 2019 09:54

@sopel39 sopel39 left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think @electrum also wanted to review it

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove

@kokosing kokosing force-pushed the origin/master/141_insert_into_unpartitioned_bucketed_table branch from bfcb4e3 to c1efa23 Compare August 16, 2019 10:47

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be test_empty_bucketed_table

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test_bucketed_table

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: space between table name and (

@kokosing

Copy link
Copy Markdown
Member Author
[INFO] --- maven-enforcer-plugin:3.0.0-M2:enforce (default) @ presto-hive-hadoop2 ---
[INFO] Ignoring requireUpperBoundDeps in com.google.guava:guava
[INFO] Ignoring requireUpperBoundDeps in org.codehaus.plexus:plexus-utils
[WARNING] Rule 5: org.apache.maven.plugins.enforcer.EnforceBytecodeVersion failed with message:
IOException while reading /home/travis/.m2/repository/software/amazon/ion/ion-java/1.0.2/ion-java-1.0.2.jar

I get the above error for HIVE_TESTS=true and command /mvnw -B -pl presto-hive-hadoop2 test -P test-hive-hadoop2 -Dhive.hadoop2.timeZone=UTC -DHADOOP_USER_NAME=hive -Dhive.hadoop2.metastoreHost=localhost -Dhive.hadoop2.metastorePort=9083 -Dhive.hadoop2.databaseName=default -Dhive.hadoop2.metastoreHost=hadoop-master -Dhive.hadoop2.timeZone=Asia/Kathmandu -Dhive.metastore.thrift.client.socks-proxy=127.0.0.1:1180 -Dhadoop-master-ip=172.18.0.2 in travis. This is passing just fine locally.

@kokosing kokosing force-pushed the origin/master/141_insert_into_unpartitioned_bucketed_table branch from 4ccc228 to c1efa23 Compare August 19, 2019 14:39
@kokosing kokosing force-pushed the origin/master/141_insert_into_unpartitioned_bucketed_table branch from c1efa23 to 1fb44cb Compare August 19, 2019 14:46
@kokosing kokosing closed this Aug 19, 2019
@kokosing kokosing deleted the origin/master/141_insert_into_unpartitioned_bucketed_table branch August 19, 2019 14:47
@kokosing kokosing merged commit 1fb44cb into trinodb:master Aug 19, 2019
@kokosing kokosing mentioned this pull request Aug 19, 2019
7 tasks
@kokosing kokosing added this to the 318 milestone Aug 19, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

4 participants