SHS-NG M4.3: Port StorageTab to the new backend. #9

vanzin · 2017-04-17T20:12:05Z

This required adding information about StreamBlockId to the UI store,
which is not available yet via the API. So an internal type was added
until there's a need to expose that information in the API.

The UI only lists RDDs that have cached partitions, and that information
wasn't being correctly captured in UIListener, so that's also fixed,
along with some minor (internal) API adjustments so that the UI can
get the correct data.

This required adding information about StreamBlockId to the store, which is not available yet via the API. So an internal type was added until there's a need to expose that information in the API. The UI only lists RDDs that have cached partitions, and that information wasn't being correctly captured in the listener, so that's also fixed, along with some minor (internal) API adjustments so that the UI can get the correct data.

## What changes were proposed in this pull request? This PR aims to optimize GroupExpressions by removing repeating expressions. `RemoveRepetitionFromGroupExpressions` is added. **Before** ```scala scala> sql("select a+1 from values 1,2 T(a) group by a+1, 1+a, A+1, 1+A").explain() == Physical Plan == WholeStageCodegen : +- TungstenAggregate(key=[(a#0 + 1)#6,(1 + a#0)#7,(A#0 + 1)#8,(1 + A#0)#9], functions=[], output=[(a + 1)#5]) : +- INPUT +- Exchange hashpartitioning((a#0 + 1)#6, (1 + a#0)#7, (A#0 + 1)#8, (1 + A#0)#9, 200), None +- WholeStageCodegen : +- TungstenAggregate(key=[(a#0 + 1) AS (a#0 + 1)#6,(1 + a#0) AS (1 + a#0)#7,(A#0 + 1) AS (A#0 + 1)#8,(1 + A#0) AS (1 + A#0)#9], functions=[], output=[(a#0 + 1)#6,(1 + a#0)#7,(A#0 + 1)#8,(1 + A#0)#9]) : +- INPUT +- LocalTableScan [a#0], [[1],[2]] ``` **After** ```scala scala> sql("select a+1 from values 1,2 T(a) group by a+1, 1+a, A+1, 1+A").explain() == Physical Plan == WholeStageCodegen : +- TungstenAggregate(key=[(a#0 + 1)#6], functions=[], output=[(a + 1)#5]) : +- INPUT +- Exchange hashpartitioning((a#0 + 1)#6, 200), None +- WholeStageCodegen : +- TungstenAggregate(key=[(a#0 + 1) AS (a#0 + 1)#6], functions=[], output=[(a#0 + 1)#6]) : +- INPUT +- LocalTableScan [a#0], [[1],[2]] ``` ## How was this patch tested? Pass the Jenkins tests (with a new testcase) Author: Dongjoon Hyun <[email protected]> Closes apache#12590 from dongjoon-hyun/SPARK-14830. (cherry picked from commit 6e63201) Signed-off-by: Michael Armbrust <[email protected]>

### Why are the changes needed? `EnsureRequirements` adds `ShuffleExchangeExec` (RangePartitioning) after Sort if `RoundRobinPartitioning` behinds it. This will cause 2 shuffles, and the number of partitions in the final stage is not the number specified by `RoundRobinPartitioning. **Example SQL** ``` SELECT /*+ REPARTITION(5) */ * FROM test ORDER BY a ``` **BEFORE** ``` == Physical Plan == *(1) Sort [a#0 ASC NULLS FIRST], true, 0 +- Exchange rangepartitioning(a#0 ASC NULLS FIRST, 200), true, [id=#11] +- Exchange RoundRobinPartitioning(5), false, [id=#9] +- Scan hive default.test [a#0, b#1], HiveTableRelation `default`.`test`, org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, [a#0, b#1] ``` **AFTER** ``` == Physical Plan == *(1) Sort [a#0 ASC NULLS FIRST], true, 0 +- Exchange rangepartitioning(a#0 ASC NULLS FIRST, 5), true, [id=#11] +- Scan hive default.test [a#0, b#1], HiveTableRelation `default`.`test`, org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, [a#0, b#1] ``` ### Does this PR introduce any user-facing change? No ### How was this patch tested? Run suite Tests and add new test for this. Closes apache#26946 from stczwd/RoundRobinPartitioning. Lead-authored-by: lijunqing <[email protected]> Co-authored-by: stczwd <[email protected]> Signed-off-by: Wenchen Fan <[email protected]>

vanzin force-pushed the shs-ng/M4.3 branch from 933d094 to d41f35e Compare April 17, 2017 21:40

vanzin force-pushed the shs-ng/M4.2 branch from 0a2ca45 to 19218ad Compare April 17, 2017 21:40

vanzin force-pushed the shs-ng/M4.3 branch from d41f35e to 3a23a38 Compare April 25, 2017 17:43

vanzin force-pushed the shs-ng/M4.2 branch from 19218ad to e75a98d Compare April 25, 2017 17:43

vanzin force-pushed the shs-ng/M4.3 branch from 3a23a38 to 17d6887 Compare April 26, 2017 18:11

vanzin force-pushed the shs-ng/M4.2 branch from e75a98d to d6318f8 Compare April 26, 2017 18:11

vanzin force-pushed the shs-ng/M4.3 branch from 17d6887 to 001c71a Compare April 26, 2017 23:57

vanzin force-pushed the shs-ng/M4.2 branch from d6318f8 to 947e9d5 Compare April 26, 2017 23:58

vanzin force-pushed the shs-ng/M4.3 branch from 001c71a to 409a201 Compare April 27, 2017 18:14

vanzin force-pushed the shs-ng/M4.2 branch from 947e9d5 to 2521cae Compare April 27, 2017 18:14

vanzin force-pushed the shs-ng/M4.3 branch from 409a201 to 207df90 Compare April 27, 2017 21:31

vanzin force-pushed the shs-ng/M4.2 branch from 2521cae to 58c5d00 Compare April 27, 2017 21:31

vanzin force-pushed the shs-ng/M4.3 branch from 207df90 to 69749b9 Compare April 28, 2017 15:08

vanzin force-pushed the shs-ng/M4.2 branch from 58c5d00 to 9dd2f98 Compare April 28, 2017 15:08

vanzin force-pushed the shs-ng/M4.3 branch from 69749b9 to 2dd88f6 Compare April 28, 2017 21:35

vanzin force-pushed the shs-ng/M4.2 branch from 9dd2f98 to 52e2641 Compare April 28, 2017 21:35

vanzin force-pushed the shs-ng/M4.3 branch from 2dd88f6 to 6d8e1b2 Compare May 1, 2017 22:58

vanzin force-pushed the shs-ng/M4.2 branch from 52e2641 to 5505c83 Compare May 1, 2017 22:58

vanzin force-pushed the shs-ng/M4.3 branch from 6d8e1b2 to dcacfc7 Compare May 5, 2017 21:19

vanzin force-pushed the shs-ng/M4.2 branch from 5505c83 to 974eb38 Compare May 5, 2017 21:19

vanzin force-pushed the shs-ng/M4.3 branch from dcacfc7 to ceb9c6b Compare May 5, 2017 22:57

vanzin force-pushed the shs-ng/M4.2 branch from 974eb38 to 4fece2f Compare May 5, 2017 22:57

vanzin force-pushed the shs-ng/M4.3 branch from ceb9c6b to 57627d0 Compare May 8, 2017 17:25

vanzin force-pushed the shs-ng/M4.2 branch from 4fece2f to b3701f6 Compare May 8, 2017 17:25

vanzin force-pushed the shs-ng/M4.3 branch from 57627d0 to 405294b Compare May 9, 2017 01:08

vanzin force-pushed the shs-ng/M4.2 branch from b3701f6 to 433d1ec Compare May 9, 2017 01:09

vanzin force-pushed the shs-ng/M4.3 branch from 405294b to ed48cd6 Compare May 15, 2017 20:44

vanzin force-pushed the shs-ng/M4.2 branch from 433d1ec to f7d6a74 Compare May 15, 2017 20:44

vanzin force-pushed the shs-ng/M4.3 branch from ed48cd6 to 4df8af1 Compare May 26, 2017 18:53

vanzin force-pushed the shs-ng/M4.2 branch from f7d6a74 to 4526ffe Compare May 26, 2017 18:53

vanzin force-pushed the shs-ng/M4.3 branch from 4df8af1 to c5a17fd Compare May 30, 2017 23:03

vanzin force-pushed the shs-ng/M4.2 branch from 4526ffe to d66024c Compare May 30, 2017 23:04

vanzin closed this May 30, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

SHS-NG M4.3: Port StorageTab to the new backend. #9

SHS-NG M4.3: Port StorageTab to the new backend. #9

Uh oh!

vanzin commented Apr 17, 2017

Uh oh!

Uh oh!

SHS-NG M4.3: Port StorageTab to the new backend. #9

SHS-NG M4.3: Port StorageTab to the new backend. #9

Uh oh!

Conversation

vanzin commented Apr 17, 2017

Uh oh!

Uh oh!