SPARK-1181. 'mvn test' fails out of the box since sbt assembly does not necessarily exist #77

srowen · 2014-03-04T18:38:33Z

The test suite requires that "sbt assembly" has been run in order for some tests (like DriverSuite) to pass. The tests themselves say as much.

This means that a "mvn test" from a fresh clone fails.

There's a pretty simple fix, to have Maven's test-compile phase invoke "sbt assembly". I suppose the only downside is re-invoking "sbt assembly" each time tests are run.

I'm open to ideas about how to set this up more intelligently but it would be a generally good thing if the Maven build's tests passed out of the box.

…require the assembly to pass

markhamstra · 2014-03-04T18:55:54Z

The standard maven build procedure should be to run mvn -DskipTests package first (which builds the assembly) and then mvn test. The "Building Spark with Maven" page should be updated to clearly explain that procedure.

Requiring maven users to invoke "sbt assembly" not only forces downloading SBT itself, but also ends up duplicating artifacts in .ivy2 and .m2.

AmplabJenkins · 2014-03-04T19:25:58Z

Merged build triggered.

AmplabJenkins · 2014-03-04T19:25:58Z

Merged build started.

JoshRosen · 2014-03-04T19:49:09Z

In Maven, you can run tests that depend on packages/assemblies during Maven's integration-test phase, which automatically runs after the Maven package phase. I'm not sure what the SBT equivalent of integration-test is.

This approach would require us to move the integration tests into a separate test directory or package.

AmplabJenkins · 2014-03-04T20:23:18Z

Merged build finished.

AmplabJenkins · 2014-03-04T20:23:18Z

One or more automated tests failed
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/12997/

srowen · 2014-03-04T22:26:21Z

OK that works, to package and then test. In the canonical Maven lifecycle, packaging comes after test, so test would not depend on packaging. In practice this is at worst a fine workaround.

I agree that making these integration tests is likely the right-est answer. I will close this however as I think that's for another day; other bigger build changes might make this point irrelevant.

Add pyspark test

[SPARK-24917] make chunk size configurable

…-ccm Fixes issue apache#76

### What changes were proposed in this pull request? This PR added a physical rule to remove redundant project nodes. A `ProjectExec` is redundant when 1. It has the same output attributes and order as its child's output when ordering of these attributes is required. 2. It has the same output attributes as its child's output when attribute output ordering is not required. For example: After Filter: ``` == Physical Plan == *(1) Project [a#14L, b#15L, c#16, key#17] +- *(1) Filter (isnotnull(a#14L) AND (a#14L > 5)) +- *(1) ColumnarToRow +- FileScan parquet [a#14L,b#15L,c#16,key#17] ``` The `Project a#14L, b#15L, c#16, key#17` is redundant because its output is exactly the same as filter's output. Before Aggregate: ``` == Physical Plan == *(2) HashAggregate(keys=[key#17], functions=[sum(a#14L), last(b#15L, false)], output=[sum_a#39L, key#17, last_b#41L]) +- Exchange hashpartitioning(key#17, 5), true, [id=#77] +- *(1) HashAggregate(keys=[key#17], functions=[partial_sum(a#14L), partial_last(b#15L, false)], output=[key#17, sum#49L, last#50L, valueSet#51]) +- *(1) Project [key#17, a#14L, b#15L] +- *(1) Filter (isnotnull(a#14L) AND (a#14L > 100)) +- *(1) ColumnarToRow +- FileScan parquet [a#14L,b#15L,key#17] ``` The `Project key#17, a#14L, b#15L` is redundant because hash aggregate doesn't require child plan's output to be in a specific order. ### Why are the changes needed? It removes unnecessary query nodes and makes query plan cleaner. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Unit tests Closes #29031 from allisonwang-db/remove-project. Lead-authored-by: allisonwang-db <[email protected]> Co-authored-by: allisonwang-db <[email protected]> Signed-off-by: Wenchen Fan <[email protected]>

Invoke "sbt assembly" during test-compile phase to ensure tests that …

3cf7ddc

…require the assembly to pass

srowen closed this Mar 4, 2014

srowen deleted the SPARK-1181 branch March 23, 2014 13:11

Igosuki pushed a commit to Adikteev/spark that referenced this pull request Jul 31, 2018

Merge pull request apache#77 from mesosphere/add-pyspark-tests

6433570

Add pyspark test

ashangit pushed a commit to ashangit/spark that referenced this pull request Dec 11, 2018

Merge pull request apache#77 from vincent-grosbois/criteo-2.2

ec5cce3

[SPARK-24917] make chunk size configurable

clems4ever pushed a commit to clems4ever/spark that referenced this pull request Feb 11, 2019

Merge pull request apache#77 from vincent-grosbois/criteo-2.2

601a83f

[SPARK-24917] make chunk size configurable

bzhaoopenstack pushed a commit to bzhaoopenstack/spark that referenced this pull request Sep 11, 2019

Merge pull request apache#77 from mrhillsman/modify-dsvm-services-for…

f2347cc

…-ccm Fixes issue apache#76

hn5092 added a commit to hn5092/spark that referenced this pull request Nov 10, 2019

apache#77 optimize parquet read and upgrade spark rsion to r19

73a6963

hn5092 added a commit to hn5092/spark that referenced this pull request Nov 19, 2019

apache#77 optimize parquet read and upgrade spark rsion to r19

c74e383

allisonwang-db mentioned this pull request Jul 7, 2020

[SPARK-32216][SQL] Remove redundant ProjectExec #29031

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

SPARK-1181. 'mvn test' fails out of the box since sbt assembly does not necessarily exist #77

SPARK-1181. 'mvn test' fails out of the box since sbt assembly does not necessarily exist #77

Uh oh!

srowen commented Mar 4, 2014

Uh oh!

markhamstra commented Mar 4, 2014

Uh oh!

AmplabJenkins commented Mar 4, 2014

Uh oh!

AmplabJenkins commented Mar 4, 2014

Uh oh!

JoshRosen commented Mar 4, 2014

Uh oh!

AmplabJenkins commented Mar 4, 2014

Uh oh!

AmplabJenkins commented Mar 4, 2014

Uh oh!

srowen commented Mar 4, 2014

Uh oh!

Uh oh!

SPARK-1181. 'mvn test' fails out of the box since sbt assembly does not necessarily exist #77

SPARK-1181. 'mvn test' fails out of the box since sbt assembly does not necessarily exist #77

Uh oh!

Conversation

srowen commented Mar 4, 2014

Uh oh!

markhamstra commented Mar 4, 2014

Uh oh!

AmplabJenkins commented Mar 4, 2014

Uh oh!

AmplabJenkins commented Mar 4, 2014

Uh oh!

JoshRosen commented Mar 4, 2014

Uh oh!

AmplabJenkins commented Mar 4, 2014

Uh oh!

AmplabJenkins commented Mar 4, 2014

Uh oh!

srowen commented Mar 4, 2014

Uh oh!

Uh oh!