Skip to content

Improved build configuration #480

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 17 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 0 additions & 14 deletions bagel/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -31,20 +31,6 @@
<name>Spark Project Bagel</name>
<url>http://spark.apache.org/</url>

<profiles>
<profile>
<!-- SPARK-1121: SPARK-1121: Adds an explicit dependency on Avro to work around
a Hadoop 0.23.X issue -->
<id>yarn-alpha</id>
<dependencies>
<dependency>
<groupId>org.apache.avro</groupId>
<artifactId>avro</artifactId>
</dependency>
</dependencies>
</profile>
</profiles>

<dependencies>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see what you're doing, to lift these dependencies up into the parent, but they will then be applied to all modules. Is that desirable -- which modules now have this dependency that didn't before, or are there none?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It exists in almost all modules should be referred to the parent.

<dependency>
<groupId>org.apache.spark</groupId>
Expand Down
22 changes: 0 additions & 22 deletions core/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -30,19 +30,6 @@
<packaging>jar</packaging>
<name>Spark Project Core</name>
<url>http://spark.apache.org/</url>
<!-- SPARK-1121: Adds an explicit dependency on Avro to work around a Hadoop 0.23.X issue -->
<profiles>
<profile>
<id>yarn-alpha</id>
<dependencies>
<dependency>
<groupId>org.apache.avro</groupId>
<artifactId>avro</artifactId>
</dependency>
</dependencies>
</profile>
</profiles>

<dependencies>
<dependency>
<groupId>org.apache.hadoop</groupId>
Expand Down Expand Up @@ -147,15 +134,6 @@
<groupId>org.json4s</groupId>
<artifactId>json4s-jackson_${scala.binary.version}</artifactId>
<version>3.2.6</version>
<!-- see also exclusion for lift-json; this is necessary since it depends on
scala-library and scalap 2.10.0, but we use 2.10.4, and only override
scala-library -->
<exclusions>
<exclusion>
<groupId>org.scala-lang</groupId>
<artifactId>scalap</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>colt</groupId>
Expand Down
8 changes: 7 additions & 1 deletion docs/building-with-maven.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,10 @@ For Apache Hadoop versions 1.x, Cloudera CDH MRv1, and other Hadoop versions wit
# Cloudera CDH 4.2.0 with MapReduce v1
$ mvn -Dhadoop.version=2.0.0-mr1-cdh4.2.0 -DskipTests clean package

For Apache Hadoop 2.x, 0.23.x, Cloudera CDH MRv2, and other Hadoop versions with YARN, you should enable the "yarn-alpha" or "yarn" profile and set the "hadoop.version", "yarn.version" property:
# Apache Hadoop 0.23.x
$ mvn -Phadoop-0.23 -Dhadoop.version=0.23.7 -DskipTests clean package

For Apache Hadoop 2.x, 0.23.x, Cloudera CDH MRv2, and other Hadoop versions with YARN, you can enable the "yarn-alpha" or "yarn" profile and set the "hadoop.version", "yarn.version" property:

# Apache Hadoop 2.0.5-alpha
$ mvn -Pyarn-alpha -Dhadoop.version=2.0.5-alpha -Dyarn.version=2.0.5-alpha -DskipTests clean package
Expand All @@ -50,6 +53,9 @@ For Apache Hadoop 2.x, 0.23.x, Cloudera CDH MRv2, and other Hadoop versions with
# Apache Hadoop 2.2.X ( e.g. 2.2.0 as below ) and newer
$ mvn -Pyarn -Dhadoop.version=2.2.0 -Dyarn.version=2.2.0 -DskipTests clean package

# Apache Hadoop 0.23.x
$ mvn -Pyarn-alpha -Phadoop-0.23 -Dhadoop.version=0.23.7 -Dyarn.version=0.23.7 -DskipTests clean package

## Spark Tests in Maven ##

Tests are run by default via the [ScalaTest Maven plugin](http://www.scalatest.org/user_guide/using_the_scalatest_maven_plugin). Some of the require Spark to be packaged first, so always run `mvn package` with `-DskipTests` the first time. You can then run the tests with `mvn -Dhadoop.version=... test`.
Expand Down
18 changes: 4 additions & 14 deletions examples/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -31,20 +31,6 @@
<name>Spark Project Examples</name>
<url>http://spark.apache.org/</url>

<profiles>
<profile>
<!-- SPARK-1121: SPARK-1121: Adds an explicit dependency on Avro to work around
a Hadoop 0.23.X issue -->
<id>yarn-alpha</id>
<dependencies>
<dependency>
<groupId>org.apache.avro</groupId>
<artifactId>avro</artifactId>
</dependency>
</dependencies>
</profile>
</profiles>

<dependencies>
<dependency>
<groupId>org.apache.spark</groupId>
Expand Down Expand Up @@ -124,6 +110,10 @@
<groupId>commons-logging</groupId>
<artifactId>commons-logging</artifactId>
</exclusion>
<exclusion>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be a good idea but what's the motivation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The jar is very big, there are 12m.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. Spark doesn't use JRuby directly and I can't imagine how it uses it indirectly. Then again, this is just in the examples module where there is a load of dependency anyway, and which people don't depend on directly. So it doesn't really hurt to not manually prune this stuff. In core this would be very important. IMHO.

<groupId>org.jruby</groupId>
<artifactId>jruby-complete</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
Expand Down
14 changes: 0 additions & 14 deletions external/flume/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -31,20 +31,6 @@
<name>Spark Project External Flume</name>
<url>http://spark.apache.org/</url>

<profiles>
<profile>
<!-- SPARK-1121: SPARK-1121: Adds an explicit dependency on Avro to work around
a Hadoop 0.23.X issue -->
<id>yarn-alpha</id>
<dependencies>
<dependency>
<groupId>org.apache.avro</groupId>
<artifactId>avro</artifactId>
</dependency>
</dependencies>
</profile>
</profiles>

<dependencies>
<dependency>
<groupId>org.apache.spark</groupId>
Expand Down
14 changes: 0 additions & 14 deletions external/kafka/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -31,20 +31,6 @@
<name>Spark Project External Kafka</name>
<url>http://spark.apache.org/</url>

<profiles>
<profile>
<!-- SPARK-1121: SPARK-1121: Adds an explicit dependency on Avro to work around
a Hadoop 0.23.X issue -->
<id>yarn-alpha</id>
<dependencies>
<dependency>
<groupId>org.apache.avro</groupId>
<artifactId>avro</artifactId>
</dependency>
</dependencies>
</profile>
</profiles>

<dependencies>
<dependency>
<groupId>org.apache.spark</groupId>
Expand Down
14 changes: 0 additions & 14 deletions external/mqtt/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -31,20 +31,6 @@
<name>Spark Project External MQTT</name>
<url>http://spark.apache.org/</url>

<profiles>
<profile>
<!-- SPARK-1121: SPARK-1121: Adds an explicit dependency on Avro to work around
a Hadoop 0.23.X issue -->
<id>yarn-alpha</id>
<dependencies>
<dependency>
<groupId>org.apache.avro</groupId>
<artifactId>avro</artifactId>
</dependency>
</dependencies>
</profile>
</profiles>

<dependencies>
<dependency>
<groupId>org.apache.spark</groupId>
Expand Down
14 changes: 0 additions & 14 deletions external/twitter/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -31,20 +31,6 @@
<name>Spark Project External Twitter</name>
<url>http://spark.apache.org/</url>

<profiles>
<profile>
<!-- SPARK-1121: SPARK-1121: Adds an explicit dependency on Avro to work around
a Hadoop 0.23.X issue -->
<id>yarn-alpha</id>
<dependencies>
<dependency>
<groupId>org.apache.avro</groupId>
<artifactId>avro</artifactId>
</dependency>
</dependencies>
</profile>
</profiles>

<dependencies>
<dependency>
<groupId>org.apache.spark</groupId>
Expand Down
14 changes: 0 additions & 14 deletions external/zeromq/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -31,20 +31,6 @@
<name>Spark Project External ZeroMQ</name>
<url>http://spark.apache.org/</url>

<profiles>
<profile>
<!-- SPARK-1121: SPARK-1121: Adds an explicit dependency on Avro to work around
a Hadoop 0.23.X issue -->
<id>yarn-alpha</id>
<dependencies>
<dependency>
<groupId>org.apache.avro</groupId>
<artifactId>avro</artifactId>
</dependency>
</dependencies>
</profile>
</profiles>

<dependencies>
<dependency>
<groupId>org.apache.spark</groupId>
Expand Down
14 changes: 0 additions & 14 deletions graphx/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -31,20 +31,6 @@
<name>Spark Project GraphX</name>
<url>http://spark.apache.org/</url>

<profiles>
<profile>
<!-- SPARK-1121: SPARK-1121: Adds an explicit dependency on Avro to work around
a Hadoop 0.23.X issue -->
<id>yarn-alpha</id>
<dependencies>
<dependency>
<groupId>org.apache.avro</groupId>
<artifactId>avro</artifactId>
</dependency>
</dependencies>
</profile>
</profiles>

<dependencies>
<dependency>
<groupId>org.apache.spark</groupId>
Expand Down
14 changes: 0 additions & 14 deletions mllib/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -31,20 +31,6 @@
<name>Spark Project ML Library</name>
<url>http://spark.apache.org/</url>

<profiles>
<profile>
<!-- SPARK-1121: SPARK-1121: Adds an explicit dependency on Avro to work around
a Hadoop 0.23.X issue -->
<id>yarn-alpha</id>
<dependencies>
<dependency>
<groupId>org.apache.avro</groupId>
<artifactId>avro</artifactId>
</dependency>
</dependencies>
</profile>
</profiles>

<dependencies>
<dependency>
<groupId>org.apache.spark</groupId>
Expand Down
Loading