Skip to content
This repository was archived by the owner on May 9, 2024. It is now read-only.

Commit c67ce81

Browse files
committed
Merge remote-tracking branch 'upstream/master' into mllib_pmml_model_export_SPARK-1406
2 parents 78515ec + 047ff57 commit c67ce81

File tree

9,244 files changed

+107728
-65610
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

9,244 files changed

+107728
-65610
lines changed

.gitattributes

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
*.bat text eol=crlf
2+
*.cmd text eol=crlf

.rat-excludes

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
11
target
22
.gitignore
3+
.gitattributes
34
.project
45
.classpath
56
.mima-excludes
@@ -43,11 +44,13 @@ SparkImports.scala
4344
SparkJLineCompletion.scala
4445
SparkJLineReader.scala
4546
SparkMemberHandlers.scala
47+
SparkReplReporter.scala
4648
sbt
4749
sbt-launch-lib.bash
4850
plugins.sbt
4951
work
5052
.*\.q
53+
.*\.qv
5154
golden
5255
test.out/*
5356
.*iml

LICENSE

Lines changed: 20 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -712,18 +712,6 @@ THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
712712
NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE,
713713
EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
714714

715-
========================================================================
716-
For colt:
717-
========================================================================
718-
719-
Copyright (c) 1999 CERN - European Organization for Nuclear Research.
720-
Permission to use, copy, modify, distribute and sell this software and its documentation for any purpose is hereby granted without fee, provided that the above copyright notice appear in all copies and that both that copyright notice and this permission notice appear in supporting documentation. CERN makes no representations about the suitability of this software for any purpose. It is provided "as is" without expressed or implied warranty.
721-
722-
Packages hep.aida.*
723-
724-
Written by Pavel Binko, Dino Ferrero Merlino, Wolfgang Hoschek, Tony Johnson, Andreas Pfeiffer, and others. Check the FreeHEP home page for more info. Permission to use and/or redistribute this work is granted under the terms of the LGPL License, with the exception that any usage related to military applications is expressly forbidden. The software and documentation made available under the terms of this license are provided with no warranty.
725-
726-
727715
========================================================================
728716
For SnapTree:
729717
========================================================================
@@ -766,7 +754,7 @@ SUCH DAMAGE.
766754

767755

768756
========================================================================
769-
For Timsort (core/src/main/java/org/apache/spark/util/collection/Sorter.java):
757+
For Timsort (core/src/main/java/org/apache/spark/util/collection/TimSort.java):
770758
========================================================================
771759
Copyright (C) 2008 The Android Open Source Project
772760

@@ -783,6 +771,25 @@ See the License for the specific language governing permissions and
783771
limitations under the License.
784772

785773

774+
========================================================================
775+
For LimitedInputStream
776+
(network/common/src/main/java/org/apache/spark/network/util/LimitedInputStream.java):
777+
========================================================================
778+
Copyright (C) 2007 The Guava Authors
779+
780+
Licensed under the Apache License, Version 2.0 (the "License");
781+
you may not use this file except in compliance with the License.
782+
You may obtain a copy of the License at
783+
784+
http://www.apache.org/licenses/LICENSE-2.0
785+
786+
Unless required by applicable law or agreed to in writing, software
787+
distributed under the License is distributed on an "AS IS" BASIS,
788+
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
789+
See the License for the specific language governing permissions and
790+
limitations under the License.
791+
792+
786793
========================================================================
787794
BSD-style licenses
788795
========================================================================

README.md

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,8 @@ and Spark Streaming for stream processing.
1313
## Online Documentation
1414

1515
You can find the latest Spark documentation, including a programming
16-
guide, on the [project web page](http://spark.apache.org/documentation.html).
16+
guide, on the [project web page](http://spark.apache.org/documentation.html)
17+
and [project wiki](https://cwiki.apache.org/confluence/display/SPARK).
1718
This README file only contains basic setup instructions.
1819

1920
## Building Spark
@@ -25,7 +26,7 @@ To build Spark and its example programs, run:
2526

2627
(You do not need to do this if you downloaded a pre-built package.)
2728
More detailed documentation is available from the project site, at
28-
["Building Spark"](http://spark.apache.org/docs/latest/building-spark.html).
29+
["Building Spark with Maven"](http://spark.apache.org/docs/latest/building-with-maven.html).
2930

3031
## Interactive Scala Shell
3132

@@ -84,7 +85,7 @@ storage systems. Because the protocols have changed in different versions of
8485
Hadoop, you must build Spark against the same version that your cluster runs.
8586

8687
Please refer to the build documentation at
87-
["Specifying the Hadoop Version"](http://spark.apache.org/docs/latest/building-spark.html#specifying-the-hadoop-version)
88+
["Specifying the Hadoop Version"](http://spark.apache.org/docs/latest/building-with-maven.html#specifying-the-hadoop-version)
8889
for detailed guidance on building for a particular distribution of Hadoop, including
8990
building for particular Hive and Hive Thriftserver distributions. See also
9091
["Third Party Hadoop Distributions"](http://spark.apache.org/docs/latest/hadoop-third-party-distributions.html)

assembly/pom.xml

Lines changed: 10 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@
2121
<parent>
2222
<groupId>org.apache.spark</groupId>
2323
<artifactId>spark-parent</artifactId>
24-
<version>1.2.0-SNAPSHOT</version>
24+
<version>1.3.0-SNAPSHOT</version>
2525
<relativePath>../pom.xml</relativePath>
2626
</parent>
2727

@@ -66,22 +66,22 @@
6666
</dependency>
6767
<dependency>
6868
<groupId>org.apache.spark</groupId>
69-
<artifactId>spark-repl_${scala.binary.version}</artifactId>
69+
<artifactId>spark-streaming_${scala.binary.version}</artifactId>
7070
<version>${project.version}</version>
7171
</dependency>
7272
<dependency>
7373
<groupId>org.apache.spark</groupId>
74-
<artifactId>spark-streaming_${scala.binary.version}</artifactId>
74+
<artifactId>spark-graphx_${scala.binary.version}</artifactId>
7575
<version>${project.version}</version>
7676
</dependency>
7777
<dependency>
7878
<groupId>org.apache.spark</groupId>
79-
<artifactId>spark-graphx_${scala.binary.version}</artifactId>
79+
<artifactId>spark-sql_${scala.binary.version}</artifactId>
8080
<version>${project.version}</version>
8181
</dependency>
8282
<dependency>
8383
<groupId>org.apache.spark</groupId>
84-
<artifactId>spark-sql_${scala.binary.version}</artifactId>
84+
<artifactId>spark-repl_${scala.binary.version}</artifactId>
8585
<version>${project.version}</version>
8686
</dependency>
8787
</dependencies>
@@ -197,6 +197,11 @@
197197
<artifactId>spark-hive_${scala.binary.version}</artifactId>
198198
<version>${project.version}</version>
199199
</dependency>
200+
</dependencies>
201+
</profile>
202+
<profile>
203+
<id>hive-thriftserver</id>
204+
<dependencies>
200205
<dependency>
201206
<groupId>org.apache.spark</groupId>
202207
<artifactId>spark-hive-thriftserver_${scala.binary.version}</artifactId>

bagel/pom.xml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@
2121
<parent>
2222
<groupId>org.apache.spark</groupId>
2323
<artifactId>spark-parent</artifactId>
24-
<version>1.2.0-SNAPSHOT</version>
24+
<version>1.3.0-SNAPSHOT</version>
2525
<relativePath>../pom.xml</relativePath>
2626
</parent>
2727

bin/compute-classpath.cmd

Lines changed: 117 additions & 117 deletions
Original file line numberDiff line numberDiff line change
@@ -1,117 +1,117 @@
1-
@echo off
2-
3-
rem
4-
rem Licensed to the Apache Software Foundation (ASF) under one or more
5-
rem contributor license agreements. See the NOTICE file distributed with
6-
rem this work for additional information regarding copyright ownership.
7-
rem The ASF licenses this file to You under the Apache License, Version 2.0
8-
rem (the "License"); you may not use this file except in compliance with
9-
rem the License. You may obtain a copy of the License at
10-
rem
11-
rem http://www.apache.org/licenses/LICENSE-2.0
12-
rem
13-
rem Unless required by applicable law or agreed to in writing, software
14-
rem distributed under the License is distributed on an "AS IS" BASIS,
15-
rem WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
16-
rem See the License for the specific language governing permissions and
17-
rem limitations under the License.
18-
rem
19-
20-
rem This script computes Spark's classpath and prints it to stdout; it's used by both the "run"
21-
rem script and the ExecutorRunner in standalone cluster mode.
22-
23-
rem If we're called from spark-class2.cmd, it already set enabledelayedexpansion and setting
24-
rem it here would stop us from affecting its copy of the CLASSPATH variable; otherwise we
25-
rem need to set it here because we use !datanucleus_jars! below.
26-
if "%DONT_PRINT_CLASSPATH%"=="1" goto skip_delayed_expansion
27-
setlocal enabledelayedexpansion
28-
:skip_delayed_expansion
29-
30-
set SCALA_VERSION=2.10
31-
32-
rem Figure out where the Spark framework is installed
33-
set FWDIR=%~dp0..\
34-
35-
rem Load environment variables from conf\spark-env.cmd, if it exists
36-
if exist "%FWDIR%conf\spark-env.cmd" call "%FWDIR%conf\spark-env.cmd"
37-
38-
rem Build up classpath
39-
set CLASSPATH=%SPARK_CLASSPATH%;%SPARK_SUBMIT_CLASSPATH%
40-
41-
if "x%SPARK_CONF_DIR%"!="x" (
42-
set CLASSPATH=%CLASSPATH%;%SPARK_CONF_DIR%
43-
) else (
44-
set CLASSPATH=%CLASSPATH%;%FWDIR%conf
45-
)
46-
47-
if exist "%FWDIR%RELEASE" (
48-
for %%d in ("%FWDIR%lib\spark-assembly*.jar") do (
49-
set ASSEMBLY_JAR=%%d
50-
)
51-
) else (
52-
for %%d in ("%FWDIR%assembly\target\scala-%SCALA_VERSION%\spark-assembly*hadoop*.jar") do (
53-
set ASSEMBLY_JAR=%%d
54-
)
55-
)
56-
57-
set CLASSPATH=%CLASSPATH%;%ASSEMBLY_JAR%
58-
59-
rem When Hive support is needed, Datanucleus jars must be included on the classpath.
60-
rem Datanucleus jars do not work if only included in the uber jar as plugin.xml metadata is lost.
61-
rem Both sbt and maven will populate "lib_managed/jars/" with the datanucleus jars when Spark is
62-
rem built with Hive, so look for them there.
63-
if exist "%FWDIR%RELEASE" (
64-
set datanucleus_dir=%FWDIR%lib
65-
) else (
66-
set datanucleus_dir=%FWDIR%lib_managed\jars
67-
)
68-
set "datanucleus_jars="
69-
for %%d in ("%datanucleus_dir%\datanucleus-*.jar") do (
70-
set datanucleus_jars=!datanucleus_jars!;%%d
71-
)
72-
set CLASSPATH=%CLASSPATH%;%datanucleus_jars%
73-
74-
set SPARK_CLASSES=%FWDIR%core\target\scala-%SCALA_VERSION%\classes
75-
set SPARK_CLASSES=%SPARK_CLASSES%;%FWDIR%repl\target\scala-%SCALA_VERSION%\classes
76-
set SPARK_CLASSES=%SPARK_CLASSES%;%FWDIR%mllib\target\scala-%SCALA_VERSION%\classes
77-
set SPARK_CLASSES=%SPARK_CLASSES%;%FWDIR%bagel\target\scala-%SCALA_VERSION%\classes
78-
set SPARK_CLASSES=%SPARK_CLASSES%;%FWDIR%graphx\target\scala-%SCALA_VERSION%\classes
79-
set SPARK_CLASSES=%SPARK_CLASSES%;%FWDIR%streaming\target\scala-%SCALA_VERSION%\classes
80-
set SPARK_CLASSES=%SPARK_CLASSES%;%FWDIR%tools\target\scala-%SCALA_VERSION%\classes
81-
set SPARK_CLASSES=%SPARK_CLASSES%;%FWDIR%sql\catalyst\target\scala-%SCALA_VERSION%\classes
82-
set SPARK_CLASSES=%SPARK_CLASSES%;%FWDIR%sql\core\target\scala-%SCALA_VERSION%\classes
83-
set SPARK_CLASSES=%SPARK_CLASSES%;%FWDIR%sql\hive\target\scala-%SCALA_VERSION%\classes
84-
85-
set SPARK_TEST_CLASSES=%FWDIR%core\target\scala-%SCALA_VERSION%\test-classes
86-
set SPARK_TEST_CLASSES=%SPARK_TEST_CLASSES%;%FWDIR%repl\target\scala-%SCALA_VERSION%\test-classes
87-
set SPARK_TEST_CLASSES=%SPARK_TEST_CLASSES%;%FWDIR%mllib\target\scala-%SCALA_VERSION%\test-classes
88-
set SPARK_TEST_CLASSES=%SPARK_TEST_CLASSES%;%FWDIR%bagel\target\scala-%SCALA_VERSION%\test-classes
89-
set SPARK_TEST_CLASSES=%SPARK_TEST_CLASSES%;%FWDIR%graphx\target\scala-%SCALA_VERSION%\test-classes
90-
set SPARK_TEST_CLASSES=%SPARK_TEST_CLASSES%;%FWDIR%streaming\target\scala-%SCALA_VERSION%\test-classes
91-
set SPARK_TEST_CLASSES=%SPARK_TEST_CLASSES%;%FWDIR%sql\catalyst\target\scala-%SCALA_VERSION%\test-classes
92-
set SPARK_TEST_CLASSES=%SPARK_TEST_CLASSES%;%FWDIR%sql\core\target\scala-%SCALA_VERSION%\test-classes
93-
set SPARK_TEST_CLASSES=%SPARK_TEST_CLASSES%;%FWDIR%sql\hive\target\scala-%SCALA_VERSION%\test-classes
94-
95-
if "x%SPARK_TESTING%"=="x1" (
96-
rem Add test clases to path - note, add SPARK_CLASSES and SPARK_TEST_CLASSES before CLASSPATH
97-
rem so that local compilation takes precedence over assembled jar
98-
set CLASSPATH=%SPARK_CLASSES%;%SPARK_TEST_CLASSES%;%CLASSPATH%
99-
)
100-
101-
rem Add hadoop conf dir - else FileSystem.*, etc fail
102-
rem Note, this assumes that there is either a HADOOP_CONF_DIR or YARN_CONF_DIR which hosts
103-
rem the configurtion files.
104-
if "x%HADOOP_CONF_DIR%"=="x" goto no_hadoop_conf_dir
105-
set CLASSPATH=%CLASSPATH%;%HADOOP_CONF_DIR%
106-
:no_hadoop_conf_dir
107-
108-
if "x%YARN_CONF_DIR%"=="x" goto no_yarn_conf_dir
109-
set CLASSPATH=%CLASSPATH%;%YARN_CONF_DIR%
110-
:no_yarn_conf_dir
111-
112-
rem A bit of a hack to allow calling this script within run2.cmd without seeing output
113-
if "%DONT_PRINT_CLASSPATH%"=="1" goto exit
114-
115-
echo %CLASSPATH%
116-
117-
:exit
1+
@echo off
2+
3+
rem
4+
rem Licensed to the Apache Software Foundation (ASF) under one or more
5+
rem contributor license agreements. See the NOTICE file distributed with
6+
rem this work for additional information regarding copyright ownership.
7+
rem The ASF licenses this file to You under the Apache License, Version 2.0
8+
rem (the "License"); you may not use this file except in compliance with
9+
rem the License. You may obtain a copy of the License at
10+
rem
11+
rem http://www.apache.org/licenses/LICENSE-2.0
12+
rem
13+
rem Unless required by applicable law or agreed to in writing, software
14+
rem distributed under the License is distributed on an "AS IS" BASIS,
15+
rem WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
16+
rem See the License for the specific language governing permissions and
17+
rem limitations under the License.
18+
rem
19+
20+
rem This script computes Spark's classpath and prints it to stdout; it's used by both the "run"
21+
rem script and the ExecutorRunner in standalone cluster mode.
22+
23+
rem If we're called from spark-class2.cmd, it already set enabledelayedexpansion and setting
24+
rem it here would stop us from affecting its copy of the CLASSPATH variable; otherwise we
25+
rem need to set it here because we use !datanucleus_jars! below.
26+
if "%DONT_PRINT_CLASSPATH%"=="1" goto skip_delayed_expansion
27+
setlocal enabledelayedexpansion
28+
:skip_delayed_expansion
29+
30+
set SCALA_VERSION=2.10
31+
32+
rem Figure out where the Spark framework is installed
33+
set FWDIR=%~dp0..\
34+
35+
rem Load environment variables from conf\spark-env.cmd, if it exists
36+
if exist "%FWDIR%conf\spark-env.cmd" call "%FWDIR%conf\spark-env.cmd"
37+
38+
rem Build up classpath
39+
set CLASSPATH=%SPARK_CLASSPATH%;%SPARK_SUBMIT_CLASSPATH%
40+
41+
if not "x%SPARK_CONF_DIR%"=="x" (
42+
set CLASSPATH=%CLASSPATH%;%SPARK_CONF_DIR%
43+
) else (
44+
set CLASSPATH=%CLASSPATH%;%FWDIR%conf
45+
)
46+
47+
if exist "%FWDIR%RELEASE" (
48+
for %%d in ("%FWDIR%lib\spark-assembly*.jar") do (
49+
set ASSEMBLY_JAR=%%d
50+
)
51+
) else (
52+
for %%d in ("%FWDIR%assembly\target\scala-%SCALA_VERSION%\spark-assembly*hadoop*.jar") do (
53+
set ASSEMBLY_JAR=%%d
54+
)
55+
)
56+
57+
set CLASSPATH=%CLASSPATH%;%ASSEMBLY_JAR%
58+
59+
rem When Hive support is needed, Datanucleus jars must be included on the classpath.
60+
rem Datanucleus jars do not work if only included in the uber jar as plugin.xml metadata is lost.
61+
rem Both sbt and maven will populate "lib_managed/jars/" with the datanucleus jars when Spark is
62+
rem built with Hive, so look for them there.
63+
if exist "%FWDIR%RELEASE" (
64+
set datanucleus_dir=%FWDIR%lib
65+
) else (
66+
set datanucleus_dir=%FWDIR%lib_managed\jars
67+
)
68+
set "datanucleus_jars="
69+
for %%d in ("%datanucleus_dir%\datanucleus-*.jar") do (
70+
set datanucleus_jars=!datanucleus_jars!;%%d
71+
)
72+
set CLASSPATH=%CLASSPATH%;%datanucleus_jars%
73+
74+
set SPARK_CLASSES=%FWDIR%core\target\scala-%SCALA_VERSION%\classes
75+
set SPARK_CLASSES=%SPARK_CLASSES%;%FWDIR%repl\target\scala-%SCALA_VERSION%\classes
76+
set SPARK_CLASSES=%SPARK_CLASSES%;%FWDIR%mllib\target\scala-%SCALA_VERSION%\classes
77+
set SPARK_CLASSES=%SPARK_CLASSES%;%FWDIR%bagel\target\scala-%SCALA_VERSION%\classes
78+
set SPARK_CLASSES=%SPARK_CLASSES%;%FWDIR%graphx\target\scala-%SCALA_VERSION%\classes
79+
set SPARK_CLASSES=%SPARK_CLASSES%;%FWDIR%streaming\target\scala-%SCALA_VERSION%\classes
80+
set SPARK_CLASSES=%SPARK_CLASSES%;%FWDIR%tools\target\scala-%SCALA_VERSION%\classes
81+
set SPARK_CLASSES=%SPARK_CLASSES%;%FWDIR%sql\catalyst\target\scala-%SCALA_VERSION%\classes
82+
set SPARK_CLASSES=%SPARK_CLASSES%;%FWDIR%sql\core\target\scala-%SCALA_VERSION%\classes
83+
set SPARK_CLASSES=%SPARK_CLASSES%;%FWDIR%sql\hive\target\scala-%SCALA_VERSION%\classes
84+
85+
set SPARK_TEST_CLASSES=%FWDIR%core\target\scala-%SCALA_VERSION%\test-classes
86+
set SPARK_TEST_CLASSES=%SPARK_TEST_CLASSES%;%FWDIR%repl\target\scala-%SCALA_VERSION%\test-classes
87+
set SPARK_TEST_CLASSES=%SPARK_TEST_CLASSES%;%FWDIR%mllib\target\scala-%SCALA_VERSION%\test-classes
88+
set SPARK_TEST_CLASSES=%SPARK_TEST_CLASSES%;%FWDIR%bagel\target\scala-%SCALA_VERSION%\test-classes
89+
set SPARK_TEST_CLASSES=%SPARK_TEST_CLASSES%;%FWDIR%graphx\target\scala-%SCALA_VERSION%\test-classes
90+
set SPARK_TEST_CLASSES=%SPARK_TEST_CLASSES%;%FWDIR%streaming\target\scala-%SCALA_VERSION%\test-classes
91+
set SPARK_TEST_CLASSES=%SPARK_TEST_CLASSES%;%FWDIR%sql\catalyst\target\scala-%SCALA_VERSION%\test-classes
92+
set SPARK_TEST_CLASSES=%SPARK_TEST_CLASSES%;%FWDIR%sql\core\target\scala-%SCALA_VERSION%\test-classes
93+
set SPARK_TEST_CLASSES=%SPARK_TEST_CLASSES%;%FWDIR%sql\hive\target\scala-%SCALA_VERSION%\test-classes
94+
95+
if "x%SPARK_TESTING%"=="x1" (
96+
rem Add test clases to path - note, add SPARK_CLASSES and SPARK_TEST_CLASSES before CLASSPATH
97+
rem so that local compilation takes precedence over assembled jar
98+
set CLASSPATH=%SPARK_CLASSES%;%SPARK_TEST_CLASSES%;%CLASSPATH%
99+
)
100+
101+
rem Add hadoop conf dir - else FileSystem.*, etc fail
102+
rem Note, this assumes that there is either a HADOOP_CONF_DIR or YARN_CONF_DIR which hosts
103+
rem the configurtion files.
104+
if "x%HADOOP_CONF_DIR%"=="x" goto no_hadoop_conf_dir
105+
set CLASSPATH=%CLASSPATH%;%HADOOP_CONF_DIR%
106+
:no_hadoop_conf_dir
107+
108+
if "x%YARN_CONF_DIR%"=="x" goto no_yarn_conf_dir
109+
set CLASSPATH=%CLASSPATH%;%YARN_CONF_DIR%
110+
:no_yarn_conf_dir
111+
112+
rem A bit of a hack to allow calling this script within run2.cmd without seeing output
113+
if "%DONT_PRINT_CLASSPATH%"=="1" goto exit
114+
115+
echo %CLASSPATH%
116+
117+
:exit

0 commit comments

Comments
 (0)