Skip to content

SKIPME Spark 1.4.1 #64

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 40 commits into from
Jul 13, 2015
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
40 commits
Select commit Hold shift + click to select a range
3f1e4ef
fix string order for non-ascii character
Jul 2, 2015
de08024
[SPARK-8501] [SQL] Avoids reading schema from empty ORC files (backpo…
liancheng Jul 3, 2015
f142867
[SPARK-8776] Increase the default MaxPermSize
yhuai Jul 3, 2015
ff76b33
[SPARK-8803] handle special characters in elements in crosstab
brkyvz Jul 3, 2015
07b95c7
Preparing Spark release v1.4.1-rc2
pwendell Jul 3, 2015
e990561
Preparing development version 1.4.2-SNAPSHOT
pwendell Jul 3, 2015
73e57cd
Merge branch 'branch-1.4' of github.com:apache/spark into csd-1.4
markhamstra Jul 6, 2015
4d81383
[SPARK-8463][SQL] Use DriverRegistry to load jdbc driver at writing path
viirya Jul 7, 2015
947b845
[SPARK-8819] Fix build for maven 3.3.x
Jul 7, 2015
997444c
Revert "[SPARK-8781] Fix variables in published pom.xml are not resol…
Jul 7, 2015
f8aab7a
Preparing Spark release v1.4.1-rc3
pwendell Jul 7, 2015
5c080c2
Preparing development version 1.4.2-SNAPSHOT
pwendell Jul 7, 2015
397bafd
[HOTFIX] Rename release-profile to release
pwendell Jul 7, 2015
3e8ae38
Preparing Spark release v1.4.1-rc3
pwendell Jul 7, 2015
bf8b47d
Preparing development version 1.4.2-SNAPSHOT
pwendell Jul 7, 2015
83a621a
[SPARK-8821] [EC2] Switched to binary mode for file reading
reactormonk Jul 7, 2015
d3d5f2a
[SPARK-8868] SqlSerializer2 can go into infinite loop when row consis…
yhuai Jul 8, 2015
ec94b6d
Merge branch 'branch-1.4' of github.com:apache/spark into csd-1.4
markhamstra Jul 8, 2015
de49916
[SPARK-8894] [SPARKR] [DOC] Example code errors in SparkR documentation.
Jul 8, 2015
e91d87e
[SPARK-8657] [YARN] Fail to upload resource to viewfs
litao-buptsse Jul 8, 2015
e4313db
[SPARK-8657] [YARN] [HOTFIX] Fail to upload resource to viewfs
litao-buptsse Jul 8, 2015
898b073
[HOTFIX] Fix style error introduced in e4313db38e81f6288f1704c22e17d0…
JoshRosen Jul 8, 2015
5127863
[SPARK-8900] [SPARKR] Fix sparkPackages in init documentation
shivaram Jul 8, 2015
4df0f1b
[SPARK-8909][Documentation] Change the scala example in sql-programmi…
Jul 8, 2015
3f6e6e0
[SPARK-8903] Fix bug in cherry-pick of SPARK-8803
JoshRosen Jul 8, 2015
df76349
[SPARK-8902] Correctly print hostname in error
darabos Jul 8, 2015
dbaa5c2
Preparing Spark release v1.4.1-rc4
pwendell Jul 8, 2015
5bc19a1
Preparing development version 1.4.2-SNAPSHOT
pwendell Jul 8, 2015
2fb2ef0
[SPARK-8927] [DOCS] Format wrong for some config descriptions
jonalter Jul 9, 2015
12c1c36
[SPARK-8910] Fix MiMa flaky due to port contention issue
Jul 9, 2015
c04f0a5
[SPARK-8937] [TEST] A setting `spark.unsafe.exceptionOnMemoryLeak ` i…
sarutak Jul 9, 2015
2376ce8
Merge branch 'branch-1.4' of github.com:apache/spark into csd-1.4
markhamstra Jul 9, 2015
dfc9971
[SPARK-7419] [STREAMING] [TESTS] Fix CheckpointSuite.recovery with fi…
zsxwing Jul 9, 2015
990f434
[SPARK-2017] [UI] Stage page hangs with many tasks
Jul 9, 2015
2f2f9da
[SPARK-8865] [STREAMING] FIX BUG: check key in kafka params
guowei2 Jul 9, 2015
bef0591
[DOCS] Added important updateStateByKey details
mvogiatzis Jul 10, 2015
756beda
Merge branch 'branch-1.4' of github.com:apache/spark into csd-1.4
markhamstra Jul 10, 2015
898e5f7
[SPARK-8990] [SQL] SPARK-8990 DataFrameReader.parquet() should respec…
liancheng Jul 11, 2015
66b7467
Merge branch 'branch-1.4' of github.com:apache/spark into csd-1.4
markhamstra Jul 11, 2015
5819266
bumped version for 1.4.1
markhamstra Jul 13, 2015
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion assembly/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@
<parent>
<groupId>org.apache.spark</groupId>
<artifactId>spark-parent_2.10</artifactId>
<version>1.4.0-csd-5-SNAPSHOT</version>
<version>1.4.1-csd-1-SNAPSHOT</version>
<relativePath>../pom.xml</relativePath>
</parent>

Expand Down
2 changes: 1 addition & 1 deletion bagel/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@
<parent>
<groupId>org.apache.spark</groupId>
<artifactId>spark-parent_2.10</artifactId>
<version>1.4.0-csd-5-SNAPSHOT</version>
<version>1.4.1-csd-1-SNAPSHOT</version>
<relativePath>../pom.xml</relativePath>
</parent>

Expand Down
2 changes: 1 addition & 1 deletion core/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@
<parent>
<groupId>org.apache.spark</groupId>
<artifactId>spark-parent_2.10</artifactId>
<version>1.4.0-csd-5-SNAPSHOT</version>
<version>1.4.1-csd-1-SNAPSHOT</version>
<relativePath>../pom.xml</relativePath>
</parent>

Expand Down
10 changes: 8 additions & 2 deletions core/src/main/scala/org/apache/spark/ui/JettyUtils.scala
Original file line number Diff line number Diff line change
Expand Up @@ -210,10 +210,16 @@ private[spark] object JettyUtils extends Logging {
conf: SparkConf,
serverName: String = ""): ServerInfo = {

val collection = new ContextHandlerCollection
collection.setHandlers(handlers.toArray)
addFilters(handlers, conf)

val collection = new ContextHandlerCollection
val gzipHandlers = handlers.map { h =>
val gzipHandler = new GzipHandler
gzipHandler.setHandler(h)
gzipHandler
}
collection.setHandlers(gzipHandlers.toArray)

// Bind to the given port, or throw a java.net.BindException if the port is occupied
def connect(currentPort: Int): (Server, Int) = {
val server = new Server(new InetSocketAddress(hostName, currentPort))
Expand Down
4 changes: 2 additions & 2 deletions dev/create-release/create-release.sh
Original file line number Diff line number Diff line change
Expand Up @@ -118,13 +118,13 @@ if [[ ! "$@" =~ --skip-publish ]]; then

rm -rf $SPARK_REPO

build/mvn -DskipTests -Pyarn -Phive \
build/mvn -DskipTests -Pyarn -Phive -Prelease\
-Phive-thriftserver -Phadoop-2.2 -Pspark-ganglia-lgpl -Pkinesis-asl \
clean install

./dev/change-version-to-2.11.sh

build/mvn -DskipTests -Pyarn -Phive \
build/mvn -DskipTests -Pyarn -Phive -Prelease\
-Dscala-2.11 -Phadoop-2.2 -Pspark-ganglia-lgpl -Pkinesis-asl \
clean install

Expand Down
4 changes: 2 additions & 2 deletions docs/configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -1007,9 +1007,9 @@ Apart from these, the following properties are also available, and may be useful
<tr>
<td><code>spark.rpc.numRetries</code></td>
<td>3</td>
<td>
Number of times to retry before an RPC task gives up.
An RPC task will run at most times of this number.
<td>
</td>
</tr>
<tr>
Expand All @@ -1029,8 +1029,8 @@ Apart from these, the following properties are also available, and may be useful
<tr>
<td><code>spark.rpc.lookupTimeout</code></td>
<td>120s</td>
Duration for an RPC remote endpoint lookup operation to wait before timing out.
<td>
Duration for an RPC remote endpoint lookup operation to wait before timing out.
</td>
</tr>
</table>
Expand Down
4 changes: 2 additions & 2 deletions docs/sparkr.md
Original file line number Diff line number Diff line change
Expand Up @@ -68,7 +68,7 @@ you can specify the packages with the `packages` argument.

<div data-lang="r" markdown="1">
{% highlight r %}
sc <- sparkR.init(packages="com.databricks:spark-csv_2.11:1.0.3")
sc <- sparkR.init(sparkPackages="com.databricks:spark-csv_2.11:1.0.3")
sqlContext <- sparkRSQL.init(sc)
{% endhighlight %}
</div>
Expand Down Expand Up @@ -116,7 +116,7 @@ sql(hiveContext, "CREATE TABLE IF NOT EXISTS src (key INT, value STRING)")
sql(hiveContext, "LOAD DATA LOCAL INPATH 'examples/src/main/resources/kv1.txt' INTO TABLE src")

# Queries can be expressed in HiveQL.
results <- hiveContext.sql("FROM src SELECT key, value")
results <- sql(hiveContext, "FROM src SELECT key, value")

# results is now a DataFrame
head(results)
Expand Down
4 changes: 2 additions & 2 deletions docs/sql-programming-guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -828,7 +828,7 @@ using this syntax.

{% highlight scala %}
val df = sqlContext.read.format("json").load("examples/src/main/resources/people.json")
df.select("name", "age").write.format("json").save("namesAndAges.json")
df.select("name", "age").write.format("parquet").save("namesAndAges.parquet")
{% endhighlight %}

</div>
Expand Down Expand Up @@ -1518,7 +1518,7 @@ sql(sqlContext, "CREATE TABLE IF NOT EXISTS src (key INT, value STRING)")
sql(sqlContext, "LOAD DATA LOCAL INPATH 'examples/src/main/resources/kv1.txt' INTO TABLE src")

# Queries can be expressed in HiveQL.
results = sqlContext.sql("FROM src SELECT key, value").collect()
results <- collect(sql(sqlContext, "FROM src SELECT key, value"))

{% endhighlight %}

Expand Down
2 changes: 2 additions & 0 deletions docs/streaming-programming-guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -854,6 +854,8 @@ it with new information. To use this, you will have to do two steps.
1. Define the state update function - Specify with a function how to update the state using the
previous state and the new values from an input stream.

In every batch, Spark will apply the state update function for all existing keys, regardless of whether they have new data in a batch or not. If the update function returns `None` then the key-value pair will be eliminated.

Let's illustrate this with an example. Say you want to maintain a running count of each word
seen in a text data stream. Here, the running count is the state and it is an integer. We
define the update function as:
Expand Down
6 changes: 3 additions & 3 deletions ec2/spark_ec2.py
Original file line number Diff line number Diff line change
Expand Up @@ -127,7 +127,7 @@ def setup_external_libs(libs):
)
with open(tgz_file_path, "wb") as tgz_file:
tgz_file.write(download_stream.read())
with open(tgz_file_path) as tar:
with open(tgz_file_path, "rb") as tar:
if hashlib.md5(tar.read()).hexdigest() != lib["md5"]:
print("ERROR: Got wrong md5sum for {lib}.".format(lib=lib["name"]), file=stderr)
sys.exit(1)
Expand Down Expand Up @@ -1111,8 +1111,8 @@ def ssh(host, opts, command):
# If this was an ssh failure, provide the user with hints.
if e.returncode == 255:
raise UsageError(
"Failed to SSH to remote host {0}.\n" +
"Please check that you have provided the correct --identity-file and " +
"Failed to SSH to remote host {0}.\n"
"Please check that you have provided the correct --identity-file and "
"--key-pair parameters and try again.".format(host))
else:
raise e
Expand Down
2 changes: 1 addition & 1 deletion examples/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@
<parent>
<groupId>org.apache.spark</groupId>
<artifactId>spark-parent_2.10</artifactId>
<version>1.4.0-csd-5-SNAPSHOT</version>
<version>1.4.1-csd-1-SNAPSHOT</version>
<relativePath>../pom.xml</relativePath>
</parent>

Expand Down
2 changes: 1 addition & 1 deletion external/flume-sink/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@
<parent>
<groupId>org.apache.spark</groupId>
<artifactId>spark-parent_2.10</artifactId>
<version>1.4.0-csd-5-SNAPSHOT</version>
<version>1.4.1-csd-1-SNAPSHOT</version>
<relativePath>../../pom.xml</relativePath>
</parent>

Expand Down
2 changes: 1 addition & 1 deletion external/flume/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@
<parent>
<groupId>org.apache.spark</groupId>
<artifactId>spark-parent_2.10</artifactId>
<version>1.4.0-csd-5-SNAPSHOT</version>
<version>1.4.1-csd-1-SNAPSHOT</version>
<relativePath>../../pom.xml</relativePath>
</parent>

Expand Down
2 changes: 1 addition & 1 deletion external/kafka-assembly/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@
<parent>
<groupId>org.apache.spark</groupId>
<artifactId>spark-parent_2.10</artifactId>
<version>1.4.0-csd-5-SNAPSHOT</version>
<version>1.4.1-csd-1-SNAPSHOT</version>
<relativePath>../../pom.xml</relativePath>
</parent>

Expand Down
2 changes: 1 addition & 1 deletion external/kafka/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@
<parent>
<groupId>org.apache.spark</groupId>
<artifactId>spark-parent_2.10</artifactId>
<version>1.4.0-csd-5-SNAPSHOT</version>
<version>1.4.1-csd-1-SNAPSHOT</version>
<relativePath>../../pom.xml</relativePath>
</parent>

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -402,7 +402,7 @@ object KafkaCluster {
}

Seq("zookeeper.connect", "group.id").foreach { s =>
if (!props.contains(s)) {
if (!props.containsKey(s)) {
props.setProperty(s, "")
}
}
Expand Down
2 changes: 1 addition & 1 deletion external/mqtt/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@
<parent>
<groupId>org.apache.spark</groupId>
<artifactId>spark-parent_2.10</artifactId>
<version>1.4.0-csd-5-SNAPSHOT</version>
<version>1.4.1-csd-1-SNAPSHOT</version>
<relativePath>../../pom.xml</relativePath>
</parent>

Expand Down
2 changes: 1 addition & 1 deletion external/twitter/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@
<parent>
<groupId>org.apache.spark</groupId>
<artifactId>spark-parent_2.10</artifactId>
<version>1.4.0-csd-5-SNAPSHOT</version>
<version>1.4.1-csd-1-SNAPSHOT</version>
<relativePath>../../pom.xml</relativePath>
</parent>

Expand Down
2 changes: 1 addition & 1 deletion external/zeromq/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@
<parent>
<groupId>org.apache.spark</groupId>
<artifactId>spark-parent_2.10</artifactId>
<version>1.4.0-csd-5-SNAPSHOT</version>
<version>1.4.1-csd-1-SNAPSHOT</version>
<relativePath>../../pom.xml</relativePath>
</parent>

Expand Down
7 changes: 7 additions & 0 deletions extras/kinesis-asl/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,13 @@
<artifactId>spark-streaming_${scala.binary.version}</artifactId>
<version>${project.version}</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_${scala.binary.version}</artifactId>
<version>${project.version}</version>
<type>test-jar</type>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming_${scala.binary.version}</artifactId>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -26,23 +26,18 @@ import com.amazonaws.services.kinesis.clientlibrary.lib.worker.InitialPositionIn
import com.amazonaws.services.kinesis.clientlibrary.types.ShutdownReason
import com.amazonaws.services.kinesis.model.Record
import org.mockito.Mockito._
// scalastyle:off
// To avoid introducing a dependency on Spark core tests, simply use scalatest's FunSuite
// here instead of our own SparkFunSuite. Introducing the dependency has caused problems
// in the past (SPARK-8781) that are complicated by bugs in the maven shade plugin (MSHADE-148).
import org.scalatest.{BeforeAndAfter, FunSuite, Matchers}
import org.scalatest.{BeforeAndAfter, Matchers}
import org.scalatest.mock.MockitoSugar

import org.apache.spark.storage.StorageLevel
import org.apache.spark.streaming.{Milliseconds, Seconds, StreamingContext}
import org.apache.spark.streaming.{Milliseconds, Seconds, StreamingContext, TestSuiteBase}
import org.apache.spark.util.{Clock, ManualClock, Utils}

/**
* Suite of Kinesis streaming receiver tests focusing mostly on the KinesisRecordProcessor
*/
class KinesisReceiverSuite extends FunSuite with Matchers with BeforeAndAfter
with MockitoSugar {
// scalastyle:on
class KinesisReceiverSuite extends TestSuiteBase with Matchers with BeforeAndAfter
with MockitoSugar {

val app = "TestKinesisReceiver"
val stream = "mySparkStream"
Expand All @@ -62,23 +57,24 @@ class KinesisReceiverSuite extends FunSuite with Matchers with BeforeAndAfter
var checkpointStateMock: KinesisCheckpointState = _
var currentClockMock: Clock = _

before {
override def beforeFunction(): Unit = {
receiverMock = mock[KinesisReceiver]
checkpointerMock = mock[IRecordProcessorCheckpointer]
checkpointClockMock = mock[ManualClock]
checkpointStateMock = mock[KinesisCheckpointState]
currentClockMock = mock[Clock]
}

after {
override def afterFunction(): Unit = {
super.afterFunction()
// Since this suite was originally written using EasyMock, add this to preserve the old
// mocking semantics (see SPARK-5735 for more details)
verifyNoMoreInteractions(receiverMock, checkpointerMock, checkpointClockMock,
checkpointStateMock, currentClockMock)
}

test("KinesisUtils API") {
val ssc = new StreamingContext("local[2]", getClass.getSimpleName, Seconds(1))
val ssc = new StreamingContext(master, framework, batchDuration)
// Tests the API, does not actually test data receiving
val kinesisStream1 = KinesisUtils.createStream(ssc, "mySparkStream",
"https://kinesis.us-west-2.amazonaws.com", Seconds(2),
Expand Down
2 changes: 1 addition & 1 deletion graphx/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@
<parent>
<groupId>org.apache.spark</groupId>
<artifactId>spark-parent_2.10</artifactId>
<version>1.4.0-csd-5-SNAPSHOT</version>
<version>1.4.1-csd-1-SNAPSHOT</version>
<relativePath>../pom.xml</relativePath>
</parent>

Expand Down
2 changes: 1 addition & 1 deletion launcher/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@
<parent>
<groupId>org.apache.spark</groupId>
<artifactId>spark-parent_2.10</artifactId>
<version>1.4.0-csd-5-SNAPSHOT</version>
<version>1.4.1-csd-1-SNAPSHOT</version>
<relativePath>../pom.xml</relativePath>
</parent>

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -133,7 +133,7 @@ void addPermGenSizeOpt(List<String> cmd) {
}
}

cmd.add("-XX:MaxPermSize=128m");
cmd.add("-XX:MaxPermSize=256m");
}

void addOptionString(List<String> cmd, String options) {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -194,7 +194,7 @@ private void testCmdBuilder(boolean isDriver) throws Exception {
if (isDriver) {
assertEquals("-XX:MaxPermSize=256m", arg);
} else {
assertEquals("-XX:MaxPermSize=128m", arg);
assertEquals("-XX:MaxPermSize=256m", arg);
}
}
}
Expand Down
2 changes: 1 addition & 1 deletion mllib/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@
<parent>
<groupId>org.apache.spark</groupId>
<artifactId>spark-parent_2.10</artifactId>
<version>1.4.0-csd-5-SNAPSHOT</version>
<version>1.4.1-csd-1-SNAPSHOT</version>
<relativePath>../pom.xml</relativePath>
</parent>

Expand Down
2 changes: 1 addition & 1 deletion network/common/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@
<parent>
<groupId>org.apache.spark</groupId>
<artifactId>spark-parent_2.10</artifactId>
<version>1.4.0-csd-5-SNAPSHOT</version>
<version>1.4.1-csd-1-SNAPSHOT</version>
<relativePath>../../pom.xml</relativePath>
</parent>

Expand Down
2 changes: 1 addition & 1 deletion network/shuffle/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@
<parent>
<groupId>org.apache.spark</groupId>
<artifactId>spark-parent_2.10</artifactId>
<version>1.4.0-csd-5-SNAPSHOT</version>
<version>1.4.1-csd-1-SNAPSHOT</version>
<relativePath>../../pom.xml</relativePath>
</parent>

Expand Down
2 changes: 1 addition & 1 deletion network/yarn/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@
<parent>
<groupId>org.apache.spark</groupId>
<artifactId>spark-parent_2.10</artifactId>
<version>1.4.0-csd-5-SNAPSHOT</version>
<version>1.4.1-csd-1-SNAPSHOT</version>
<relativePath>../../pom.xml</relativePath>
</parent>

Expand Down
Loading