Skip to content

update #12

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 62 commits into from
Dec 5, 2014
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
62 commits
Select commit Hold shift + click to select a range
2b233f5
Documentation: add description for repartitionAndSortWithinPartitions
msiddalingaiah Dec 1, 2014
5db8dca
[SPARK-4258][SQL][DOC] Documents spark.sql.parquet.filterPushdown
liancheng Dec 1, 2014
bafee67
[SQL] add @group tab in limit() and count()
Dec 1, 2014
b57365a
[SPARK-4358][SQL] Let BigDecimal do checking type compatibility
viirya Dec 1, 2014
6a9ff19
[SPARK-4650][SQL] Supporting multi column support in countDistinct fu…
ravipesala Dec 1, 2014
bc35381
[SPARK-4658][SQL] Code documentation issue in DDL of datasource API
ravipesala Dec 1, 2014
7b79957
[SQL] Minor fix for doc and comment
scwf Dec 1, 2014
5edbcbf
[SQL][DOC] Date type in SQL programming guide
adrian-wang Dec 1, 2014
4df60a8
[SPARK-4529] [SQL] support view with column alias
adrian-wang Dec 2, 2014
d3e02dd
[SPARK-4268][SQL] Use #::: to get benefit from Stream in SqlLexical.a…
zsxwing Dec 2, 2014
b0a46d8
MAINTENANCE: Automated closing of pull requests.
pwendell Dec 2, 2014
64f3175
[SPARK-4611][MLlib] Implement the efficient vector norm
Dec 2, 2014
6dfe38a
[SPARK-4397][Core] Cleanup 'import SparkContext._' in core
zsxwing Dec 2, 2014
d9a148b
[SPARK-4686] Link to allowed master URLs is broken
kayousterhout Dec 2, 2014
b1f8fe3
Indent license header properly for interfaces.scala.
rxin Dec 2, 2014
e75e04f
[SPARK-4536][SQL] Add sqrt and abs to Spark SQL DSL
sarutak Dec 2, 2014
69b6fed
[SPARK-4663][sql]add finally to avoid resource leak
baishuo Dec 2, 2014
1066427
[SPARK-4676][SQL] JavaSchemaRDD.schema may throw NullType MatchError …
YanTangZhai Dec 2, 2014
f6df609
[SPARK-4593][SQL] Return null when denominator is 0
adrian-wang Dec 2, 2014
1f5ddf1
[SPARK-4670] [SQL] wrong symbol for bitwise not
adrian-wang Dec 2, 2014
3ae0cda
[SPARK-4695][SQL] Get result using executeCollect
scwf Dec 2, 2014
2d4f6e7
Minor nit style cleanup in GraphX.
rxin Dec 2, 2014
5da21f0
[Release] Translate unknown author names automatically
Dec 3, 2014
fc0a147
[SPARK-4672][GraphX]Perform checkpoint() on PartitionsRDD to shorten …
Dec 3, 2014
17c162f
[SPARK-4672][GraphX]Non-transient PartitionsRDDs will lead to StackOv…
Dec 3, 2014
77be8b9
[SPARK-4672][Core]Checkpoint() should clear f to shorten the serializ…
Dec 3, 2014
8af551f
[SPARK-4397][Core] Change the 'since' value of '@deprecated' to '1.3.0'
zsxwing Dec 3, 2014
4ac2151
[SPARK-4710] [mllib] Eliminate MLlib compilation warnings
jkbradley Dec 3, 2014
7fc49ed
[SPARK-4708][MLLib] Make k-mean runs two/three times faster with dens…
Dec 3, 2014
d005429
[SPARK-4717][MLlib] Optimize BLAS library to avoid de-reference multi…
Dec 3, 2014
a975dc3
SPARK-2624 add datanucleus jars to the container in yarn-cluster
Dec 3, 2014
96786e3
[SPARK-4701] Typo in sbt/sbt
tsudukim Dec 3, 2014
edd3cd4
[SPARK-4715][Core] Make sure tryToAcquire won't return a negative value
zsxwing Dec 3, 2014
692f493
[SPARK-4642] Add description about spark.yarn.queue to running-on-YAR…
tsudukim Dec 3, 2014
90ec643
[HOT FIX] [YARN] Check whether `/lib` exists before listing its files
Dec 3, 2014
513ef82
[SPARK-4552][SQL] Avoid exception when reading empty parquet data thr…
marmbrus Dec 3, 2014
96b2785
[SPARK-4498][core] Don't transition ExecutorInfo to RUNNING until Dri…
markhamstra Dec 3, 2014
1826372
[SPARK-4085] Propagate FetchFailedException when Spark fails to read …
rxin Dec 4, 2014
27ab0b8
[SPARK-4711] [mllib] [docs] Programming guide advice on choosing opti…
jkbradley Dec 4, 2014
657a888
[SPARK-4580] [SPARK-4610] [mllib] [docs] Documentation for tree ensem…
jkbradley Dec 4, 2014
a4dfb4e
[Release] Correctly translate contributors name in release notes
Dec 4, 2014
3cdae03
MAINTENANCE: Automated closing of pull requests.
pwendell Dec 4, 2014
ed88db4
[SQL] remove unnecessary import
Dec 4, 2014
c3ad486
[SPARK-4719][API] Consolidate various narrow dep RDD classes with Map…
rxin Dec 4, 2014
20bfea4
[SPARK-4685] Include all spark.ml and spark.mllib packages in JavaDoc…
Lewuathe Dec 4, 2014
c6c7165
[SQL] Minor: Avoid calling Seq#size in a loop
aarondav Dec 4, 2014
529439b
[docs] Fix outdated comment in tuning guide
jkbradley Dec 4, 2014
469a6e5
[SPARK-4575] [mllib] [docs] spark.ml pipelines doc + bug fixes
jkbradley Dec 4, 2014
7e758d7
[FIX][DOC] Fix broken links in ml-guide.md
mengxr Dec 4, 2014
28c7aca
[SPARK-4683][SQL] Add a beeline.cmd to run on Windows
liancheng Dec 4, 2014
8106b1e
[SPARK-4253] Ignore spark.driver.host in yarn-cluster and standalone-…
WangTaoTheTonic Dec 4, 2014
8dae26f
[HOTFIX] Fixing two issues with the release script.
pwendell Dec 4, 2014
794f3ae
[SPARK-4745] Fix get_existing_cluster() function with multiple securi…
alexdebrie Dec 4, 2014
743a889
[SPARK-4459] Change groupBy type parameter from K to U
Dec 4, 2014
ab8177d
[SPARK-4652][DOCS] Add docs about spark-git-repo option
Lewuathe Dec 4, 2014
ed92b47
[SPARK-4397] Move object RDD to the front of RDD.scala.
rxin Dec 5, 2014
ddfc09c
[SPARK-4421] Wrong link in spark-standalone.html
tsudukim Dec 5, 2014
15cf3b0
Fix typo in Spark SQL docs.
andyk Dec 5, 2014
ca37903
[SPARK-4464] Description about configuration options need to be modif…
tsudukim Dec 5, 2014
87437df
Revert "[HOT FIX] [YARN] Check whether `/lib` exists before listing i…
Dec 5, 2014
fd85253
Revert "SPARK-2624 add datanucleus jars to the container in yarn-clus…
Dec 5, 2014
f5801e8
[SPARK-4753][SQL] Use catalyst for partition pruning in newParquet.
marmbrus Dec 5, 2014
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
*.ipr
*.iml
*.iws
*.pyc
.idea/
.idea_modules/
sbt/*.jar
Expand Down Expand Up @@ -49,6 +50,8 @@ dependency-reduced-pom.xml
checkpoint
derby.log
dist/
dev/create-release/*txt
dev/create-release/*new
spark-*-bin-*.tgz
unit-tests.log
/lib/
Expand Down
21 changes: 21 additions & 0 deletions bin/beeline.cmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
@echo off

rem
rem Licensed to the Apache Software Foundation (ASF) under one or more
rem contributor license agreements. See the NOTICE file distributed with
rem this work for additional information regarding copyright ownership.
rem The ASF licenses this file to You under the Apache License, Version 2.0
rem (the "License"); you may not use this file except in compliance with
rem the License. You may obtain a copy of the License at
rem
rem http://www.apache.org/licenses/LICENSE-2.0
rem
rem Unless required by applicable law or agreed to in writing, software
rem distributed under the License is distributed on an "AS IS" BASIS,
rem WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
rem See the License for the specific language governing permissions and
rem limitations under the License.
rem

set SPARK_HOME=%~dp0..
cmd /V /E /C %SPARK_HOME%\bin\spark-class.cmd org.apache.hive.beeline.BeeLine %*
44 changes: 22 additions & 22 deletions core/src/main/scala/org/apache/spark/SparkContext.scala
Original file line number Diff line number Diff line change
Expand Up @@ -1630,28 +1630,28 @@ object SparkContext extends Logging {
// following ones.

@deprecated("Replaced by implicit objects in AccumulatorParam. This is kept here only for " +
"backward compatibility.", "1.2.0")
"backward compatibility.", "1.3.0")
object DoubleAccumulatorParam extends AccumulatorParam[Double] {
def addInPlace(t1: Double, t2: Double): Double = t1 + t2
def zero(initialValue: Double) = 0.0
}

@deprecated("Replaced by implicit objects in AccumulatorParam. This is kept here only for " +
"backward compatibility.", "1.2.0")
"backward compatibility.", "1.3.0")
object IntAccumulatorParam extends AccumulatorParam[Int] {
def addInPlace(t1: Int, t2: Int): Int = t1 + t2
def zero(initialValue: Int) = 0
}

@deprecated("Replaced by implicit objects in AccumulatorParam. This is kept here only for " +
"backward compatibility.", "1.2.0")
"backward compatibility.", "1.3.0")
object LongAccumulatorParam extends AccumulatorParam[Long] {
def addInPlace(t1: Long, t2: Long) = t1 + t2
def zero(initialValue: Long) = 0L
}

@deprecated("Replaced by implicit objects in AccumulatorParam. This is kept here only for " +
"backward compatibility.", "1.2.0")
"backward compatibility.", "1.3.0")
object FloatAccumulatorParam extends AccumulatorParam[Float] {
def addInPlace(t1: Float, t2: Float) = t1 + t2
def zero(initialValue: Float) = 0f
Expand All @@ -1662,34 +1662,34 @@ object SparkContext extends Logging {
// and just call the corresponding functions in `object RDD`.

@deprecated("Replaced by implicit functions in the RDD companion object. This is " +
"kept here only for backward compatibility.", "1.2.0")
"kept here only for backward compatibility.", "1.3.0")
def rddToPairRDDFunctions[K, V](rdd: RDD[(K, V)])
(implicit kt: ClassTag[K], vt: ClassTag[V], ord: Ordering[K] = null) = {
RDD.rddToPairRDDFunctions(rdd)
}

@deprecated("Replaced by implicit functions in the RDD companion object. This is " +
"kept here only for backward compatibility.", "1.2.0")
"kept here only for backward compatibility.", "1.3.0")
def rddToAsyncRDDActions[T: ClassTag](rdd: RDD[T]) = RDD.rddToAsyncRDDActions(rdd)

@deprecated("Replaced by implicit functions in the RDD companion object. This is " +
"kept here only for backward compatibility.", "1.2.0")
"kept here only for backward compatibility.", "1.3.0")
def rddToSequenceFileRDDFunctions[K <% Writable: ClassTag, V <% Writable: ClassTag](
rdd: RDD[(K, V)]) =
RDD.rddToSequenceFileRDDFunctions(rdd)

@deprecated("Replaced by implicit functions in the RDD companion object. This is " +
"kept here only for backward compatibility.", "1.2.0")
"kept here only for backward compatibility.", "1.3.0")
def rddToOrderedRDDFunctions[K : Ordering : ClassTag, V: ClassTag](
rdd: RDD[(K, V)]) =
RDD.rddToOrderedRDDFunctions(rdd)

@deprecated("Replaced by implicit functions in the RDD companion object. This is " +
"kept here only for backward compatibility.", "1.2.0")
"kept here only for backward compatibility.", "1.3.0")
def doubleRDDToDoubleRDDFunctions(rdd: RDD[Double]) = RDD.doubleRDDToDoubleRDDFunctions(rdd)

@deprecated("Replaced by implicit functions in the RDD companion object. This is " +
"kept here only for backward compatibility.", "1.2.0")
"kept here only for backward compatibility.", "1.3.0")
def numericRDDToDoubleRDDFunctions[T](rdd: RDD[T])(implicit num: Numeric[T]) =
RDD.numericRDDToDoubleRDDFunctions(rdd)

Expand Down Expand Up @@ -1722,43 +1722,43 @@ object SparkContext extends Logging {
// and just call the corresponding functions in `object WritableConverter`.

@deprecated("Replaced by implicit functions in WritableConverter. This is kept here only for " +
"backward compatibility.", "1.2.0")
"backward compatibility.", "1.3.0")
def intWritableConverter(): WritableConverter[Int] =
WritableConverter.intWritableConverter()

@deprecated("Replaced by implicit functions in WritableConverter. This is kept here only for " +
"backward compatibility.", "1.2.0")
"backward compatibility.", "1.3.0")
def longWritableConverter(): WritableConverter[Long] =
WritableConverter.longWritableConverter()

@deprecated("Replaced by implicit functions in WritableConverter. This is kept here only for " +
"backward compatibility.", "1.2.0")
"backward compatibility.", "1.3.0")
def doubleWritableConverter(): WritableConverter[Double] =
WritableConverter.doubleWritableConverter()

@deprecated("Replaced by implicit functions in WritableConverter. This is kept here only for " +
"backward compatibility.", "1.2.0")
"backward compatibility.", "1.3.0")
def floatWritableConverter(): WritableConverter[Float] =
WritableConverter.floatWritableConverter()

@deprecated("Replaced by implicit functions in WritableConverter. This is kept here only for " +
"backward compatibility.", "1.2.0")
"backward compatibility.", "1.3.0")
def booleanWritableConverter(): WritableConverter[Boolean] =
WritableConverter.booleanWritableConverter()

@deprecated("Replaced by implicit functions in WritableConverter. This is kept here only for " +
"backward compatibility.", "1.2.0")
"backward compatibility.", "1.3.0")
def bytesWritableConverter(): WritableConverter[Array[Byte]] =
WritableConverter.bytesWritableConverter()

@deprecated("Replaced by implicit functions in WritableConverter. This is kept here only for " +
"backward compatibility.", "1.2.0")
"backward compatibility.", "1.3.0")
def stringWritableConverter(): WritableConverter[String] =
WritableConverter.stringWritableConverter()

@deprecated("Replaced by implicit functions in WritableConverter. This is kept here only for " +
"backward compatibility.", "1.2.0")
def writableWritableConverter[T <: Writable]() =
"backward compatibility.", "1.3.0")
def writableWritableConverter[T <: Writable](): WritableConverter[T] =
WritableConverter.writableWritableConverter()

/**
Expand Down Expand Up @@ -2017,15 +2017,15 @@ object WritableConverter {
simpleWritableConverter[Boolean, BooleanWritable](_.get)

implicit def bytesWritableConverter(): WritableConverter[Array[Byte]] = {
simpleWritableConverter[Array[Byte], BytesWritable](bw =>
simpleWritableConverter[Array[Byte], BytesWritable] { bw =>
// getBytes method returns array which is longer then data to be returned
Arrays.copyOfRange(bw.getBytes, 0, bw.getLength)
)
}
}

implicit def stringWritableConverter(): WritableConverter[String] =
simpleWritableConverter[String, Text](_.toString)

implicit def writableWritableConverter[T <: Writable]() =
implicit def writableWritableConverter[T <: Writable](): WritableConverter[T] =
new WritableConverter[T](_.runtimeClass.asInstanceOf[Class[T]], _.asInstanceOf[T])
}
18 changes: 10 additions & 8 deletions core/src/main/scala/org/apache/spark/api/java/JavaRDDLike.scala
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,6 @@ import com.google.common.base.Optional
import org.apache.hadoop.io.compress.CompressionCodec

import org.apache.spark._
import org.apache.spark.SparkContext._
import org.apache.spark.annotation.Experimental
import org.apache.spark.api.java.JavaPairRDD._
import org.apache.spark.api.java.JavaSparkContext.fakeClassTag
Expand Down Expand Up @@ -212,8 +211,9 @@ trait JavaRDDLike[T, This <: JavaRDDLike[T, This]] extends Serializable {
* Return an RDD of grouped elements. Each group consists of a key and a sequence of elements
* mapping to that key.
*/
def groupBy[K](f: JFunction[T, K]): JavaPairRDD[K, JIterable[T]] = {
implicit val ctagK: ClassTag[K] = fakeClassTag
def groupBy[U](f: JFunction[T, U]): JavaPairRDD[U, JIterable[T]] = {
// The type parameter is U instead of K in order to work around a compiler bug; see SPARK-4459
implicit val ctagK: ClassTag[U] = fakeClassTag
implicit val ctagV: ClassTag[JList[T]] = fakeClassTag
JavaPairRDD.fromRDD(groupByResultToJava(rdd.groupBy(f)(fakeClassTag)))
}
Expand All @@ -222,10 +222,11 @@ trait JavaRDDLike[T, This <: JavaRDDLike[T, This]] extends Serializable {
* Return an RDD of grouped elements. Each group consists of a key and a sequence of elements
* mapping to that key.
*/
def groupBy[K](f: JFunction[T, K], numPartitions: Int): JavaPairRDD[K, JIterable[T]] = {
implicit val ctagK: ClassTag[K] = fakeClassTag
def groupBy[U](f: JFunction[T, U], numPartitions: Int): JavaPairRDD[U, JIterable[T]] = {
// The type parameter is U instead of K in order to work around a compiler bug; see SPARK-4459
implicit val ctagK: ClassTag[U] = fakeClassTag
implicit val ctagV: ClassTag[JList[T]] = fakeClassTag
JavaPairRDD.fromRDD(groupByResultToJava(rdd.groupBy(f, numPartitions)(fakeClassTag[K])))
JavaPairRDD.fromRDD(groupByResultToJava(rdd.groupBy(f, numPartitions)(fakeClassTag[U])))
}

/**
Expand Down Expand Up @@ -459,8 +460,9 @@ trait JavaRDDLike[T, This <: JavaRDDLike[T, This]] extends Serializable {
/**
* Creates tuples of the elements in this RDD by applying `f`.
*/
def keyBy[K](f: JFunction[T, K]): JavaPairRDD[K, T] = {
implicit val ctag: ClassTag[K] = fakeClassTag
def keyBy[U](f: JFunction[T, U]): JavaPairRDD[U, T] = {
// The type parameter is U instead of K in order to work around a compiler bug; see SPARK-4459
implicit val ctag: ClassTag[U] = fakeClassTag
JavaPairRDD.fromRDD(rdd.keyBy(f))
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,6 @@ import org.apache.hadoop.io.compress.CompressionCodec
import org.apache.hadoop.mapred.{InputFormat, OutputFormat, JobConf}
import org.apache.hadoop.mapreduce.{InputFormat => NewInputFormat, OutputFormat => NewOutputFormat}
import org.apache.spark._
import org.apache.spark.SparkContext._
import org.apache.spark.api.java.{JavaSparkContext, JavaPairRDD, JavaRDD}
import org.apache.spark.broadcast.Broadcast
import org.apache.spark.rdd.RDD
Expand Down
5 changes: 5 additions & 0 deletions core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala
Original file line number Diff line number Diff line change
Expand Up @@ -281,6 +281,11 @@ object SparkSubmit {
sysProps.getOrElseUpdate(k, v)
}

// Ignore invalid spark.driver.host in cluster modes.
if (deployMode == CLUSTER) {
sysProps -= ("spark.driver.host")
}

// Resolve paths in certain spark properties
val pathConfigs = Seq(
"spark.jars",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -134,6 +134,7 @@ private[spark] class AppClient(
val fullId = appId + "/" + id
logInfo("Executor added: %s on %s (%s) with %d cores".format(fullId, workerId, hostPort,
cores))
master ! ExecutorStateChanged(appId, id, ExecutorState.RUNNING, None, None)
listener.executorAdded(fullId, workerId, hostPort, cores, memory)

case ExecutorUpdated(id, state, message, exitStatus) =>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -144,8 +144,6 @@ private[spark] class ExecutorRunner(
Files.write(header, stderr, UTF_8)
stderrAppender = FileAppender(process.getErrorStream, stderr, conf)

state = ExecutorState.RUNNING
worker ! ExecutorStateChanged(appId, execId, state, None, None)
// Wait for it to exit; executor may exit with code 0 (when driver instructs it to shutdown)
// or with nonzero exit code
val exitCode = process.waitFor()
Expand Down
4 changes: 2 additions & 2 deletions core/src/main/scala/org/apache/spark/package.scala
Original file line number Diff line number Diff line change
Expand Up @@ -27,8 +27,8 @@ package org.apache
* contains operations available only on RDDs of Doubles; and
* [[org.apache.spark.rdd.SequenceFileRDDFunctions]] contains operations available on RDDs that can
* be saved as SequenceFiles. These operations are automatically available on any RDD of the right
* type (e.g. RDD[(Int, Int)] through implicit conversions when you
* `import org.apache.spark.SparkContext._`.
* type (e.g. RDD[(Int, Int)] through implicit conversions except `saveAsSequenceFile`. You need to
* `import org.apache.spark.SparkContext._` to make `saveAsSequenceFile` work.
*
* Java programmers should reference the [[org.apache.spark.api.java]] package
* for Spark programming APIs in Java.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,6 @@ import org.apache.spark.{ComplexFutureAction, FutureAction, Logging}

/**
* A set of asynchronous RDD actions available through an implicit conversion.
* Import `org.apache.spark.SparkContext._` at the top of your program to use these functions.
*/
class AsyncRDDActions[T: ClassTag](self: RDD[T]) extends Serializable with Logging {

Expand Down
12 changes: 6 additions & 6 deletions core/src/main/scala/org/apache/spark/rdd/BinaryFileRDD.scala
Original file line number Diff line number Diff line change
Expand Up @@ -24,12 +24,12 @@ import org.apache.spark.input.StreamFileInputFormat
import org.apache.spark.{ Partition, SparkContext }

private[spark] class BinaryFileRDD[T](
sc: SparkContext,
inputFormatClass: Class[_ <: StreamFileInputFormat[T]],
keyClass: Class[String],
valueClass: Class[T],
@transient conf: Configuration,
minPartitions: Int)
sc: SparkContext,
inputFormatClass: Class[_ <: StreamFileInputFormat[T]],
keyClass: Class[String],
valueClass: Class[T],
@transient conf: Configuration,
minPartitions: Int)
extends NewHadoopRDD[String, T](sc, inputFormatClass, keyClass, valueClass, conf) {

override def getPartitions: Array[Partition] = {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,6 @@ import org.apache.spark.util.StatCounter

/**
* Extra functions available on RDDs of Doubles through an implicit conversion.
* Import `org.apache.spark.SparkContext._` at the top of your program to use these functions.
*/
class DoubleRDDFunctions(self: RDD[Double]) extends Logging with Serializable {
/** Add up the elements in this RDD. */
Expand Down
35 changes: 0 additions & 35 deletions core/src/main/scala/org/apache/spark/rdd/FilteredRDD.scala

This file was deleted.

34 changes: 0 additions & 34 deletions core/src/main/scala/org/apache/spark/rdd/FlatMappedRDD.scala

This file was deleted.

Loading