Skip to content

Commit f0a10e1

Browse files
author
Davies Liu
committed
address comments from @shivaram
1 parent 3ef7cf3 commit f0a10e1

File tree

5 files changed

+15
-11
lines changed

5 files changed

+15
-11
lines changed

R/pkg/R/RDD.R

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -238,7 +238,7 @@ setMethod("cache",
238238
#' @aliases persist,RDD-method
239239
setMethod("persist",
240240
signature(x = "RDD", newLevel = "character"),
241-
function(x, newLevel) {
241+
function(x, newLevel = "MEMORY_ONLY") {
242242
callJMethod(getJRDD(x), "persist", getStorageLevel(newLevel))
243243
x@env$isCached <- TRUE
244244
x

R/pkg/R/pairRDD.R

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -327,7 +327,7 @@ setMethod("reduceByKey",
327327
convertEnvsToList(keys, vals)
328328
}
329329
locallyReduced <- lapplyPartition(x, reduceVals)
330-
shuffled <- partitionBy(locallyReduced, as.integer(numPartitions))
330+
shuffled <- partitionBy(locallyReduced, numToInt(numPartitions))
331331
lapplyPartition(shuffled, reduceVals)
332332
})
333333

docs/index.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -54,7 +54,7 @@ Example applications are also provided in Python. For example,
5454

5555
./bin/spark-submit examples/src/main/python/pi.py 10
5656

57-
Spark also provides an experimental R API since 1.4 (only RDD and DtaFrame APIs included).
57+
Spark also provides an experimental R API since 1.4 (only RDD and DataFrames APIs included).
5858
To run Spark interactively in a R interpreter, use `bin/sparkR`:
5959

6060
./bin/sparkR --master local[2]

docs/programming-guide.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -867,7 +867,7 @@ There are three recommended ways to do this:
867867
For example, to pass a longer function, consider the code below:
868868

869869
{% highlight r %}
870-
"""MyScript.py"""
870+
"""MyScript.R"""
871871
myFunc <- function(s) {
872872
words = strsplit(s, " ")[[1]]
873873
length(words)

docs/quick-start.md

Lines changed: 11 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -225,7 +225,11 @@ For example, we'll define a `mymax` function to make this code easier to underst
225225
One common data flow pattern is MapReduce, as popularized by Hadoop. Spark can implement MapReduce flows easily:
226226

227227
{% highlight r %}
228-
> wordCounts <- reduceByKey(map(flatMap(textFile, function(line) strsplit(line, " ")[[1]]), function(word) list(word, 1)), "+", 2)
228+
> wordCounts <- reduceByKey(
229+
map(
230+
flatMap(textFile, function(line) strsplit(line, " ")[[1]]),
231+
function(word) list(word, 1)),
232+
"+", 2)
229233
{% endhighlight %}
230234

231235
Here, we combined the [`flatMap`](programming-guide.html#transformations), [`map`](programming-guide.html#transformations) and [`reduceByKey`](programming-guide.html#transformations) transformations to compute the per-word counts in the file as an RDD of (string, numeric) pairs. To collect the word counts in our shell, we can use the [`collect`](programming-guide.html#actions) action:
@@ -256,10 +260,10 @@ scala> linesWithSpark.cache()
256260
res7: spark.RDD[String] = spark.FilteredRDD@17e51082
257261

258262
scala> linesWithSpark.count()
259-
res8: Long = 15
263+
res8: Long = 19
260264

261265
scala> linesWithSpark.count()
262-
res9: Long = 15
266+
res9: Long = 19
263267
{% endhighlight %}
264268

265269
It may seem silly to use Spark to explore and cache a 100-line text file. The interesting part is
@@ -274,10 +278,10 @@ a cluster, as described in the [programming guide](programming-guide.html#initia
274278
>>> linesWithSpark.cache()
275279
276280
>>> linesWithSpark.count()
277-
15
281+
19
278282

279283
>>> linesWithSpark.count()
280-
15
284+
19
281285
{% endhighlight %}
282286

283287
It may seem silly to use Spark to explore and cache a 100-line text file. The interesting part is
@@ -292,10 +296,10 @@ a cluster, as described in the [programming guide](programming-guide.html#initia
292296
> cache(linesWithSpark)
293297
294298
> count(linesWithSpark)
295-
[1] 15
299+
[1] 19
296300

297301
> count(linesWithSpark)
298-
[1] 15
302+
[1] 19
299303
{% endhighlight %}
300304

301305
It may seem silly to use Spark to explore and cache a 100-line text file. The interesting part is

0 commit comments

Comments
 (0)