Skip to content

Commit 3ef7cf3

Browse files
author
Davies Liu
committed
use + instead of function(a,b) a+b
1 parent 2f10a77 commit 3ef7cf3

File tree

2 files changed

+5
-5
lines changed

2 files changed

+5
-5
lines changed

docs/programming-guide.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -363,7 +363,7 @@ data <- c(1, 2, 3, 4, 5)
363363
distData <- parallelize(sc, data)
364364
{% endhighlight %}
365365

366-
Once created, the distributed dataset (`distData`) can be operated on in parallel. For example, we can call `reduce(distData, function(a, b) {a + b})` to add up the elements of the list.
366+
Once created, the distributed dataset (`distData`) can be operated on in parallel. For example, we can call `reduce(distData, "+")` to add up the elements of the list.
367367
We describe operations on distributed datasets later on.
368368

369369
</div>
@@ -551,7 +551,7 @@ Text file RDDs can be created using `textFile` method. This method takes an URI
551551
distFile <- textFile(sc, "data.txt")
552552
{% endhighlight %}
553553

554-
Once created, `distFile` can be acted on by dataset operations. For example, we can add up the sizes of all the lines using the `map` and `reduce` operations as follows: `reduce(map(distFile, length), function(a, b) {a + b})`.
554+
Once created, `distFile` can be acted on by dataset operations. For example, we can add up the sizes of all the lines using the `map` and `reduce` operations as follows: `reduce(map(distFile, length), "+")`.
555555

556556
Some notes on reading files with Spark:
557557

@@ -667,7 +667,7 @@ To illustrate RDD basics, consider the simple program below:
667667
{% highlight r %}
668668
lines <- textFile(sc, "data.txt")
669669
lineLengths <- map(lines, length)
670-
totalLength <- reduce(lineLengths, function(a, b) {a + b})
670+
totalLength <- reduce(lineLengths, "+")
671671
{% endhighlight %}
672672

673673
The first line defines a base RDD from an external file. This dataset is not loaded in memory or
@@ -1070,7 +1070,7 @@ many times each line of text occurs in a file:
10701070
{% highlight r %}
10711071
lines <- textFile(sc, "data.txt")
10721072
pairs <- map(lines, function(s) list(s, 1))
1073-
counts <- reduceByKey(pairs, function(a, b){a + b})
1073+
counts <- reduceByKey(pairs, "+")
10741074
{% endhighlight %}
10751075

10761076
We could also use `sortByKey(counts)`, for example, to sort the pairs alphabetically, and finally

docs/quick-start.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -225,7 +225,7 @@ For example, we'll define a `mymax` function to make this code easier to underst
225225
One common data flow pattern is MapReduce, as popularized by Hadoop. Spark can implement MapReduce flows easily:
226226

227227
{% highlight r %}
228-
> wordCounts <- reduceByKey(map(flatMap(textFile, function(line) strsplit(line, " ")[[1]]), function(word) list(word, 1)), function(a, b) {a + b}, 2)
228+
> wordCounts <- reduceByKey(map(flatMap(textFile, function(line) strsplit(line, " ")[[1]]), function(word) list(word, 1)), "+", 2)
229229
{% endhighlight %}
230230

231231
Here, we combined the [`flatMap`](programming-guide.html#transformations), [`map`](programming-guide.html#transformations) and [`reduceByKey`](programming-guide.html#transformations) transformations to compute the per-word counts in the file as an RDD of (string, numeric) pairs. To collect the word counts in our shell, we can use the [`collect`](programming-guide.html#actions) action:

0 commit comments

Comments
 (0)