-
Notifications
You must be signed in to change notification settings - Fork 28.7k
[SPARK-6806] [SparkR] [Docs] Fill in SparkR examples in programming guide #5442
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
sqlCtx -> sqlContext
Test build #29975 has started for PR 5442 at commit |
Test build #29979 has started for PR 5442 at commit |
@cafreeman -- If you get a chance, could you take a look at this too ? |
Test build #29975 has finished for PR 5442 at commit
|
Test PASSed. |
Test build #29979 has finished for PR 5442 at commit
|
Test PASSed. |
context connects to using the `--master` argument. You can also add dependencies | ||
(e.g. Spark Packages) to your shell session by supplying a comma-separated list of maven coordinates | ||
to the `--packages` argument. Any additional repositories where dependencies might exist (e.g. SonaType) | ||
can be passed to the `--repositories` argument. For example, to run `bin/pyspark` on exactly four cores, use: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should refer to SparkR instead of PySpark.
Left some comments inline, but most of them seem like minor details leftover from translating the |
@@ -54,6 +54,15 @@ Example applications are also provided in Python. For example, | |||
|
|||
./bin/spark-submit examples/src/main/python/pi.py 10 | |||
|
|||
Spark also provides a R API. To run Spark interactively in a R interpreter, use |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think here (or somewhere close by) we should say that SparkR is an experimental component in <SPARK_VERSION>
and that only the RDD API and DataFrame APIs have been implemented in SparkR.
Test FAILed. |
Test build #660 has started for PR 5442 at commit |
Test build #30052 has started for PR 5442 at commit |
Test build #660 has finished for PR 5442 at commit
|
Test build #30052 has finished for PR 5442 at commit
|
Test PASSed. |
@@ -327,7 +327,7 @@ setMethod("reduceByKey", | |||
convertEnvsToList(keys, vals) | |||
} | |||
locallyReduced <- lapplyPartition(x, reduceVals) | |||
shuffled <- partitionBy(locallyReduced, numPartitions) | |||
shuffled <- partitionBy(locallyReduced, as.integer(numPartitions)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should use numToInt
from utils.R here
Test PASSed. |
@@ -491,6 +573,37 @@ for teenName in teenNames.collect(): | |||
|
|||
</div> | |||
|
|||
<div data-lang="r" markdown="1"> | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not applicable right now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Merged build triggered. |
Merged build started. |
Test build #32998 has started for PR 5442 at commit |
Test build #32998 has finished for PR 5442 at commit
|
Merged build finished. Test FAILed. |
Test FAILed. |
Test build #821 has started for PR 5442 at commit |
Test build #821 has finished for PR 5442 at commit
|
Test build #825 has started for PR 5442 at commit |
Test build #825 has finished for PR 5442 at commit
|
Merged build triggered. |
Merged build started. |
Test build #33103 has started for PR 5442 at commit |
Test build #33103 has finished for PR 5442 at commit
|
Merged build finished. Test FAILed. |
Test FAILed. |
@shivaram Is this ready to go? |
Test build #850 has started for PR 5442 at commit |
Test build #850 has finished for PR 5442 at commit
|
|
||
{% highlight r %} | ||
|
||
df <- laodDF(sqlContext, source="jdbc", url="jdbc:postgresql:dbserver", dbtable="schema.tablename") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor typo: This should be loadDF
@davies Sorry for the delay in looking at this. I think this change looks pretty good -- I found a minor typo that we can fix up during merge. I think it might be better to actually create a new page for SparkR rather than append it to the DataFrames page -- but I'l do this in a follow PR. LGTM |
…uide sqlCtx -> sqlContext You can check the docs by: ``` $ cd docs $ SKIP_SCALADOC=1 jekyll serve ``` cc shivaram Author: Davies Liu <[email protected]> Closes #5442 from davies/r_docs and squashes the following commits: 7a12ec6 [Davies Liu] remove rdd in R docs 8496b26 [Davies Liu] remove the docs related to RDD e23b9d6 [Davies Liu] delete R docs for RDD API 222e4ff [Davies Liu] Merge branch 'master' into r_docs 89684ce [Davies Liu] Merge branch 'r_docs' of github.com:davies/spark into r_docs f0a10e1 [Davies Liu] address comments from @shivaram f61de71 [Davies Liu] Update pairRDD.R 3ef7cf3 [Davies Liu] use + instead of function(a,b) a+b 2f10a77 [Davies Liu] address comments from @cafreeman 9c2a062 [Davies Liu] mention R api together with Python API 23f751a [Davies Liu] Fill in SparkR examples in programming guide (cherry picked from commit 7af3818) Signed-off-by: Shivaram Venkataraman <[email protected]>
…uide sqlCtx -> sqlContext You can check the docs by: ``` $ cd docs $ SKIP_SCALADOC=1 jekyll serve ``` cc shivaram Author: Davies Liu <[email protected]> Closes apache#5442 from davies/r_docs and squashes the following commits: 7a12ec6 [Davies Liu] remove rdd in R docs 8496b26 [Davies Liu] remove the docs related to RDD e23b9d6 [Davies Liu] delete R docs for RDD API 222e4ff [Davies Liu] Merge branch 'master' into r_docs 89684ce [Davies Liu] Merge branch 'r_docs' of github.com:davies/spark into r_docs f0a10e1 [Davies Liu] address comments from @shivaram f61de71 [Davies Liu] Update pairRDD.R 3ef7cf3 [Davies Liu] use + instead of function(a,b) a+b 2f10a77 [Davies Liu] address comments from @cafreeman 9c2a062 [Davies Liu] mention R api together with Python API 23f751a [Davies Liu] Fill in SparkR examples in programming guide
…uide sqlCtx -> sqlContext You can check the docs by: ``` $ cd docs $ SKIP_SCALADOC=1 jekyll serve ``` cc shivaram Author: Davies Liu <[email protected]> Closes apache#5442 from davies/r_docs and squashes the following commits: 7a12ec6 [Davies Liu] remove rdd in R docs 8496b26 [Davies Liu] remove the docs related to RDD e23b9d6 [Davies Liu] delete R docs for RDD API 222e4ff [Davies Liu] Merge branch 'master' into r_docs 89684ce [Davies Liu] Merge branch 'r_docs' of github.com:davies/spark into r_docs f0a10e1 [Davies Liu] address comments from @shivaram f61de71 [Davies Liu] Update pairRDD.R 3ef7cf3 [Davies Liu] use + instead of function(a,b) a+b 2f10a77 [Davies Liu] address comments from @cafreeman 9c2a062 [Davies Liu] mention R api together with Python API 23f751a [Davies Liu] Fill in SparkR examples in programming guide
…uide sqlCtx -> sqlContext You can check the docs by: ``` $ cd docs $ SKIP_SCALADOC=1 jekyll serve ``` cc shivaram Author: Davies Liu <[email protected]> Closes apache#5442 from davies/r_docs and squashes the following commits: 7a12ec6 [Davies Liu] remove rdd in R docs 8496b26 [Davies Liu] remove the docs related to RDD e23b9d6 [Davies Liu] delete R docs for RDD API 222e4ff [Davies Liu] Merge branch 'master' into r_docs 89684ce [Davies Liu] Merge branch 'r_docs' of github.com:davies/spark into r_docs f0a10e1 [Davies Liu] address comments from @shivaram f61de71 [Davies Liu] Update pairRDD.R 3ef7cf3 [Davies Liu] use + instead of function(a,b) a+b 2f10a77 [Davies Liu] address comments from @cafreeman 9c2a062 [Davies Liu] mention R api together with Python API 23f751a [Davies Liu] Fill in SparkR examples in programming guide
sqlCtx -> sqlContext
You can check the docs by:
cc @shivaram