Skip to content

[SPARK-6806] [SparkR] [Docs] Fill in SparkR examples in programming guide #5442

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 11 commits into from

Conversation

davies
Copy link
Contributor

@davies davies commented Apr 9, 2015

sqlCtx -> sqlContext

You can check the docs by:

$ cd docs
$ SKIP_SCALADOC=1 jekyll serve

cc @shivaram

@SparkQA
Copy link

SparkQA commented Apr 9, 2015

Test build #29975 has started for PR 5442 at commit 23f751a.

@SparkQA
Copy link

SparkQA commented Apr 9, 2015

Test build #29979 has started for PR 5442 at commit 9c2a062.

@shivaram
Copy link
Contributor

shivaram commented Apr 9, 2015

@cafreeman -- If you get a chance, could you take a look at this too ?

@SparkQA
Copy link

SparkQA commented Apr 9, 2015

Test build #29975 has finished for PR 5442 at commit 23f751a.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.
  • This patch does not change any dependencies.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/29975/
Test PASSed.

@SparkQA
Copy link

SparkQA commented Apr 10, 2015

Test build #29979 has finished for PR 5442 at commit 9c2a062.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.
  • This patch does not change any dependencies.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/29979/
Test PASSed.

context connects to using the `--master` argument. You can also add dependencies
(e.g. Spark Packages) to your shell session by supplying a comma-separated list of maven coordinates
to the `--packages` argument. Any additional repositories where dependencies might exist (e.g. SonaType)
can be passed to the `--repositories` argument. For example, to run `bin/pyspark` on exactly four cores, use:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should refer to SparkR instead of PySpark.

@cafreeman
Copy link

Left some comments inline, but most of them seem like minor details leftover from translating the PySpark docs. Overall I think this is looking really good.

@@ -54,6 +54,15 @@ Example applications are also provided in Python. For example,

./bin/spark-submit examples/src/main/python/pi.py 10

Spark also provides a R API. To run Spark interactively in a R interpreter, use
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think here (or somewhere close by) we should say that SparkR is an experimental component in <SPARK_VERSION> and that only the RDD API and DataFrame APIs have been implemented in SparkR.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/30047/
Test FAILed.

@SparkQA
Copy link

SparkQA commented Apr 10, 2015

Test build #660 has started for PR 5442 at commit 2f10a77.

@SparkQA
Copy link

SparkQA commented Apr 10, 2015

Test build #30052 has started for PR 5442 at commit 3ef7cf3.

@SparkQA
Copy link

SparkQA commented Apr 10, 2015

Test build #660 has finished for PR 5442 at commit 2f10a77.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.
  • This patch does not change any dependencies.

@SparkQA
Copy link

SparkQA commented Apr 10, 2015

Test build #30052 has finished for PR 5442 at commit 3ef7cf3.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.
  • This patch does not change any dependencies.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/30052/
Test PASSed.

@@ -327,7 +327,7 @@ setMethod("reduceByKey",
convertEnvsToList(keys, vals)
}
locallyReduced <- lapplyPartition(x, reduceVals)
shuffled <- partitionBy(locallyReduced, numPartitions)
shuffled <- partitionBy(locallyReduced, as.integer(numPartitions))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should use numToInt from utils.R here

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/32931/
Test PASSed.

@@ -491,6 +573,37 @@ for teenName in teenNames.collect():

</div>

<div data-lang="r" markdown="1">

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not applicable right now

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@SparkQA
Copy link

SparkQA commented May 18, 2015

Test build #32998 has started for PR 5442 at commit 8496b26.

@SparkQA
Copy link

SparkQA commented May 18, 2015

Test build #32998 has finished for PR 5442 at commit 8496b26.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Merged build finished. Test FAILed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/32998/
Test FAILed.

@SparkQA
Copy link

SparkQA commented May 18, 2015

Test build #821 has started for PR 5442 at commit 8496b26.

@SparkQA
Copy link

SparkQA commented May 18, 2015

Test build #821 has finished for PR 5442 at commit 8496b26.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented May 19, 2015

Test build #825 has started for PR 5442 at commit 8496b26.

@SparkQA
Copy link

SparkQA commented May 19, 2015

Test build #825 has finished for PR 5442 at commit 8496b26.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@SparkQA
Copy link

SparkQA commented May 19, 2015

Test build #33103 has started for PR 5442 at commit 7a12ec6.

@SparkQA
Copy link

SparkQA commented May 19, 2015

Test build #33103 has finished for PR 5442 at commit 7a12ec6.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Merged build finished. Test FAILed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/33103/
Test FAILed.

@davies
Copy link
Contributor Author

davies commented May 21, 2015

@shivaram Is this ready to go?

@SparkQA
Copy link

SparkQA commented May 21, 2015

Test build #850 has started for PR 5442 at commit 7a12ec6.

@SparkQA
Copy link

SparkQA commented May 22, 2015

Test build #850 has finished for PR 5442 at commit 7a12ec6.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.


{% highlight r %}

df <- laodDF(sqlContext, source="jdbc", url="jdbc:postgresql:dbserver", dbtable="schema.tablename")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor typo: This should be loadDF

@shivaram
Copy link
Contributor

@davies Sorry for the delay in looking at this. I think this change looks pretty good -- I found a minor typo that we can fix up during merge.

I think it might be better to actually create a new page for SparkR rather than append it to the DataFrames page -- but I'l do this in a follow PR.

LGTM

asfgit pushed a commit that referenced this pull request May 23, 2015
…uide

sqlCtx -> sqlContext

You can check the docs by:

```
$ cd docs
$ SKIP_SCALADOC=1 jekyll serve
```
cc shivaram

Author: Davies Liu <[email protected]>

Closes #5442 from davies/r_docs and squashes the following commits:

7a12ec6 [Davies Liu] remove rdd in R docs
8496b26 [Davies Liu] remove the docs related to RDD
e23b9d6 [Davies Liu] delete R docs for RDD API
222e4ff [Davies Liu] Merge branch 'master' into r_docs
89684ce [Davies Liu] Merge branch 'r_docs' of github.com:davies/spark into r_docs
f0a10e1 [Davies Liu] address comments from @shivaram
f61de71 [Davies Liu] Update pairRDD.R
3ef7cf3 [Davies Liu] use + instead of function(a,b) a+b
2f10a77 [Davies Liu] address comments from @cafreeman
9c2a062 [Davies Liu] mention R api together with Python API
23f751a [Davies Liu] Fill in SparkR examples in programming guide

(cherry picked from commit 7af3818)
Signed-off-by: Shivaram Venkataraman <[email protected]>
@asfgit asfgit closed this in 7af3818 May 23, 2015
jeanlyn pushed a commit to jeanlyn/spark that referenced this pull request May 28, 2015
…uide

sqlCtx -> sqlContext

You can check the docs by:

```
$ cd docs
$ SKIP_SCALADOC=1 jekyll serve
```
cc shivaram

Author: Davies Liu <[email protected]>

Closes apache#5442 from davies/r_docs and squashes the following commits:

7a12ec6 [Davies Liu] remove rdd in R docs
8496b26 [Davies Liu] remove the docs related to RDD
e23b9d6 [Davies Liu] delete R docs for RDD API
222e4ff [Davies Liu] Merge branch 'master' into r_docs
89684ce [Davies Liu] Merge branch 'r_docs' of github.com:davies/spark into r_docs
f0a10e1 [Davies Liu] address comments from @shivaram
f61de71 [Davies Liu] Update pairRDD.R
3ef7cf3 [Davies Liu] use + instead of function(a,b) a+b
2f10a77 [Davies Liu] address comments from @cafreeman
9c2a062 [Davies Liu] mention R api together with Python API
23f751a [Davies Liu] Fill in SparkR examples in programming guide
jeanlyn pushed a commit to jeanlyn/spark that referenced this pull request Jun 12, 2015
…uide

sqlCtx -> sqlContext

You can check the docs by:

```
$ cd docs
$ SKIP_SCALADOC=1 jekyll serve
```
cc shivaram

Author: Davies Liu <[email protected]>

Closes apache#5442 from davies/r_docs and squashes the following commits:

7a12ec6 [Davies Liu] remove rdd in R docs
8496b26 [Davies Liu] remove the docs related to RDD
e23b9d6 [Davies Liu] delete R docs for RDD API
222e4ff [Davies Liu] Merge branch 'master' into r_docs
89684ce [Davies Liu] Merge branch 'r_docs' of github.com:davies/spark into r_docs
f0a10e1 [Davies Liu] address comments from @shivaram
f61de71 [Davies Liu] Update pairRDD.R
3ef7cf3 [Davies Liu] use + instead of function(a,b) a+b
2f10a77 [Davies Liu] address comments from @cafreeman
9c2a062 [Davies Liu] mention R api together with Python API
23f751a [Davies Liu] Fill in SparkR examples in programming guide
nemccarthy pushed a commit to nemccarthy/spark that referenced this pull request Jun 19, 2015
…uide

sqlCtx -> sqlContext

You can check the docs by:

```
$ cd docs
$ SKIP_SCALADOC=1 jekyll serve
```
cc shivaram

Author: Davies Liu <[email protected]>

Closes apache#5442 from davies/r_docs and squashes the following commits:

7a12ec6 [Davies Liu] remove rdd in R docs
8496b26 [Davies Liu] remove the docs related to RDD
e23b9d6 [Davies Liu] delete R docs for RDD API
222e4ff [Davies Liu] Merge branch 'master' into r_docs
89684ce [Davies Liu] Merge branch 'r_docs' of github.com:davies/spark into r_docs
f0a10e1 [Davies Liu] address comments from @shivaram
f61de71 [Davies Liu] Update pairRDD.R
3ef7cf3 [Davies Liu] use + instead of function(a,b) a+b
2f10a77 [Davies Liu] address comments from @cafreeman
9c2a062 [Davies Liu] mention R api together with Python API
23f751a [Davies Liu] Fill in SparkR examples in programming guide
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants