[SPARK-4047] - Generate runtime warnings for example implementation of PageRank #2894

varadharajan · 2014-10-22T16:43:56Z

Based on SPARK-2434, this PR generates runtime warnings for example implementations (Python, Scala) of PageRank.

AmplabJenkins · 2014-10-22T16:47:10Z

Can one of the admins verify this patch?

…f PageRank

srowen · 2014-10-24T14:28:23Z

Are there other examples that should have the same warning? I think there are many more than this.

varadharajan · 2014-10-24T15:59:41Z

Here are list of scala examples that i think is similar / naive implementation of algorithms from MLlib or graphx.

LocalALS
LocalFileLR
LocalKMeans
LocalLR
SparkALS
SparkHdfsLR
SparkKMeans
SparkLR
SparkPageRank (*)
SparkTachyonHdfsLR (*)

Python examples:

ALS
kmeans
logistic_regression
pagerank (*)

Java examples:

JavaHdfsLR (*)
JavaPageRank (*)

(*) - Examples with missing warnings. I've updated JIRA with these details and also added warning for them

I've also corrected class names of existing LR examples. They were pointing to org.apache.spark.mllib.classification.LogisticRegression instead of org.apache.spark.mllib.classification.LogisticRegressionModel

I've excluded examples that compute transitive closures on graphs because i'm was not able to find corresponding implementations in graphx. Please let me know if i'm missing something

1. JavaHdfsLR 2. JavaPageRank 3. SparkTachyonHdfsLR b. Renamed references of org.apache.spark.mllib.classification.LogisticRegression to org.apache.spark.mllib.classification.LogisticRegressionModel

davies · 2014-10-30T00:15:24Z

Jenkins, ok to test.

SparkQA · 2014-10-30T00:19:39Z

Test build #22498 has started for PR 2894 at commit 252f595.

This patch merges cleanly.

SparkQA · 2014-10-30T01:28:17Z

Test build #22498 has finished for PR 2894 at commit 252f595.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

AmplabJenkins · 2014-10-30T01:28:20Z

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/22498/
Test PASSed.

davies · 2014-11-03T22:44:33Z

LGTM, thanks!

varadharajan · 2014-11-04T03:45:12Z

Thanks :)

JoshRosen · 2014-11-07T22:17:06Z

Since this is MLlib related, @mengxr or @jkbradley, could one of you do the final sign-off + commit on this? Thanks!

jkbradley · 2014-11-07T22:45:06Z

@varadharajan Thanks for adding the warnings! My main comment is that LogisticRegressionModel is a model, rather than an algorithm. Users would really want the algorithm which they can run to produce the model. Could you instead direct users to the algorithms: LogisticRegressionWithSGD and LogisticRegressionWithLBFGS? (It is awkward that there are 2 algorithms to direct users towards, but it is hard to get around that.)

varadharajan · 2014-11-08T14:18:52Z

@jkbradley Makes sense. I've updated the warnings, please let me know if wordings can be improved. Also i just noticed that pyspark classification model does not have LR-LBFGS implementation. I'll probably create a new issue and work on it.

…egressionWithLBFGS instead of LogisticRegressionModel

SparkQA · 2014-11-08T14:19:55Z

Test build #23102 has started for PR 2894 at commit 5f9406b.

This patch merges cleanly.

varadharajan · 2014-11-08T14:23:18Z

Also i think it would help users if we can document in the LR section of the MLlib guide, which algorithm should be preferred in which scenarios.

SparkQA · 2014-11-08T15:44:52Z

Test build #23102 has finished for PR 2894 at commit 5f9406b.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

AmplabJenkins · 2014-11-08T15:44:55Z

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23102/
Test PASSed.

jkbradley · 2014-11-09T18:25:04Z

@varadharajan Good suggestion about documenting algs for LR; I'll make a note to do that for the upcoming release. Thank you for the PR!

LGTM

varadharajan · 2014-11-10T07:32:23Z

@jkbradley Thanks :)

mengxr · 2014-11-10T22:35:22Z

Merged into master and branch-1.2. Thanks! (We should find some time and clean really old examples.)

…f PageRank Based on SPARK-2434, this PR generates runtime warnings for example implementations (Python, Scala) of PageRank. Author: Varadharajan Mukundan <[email protected]> Closes #2894 from varadharajan/SPARK-4047 and squashes the following commits: 5f9406b [Varadharajan Mukundan] [SPARK-4047] - Point users to LogisticRegressionWithSGD and LogisticRegressionWithLBFGS instead of LogisticRegressionModel 252f595 [Varadharajan Mukundan] a. Generate runtime warnings for 05a018b [Varadharajan Mukundan] Fix PageRank implementation's package reference 5c2bf54 [Varadharajan Mukundan] [SPARK-4047] - Generate runtime warnings for example implementation of PageRank (cherry picked from commit 974d334) Signed-off-by: Xiangrui Meng <[email protected]>

varadharajan added 2 commits October 22, 2014 22:17

[SPARK-4047] - Generate runtime warnings for example implementation o…

5c2bf54

…f PageRank

Fix PageRank implementation's package reference

05a018b

a. Generate runtime warnings for

252f595

1. JavaHdfsLR 2. JavaPageRank 3. SparkTachyonHdfsLR b. Renamed references of org.apache.spark.mllib.classification.LogisticRegression to org.apache.spark.mllib.classification.LogisticRegressionModel

[SPARK-4047] - Point users to LogisticRegressionWithSGD and LogisticR…

5f9406b

…egressionWithLBFGS instead of LogisticRegressionModel

asfgit closed this in 974d334 Nov 10, 2014

[SPARK-4047] - Generate runtime warnings for example implementation of PageRank #2894

[SPARK-4047] - Generate runtime warnings for example implementation of PageRank #2894

Uh oh!

Conversation

varadharajan commented Oct 22, 2014

Uh oh!

AmplabJenkins commented Oct 22, 2014

Uh oh!

srowen commented Oct 24, 2014

Uh oh!

varadharajan commented Oct 24, 2014

Uh oh!

davies commented Oct 30, 2014

Uh oh!

SparkQA commented Oct 30, 2014

Uh oh!

SparkQA commented Oct 30, 2014

Uh oh!

AmplabJenkins commented Oct 30, 2014

Uh oh!

davies commented Nov 3, 2014

Uh oh!

varadharajan commented Nov 4, 2014

Uh oh!

JoshRosen commented Nov 7, 2014

Uh oh!

jkbradley commented Nov 7, 2014

Uh oh!

varadharajan commented Nov 8, 2014

Uh oh!

SparkQA commented Nov 8, 2014

Uh oh!

varadharajan commented Nov 8, 2014

Uh oh!

SparkQA commented Nov 8, 2014

Uh oh!

AmplabJenkins commented Nov 8, 2014

Uh oh!

jkbradley commented Nov 9, 2014

Uh oh!

varadharajan commented Nov 10, 2014

Uh oh!

mengxr commented Nov 10, 2014

Uh oh!

Uh oh!