Skip to content

[SPARK-6226][MLLIB] add save/load in PySpark's KMeansModel #5049

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

mengxr
Copy link
Contributor

@mengxr mengxr commented Mar 16, 2015

Use _py2java and _java2py to convert Python model to/from Java model. @yinxusen

@SparkQA
Copy link

SparkQA commented Mar 16, 2015

Test build #28659 has started for PR 5049 at commit b10b911.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Mar 16, 2015

Test build #28659 has finished for PR 5049 at commit b10b911.

  • This patch fails Python style tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • class KMeansModel(Saveable, Loader):

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/28659/
Test FAILed.

@SparkQA
Copy link

SparkQA commented Mar 16, 2015

Test build #28665 has started for PR 5049 at commit 570ba81.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Mar 16, 2015

Test build #28665 has finished for PR 5049 at commit 570ba81.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • class KMeansModel(Saveable, Loader):

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/28665/
Test FAILed.

@mengxr
Copy link
Contributor Author

mengxr commented Mar 16, 2015

test this please

@SparkQA
Copy link

SparkQA commented Mar 16, 2015

Test build #28672 has started for PR 5049 at commit 570ba81.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Mar 16, 2015

Test build #28672 has finished for PR 5049 at commit 570ba81.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • class KMeansModel(Saveable, Loader):

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/28672/
Test PASSed.

elif isinstance(obj, list) and (obj or isinstance(obj[0], JavaObject)):
obj = ListConverter().convert(obj, sc._gateway._gateway_client)
elif isinstance(obj, list):
obj = ListConverter().convert([_py2java(sc, x) for x in obj], sc._gateway._gateway_client)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mengxr Let me guess, when I encounter an Array[ndarray], this line of code will try to translate the ndarray one by one. For each ndarray, the else statement will be triggered, PikleSerializer will serialize the ndarray, then the helper function in JVM will turn it back to Vector.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. Except that we pass in vectors because we call _convert_to_vector in KMeansModel.save.

@yinxusen
Copy link
Contributor

@mengxr Don't we need extra unittest? Does doctest well enough?

@mengxr
Copy link
Contributor Author

mengxr commented Mar 17, 2015

Not necessary. doctests are examples+unittests.

@mengxr
Copy link
Contributor Author

mengxr commented Mar 17, 2015

Merged into master.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants