Skip to content

Conversation

mengxr
Copy link
Contributor

@mengxr mengxr commented Nov 26, 2014

Before we have a full picture of the operators we want to add, it might be safer to hide Matrix.transposeMultiply in 1.2.0. Another update we want to change is Matrix.randn and Matrix.rand, both of which should take a Random implementation. Otherwise, it is very likely to produce inconsistent RDDs. I also added some unit tests for matrix factory methods. All APIs are new in 1.2, so there is no incompatible changes.

@brkyvz

@SparkQA
Copy link

SparkQA commented Nov 26, 2014

Test build #23870 has started for PR 3468 at commit 6bfd8a4.

  • This patch merges cleanly.

val mat = Matrices.eye(2).asInstanceOf[DenseMatrix]
assert(mat.numCols === 2)
assert(mat.numCols === 2)
assert(mat.values.toSeq === Seq(1.0, 0.0, 0.0, 1.0))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small, but:
mat.values.toSeq === Seq(1.0, 0.0, 0.0, 1.0) -> mat.values === Array(1.0, 0.0, 0.0, 1.0)
is there any reason the other wouldn't work?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Array.equals only check the reference.

scala> Array(1.0, 2.0) == Array(1.0, 2.0)
res0: Boolean = false

@brkyvz
Copy link
Contributor

brkyvz commented Nov 26, 2014

Looks good to me! Just made one comment, no biggie though, it's fine as is (but if you decide to change it, there are 4 exact copies of it).
One comment/question about using Random: What would be the problem if we just pass a seed?
Since users wouldn't be able to use XORShiftRandom, which has better performance than
java.util.Random, wouldn't passing the seed be enough to solve the inconsistency issue?

@SparkQA
Copy link

SparkQA commented Nov 26, 2014

Test build #23870 has finished for PR 3468 at commit 6bfd8a4.

  • This patch fails MiMa tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23870/
Test FAILed.

@mengxr
Copy link
Contributor Author

mengxr commented Nov 26, 2014

@brkyvz XORShiftRandom implements Random. So users can use it directly. Using seed will create problems when we want to generate a sequence of matrices, e.g., generate a random block matrices, because randn and rand do not return the next seed. If we using a predefined sequence of seeds, the generated sequence won't have good quality.

@SparkQA
Copy link

SparkQA commented Nov 26, 2014

Test build #23884 has started for PR 3468 at commit 3b0e4e2.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Nov 26, 2014

Test build #23884 has finished for PR 3468 at commit 3b0e4e2.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23884/
Test PASSed.

@mengxr
Copy link
Contributor Author

mengxr commented Nov 26, 2014

@brkyvz Thanks! I've merged this into master. Will submit another PR for branch-1.2.

@asfgit asfgit closed this in 561d31d Nov 26, 2014
asfgit pushed a commit that referenced this pull request Nov 26, 2014
…ices

This is #3468 for branch-1.2, same content except mima excludes.

Author: Xiangrui Meng <[email protected]>

Closes #3482 from mengxr/SPARK-4614-1.2 and squashes the following commits:

ea4f08d [Xiangrui Meng] hide transposeMultiply; add rng to rand and randn; add unit tests
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants