@@ -296,70 +296,6 @@ backed by an RDD of its entries.
296
296
The underlying RDDs of a distributed matrix must be deterministic, because we cache the matrix size.
297
297
In general the use of non-deterministic RDDs can lead to errors.
298
298
299
- ### BlockMatrix
300
-
301
- A ` BlockMatrix ` is a distributed matrix backed by an RDD of ` MatrixBlock ` s, where a ` MatrixBlock ` is
302
- a tuple of ` ((Int, Int), Matrix) ` , where the ` (Int, Int) ` is the index of the block, and ` Matrix ` is
303
- the sub-matrix at the given index with size ` rowsPerBlock ` x ` colsPerBlock ` .
304
- ` BlockMatrix ` supports methods such as ` add ` and ` multiply ` with another ` BlockMatrix ` .
305
- ` BlockMatrix ` also has a helper function ` validate ` which can be used to check whether the
306
- ` BlockMatrix ` is set up properly.
307
-
308
- <div class =" codetabs " >
309
- <div data-lang =" scala " markdown =" 1 " >
310
-
311
- A [ ` BlockMatrix ` ] ( api/scala/index.html#org.apache.spark.mllib.linalg.distributed.BlockMatrix ) can be
312
- most easily created from an ` IndexedRowMatrix ` or ` CoordinateMatrix ` by calling ` toBlockMatrix ` .
313
- ` toBlockMatrix ` creates blocks of size 1024 x 1024 by default.
314
- Users may change the block size by supplying the values through ` toBlockMatrix(rowsPerBlock, colsPerBlock) ` .
315
-
316
- {% highlight scala %}
317
- import org.apache.spark.mllib.linalg.distributed.{BlockMatrix, CoordinateMatrix, MatrixEntry}
318
-
319
- val entries: RDD[ MatrixEntry] = ... // an RDD of (i, j, v) matrix entries
320
- // Create a CoordinateMatrix from an RDD[ MatrixEntry] .
321
- val coordMat: CoordinateMatrix = new CoordinateMatrix(entries)
322
- // Transform the CoordinateMatrix to a BlockMatrix
323
- val matA: BlockMatrix = coordMat.toBlockMatrix().cache()
324
-
325
- // Validate whether the BlockMatrix is set up properly. Throws an Exception when it is not valid.
326
- // Nothing happens if it is valid.
327
- matA.validate()
328
-
329
- // Calculate A^T A.
330
- val ata = matA.transpose.multiply(matA)
331
- {% endhighlight %}
332
- </div >
333
-
334
- <div data-lang =" java " markdown =" 1 " >
335
-
336
- A [ ` BlockMatrix ` ] ( api/java/org/apache/spark/mllib/linalg/distributed/BlockMatrix.html ) can be
337
- most easily created from an ` IndexedRowMatrix ` or ` CoordinateMatrix ` by calling ` toBlockMatrix ` .
338
- ` toBlockMatrix ` creates blocks of size 1024 x 1024 by default.
339
- Users may change the block size by supplying the values through ` toBlockMatrix(rowsPerBlock, colsPerBlock) ` .
340
-
341
- {% highlight java %}
342
- import org.apache.spark.api.java.JavaRDD;
343
- import org.apache.spark.mllib.linalg.distributed.BlockMatrix;
344
- import org.apache.spark.mllib.linalg.distributed.CoordinateMatrix;
345
- import org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix;
346
-
347
- JavaRDD<MatrixEntry > entries = ... // a JavaRDD of (i, j, v) Matrix Entries
348
- // Create a CoordinateMatrix from a JavaRDD<MatrixEntry >.
349
- CoordinateMatrix coordMat = new CoordinateMatrix(entries.rdd());
350
- // Transform the CoordinateMatrix to a BlockMatrix
351
- BlockMatrix matA = coordMat.toBlockMatrix().cache();
352
-
353
- // Validate whether the BlockMatrix is set up properly. Throws an Exception when it is not valid.
354
- // Nothing happens if it is valid.
355
- matA.validate();
356
-
357
- // Calculate A^T A.
358
- BlockMatrix ata = matA.transpose().multiply(matA);
359
- {% endhighlight %}
360
- </div >
361
- </div >
362
-
363
299
### RowMatrix
364
300
365
301
A ` RowMatrix ` is a row-oriented distributed matrix without meaningful row indices, backed by an RDD
@@ -530,3 +466,67 @@ IndexedRowMatrix indexedRowMatrix = mat.toIndexedRowMatrix();
530
466
{% endhighlight %}
531
467
</div >
532
468
</div >
469
+
470
+ ### BlockMatrix
471
+
472
+ A ` BlockMatrix ` is a distributed matrix backed by an RDD of ` MatrixBlock ` s, where a ` MatrixBlock ` is
473
+ a tuple of ` ((Int, Int), Matrix) ` , where the ` (Int, Int) ` is the index of the block, and ` Matrix ` is
474
+ the sub-matrix at the given index with size ` rowsPerBlock ` x ` colsPerBlock ` .
475
+ ` BlockMatrix ` supports methods such as ` add ` and ` multiply ` with another ` BlockMatrix ` .
476
+ ` BlockMatrix ` also has a helper function ` validate ` which can be used to check whether the
477
+ ` BlockMatrix ` is set up properly.
478
+
479
+ <div class =" codetabs " >
480
+ <div data-lang =" scala " markdown =" 1 " >
481
+
482
+ A [ ` BlockMatrix ` ] ( api/scala/index.html#org.apache.spark.mllib.linalg.distributed.BlockMatrix ) can be
483
+ most easily created from an ` IndexedRowMatrix ` or ` CoordinateMatrix ` by calling ` toBlockMatrix ` .
484
+ ` toBlockMatrix ` creates blocks of size 1024 x 1024 by default.
485
+ Users may change the block size by supplying the values through ` toBlockMatrix(rowsPerBlock, colsPerBlock) ` .
486
+
487
+ {% highlight scala %}
488
+ import org.apache.spark.mllib.linalg.distributed.{BlockMatrix, CoordinateMatrix, MatrixEntry}
489
+
490
+ val entries: RDD[ MatrixEntry] = ... // an RDD of (i, j, v) matrix entries
491
+ // Create a CoordinateMatrix from an RDD[ MatrixEntry] .
492
+ val coordMat: CoordinateMatrix = new CoordinateMatrix(entries)
493
+ // Transform the CoordinateMatrix to a BlockMatrix
494
+ val matA: BlockMatrix = coordMat.toBlockMatrix().cache()
495
+
496
+ // Validate whether the BlockMatrix is set up properly. Throws an Exception when it is not valid.
497
+ // Nothing happens if it is valid.
498
+ matA.validate()
499
+
500
+ // Calculate A^T A.
501
+ val ata = matA.transpose.multiply(matA)
502
+ {% endhighlight %}
503
+ </div >
504
+
505
+ <div data-lang =" java " markdown =" 1 " >
506
+
507
+ A [ ` BlockMatrix ` ] ( api/java/org/apache/spark/mllib/linalg/distributed/BlockMatrix.html ) can be
508
+ most easily created from an ` IndexedRowMatrix ` or ` CoordinateMatrix ` by calling ` toBlockMatrix ` .
509
+ ` toBlockMatrix ` creates blocks of size 1024 x 1024 by default.
510
+ Users may change the block size by supplying the values through ` toBlockMatrix(rowsPerBlock, colsPerBlock) ` .
511
+
512
+ {% highlight java %}
513
+ import org.apache.spark.api.java.JavaRDD;
514
+ import org.apache.spark.mllib.linalg.distributed.BlockMatrix;
515
+ import org.apache.spark.mllib.linalg.distributed.CoordinateMatrix;
516
+ import org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix;
517
+
518
+ JavaRDD<MatrixEntry > entries = ... // a JavaRDD of (i, j, v) Matrix Entries
519
+ // Create a CoordinateMatrix from a JavaRDD<MatrixEntry >.
520
+ CoordinateMatrix coordMat = new CoordinateMatrix(entries.rdd());
521
+ // Transform the CoordinateMatrix to a BlockMatrix
522
+ BlockMatrix matA = coordMat.toBlockMatrix().cache();
523
+
524
+ // Validate whether the BlockMatrix is set up properly. Throws an Exception when it is not valid.
525
+ // Nothing happens if it is valid.
526
+ matA.validate();
527
+
528
+ // Calculate A^T A.
529
+ BlockMatrix ata = matA.transpose().multiply(matA);
530
+ {% endhighlight %}
531
+ </div >
532
+ </div >
0 commit comments