, and distributed operations are executed as M/R passes on
Hadoop. The usage is as follows:
// the path must already contain an already created SequenceFile!
DistributedRowMatrix m = new DistributedRowMatrix("path/to/vector/sequenceFile", "tmp/path", 10000000, 250000);
m.configure(new JobConf());
// now if we want to multiply a vector by this matrix, it's dimension must equal the row dimension of this
// matrix. If we want to timesSquared() a vector by this matrix, its dimension must equal the column dimension
// of the matrix.
Vector v = new DenseVector(250000);
// now the following operation will be done via a M/R pass via Hadoop.
Vector w = m.timesSquared(v);
Constructor Summary |
DistributedRowMatrix(org.apache.hadoop.fs.Path inputPathString,
org.apache.hadoop.fs.Path outputTmpPathString,
int numRows,
int numCols)
|
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
DistributedRowMatrix
public DistributedRowMatrix(org.apache.hadoop.fs.Path inputPathString,
org.apache.hadoop.fs.Path outputTmpPathString,
int numRows,
int numCols)
configure
public void configure(org.apache.hadoop.mapred.JobConf conf)
- Specified by:
configure
in interface org.apache.hadoop.mapred.JobConfigurable
getRowPath
public org.apache.hadoop.fs.Path getRowPath()
getOutputTempPath
public org.apache.hadoop.fs.Path getOutputTempPath()
setOutputTempPathString
public void setOutputTempPathString(java.lang.String outPathString)
iterateAll
public java.util.Iterator<MatrixSlice> iterateAll()
- Specified by:
iterateAll
in interface VectorIterable
numSlices
public int numSlices()
- Specified by:
numSlices
in interface VectorIterable
numRows
public int numRows()
- Specified by:
numRows
in interface VectorIterable
numCols
public int numCols()
- Specified by:
numCols
in interface VectorIterable
times
public DistributedRowMatrix times(DistributedRowMatrix other)
throws java.io.IOException
- This implements matrix this.transpose().times(other)
- Parameters:
other
- a DistributedRowMatrix
- Returns:
- a DistributedRowMatrix containing the product
- Throws:
java.io.IOException
transpose
public DistributedRowMatrix transpose()
throws java.io.IOException
- Throws:
java.io.IOException
times
public Vector times(Vector v)
- Specified by:
times
in interface VectorIterable
timesSquared
public Vector timesSquared(Vector v)
- Specified by:
timesSquared
in interface VectorIterable
iterator
public java.util.Iterator<MatrixSlice> iterator()
- Specified by:
iterator
in interface java.lang.Iterable<MatrixSlice>
Copyright © 2008-2010 The Apache Software Foundation. All Rights Reserved.