org.apache.mahout.utils.vectors.common
Class PartialVectorMerger
java.lang.Object
org.apache.mahout.utils.vectors.common.PartialVectorMerger
public final class PartialVectorMerger
- extends java.lang.Object
This class groups a set of input vectors. The Sequence file input should have a WritableComparable
key containing document id and a VectorWritable
value containing the term frequency vector. This
class also does normalization of the vector.
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
NO_NORMALIZING
public static final float NO_NORMALIZING
- See Also:
- Constant Field Values
NORMALIZATION_POWER
public static final java.lang.String NORMALIZATION_POWER
- See Also:
- Constant Field Values
DIMENSION
public static final java.lang.String DIMENSION
- See Also:
- Constant Field Values
SEQUENTIAL_ACCESS
public static final java.lang.String SEQUENTIAL_ACCESS
- See Also:
- Constant Field Values
mergePartialVectors
public static void mergePartialVectors(java.util.List<org.apache.hadoop.fs.Path> partialVectorPaths,
java.lang.String output,
float normPower,
int dimension,
boolean sequentialAccess)
throws java.io.IOException
- Merge all the partial
RandomAccessSparseVector
s into the complete Document
RandomAccessSparseVector
- Parameters:
partialVectorPaths
- input directory of the vectors in SequenceFile
formatoutput
- output directory were the partial vectors have to be creatednormPower
- The normalization value. Must be greater than or equal to 0 or equal to NO_NORMALIZING
- Throws:
java.io.IOException
Copyright © 2008-2010 The Apache Software Foundation. All Rights Reserved.