Package org.apache.mahout.vectorizer

Interface Summary
Weight  
 

Class Summary
DefaultAnalyzer A subclass of the Lucene StandardAnalyzer that provides a no-argument constructor.
DictionaryVectorizer This class converts a set of input documents in the sequence file format to vectors.
DocumentProcessor This class converts a set of input documents in the sequence file format of StringTuples.The SequenceFile input should have a Text key containing the unique document identifier and a Text value containing the whole document.
SparseVectorsFromSequenceFiles Converts a given set of sequence files into SparseVectors
TF org.apache.mahout.utils.vectors.Weight based on term frequency only
TFIDF  
 



Copyright © 2008-2010 The Apache Software Foundation. All Rights Reserved.