Package org.apache.mahout.text

Class Summary
SequenceFilesFromDirectory Converts a directory of text documents into SequenceFiles of Specified chunkSize.
SequenceFilesFromDirectory.ChunkedWriter  
WikipediaMapper Maps over Wikipedia xml format and output all document having the category listed in the input category file
WikipediaToSequenceFile Create and run the Wikipedia Dataset Creator.
 



Copyright © 2008-2010 The Apache Software Foundation. All Rights Reserved.