Package org.apache.mahout.classifier.bayes

Class Summary
PrepareTwentyNewsgroups Prepare the 20 Newsgroups files for training using the BayesFileFormatter.
WikipediaDatasetCreatorDriver Create and run the Wikipedia Dataset Creator.
WikipediaDatasetCreatorMapper Maps over Wikipedia xml format and output all document having the category listed in the input category file
WikipediaDatasetCreatorReducer Can also be used as a local Combiner
WikipediaXmlSplitter The Bayes example package provides some helper classes for training the Naive Bayes classifier on the Twenty Newsgroups data.
XmlInputFormat Reads records that are delimited by a specific begin/end tag.
XmlInputFormat.XmlRecordReader XMLRecordReader class to read through a given xml document to output xml blocks as records as specified by the start tag and end tag
 



Copyright © 2008-2012 The Apache Software Foundation. All Rights Reserved.