|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectorg.apache.mahout.classifier.BayesFileFormatter
public final class BayesFileFormatter
Flatten a file into format that can be read by the Bayes M/R job.
One document per line, first token is the label followed by a tab, rest of the line are the terms.
Method Summary | |
---|---|
static void |
collapse(String label,
org.apache.lucene.analysis.Analyzer analyzer,
File inputDir,
Charset charset,
File outputFile)
Collapse all the files in the inputDir into a single file in the proper Bayes format, 1 document per line |
static void |
format(String label,
org.apache.lucene.analysis.Analyzer analyzer,
File input,
Charset charset,
File outDir)
Write the input files to the outdir, one output file per input file |
static void |
main(String[] args)
Run the FileFormatter |
static String[] |
readerToDocument(org.apache.lucene.analysis.Analyzer analyzer,
Reader reader)
Convert a Reader to a vector |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Method Detail |
---|
public static void collapse(String label, org.apache.lucene.analysis.Analyzer analyzer, File inputDir, Charset charset, File outputFile) throws IOException
label
- The labelanalyzer
- The analyzer to useinputDir
- The input Directorycharset
- The charset of the input filesoutputFile
- The file to collapse to
IOException
public static void format(String label, org.apache.lucene.analysis.Analyzer analyzer, File input, Charset charset, File outDir) throws IOException
label
- The label of the fileanalyzer
- The analyzer to useinput
- The input file or directory. May not be nullcharset
- The Character set of the input filesoutDir
- The output directory. Files will be written there with the same name as the input file
IOException
public static String[] readerToDocument(org.apache.lucene.analysis.Analyzer analyzer, Reader reader) throws IOException
analyzer
- The Analyzer to usereader
- The reader to feed to the Analyzer
IOException
public static void main(String[] args) throws Exception
args
- The input args. Run with -h to see the help
ClassNotFoundException
- if the Analyzer can't be found
IllegalAccessException
- if the Analyzer can't be constructed
InstantiationException
- if the Analyzer can't be constructed
IOException
- if the files can't be dealt with properly
Exception
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |