org.apache.mahout.classifier.bayes
Class XmlInputFormat
java.lang.Object
org.apache.hadoop.mapred.FileInputFormat<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text>
org.apache.hadoop.mapred.TextInputFormat
org.apache.mahout.classifier.bayes.XmlInputFormat
- All Implemented Interfaces:
- org.apache.hadoop.mapred.InputFormat<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text>, org.apache.hadoop.mapred.JobConfigurable
public class XmlInputFormat
- extends org.apache.hadoop.mapred.TextInputFormat
Reads records that are delimited by a specifc begin/end tag.
Nested Class Summary |
static class |
XmlInputFormat.XmlRecordReader
XMLRecordReader class to read through a given xml document to output xml blocks as records as specified
by the start tag and end tag |
Fields inherited from class org.apache.hadoop.mapred.FileInputFormat |
LOG |
Method Summary |
org.apache.hadoop.mapred.RecordReader<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text> |
getRecordReader(org.apache.hadoop.mapred.InputSplit inputSplit,
org.apache.hadoop.mapred.JobConf jobConf,
org.apache.hadoop.mapred.Reporter reporter)
|
Methods inherited from class org.apache.hadoop.mapred.TextInputFormat |
configure, isSplitable |
Methods inherited from class org.apache.hadoop.mapred.FileInputFormat |
addInputPath, addInputPaths, computeSplitSize, getBlockIndex, getInputPathFilter, getInputPaths, getSplitHosts, getSplits, listStatus, setInputPathFilter, setInputPaths, setInputPaths, setMinSplitSize |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
START_TAG_KEY
public static final java.lang.String START_TAG_KEY
- See Also:
- Constant Field Values
END_TAG_KEY
public static final java.lang.String END_TAG_KEY
- See Also:
- Constant Field Values
XmlInputFormat
public XmlInputFormat()
getRecordReader
public org.apache.hadoop.mapred.RecordReader<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text> getRecordReader(org.apache.hadoop.mapred.InputSplit inputSplit,
org.apache.hadoop.mapred.JobConf jobConf,
org.apache.hadoop.mapred.Reporter reporter)
throws java.io.IOException
- Specified by:
getRecordReader
in interface org.apache.hadoop.mapred.InputFormat<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text>
- Overrides:
getRecordReader
in class org.apache.hadoop.mapred.TextInputFormat
- Throws:
java.io.IOException
Copyright © 2008-2010 The Apache Software Foundation. All Rights Reserved.