org.apache.hadoop.chukwa.inputtools
Class ChukwaInputFormat

java.lang.Object
  extended by org.apache.hadoop.mapred.FileInputFormat<K,V>
      extended by org.apache.hadoop.mapred.SequenceFileInputFormat<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text>
          extended by org.apache.hadoop.chukwa.inputtools.ChukwaInputFormat
All Implemented Interfaces:
org.apache.hadoop.mapred.InputFormat<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text>

public class ChukwaInputFormat
extends org.apache.hadoop.mapred.SequenceFileInputFormat<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text>

An InputFormat for processing logfiles in Chukwa. Designed to be a nearly drop-in replacement for the Hadoop default TextInputFormat so that code can be ported to use Chukwa with minimal modification. Has an optional configuration option, chukwa.inputfilter.datatype which can be used to filter the input by datatype. If need exists, this mechanism could be extended to also filter by other fields.


Nested Class Summary
static class ChukwaInputFormat.ChukwaRecordReader
           
 
Field Summary
 
Fields inherited from class org.apache.hadoop.mapred.FileInputFormat
LOG
 
Constructor Summary
ChukwaInputFormat()
           
 
Method Summary
 org.apache.hadoop.mapred.RecordReader<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text> getRecordReader(org.apache.hadoop.mapred.InputSplit split, org.apache.hadoop.mapred.JobConf job, org.apache.hadoop.mapred.Reporter reporter)
           
 
Methods inherited from class org.apache.hadoop.mapred.SequenceFileInputFormat
listStatus
 
Methods inherited from class org.apache.hadoop.mapred.FileInputFormat
addInputPath, addInputPaths, computeSplitSize, getBlockIndex, getInputPathFilter, getInputPaths, getSplitHosts, getSplits, isSplitable, setInputPathFilter, setInputPaths, setInputPaths, setMinSplitSize
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

ChukwaInputFormat

public ChukwaInputFormat()
Method Detail

getRecordReader

public org.apache.hadoop.mapred.RecordReader<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text> getRecordReader(org.apache.hadoop.mapred.InputSplit split,
                                                                                                                          org.apache.hadoop.mapred.JobConf job,
                                                                                                                          org.apache.hadoop.mapred.Reporter reporter)
                                                                                                                   throws IOException
Specified by:
getRecordReader in interface org.apache.hadoop.mapred.InputFormat<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text>
Overrides:
getRecordReader in class org.apache.hadoop.mapred.SequenceFileInputFormat<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text>
Throws:
IOException


Copyright © ${year} The Apache Software Foundation