org.apache.hadoop.mapred
Class MultiFileInputFormat<K extends WritableComparable,V extends Writable>

java.lang.Object
  extended by org.apache.hadoop.mapred.FileInputFormat<K,V>
      extended by org.apache.hadoop.mapred.MultiFileInputFormat<K,V>
All Implemented Interfaces:
InputFormat<K,V>

public abstract class MultiFileInputFormat<K extends WritableComparable,V extends Writable>
extends FileInputFormat<K,V>

An abstract InputFormat that returns MultiFileSplit's in getSplits(JobConf, int) method. Splits are constructed from the files under the input paths. Each split returned contains nearly equal content length.
Subclasses implement getRecordReader(InputSplit, JobConf, Reporter) to construct RecordReader's for MultiFileSplit's.

See Also:
MultiFileSplit

Field Summary
 
Fields inherited from class org.apache.hadoop.mapred.FileInputFormat
LOG
 
Constructor Summary
MultiFileInputFormat()
           
 
Method Summary
abstract  RecordReader<K,V> getRecordReader(InputSplit split, JobConf job, Reporter reporter)
          Get the RecordReader for the given InputSplit.
 InputSplit[] getSplits(JobConf job, int numSplits)
          Splits files returned by FileInputFormat.listPaths(JobConf) when they're too big.
 
Methods inherited from class org.apache.hadoop.mapred.FileInputFormat
isSplitable, listPaths, setMinSplitSize, validateInput
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

MultiFileInputFormat

public MultiFileInputFormat()
Method Detail

getSplits

public InputSplit[] getSplits(JobConf job,
                              int numSplits)
                       throws IOException
Description copied from class: FileInputFormat
Splits files returned by FileInputFormat.listPaths(JobConf) when they're too big.

Specified by:
getSplits in interface InputFormat<K extends WritableComparable,V extends Writable>
Overrides:
getSplits in class FileInputFormat<K extends WritableComparable,V extends Writable>
Parameters:
job - job configuration.
numSplits - the desired number of splits, a hint.
Returns:
an array of InputSplits for the job.
Throws:
IOException

getRecordReader

public abstract RecordReader<K,V> getRecordReader(InputSplit split,
                                                  JobConf job,
                                                  Reporter reporter)
                                                                                       throws IOException
Description copied from interface: InputFormat
Get the RecordReader for the given InputSplit.

It is the responsibility of the RecordReader to respect record boundaries while processing the logical split to present a record-oriented view to the individual task.

Specified by:
getRecordReader in interface InputFormat<K extends WritableComparable,V extends Writable>
Specified by:
getRecordReader in class FileInputFormat<K extends WritableComparable,V extends Writable>
Parameters:
split - the InputSplit
job - the job that this split belongs to
Returns:
a RecordReader
Throws:
IOException


Copyright © 2006 The Apache Software Foundation