org.apache.hadoop.mapred
Class InputFormatBase

java.lang.Object
  extended byorg.apache.hadoop.mapred.InputFormatBase
All Implemented Interfaces:
InputFormat
Direct Known Subclasses:
SequenceFileInputFormat, TextInputFormat

public abstract class InputFormatBase
extends Object
implements InputFormat

A base class for InputFormat.


Field Summary
static Logger LOG
           
 
Constructor Summary
InputFormatBase()
           
 
Method Summary
abstract  RecordReader getRecordReader(FileSystem fs, FileSplit split, JobConf job, Reporter reporter)
          Construct a RecordReader for a FileSplit.
 FileSplit[] getSplits(FileSystem fs, JobConf job, int numSplits)
          Splits files returned by {#listFiles(FileSystem,JobConf) when they're too big.
protected  File[] listFiles(FileSystem fs, JobConf job)
          List input directories.
protected  void setMinSplitSize(long minSplitSize)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

LOG

public static final Logger LOG
Constructor Detail

InputFormatBase

public InputFormatBase()
Method Detail

setMinSplitSize

protected void setMinSplitSize(long minSplitSize)

getRecordReader

public abstract RecordReader getRecordReader(FileSystem fs,
                                             FileSplit split,
                                             JobConf job,
                                             Reporter reporter)
                                      throws IOException
Description copied from interface: InputFormat
Construct a RecordReader for a FileSplit.

Specified by:
getRecordReader in interface InputFormat
Parameters:
fs - the FileSystem
split - the FileSplit
job - the job that this split belongs to
Returns:
a RecordReader
Throws:
IOException

listFiles

protected File[] listFiles(FileSystem fs,
                           JobConf job)
                    throws IOException
List input directories. Subclasses may override to, e.g., select only files matching a regular expression. Property mapred.input.subdir, if set, names a subdirectory that is appended to all input dirs specified by job, and if the given fs lists those too, each is added to the returned array of File.

Parameters:
fs -
job -
Returns:
array of File objects, never zero length.
Throws:
IOException - if zero items.

getSplits

public FileSplit[] getSplits(FileSystem fs,
                             JobConf job,
                             int numSplits)
                      throws IOException
Splits files returned by {#listFiles(FileSystem,JobConf) when they're too big.

Specified by:
getSplits in interface InputFormat
Parameters:
fs - the filesystem containing the files to be split
job - the job whose input files are to be split
numSplits - the desired number of splits
Returns:
the splits
Throws:
IOException


Copyright © 2006 The Apache Software Foundation