org.apache.hadoop.mapred
Class InputFormatBase
java.lang.Object
org.apache.hadoop.mapred.InputFormatBase
- All Implemented Interfaces:
- InputFormat
- Direct Known Subclasses:
- SequenceFileInputFormat, TextInputFormat
- public abstract class InputFormatBase
- extends Object
- implements InputFormat
A base class for InputFormat
.
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
LOG
public static final Logger LOG
InputFormatBase
public InputFormatBase()
setMinSplitSize
protected void setMinSplitSize(long minSplitSize)
getRecordReader
public abstract RecordReader getRecordReader(FileSystem fs,
FileSplit split,
JobConf job,
Reporter reporter)
throws IOException
- Description copied from interface:
InputFormat
- Construct a
RecordReader
for a FileSplit
.
- Specified by:
getRecordReader
in interface InputFormat
- Parameters:
fs
- the FileSystem
split
- the FileSplit
job
- the job that this split belongs to
- Returns:
- a
RecordReader
- Throws:
IOException
listFiles
protected File[] listFiles(FileSystem fs,
JobConf job)
throws IOException
- List input directories.
Subclasses may override to, e.g., select only files matching a regular
expression.
Property mapred.input.subdir, if set, names a subdirectory that
is appended to all input dirs specified by job, and if the given fs
lists those too, each is added to the returned array of File.
- Parameters:
fs
- job
-
- Returns:
- array of File objects, never zero length.
- Throws:
IOException
- if zero items.
getSplits
public FileSplit[] getSplits(FileSystem fs,
JobConf job,
int numSplits)
throws IOException
- Splits files returned by {#listFiles(FileSystem,JobConf) when
they're too big.
- Specified by:
getSplits
in interface InputFormat
- Parameters:
fs
- the filesystem containing the files to be splitjob
- the job whose input files are to be splitnumSplits
- the desired number of splits
- Returns:
- the splits
- Throws:
IOException
Copyright © 2006 The Apache Software Foundation