|
|||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
An input data format. Input files are stored in a NutchFileSystem
.
The processing of an input file may be split across multiple machines.
Files are processed as sequences of records, implementing RecordReader
. Files must thus be split on record boundaries.
Nested Class Summary | |
static interface |
InputFormat.Split
A section of an input file. |
Method Summary | |
RecordReader |
getRecordReader(InputFormat.Split split)
Construct a RecordReader for a InputFormat.Split . |
InputFormat.Split[] |
getSplits(NutchFileSystem fs,
File[] files,
int numSplits)
Splits a set of input files. |
Method Detail |
public InputFormat.Split[] getSplits(NutchFileSystem fs, File[] files, int numSplits) throws IOException
fs
- the filesystem containing the files to be splitfiles
- the input files to splitnumSplits
- the desired number of splits
IOException
public RecordReader getRecordReader(InputFormat.Split split) throws IOException
RecordReader
for a InputFormat.Split
.
split
- the split
RecordReader
IOException
|
|||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |