net.nutch.mapReduce
Class TextInputFormat
java.lang.Object
net.nutch.mapReduce.TextInputFormat
- All Implemented Interfaces:
- InputFormat
- public class TextInputFormat
- extends Object
- implements InputFormat
An InputFormat
for plain text files. Files are broken into lines.
Either linefeed or carriage-return are used to signal end of line. Keys are
the position in the file, and values are the line of text..
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
TextInputFormat
public TextInputFormat()
getSplits
public InputFormat.Split[] getSplits(NutchFileSystem fs,
File[] files,
int numSplits)
throws IOException
- Description copied from interface:
InputFormat
- Splits a set of input files. One split is created per map task.
- Specified by:
getSplits
in interface InputFormat
- Parameters:
fs
- the filesystem containing the files to be splitfiles
- the input files to splitnumSplits
- the desired number of splits
- Returns:
- the splits
- Throws:
IOException
getRecordReader
public RecordReader getRecordReader(InputFormat.Split s)
throws IOException
- Description copied from interface:
InputFormat
- Construct a
RecordReader
for a InputFormat.Split
.
- Specified by:
getRecordReader
in interface InputFormat
- Parameters:
s
- the split
- Returns:
- a
RecordReader
- Throws:
IOException
Copyright © 2005 The Nutch Organization.