net.nutch.mapReduce
Class TextInputFormat

java.lang.Object
  extended bynet.nutch.mapReduce.TextInputFormat
All Implemented Interfaces:
InputFormat

public class TextInputFormat
extends Object
implements InputFormat

An InputFormat for plain text files. Files are broken into lines. Either linefeed or carriage-return are used to signal end of line. Keys are the position in the file, and values are the line of text..


Nested Class Summary
 
Nested classes inherited from class net.nutch.mapReduce.InputFormat
InputFormat.Split
 
Constructor Summary
TextInputFormat()
           
 
Method Summary
 RecordReader getRecordReader(InputFormat.Split s)
          Construct a RecordReader for a Split.
 InputFormat.Split[] getSplits(NutchFileSystem fs, File[] files, int numSplits)
          Splits a set of input files.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

TextInputFormat

public TextInputFormat()
Method Detail

getSplits

public InputFormat.Split[] getSplits(NutchFileSystem fs,
                                     File[] files,
                                     int numSplits)
                              throws IOException
Description copied from interface: InputFormat
Splits a set of input files. One split is created per map task.

Specified by:
getSplits in interface InputFormat
Parameters:
fs - the filesystem containing the files to be split
files - the input files to split
numSplits - the desired number of splits
Returns:
the splits
Throws:
IOException

getRecordReader

public RecordReader getRecordReader(InputFormat.Split s)
                             throws IOException
Description copied from interface: InputFormat
Construct a RecordReader for a InputFormat.Split.

Specified by:
getRecordReader in interface InputFormat
Parameters:
s - the split
Returns:
a RecordReader
Throws:
IOException


Copyright © 2005 The Nutch Organization.