org.apache.pig.backend.hadoop.executionengine.mapReduceLayer
Class PigInputFormat

java.lang.Object
  extended by org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat
All Implemented Interfaces:
org.apache.hadoop.mapred.InputFormat<org.apache.hadoop.io.Text,Tuple>, org.apache.hadoop.mapred.JobConfigurable

public class PigInputFormat
extends Object
implements org.apache.hadoop.mapred.InputFormat<org.apache.hadoop.io.Text,Tuple>, org.apache.hadoop.mapred.JobConfigurable


Field Summary
static org.apache.commons.logging.Log LOG
           
static org.apache.hadoop.mapred.JobConf sJob
           
 
Constructor Summary
PigInputFormat()
           
 
Method Summary
 void configure(org.apache.hadoop.mapred.JobConf conf)
           
static SliceWrapper getActiveSplit()
           
 org.apache.hadoop.mapred.RecordReader<org.apache.hadoop.io.Text,Tuple> getRecordReader(org.apache.hadoop.mapred.InputSplit split, org.apache.hadoop.mapred.JobConf job, org.apache.hadoop.mapred.Reporter reporter)
           
 org.apache.hadoop.mapred.InputSplit[] getSplits(org.apache.hadoop.mapred.JobConf job, int numSplits)
          Creates input splits one per input and slices of it per DFS block of the input file.
protected  boolean isSplitable(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path filename)
          Is the given filename splitable? Usually, true, but if the file is stream compressed, it will not be.
protected  org.apache.hadoop.fs.Path[] listPaths(org.apache.hadoop.mapred.JobConf job)
          List input directories.
 void validateInput(org.apache.hadoop.mapred.JobConf job)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

LOG

public static final org.apache.commons.logging.Log LOG

sJob

public static org.apache.hadoop.mapred.JobConf sJob
Constructor Detail

PigInputFormat

public PigInputFormat()
Method Detail

isSplitable

protected boolean isSplitable(org.apache.hadoop.fs.FileSystem fs,
                              org.apache.hadoop.fs.Path filename)
Is the given filename splitable? Usually, true, but if the file is stream compressed, it will not be. FileInputFormat implementations can override this and return false to ensure that individual input files are never split-up so that Mappers process entire files.

Parameters:
fs - the file system that the file is on
filename - the file name to check
Returns:
is this file splitable?

listPaths

protected org.apache.hadoop.fs.Path[] listPaths(org.apache.hadoop.mapred.JobConf job)
                                         throws IOException
List input directories. Subclasses may override to, e.g., select only files matching a regular expression.

Parameters:
job - the job to list input paths for
Returns:
array of Path objects
Throws:
IOException - if zero items.

validateInput

public void validateInput(org.apache.hadoop.mapred.JobConf job)
                   throws IOException
Throws:
IOException

getSplits

public org.apache.hadoop.mapred.InputSplit[] getSplits(org.apache.hadoop.mapred.JobConf job,
                                                       int numSplits)
                                                throws IOException
Creates input splits one per input and slices of it per DFS block of the input file. Configures the PigSlice and returns the list of PigSlices as an array

Specified by:
getSplits in interface org.apache.hadoop.mapred.InputFormat<org.apache.hadoop.io.Text,Tuple>
Throws:
IOException

getRecordReader

public org.apache.hadoop.mapred.RecordReader<org.apache.hadoop.io.Text,Tuple> getRecordReader(org.apache.hadoop.mapred.InputSplit split,
                                                                                              org.apache.hadoop.mapred.JobConf job,
                                                                                              org.apache.hadoop.mapred.Reporter reporter)
                                                                                       throws IOException
Specified by:
getRecordReader in interface org.apache.hadoop.mapred.InputFormat<org.apache.hadoop.io.Text,Tuple>
Throws:
IOException

configure

public void configure(org.apache.hadoop.mapred.JobConf conf)
Specified by:
configure in interface org.apache.hadoop.mapred.JobConfigurable

getActiveSplit

public static SliceWrapper getActiveSplit()


Copyright © ${year} The Apache Software Foundation