org.apache.hadoop.hive.ql.io
Class BucketizedHiveInputFormat<K extends org.apache.hadoop.io.WritableComparable,V extends org.apache.hadoop.io.Writable>

java.lang.Object
  extended by org.apache.hadoop.hive.ql.io.HiveInputFormat<K,V>
      extended by org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat<K,V>
All Implemented Interfaces:
org.apache.hadoop.mapred.InputFormat<K,V>, org.apache.hadoop.mapred.JobConfigurable

public class BucketizedHiveInputFormat<K extends org.apache.hadoop.io.WritableComparable,V extends org.apache.hadoop.io.Writable>
extends HiveInputFormat<K,V>

BucketizedHiveInputFormat serves the similar function as hiveInputFormat but its getSplits() always group splits from one input file into one wrapper split. It is useful for the applications that requires input files to fit in one mapper.


Nested Class Summary
 
Nested classes/interfaces inherited from class org.apache.hadoop.hive.ql.io.HiveInputFormat
HiveInputFormat.HiveInputSplit
 
Field Summary
static org.apache.commons.logging.Log LOG
           
 
Fields inherited from class org.apache.hadoop.hive.ql.io.HiveInputFormat
inputFormats, pathToPartitionInfo
 
Constructor Summary
BucketizedHiveInputFormat()
           
 
Method Summary
 org.apache.hadoop.mapred.RecordReader getRecordReader(org.apache.hadoop.mapred.InputSplit split, org.apache.hadoop.mapred.JobConf job, org.apache.hadoop.mapred.Reporter reporter)
           
 org.apache.hadoop.mapred.InputSplit[] getSplits(org.apache.hadoop.mapred.JobConf job, int numSplits)
           
protected  org.apache.hadoop.fs.FileStatus[] listStatus(org.apache.hadoop.mapred.JobConf job, org.apache.hadoop.fs.Path path)
           
 
Methods inherited from class org.apache.hadoop.hive.ql.io.HiveInputFormat
configure, getPartitionDescFromPath, init, initColumnsNeeded, validateInput
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

LOG

public static final org.apache.commons.logging.Log LOG
Constructor Detail

BucketizedHiveInputFormat

public BucketizedHiveInputFormat()
Method Detail

getRecordReader

public org.apache.hadoop.mapred.RecordReader getRecordReader(org.apache.hadoop.mapred.InputSplit split,
                                                             org.apache.hadoop.mapred.JobConf job,
                                                             org.apache.hadoop.mapred.Reporter reporter)
                                                      throws IOException
Specified by:
getRecordReader in interface org.apache.hadoop.mapred.InputFormat<K extends org.apache.hadoop.io.WritableComparable,V extends org.apache.hadoop.io.Writable>
Overrides:
getRecordReader in class HiveInputFormat<K extends org.apache.hadoop.io.WritableComparable,V extends org.apache.hadoop.io.Writable>
Throws:
IOException

listStatus

protected org.apache.hadoop.fs.FileStatus[] listStatus(org.apache.hadoop.mapred.JobConf job,
                                                       org.apache.hadoop.fs.Path path)
                                                throws IOException
Throws:
IOException

getSplits

public org.apache.hadoop.mapred.InputSplit[] getSplits(org.apache.hadoop.mapred.JobConf job,
                                                       int numSplits)
                                                throws IOException
Specified by:
getSplits in interface org.apache.hadoop.mapred.InputFormat<K extends org.apache.hadoop.io.WritableComparable,V extends org.apache.hadoop.io.Writable>
Overrides:
getSplits in class HiveInputFormat<K extends org.apache.hadoop.io.WritableComparable,V extends org.apache.hadoop.io.Writable>
Throws:
IOException


Copyright © 2010 The Apache Software Foundation