org.apache.hcatalog.rcfile
Class RCFileInputDriver

java.lang.Object
  extended by org.apache.hcatalog.mapreduce.HCatInputStorageDriver
      extended by org.apache.hcatalog.rcfile.RCFileInputDriver

public class RCFileInputDriver
extends HCatInputStorageDriver


Constructor Summary
RCFileInputDriver()
           
 
Method Summary
 HCatRecord convertToHCatRecord(org.apache.hadoop.io.WritableComparable ignored, org.apache.hadoop.io.Writable bytesRefArray)
          Converts to HCatRecord format usable by HCatInputFormat to convert to required valuetype.
 org.apache.hadoop.mapreduce.InputFormat<? extends org.apache.hadoop.io.WritableComparable,? extends org.apache.hadoop.io.Writable> getInputFormat(java.util.Properties hcatProperties)
          Returns the InputFormat to use with this Storage Driver.
 void initialize(org.apache.hadoop.mapreduce.JobContext context, java.util.Properties hcatProperties)
           
 void setInputPath(org.apache.hadoop.mapreduce.JobContext jobContext, java.lang.String location)
          Set the data location for the input.
 void setOriginalSchema(org.apache.hadoop.mapreduce.JobContext jobContext, HCatSchema dataSchema)
          Set the schema of the data as originally published in HCat.
 void setOutputSchema(org.apache.hadoop.mapreduce.JobContext jobContext, HCatSchema desiredSchema)
          Set the consolidated schema for the HCatRecord data returned by the storage driver.
 void setPartitionValues(org.apache.hadoop.mapreduce.JobContext jobContext, java.util.Map<java.lang.String,java.lang.String> partitionValues)
          Sets the partition key values for the current partition.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

RCFileInputDriver

public RCFileInputDriver()
Method Detail

getInputFormat

public org.apache.hadoop.mapreduce.InputFormat<? extends org.apache.hadoop.io.WritableComparable,? extends org.apache.hadoop.io.Writable> getInputFormat(java.util.Properties hcatProperties)
Description copied from class: HCatInputStorageDriver
Returns the InputFormat to use with this Storage Driver.

Specified by:
getInputFormat in class HCatInputStorageDriver
Parameters:
hcatProperties - the properties containing parameters required for initialization of InputFormat
Returns:
the InputFormat instance

setInputPath

public void setInputPath(org.apache.hadoop.mapreduce.JobContext jobContext,
                         java.lang.String location)
                  throws java.io.IOException
Description copied from class: HCatInputStorageDriver
Set the data location for the input.

Overrides:
setInputPath in class HCatInputStorageDriver
Parameters:
jobContext - the job context object
location - the data location
Throws:
java.io.IOException - Signals that an I/O exception has occurred. Default implementation for FileInputFormat based Input Formats. Override this for other input formats.

setOriginalSchema

public void setOriginalSchema(org.apache.hadoop.mapreduce.JobContext jobContext,
                              HCatSchema dataSchema)
                       throws java.io.IOException
Description copied from class: HCatInputStorageDriver
Set the schema of the data as originally published in HCat. The storage driver might validate that this matches with the schema it has (like Zebra) or it will use this to create a HCatRecord matching the output schema.

Specified by:
setOriginalSchema in class HCatInputStorageDriver
Parameters:
jobContext - the job context object
dataSchema - the schema published in HCat for this data
Throws:
java.io.IOException - Signals that an I/O exception has occurred.

setOutputSchema

public void setOutputSchema(org.apache.hadoop.mapreduce.JobContext jobContext,
                            HCatSchema desiredSchema)
                     throws java.io.IOException
Description copied from class: HCatInputStorageDriver
Set the consolidated schema for the HCatRecord data returned by the storage driver. All tuples returned by the RecordReader should have this schema. Nulls should be inserted for columns not present in the data.

Specified by:
setOutputSchema in class HCatInputStorageDriver
Parameters:
jobContext - the job context object
desiredSchema - the schema to use as the consolidated schema
Throws:
java.io.IOException - Signals that an I/O exception has occurred.

setPartitionValues

public void setPartitionValues(org.apache.hadoop.mapreduce.JobContext jobContext,
                               java.util.Map<java.lang.String,java.lang.String> partitionValues)
                        throws java.io.IOException
Description copied from class: HCatInputStorageDriver
Sets the partition key values for the current partition. The storage driver is passed this so that the storage driver can add the partition key values to the output HCatRecord if the partition key values are not present on disk.

Specified by:
setPartitionValues in class HCatInputStorageDriver
Parameters:
jobContext - the job context object
partitionValues - the partition values having a map with partition key name as key and the HCatKeyValue as value
Throws:
java.io.IOException - Signals that an I/O exception has occurred.

convertToHCatRecord

public HCatRecord convertToHCatRecord(org.apache.hadoop.io.WritableComparable ignored,
                                      org.apache.hadoop.io.Writable bytesRefArray)
                               throws java.io.IOException
Description copied from class: HCatInputStorageDriver
Converts to HCatRecord format usable by HCatInputFormat to convert to required valuetype. Implementers of StorageDriver should look to overwriting this function so as to convert their value type to HCatRecord. Default implementation is provided for StorageDriver implementations on top of an underlying InputFormat that already uses HCatRecord as a tuple

Specified by:
convertToHCatRecord in class HCatInputStorageDriver
bytesRefArray - the underlying value to convert to HCatRecord
Throws:
java.io.IOException

initialize

public void initialize(org.apache.hadoop.mapreduce.JobContext context,
                       java.util.Properties hcatProperties)
                throws java.io.IOException
Overrides:
initialize in class HCatInputStorageDriver
Throws:
java.io.IOException