org.apache.hadoop.hive.shims
Interface HadoopShims


public interface HadoopShims

In order to be compatible with multiple versions of Hadoop, all parts of the Hadoop interface that are not cross-version compatible are encapsulated in an implementation of this class. Users should use the ShimLoader class as a factory to obtain an implementation of HadoopShims corresponding to the version of Hadoop currently on the classpath.


Nested Class Summary
static interface HadoopShims.CombineFileInputFormatShim<K,V>
          CombineFileInputFormatShim.
static interface HadoopShims.InputSplitShim
          InputSplitShim.
static interface HadoopShims.MiniDFSShim
          Shim around the functions in MiniDFSCluster that Hive uses.
 
Method Summary
 int compareText(org.apache.hadoop.io.Text a, org.apache.hadoop.io.Text b)
          We define this function here to make the code compatible between hadoop 0.17 and hadoop 0.20.
 int createHadoopArchive(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.Path parentDir, org.apache.hadoop.fs.Path destDir, String archiveName)
           
 boolean fileSystemDeleteOnExit(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path path)
          Calls fs.deleteOnExit(path) if such a function exists.
 long getAccessTime(org.apache.hadoop.fs.FileStatus file)
          return the last access time of the given file.
 HadoopShims.CombineFileInputFormatShim getCombineFileInputFormat()
           
 String getInputFormatClassName()
           
 HadoopShims.MiniDFSShim getMiniDfs(org.apache.hadoop.conf.Configuration conf, int numDataNodes, boolean format, String[] racks)
          Returns a shim to wrap MiniDFSCluster.
 String[] getTaskJobIDs(org.apache.hadoop.mapred.TaskCompletionEvent t)
          getTaskJobIDs returns an array of String with two elements.
 void inputFormatValidateInput(org.apache.hadoop.mapred.InputFormat fmt, org.apache.hadoop.mapred.JobConf conf)
          Calls fmt.validateInput(conf) if such a function exists.
 boolean isJobPreparing(org.apache.hadoop.mapred.RunningJob job)
          Return true if the job has not switched to RUNNING state yet and is still in PREP state
 void setFloatConf(org.apache.hadoop.conf.Configuration conf, String varName, float val)
          Wrapper for Configuration.setFloat, which was not introduced until 0.20.
 void setNullOutputFormat(org.apache.hadoop.mapred.JobConf conf)
          Hive uses side effect files exclusively for it's output.
 void setTmpFiles(String prop, String files)
          If JobClient.getCommandLineConfig exists, sets the given property/value pair in that Configuration object.
 boolean usesJobShell()
          Return true if the current version of Hadoop uses the JobShell for command line interpretation.
 

Method Detail

usesJobShell

boolean usesJobShell()
Return true if the current version of Hadoop uses the JobShell for command line interpretation.


isJobPreparing

boolean isJobPreparing(org.apache.hadoop.mapred.RunningJob job)
                       throws IOException
Return true if the job has not switched to RUNNING state yet and is still in PREP state

Throws:
IOException

fileSystemDeleteOnExit

boolean fileSystemDeleteOnExit(org.apache.hadoop.fs.FileSystem fs,
                               org.apache.hadoop.fs.Path path)
                               throws IOException
Calls fs.deleteOnExit(path) if such a function exists.

Returns:
true if the call was successful
Throws:
IOException

inputFormatValidateInput

void inputFormatValidateInput(org.apache.hadoop.mapred.InputFormat fmt,
                              org.apache.hadoop.mapred.JobConf conf)
                              throws IOException
Calls fmt.validateInput(conf) if such a function exists.

Throws:
IOException

setTmpFiles

void setTmpFiles(String prop,
                 String files)
If JobClient.getCommandLineConfig exists, sets the given property/value pair in that Configuration object. This applies for Hadoop 0.17 through 0.19


getAccessTime

long getAccessTime(org.apache.hadoop.fs.FileStatus file)
return the last access time of the given file.

Parameters:
file -
Returns:
last access time. -1 if not supported.

getMiniDfs

HadoopShims.MiniDFSShim getMiniDfs(org.apache.hadoop.conf.Configuration conf,
                                   int numDataNodes,
                                   boolean format,
                                   String[] racks)
                                   throws IOException
Returns a shim to wrap MiniDFSCluster. This is necessary since this class was moved from org.apache.hadoop.dfs to org.apache.hadoop.hdfs

Throws:
IOException

compareText

int compareText(org.apache.hadoop.io.Text a,
                org.apache.hadoop.io.Text b)
We define this function here to make the code compatible between hadoop 0.17 and hadoop 0.20. Hive binary that compiled Text.compareTo(Text) with hadoop 0.20 won't work with hadoop 0.17 because in hadoop 0.20, Text.compareTo(Text) is implemented in org.apache.hadoop.io.BinaryComparable, and Java compiler references that class, which is not available in hadoop 0.17.


getCombineFileInputFormat

HadoopShims.CombineFileInputFormatShim getCombineFileInputFormat()

getInputFormatClassName

String getInputFormatClassName()

setFloatConf

void setFloatConf(org.apache.hadoop.conf.Configuration conf,
                  String varName,
                  float val)
Wrapper for Configuration.setFloat, which was not introduced until 0.20.


getTaskJobIDs

String[] getTaskJobIDs(org.apache.hadoop.mapred.TaskCompletionEvent t)
getTaskJobIDs returns an array of String with two elements. The first element is a string representing the task id and the second is a string representing the job id. This is necessary as TaskID and TaskAttemptID are not supported in Haddop 0.17


createHadoopArchive

int createHadoopArchive(org.apache.hadoop.conf.Configuration conf,
                        org.apache.hadoop.fs.Path parentDir,
                        org.apache.hadoop.fs.Path destDir,
                        String archiveName)
                        throws Exception
Throws:
Exception

setNullOutputFormat

void setNullOutputFormat(org.apache.hadoop.mapred.JobConf conf)
Hive uses side effect files exclusively for it's output. It also manages the setup/cleanup/commit of output from the hive client. As a result it does not need support for the same inside the MR framework This routine sets the appropriate options to set the output format and any options related to bypass setup/cleanup/commit support in the MR framework



Copyright © 2010 The Apache Software Foundation