org.apache.hadoop.hbase.mapreduce
Class MultithreadedTableMapper<K2,V2>

java.lang.Object
  extended by org.apache.hadoop.mapreduce.Mapper<ImmutableBytesWritable,Result,KEYOUT,VALUEOUT>
      extended by org.apache.hadoop.hbase.mapreduce.TableMapper<K2,V2>
          extended by org.apache.hadoop.hbase.mapreduce.MultithreadedTableMapper<K2,V2>

public class MultithreadedTableMapper<K2,V2>
extends TableMapper<K2,V2>

Multithreaded implementation for @link org.apache.hbase.mapreduce.TableMapper

It can be used instead when the Map operation is not CPU bound in order to improve throughput.

Mapper implementations using this MapRunnable must be thread-safe.

The Map-Reduce job has to be configured with the mapper to use via setMapperClass(org.apache.hadoop.mapreduce.Job, java.lang.Class>) and the number of thread the thread-pool can use with the getNumberOfThreads(org.apache.hadoop.mapreduce.JobContext) method. The default value is 10 threads.


Nested Class Summary
 
Nested classes/interfaces inherited from class org.apache.hadoop.mapreduce.Mapper
org.apache.hadoop.mapreduce.Mapper.Context
 
Field Summary
static String MAPPER_CLASS
           
static String NUMBER_OF_THREADS
           
 
Constructor Summary
MultithreadedTableMapper()
           
 
Method Summary
static
<K2,V2> Class<org.apache.hadoop.mapreduce.Mapper<ImmutableBytesWritable,Result,K2,V2>>
getMapperClass(org.apache.hadoop.mapreduce.JobContext job)
          Get the application's mapper class.
static int getNumberOfThreads(org.apache.hadoop.mapreduce.JobContext job)
          The number of threads in the thread pool that will run the map function.
 void run(org.apache.hadoop.mapreduce.Mapper.Context context)
          Run the application's maps using a thread pool.
static
<K2,V2> void
setMapperClass(org.apache.hadoop.mapreduce.Job job, Class<? extends org.apache.hadoop.mapreduce.Mapper<ImmutableBytesWritable,Result,K2,V2>> cls)
          Set the application's mapper class.
static void setNumberOfThreads(org.apache.hadoop.mapreduce.Job job, int threads)
          Set the number of threads in the pool for running maps.
 
Methods inherited from class org.apache.hadoop.mapreduce.Mapper
cleanup, map, setup
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

NUMBER_OF_THREADS

public static final String NUMBER_OF_THREADS
See Also:
Constant Field Values

MAPPER_CLASS

public static final String MAPPER_CLASS
See Also:
Constant Field Values
Constructor Detail

MultithreadedTableMapper

public MultithreadedTableMapper()
Method Detail

getNumberOfThreads

public static int getNumberOfThreads(org.apache.hadoop.mapreduce.JobContext job)
The number of threads in the thread pool that will run the map function.

Parameters:
job - the job
Returns:
the number of threads

setNumberOfThreads

public static void setNumberOfThreads(org.apache.hadoop.mapreduce.Job job,
                                      int threads)
Set the number of threads in the pool for running maps.

Parameters:
job - the job to modify
threads - the new number of threads

getMapperClass

public static <K2,V2> Class<org.apache.hadoop.mapreduce.Mapper<ImmutableBytesWritable,Result,K2,V2>> getMapperClass(org.apache.hadoop.mapreduce.JobContext job)
Get the application's mapper class.

Type Parameters:
K2 - the map's output key type
V2 - the map's output value type
Parameters:
job - the job
Returns:
the mapper class to run

setMapperClass

public static <K2,V2> void setMapperClass(org.apache.hadoop.mapreduce.Job job,
                                          Class<? extends org.apache.hadoop.mapreduce.Mapper<ImmutableBytesWritable,Result,K2,V2>> cls)
Set the application's mapper class.

Type Parameters:
K2 - the map output key type
V2 - the map output value type
Parameters:
job - the job to modify
cls - the class to use as the mapper

run

public void run(org.apache.hadoop.mapreduce.Mapper.Context context)
         throws IOException,
                InterruptedException
Run the application's maps using a thread pool.

Overrides:
run in class org.apache.hadoop.mapreduce.Mapper<ImmutableBytesWritable,Result,K2,V2>
Throws:
IOException
InterruptedException


Copyright © 2007–2015 The Apache Software Foundation. All rights reserved.