org.apache.hadoop.hbase.mapred
Class GroupingTableMap

java.lang.Object
  extended by org.apache.hadoop.mapred.MapReduceBase
      extended by org.apache.hadoop.hbase.mapred.TableMap
          extended by org.apache.hadoop.hbase.mapred.GroupingTableMap
All Implemented Interfaces:
Closeable, JobConfigurable, Mapper

public class GroupingTableMap
extends TableMap

Extract grouping columns from input record


Field Summary
static String GROUP_COLUMNS
          JobConf parameter to specify the columns used to produce the key passed to collect from the map phase
 
Constructor Summary
GroupingTableMap()
          default constructor
 
Method Summary
 void configure(JobConf job)
          Default implementation that does nothing.
protected  Text createGroupKey(byte[][] vals)
          Create a key by concatenating multiple column values.
protected  byte[][] extractKeyValues(KeyedDataArrayWritable r)
          Extract columns values from the current record.
static void initJob(String table, String columns, String groupColumns, Class<? extends TableMap> mapper, JobConf job)
          Use this before submitting a TableMap job.
 void map(HStoreKey key, KeyedDataArrayWritable value, TableOutputCollector output, Reporter reporter)
          Extract the grouping columns from value to construct a new key.
 
Methods inherited from class org.apache.hadoop.hbase.mapred.TableMap
initJob, map
 
Methods inherited from class org.apache.hadoop.mapred.MapReduceBase
close
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface org.apache.hadoop.io.Closeable
close
 

Field Detail

GROUP_COLUMNS

public static final String GROUP_COLUMNS
JobConf parameter to specify the columns used to produce the key passed to collect from the map phase

See Also:
Constant Field Values
Constructor Detail

GroupingTableMap

public GroupingTableMap()
default constructor

Method Detail

initJob

public static void initJob(String table,
                           String columns,
                           String groupColumns,
                           Class<? extends TableMap> mapper,
                           JobConf job)
Use this before submitting a TableMap job. It will appropriately set up the JobConf.

Parameters:
table - table to be processed
columns - space separated list of columns to fetch
groupColumns - space separated list of columns used to form the key used in collect
mapper - map class
job - job configuration object

configure

public void configure(JobConf job)
Description copied from class: MapReduceBase
Default implementation that does nothing.

Specified by:
configure in interface JobConfigurable
Overrides:
configure in class TableMap
Parameters:
job - the configuration

map

public void map(HStoreKey key,
                KeyedDataArrayWritable value,
                TableOutputCollector output,
                Reporter reporter)
         throws IOException
Extract the grouping columns from value to construct a new key. Pass the new key and value to reduce. If any of the grouping columns are not found in the value, the record is skipped.

Specified by:
map in class TableMap
Throws:
IOException
See Also:
TableMap.map(org.apache.hadoop.hbase.HStoreKey, org.apache.hadoop.hbase.io.KeyedDataArrayWritable, org.apache.hadoop.hbase.mapred.TableOutputCollector, org.apache.hadoop.mapred.Reporter)

extractKeyValues

protected byte[][] extractKeyValues(KeyedDataArrayWritable r)
Extract columns values from the current record. This method returns null if any of the columns are not found. Override this method if you want to deal with nulls differently.

Parameters:
r -
Returns:
array of byte values

createGroupKey

protected Text createGroupKey(byte[][] vals)
Create a key by concatenating multiple column values. Override this function in order to produce different types of keys.

Parameters:
vals -
Returns:
key generated by concatenating multiple column values


Copyright © 2006 The Apache Software Foundation