org.apache.hadoop.hbase.mapred
Class GroupingTableMap

java.lang.Object
  extended by org.apache.hadoop.mapred.MapReduceBase
      extended by org.apache.hadoop.hbase.mapred.TableMap<Text,MapWritable>
          extended by org.apache.hadoop.hbase.mapred.GroupingTableMap
All Implemented Interfaces:
Closeable, JobConfigurable, Mapper<HStoreKey,MapWritable,Text,MapWritable>

public class GroupingTableMap
extends TableMap<Text,MapWritable>

Extract grouping columns from input record


Field Summary
static String GROUP_COLUMNS
          JobConf parameter to specify the columns used to produce the key passed to collect from the map phase
protected  Text[] m_columns
           
 
Constructor Summary
GroupingTableMap()
           
 
Method Summary
 void configure(JobConf job)
          Default implementation that does nothing.
protected  Text createGroupKey(byte[][] vals)
          Create a key by concatenating multiple column values.
protected  byte[][] extractKeyValues(MapWritable r)
          Extract columns values from the current record.
static void initJob(String table, String columns, String groupColumns, Class<? extends TableMap> mapper, JobConf job)
          Use this before submitting a TableMap job.
 void map(HStoreKey key, MapWritable value, OutputCollector<Text,MapWritable> output, Reporter reporter)
          Extract the grouping columns from value to construct a new key.
 
Methods inherited from class org.apache.hadoop.hbase.mapred.TableMap
initJob
 
Methods inherited from class org.apache.hadoop.mapred.MapReduceBase
close
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface org.apache.hadoop.io.Closeable
close
 

Field Detail

GROUP_COLUMNS

public static final String GROUP_COLUMNS
JobConf parameter to specify the columns used to produce the key passed to collect from the map phase

See Also:
Constant Field Values

m_columns

protected Text[] m_columns
Constructor Detail

GroupingTableMap

public GroupingTableMap()
Method Detail

initJob

public static void initJob(String table,
                           String columns,
                           String groupColumns,
                           Class<? extends TableMap> mapper,
                           JobConf job)
Use this before submitting a TableMap job. It will appropriately set up the JobConf.

Parameters:
table - table to be processed
columns - space separated list of columns to fetch
groupColumns - space separated list of columns used to form the key used in collect
mapper - map class
job - job configuration object

configure

public void configure(JobConf job)
Default implementation that does nothing.

Specified by:
configure in interface JobConfigurable
Overrides:
configure in class MapReduceBase
Parameters:
job - the configuration

map

public void map(HStoreKey key,
                MapWritable value,
                OutputCollector<Text,MapWritable> output,
                Reporter reporter)
         throws IOException
Extract the grouping columns from value to construct a new key. Pass the new key and value to reduce. If any of the grouping columns are not found in the value, the record is skipped.

Specified by:
map in interface Mapper<HStoreKey,MapWritable,Text,MapWritable>
Specified by:
map in class TableMap<Text,MapWritable>
Parameters:
key - the input key.
value - the input value.
output - collects mapped keys and values.
reporter - facility to report progress.
Throws:
IOException
See Also:
TableMap.map(org.apache.hadoop.hbase.HStoreKey, org.apache.hadoop.io.MapWritable, org.apache.hadoop.mapred.OutputCollector, org.apache.hadoop.mapred.Reporter)

extractKeyValues

protected byte[][] extractKeyValues(MapWritable r)
Extract columns values from the current record. This method returns null if any of the columns are not found. Override this method if you want to deal with nulls differently.

Parameters:
r -
Returns:
array of byte values

createGroupKey

protected Text createGroupKey(byte[][] vals)
Create a key by concatenating multiple column values. Override this function in order to produce different types of keys.

Parameters:
vals -
Returns:
key generated by concatenating multiple column values


Copyright © 2006 The Apache Software Foundation