org.apache.mahout.df.mapreduce.partial
Class Step1Mapper
java.lang.Object
org.apache.hadoop.mapreduce.Mapper<KEYIN,VALUEIN,KEYOUT,VALUEOUT>
org.apache.mahout.df.mapreduce.MapredMapper<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text,TreeID,MapredOutput>
org.apache.mahout.df.mapreduce.partial.Step1Mapper
public class Step1Mapper
- extends MapredMapper<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text,TreeID,MapredOutput>
First step of the Partial Data Builder. Builds the trees using the data available in the InputSplit.
Predict the oob classes for each tree in its growing partition (input split).
Nested classes/interfaces inherited from class org.apache.hadoop.mapreduce.Mapper |
org.apache.hadoop.mapreduce.Mapper.Context |
Method Summary |
protected void |
cleanup(org.apache.hadoop.mapreduce.Mapper.Context context)
|
protected void |
configure(java.lang.Long seed,
int partition,
int numMapTasks,
int numTrees)
Useful when testing |
int |
getFirstTreeId()
|
protected void |
map(org.apache.hadoop.io.LongWritable key,
org.apache.hadoop.io.Text value,
org.apache.hadoop.mapreduce.Mapper.Context context)
|
static int |
nbTrees(int numMaps,
int numTrees,
int partition)
Compute the number of trees for a given partition. |
protected void |
setup(org.apache.hadoop.mapreduce.Mapper.Context context)
|
Methods inherited from class org.apache.hadoop.mapreduce.Mapper |
run |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Step1Mapper
public Step1Mapper()
getFirstTreeId
public int getFirstTreeId()
setup
protected void setup(org.apache.hadoop.mapreduce.Mapper.Context context)
throws java.io.IOException,
java.lang.InterruptedException
- Overrides:
setup
in class MapredMapper<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text,TreeID,MapredOutput>
- Throws:
java.io.IOException
java.lang.InterruptedException
configure
protected void configure(java.lang.Long seed,
int partition,
int numMapTasks,
int numTrees)
- Useful when testing
- Parameters:
seed
- partition
- current mapper inputSplit partitionnumMapTasks
- number of running map tasksnumTrees
- total number of trees in the forest
nbTrees
public static int nbTrees(int numMaps,
int numTrees,
int partition)
- Compute the number of trees for a given partition. The first partition (0) may be longer than the rest of
partition because of the remainder.
- Parameters:
numMaps
- total number of maps (partitions)numTrees
- total number of trees to buildpartition
- partition to compute the number of trees for
- Returns:
map
protected void map(org.apache.hadoop.io.LongWritable key,
org.apache.hadoop.io.Text value,
org.apache.hadoop.mapreduce.Mapper.Context context)
throws java.io.IOException,
java.lang.InterruptedException
- Overrides:
map
in class org.apache.hadoop.mapreduce.Mapper<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text,TreeID,MapredOutput>
- Throws:
java.io.IOException
java.lang.InterruptedException
cleanup
protected void cleanup(org.apache.hadoop.mapreduce.Mapper.Context context)
throws java.io.IOException,
java.lang.InterruptedException
- Overrides:
cleanup
in class org.apache.hadoop.mapreduce.Mapper<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text,TreeID,MapredOutput>
- Throws:
java.io.IOException
java.lang.InterruptedException
Copyright © 2008-2010 The Apache Software Foundation. All Rights Reserved.