org.apache.mahout.df.mapreduce.partial
Class Step2Mapper

java.lang.Object
  extended by org.apache.hadoop.mapreduce.Mapper<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text,TreeID,MapredOutput>
      extended by org.apache.mahout.df.mapreduce.partial.Step2Mapper

public class Step2Mapper
extends org.apache.hadoop.mapreduce.Mapper<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text,TreeID,MapredOutput>

Second step of PartialBuilder. Using the trees of the first step, computes the oob predictions for each tree, except those of its own partition, on all instancesof the partition.


Nested Class Summary
 
Nested classes/interfaces inherited from class org.apache.hadoop.mapreduce.Mapper
org.apache.hadoop.mapreduce.Mapper.Context
 
Constructor Summary
Step2Mapper()
           
 
Method Summary
protected  void cleanup(org.apache.hadoop.mapreduce.Mapper.Context context)
           
 void configure(int partition, Dataset dataset, TreeID[] keys, Node[] trees, int numInstances)
          Useful for testing.
protected  void map(org.apache.hadoop.io.LongWritable key, org.apache.hadoop.io.Text value, org.apache.hadoop.mapreduce.Mapper.Context context)
           
static int nbConcerned(int numMaps, int numTrees, int partition)
          Compute the number of trees that need to classify the instances of this mapper's partition
protected  void setup(org.apache.hadoop.mapreduce.Mapper.Context context)
           
 
Methods inherited from class org.apache.hadoop.mapreduce.Mapper
run
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

Step2Mapper

public Step2Mapper()
Method Detail

setup

protected void setup(org.apache.hadoop.mapreduce.Mapper.Context context)
              throws java.io.IOException,
                     java.lang.InterruptedException
Overrides:
setup in class org.apache.hadoop.mapreduce.Mapper<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text,TreeID,MapredOutput>
Throws:
java.io.IOException
java.lang.InterruptedException

nbConcerned

public static int nbConcerned(int numMaps,
                              int numTrees,
                              int partition)
Compute the number of trees that need to classify the instances of this mapper's partition

Parameters:
numMaps - total number of map tasks
numTrees - total number of trees in the forest
partition - mapper's partition

configure

public void configure(int partition,
                      Dataset dataset,
                      TreeID[] keys,
                      Node[] trees,
                      int numInstances)
Useful for testing. Configures the mapper without using a JobConf
TODO we don't need the keys partitions, the tree ids should suffice

Parameters:
partition - mapper's partition
dataset -
keys - keys returned by the first step
trees - trees returned by the first step
numInstances - number of instances in the mapper's partition

map

protected void map(org.apache.hadoop.io.LongWritable key,
                   org.apache.hadoop.io.Text value,
                   org.apache.hadoop.mapreduce.Mapper.Context context)
            throws java.io.IOException,
                   java.lang.InterruptedException
Overrides:
map in class org.apache.hadoop.mapreduce.Mapper<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text,TreeID,MapredOutput>
Throws:
java.io.IOException
java.lang.InterruptedException

cleanup

protected void cleanup(org.apache.hadoop.mapreduce.Mapper.Context context)
                throws java.io.IOException,
                       java.lang.InterruptedException
Overrides:
cleanup in class org.apache.hadoop.mapreduce.Mapper<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text,TreeID,MapredOutput>
Throws:
java.io.IOException
java.lang.InterruptedException


Copyright © 2008-2010 The Apache Software Foundation. All Rights Reserved.