org.apache.mahout.df.mapred.partial
Class Step1Mapper

java.lang.Object
  extended by org.apache.hadoop.mapred.MapReduceBase
      extended by org.apache.mahout.df.mapred.MapredMapper
          extended by org.apache.mahout.df.mapred.partial.Step1Mapper
All Implemented Interfaces:
java.io.Closeable, org.apache.hadoop.mapred.JobConfigurable, org.apache.hadoop.mapred.Mapper<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text,TreeID,MapredOutput>

public class Step1Mapper
extends MapredMapper
implements org.apache.hadoop.mapred.Mapper<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text,TreeID,MapredOutput>

First step of the Partial Data Builder. Builds the trees using the data available in the InputSplit. Predict the oob classes for each tree in its growing partition (input split).


Constructor Summary
Step1Mapper()
           
 
Method Summary
 void close()
           
 void configure(org.apache.hadoop.mapred.JobConf job)
           
protected  void configure(java.lang.Long seed, int partition, int numMapTasks, int numTrees)
          Useful when testing
 int getFirstTreeId()
           
 void map(org.apache.hadoop.io.LongWritable key, org.apache.hadoop.io.Text value, org.apache.hadoop.mapred.OutputCollector<TreeID,MapredOutput> output, org.apache.hadoop.mapred.Reporter reporter)
           
static int nbTrees(int numMaps, int numTrees, int partition)
          Compute the number of trees for a given partition.
 
Methods inherited from class org.apache.mahout.df.mapred.MapredMapper
configure, getDataset, getTreeBuilder, isNoOutput, isOobEstimate
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

Step1Mapper

public Step1Mapper()
Method Detail

getFirstTreeId

public int getFirstTreeId()

configure

public void configure(org.apache.hadoop.mapred.JobConf job)
Specified by:
configure in interface org.apache.hadoop.mapred.JobConfigurable
Overrides:
configure in class MapredMapper

configure

protected void configure(java.lang.Long seed,
                         int partition,
                         int numMapTasks,
                         int numTrees)
Useful when testing

Parameters:
seed -
partition - current mapper inputSplit partition
numMapTasks - number of running map tasks
numTrees - total number of trees in the forest

nbTrees

public static int nbTrees(int numMaps,
                          int numTrees,
                          int partition)
Compute the number of trees for a given partition. The first partition (0) may be longer than the rest of partition because of the remainder.

Parameters:
numMaps - total number of maps (partitions)
numTrees - total number of trees to build
partition - partition to compute the number of trees for
Returns:

map

public void map(org.apache.hadoop.io.LongWritable key,
                org.apache.hadoop.io.Text value,
                org.apache.hadoop.mapred.OutputCollector<TreeID,MapredOutput> output,
                org.apache.hadoop.mapred.Reporter reporter)
         throws java.io.IOException
Specified by:
map in interface org.apache.hadoop.mapred.Mapper<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text,TreeID,MapredOutput>
Throws:
java.io.IOException

close

public void close()
           throws java.io.IOException
Specified by:
close in interface java.io.Closeable
Overrides:
close in class org.apache.hadoop.mapred.MapReduceBase
Throws:
java.io.IOException


Copyright © 2008-2010 The Apache Software Foundation. All Rights Reserved.