org.apache.mahout.df.mapreduce.partial
Class Step0Job

java.lang.Object
  extended by org.apache.mahout.df.mapreduce.partial.Step0Job

public class Step0Job
extends java.lang.Object

preparation step of the partial mapreduce builder. Computes some stats that will be used by the builder.


Nested Class Summary
protected static class Step0Job.Step0Mapper
          Outputs the first key and the size of the partition
static class Step0Job.Step0Output
          Output of the step0's mappers
 
Constructor Summary
Step0Job(org.apache.hadoop.fs.Path base, org.apache.hadoop.fs.Path dataPath, org.apache.hadoop.fs.Path datasetPath)
           
 
Method Summary
protected  Step0Job.Step0Output[] parseOutput(org.apache.hadoop.mapreduce.JobContext job)
          Extracts the output and processes it
protected static Step0Job.Step0Output[] processOutput(java.util.List<java.lang.Integer> keys, java.util.List<Step0Job.Step0Output> values)
          Replaces the first id for each partition in Hadoop's order
 Step0Job.Step0Output[] run(org.apache.hadoop.conf.Configuration conf)
          Computes the partitions' info in Hadoop's order
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

Step0Job

public Step0Job(org.apache.hadoop.fs.Path base,
                org.apache.hadoop.fs.Path dataPath,
                org.apache.hadoop.fs.Path datasetPath)
Parameters:
base - base directory
dataPath - data used in the first step
datasetPath -
Method Detail

run

public Step0Job.Step0Output[] run(org.apache.hadoop.conf.Configuration conf)
                           throws java.io.IOException,
                                  java.lang.ClassNotFoundException,
                                  java.lang.InterruptedException
Computes the partitions' info in Hadoop's order

Parameters:
conf - configuration
Returns:
partitions' info in Hadoop's order
Throws:
java.io.IOException
java.lang.ClassNotFoundException
java.lang.InterruptedException

parseOutput

protected Step0Job.Step0Output[] parseOutput(org.apache.hadoop.mapreduce.JobContext job)
                                      throws java.io.IOException
Extracts the output and processes it

Returns:
info for each partition in Hadoop's order
Throws:
java.io.IOException

processOutput

protected static Step0Job.Step0Output[] processOutput(java.util.List<java.lang.Integer> keys,
                                                      java.util.List<Step0Job.Step0Output> values)
Replaces the first id for each partition in Hadoop's order

Parameters:
keys -
values -
Returns:


Copyright © 2008-2010 The Apache Software Foundation. All Rights Reserved.