org.apache.mahout.clustering.dirichlet
Class DirichletDriver

java.lang.Object
  extended by org.apache.mahout.clustering.dirichlet.DirichletDriver

public class DirichletDriver
extends java.lang.Object


Field Summary
static java.lang.String ALPHA_0_KEY
           
static java.lang.String MODEL_FACTORY_KEY
           
static java.lang.String MODEL_PROTOTYPE_KEY
           
static java.lang.String NUM_CLUSTERS_KEY
           
static java.lang.String PROTOTYPE_SIZE_KEY
           
static java.lang.String STATE_IN_KEY
           
 
Method Summary
static DirichletState<VectorWritable> createState(java.lang.String modelFactory, java.lang.String modelPrototype, int prototypeSize, int numModels, double alpha_0)
          Creates a DirichletState object from the given arguments.
static void main(java.lang.String[] args)
           
static void runClustering(java.lang.String input, java.lang.String stateIn, java.lang.String output)
          Run the job using supplied arguments
static void runIteration(java.lang.String input, java.lang.String stateIn, java.lang.String stateOut, java.lang.String modelFactory, java.lang.String modelPrototype, int prototypeSize, int numClusters, double alpha_0, int numReducers)
          Run the job using supplied arguments
static void runJob(java.lang.String input, java.lang.String output, java.lang.String modelFactory, int numClusters, int maxIterations, double alpha_0, int numReducers)
          Deprecated. since it presumes 2-d, dense vector model prototypes
static void runJob(java.lang.String input, java.lang.String output, java.lang.String modelFactory, java.lang.String modelPrototype, int prototypeSize, int numClusters, int maxIterations, double alpha_0, int numReducers)
          Run the job using supplied arguments
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

STATE_IN_KEY

public static final java.lang.String STATE_IN_KEY
See Also:
Constant Field Values

MODEL_FACTORY_KEY

public static final java.lang.String MODEL_FACTORY_KEY
See Also:
Constant Field Values

MODEL_PROTOTYPE_KEY

public static final java.lang.String MODEL_PROTOTYPE_KEY
See Also:
Constant Field Values

PROTOTYPE_SIZE_KEY

public static final java.lang.String PROTOTYPE_SIZE_KEY
See Also:
Constant Field Values

NUM_CLUSTERS_KEY

public static final java.lang.String NUM_CLUSTERS_KEY
See Also:
Constant Field Values

ALPHA_0_KEY

public static final java.lang.String ALPHA_0_KEY
See Also:
Constant Field Values
Method Detail

main

public static void main(java.lang.String[] args)
                 throws java.lang.Exception
Throws:
java.lang.Exception

runJob

@Deprecated
public static void runJob(java.lang.String input,
                                     java.lang.String output,
                                     java.lang.String modelFactory,
                                     int numClusters,
                                     int maxIterations,
                                     double alpha_0,
                                     int numReducers)
                   throws java.lang.ClassNotFoundException,
                          java.lang.InstantiationException,
                          java.lang.IllegalAccessException,
                          java.io.IOException,
                          java.lang.SecurityException,
                          java.lang.NoSuchMethodException,
                          java.lang.reflect.InvocationTargetException
Deprecated. since it presumes 2-d, dense vector model prototypes

Run the job using supplied arguments

Parameters:
input - the directory pathname for input points
output - the directory pathname for output points
modelFactory - the String ModelDistribution class name to use
numClusters - the number of models
maxIterations - the maximum number of iterations
alpha_0 - the alpha_0 value for the DirichletDistribution
numReducers - the number of Reducers desired
Throws:
java.lang.ClassNotFoundException
java.lang.InstantiationException
java.lang.IllegalAccessException
java.io.IOException
java.lang.SecurityException
java.lang.NoSuchMethodException
java.lang.reflect.InvocationTargetException

runJob

public static void runJob(java.lang.String input,
                          java.lang.String output,
                          java.lang.String modelFactory,
                          java.lang.String modelPrototype,
                          int prototypeSize,
                          int numClusters,
                          int maxIterations,
                          double alpha_0,
                          int numReducers)
                   throws java.lang.ClassNotFoundException,
                          java.lang.InstantiationException,
                          java.lang.IllegalAccessException,
                          java.io.IOException,
                          java.lang.SecurityException,
                          java.lang.NoSuchMethodException,
                          java.lang.reflect.InvocationTargetException
Run the job using supplied arguments

Parameters:
input - the directory pathname for input points
output - the directory pathname for output points
modelFactory - the String ModelDistribution class name to use
numClusters - the number of models
maxIterations - the maximum number of iterations
alpha_0 - the alpha_0 value for the DirichletDistribution
numReducers - the number of Reducers desired
Throws:
java.lang.ClassNotFoundException
java.lang.InstantiationException
java.lang.IllegalAccessException
java.io.IOException
java.lang.SecurityException
java.lang.NoSuchMethodException
java.lang.reflect.InvocationTargetException

createState

public static DirichletState<VectorWritable> createState(java.lang.String modelFactory,
                                                         java.lang.String modelPrototype,
                                                         int prototypeSize,
                                                         int numModels,
                                                         double alpha_0)
                                                  throws java.lang.ClassNotFoundException,
                                                         java.lang.InstantiationException,
                                                         java.lang.IllegalAccessException,
                                                         java.lang.SecurityException,
                                                         java.lang.NoSuchMethodException,
                                                         java.lang.IllegalArgumentException,
                                                         java.lang.reflect.InvocationTargetException
Creates a DirichletState object from the given arguments. Note that the modelFactory is presumed to be a subclass of VectorModelDistribution that can be initialized with a concrete Vector prototype.

Parameters:
modelFactory - a String which is the class name of the model factory
modelPrototype - a String which is the class name of the Vector used to initialize the factory
prototypeSize - an int number of dimensions of the model prototype vector
numModels - an int number of models to be created
alpha_0 - the double alpha_0 argument to the algorithm
Returns:
an initialized DirichletState
Throws:
java.lang.ClassNotFoundException
java.lang.InstantiationException
java.lang.IllegalAccessException
java.lang.SecurityException
java.lang.NoSuchMethodException
java.lang.IllegalArgumentException
java.lang.reflect.InvocationTargetException

runIteration

public static void runIteration(java.lang.String input,
                                java.lang.String stateIn,
                                java.lang.String stateOut,
                                java.lang.String modelFactory,
                                java.lang.String modelPrototype,
                                int prototypeSize,
                                int numClusters,
                                double alpha_0,
                                int numReducers)
Run the job using supplied arguments

Parameters:
input - the directory pathname for input points
stateIn - the directory pathname for input state
stateOut - the directory pathname for output state
modelFactory - the class name of the model factory class
modelPrototype - TODO
prototypeSize - TODO
numClusters - the number of clusters
alpha_0 - alpha_0
numReducers - the number of Reducers desired

runClustering

public static void runClustering(java.lang.String input,
                                 java.lang.String stateIn,
                                 java.lang.String output)
Run the job using supplied arguments

Parameters:
input - the directory pathname for input points
stateIn - the directory pathname for input state
output - the directory pathname for output points


Copyright © 2008-2010 The Apache Software Foundation. All Rights Reserved.