org.apache.mahout.clustering.kmeans
Class KMeansDriver
java.lang.Object
org.apache.hadoop.conf.Configured
org.apache.mahout.common.AbstractJob
org.apache.mahout.clustering.kmeans.KMeansDriver
- All Implemented Interfaces:
- org.apache.hadoop.conf.Configurable, org.apache.hadoop.util.Tool
public class KMeansDriver
- extends AbstractJob
Method Summary |
static org.apache.hadoop.fs.Path |
buildClusters(org.apache.hadoop.conf.Configuration conf,
org.apache.hadoop.fs.Path input,
org.apache.hadoop.fs.Path clustersIn,
org.apache.hadoop.fs.Path output,
DistanceMeasure measure,
int maxIterations,
java.lang.String delta,
boolean runSequential)
Iterate over the input vectors to produce cluster directories for each iteration |
static void |
clusterData(org.apache.hadoop.conf.Configuration conf,
org.apache.hadoop.fs.Path input,
org.apache.hadoop.fs.Path clustersIn,
org.apache.hadoop.fs.Path output,
DistanceMeasure measure,
java.lang.String convergenceDelta,
boolean runSequential)
Run the job using supplied arguments |
static void |
main(java.lang.String[] args)
|
static void |
run(org.apache.hadoop.conf.Configuration conf,
org.apache.hadoop.fs.Path input,
org.apache.hadoop.fs.Path clustersIn,
org.apache.hadoop.fs.Path output,
DistanceMeasure measure,
double convergenceDelta,
int maxIterations,
boolean runClustering,
boolean runSequential)
Iterate over the input vectors to produce clusters and, if requested, use the
results of the final iteration to cluster the input vectors. |
static void |
run(org.apache.hadoop.fs.Path input,
org.apache.hadoop.fs.Path clustersIn,
org.apache.hadoop.fs.Path output,
DistanceMeasure measure,
double convergenceDelta,
int maxIterations,
boolean runClustering,
boolean runSequential)
Iterate over the input vectors to produce clusters and, if requested, use the
results of the final iteration to cluster the input vectors. |
int |
run(java.lang.String[] args)
|
Methods inherited from class org.apache.mahout.common.AbstractJob |
addFlag, addInputOption, addOption, addOption, addOption, addOption, addOutputOption, getInputPath, getOption, getOutputPath, hasOption, keyFor, maybePut, parseArguments, parseDirectories, prepareJob, shouldRunNextPhase |
Methods inherited from class org.apache.hadoop.conf.Configured |
getConf, setConf |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Methods inherited from interface org.apache.hadoop.conf.Configurable |
getConf, setConf |
KMeansDriver
public KMeansDriver()
main
public static void main(java.lang.String[] args)
throws java.lang.Exception
- Throws:
java.lang.Exception
run
public int run(java.lang.String[] args)
throws java.lang.Exception
- Throws:
java.lang.Exception
run
public static void run(org.apache.hadoop.conf.Configuration conf,
org.apache.hadoop.fs.Path input,
org.apache.hadoop.fs.Path clustersIn,
org.apache.hadoop.fs.Path output,
DistanceMeasure measure,
double convergenceDelta,
int maxIterations,
boolean runClustering,
boolean runSequential)
throws java.io.IOException,
java.lang.InterruptedException,
java.lang.ClassNotFoundException,
java.lang.InstantiationException,
java.lang.IllegalAccessException
- Iterate over the input vectors to produce clusters and, if requested, use the
results of the final iteration to cluster the input vectors.
- Parameters:
input
- the directory pathname for input pointsclustersIn
- the directory pathname for initial & computed clustersoutput
- the directory pathname for output pointsmeasure
- the DistanceMeasure to useconvergenceDelta
- the convergence delta valuemaxIterations
- the maximum number of iterationsrunClustering
- true if points are to be clustered after iterations are completedrunSequential
- if true execute sequential algorithm
- Throws:
java.io.IOException
java.lang.InterruptedException
java.lang.ClassNotFoundException
java.lang.InstantiationException
java.lang.IllegalAccessException
run
public static void run(org.apache.hadoop.fs.Path input,
org.apache.hadoop.fs.Path clustersIn,
org.apache.hadoop.fs.Path output,
DistanceMeasure measure,
double convergenceDelta,
int maxIterations,
boolean runClustering,
boolean runSequential)
throws java.io.IOException,
java.lang.InterruptedException,
java.lang.ClassNotFoundException,
java.lang.InstantiationException,
java.lang.IllegalAccessException
- Iterate over the input vectors to produce clusters and, if requested, use the
results of the final iteration to cluster the input vectors.
- Parameters:
input
- the directory pathname for input pointsclustersIn
- the directory pathname for initial & computed clustersoutput
- the directory pathname for output pointsmeasure
- the DistanceMeasure to useconvergenceDelta
- the convergence delta valuemaxIterations
- the maximum number of iterationsrunClustering
- true if points are to be clustered after iterations are completedrunSequential
- if true execute sequential algorithm
- Throws:
java.io.IOException
java.lang.InterruptedException
java.lang.ClassNotFoundException
java.lang.InstantiationException
java.lang.IllegalAccessException
buildClusters
public static org.apache.hadoop.fs.Path buildClusters(org.apache.hadoop.conf.Configuration conf,
org.apache.hadoop.fs.Path input,
org.apache.hadoop.fs.Path clustersIn,
org.apache.hadoop.fs.Path output,
DistanceMeasure measure,
int maxIterations,
java.lang.String delta,
boolean runSequential)
throws java.io.IOException,
java.lang.InterruptedException,
java.lang.ClassNotFoundException,
java.lang.InstantiationException,
java.lang.IllegalAccessException
- Iterate over the input vectors to produce cluster directories for each iteration
- Parameters:
conf
- the Configuration to useinput
- the directory pathname for input pointsclustersIn
- the directory pathname for initial & computed clustersoutput
- the directory pathname for output pointsmeasure
- the classname of the DistanceMeasuremaxIterations
- the maximum number of iterationsdelta
- the convergence delta valuerunSequential
- if true execute sequential algorithm
- Returns:
- the Path of the final clusters directory
- Throws:
java.io.IOException
java.lang.InterruptedException
java.lang.ClassNotFoundException
java.lang.InstantiationException
java.lang.IllegalAccessException
clusterData
public static void clusterData(org.apache.hadoop.conf.Configuration conf,
org.apache.hadoop.fs.Path input,
org.apache.hadoop.fs.Path clustersIn,
org.apache.hadoop.fs.Path output,
DistanceMeasure measure,
java.lang.String convergenceDelta,
boolean runSequential)
throws java.io.IOException,
java.lang.InterruptedException,
java.lang.ClassNotFoundException,
java.lang.InstantiationException,
java.lang.IllegalAccessException
- Run the job using supplied arguments
- Parameters:
conf
- TODOinput
- the directory pathname for input pointsclustersIn
- the directory pathname for input clustersoutput
- the directory pathname for output pointsmeasure
- the classname of the DistanceMeasureconvergenceDelta
- the convergence delta valuerunSequential
- if true execute sequential algorithm
- Throws:
java.io.IOException
java.lang.InterruptedException
java.lang.ClassNotFoundException
java.lang.InstantiationException
java.lang.IllegalAccessException
Copyright © 2008-2010 The Apache Software Foundation. All Rights Reserved.