org.apache.mahout.clustering.iterator
Class ClusterIterator
java.lang.Object
org.apache.mahout.clustering.iterator.ClusterIterator
public class ClusterIterator
- extends Object
This is a clustering iterator which works with a set of Vector data and a prior ClusterClassifier which has been
initialized with a set of models. Its implementation is algorithm-neutral and works for any iterative clustering
algorithm (currently k-means, fuzzy-k-means and Dirichlet) that processes all the input vectors in each iteration.
The cluster classifier is configured with a ClusteringPolicy to select the desired clustering algorithm.
Method Summary |
ClusterClassifier |
iterate(Iterable<Vector> data,
ClusterClassifier classifier,
int numIterations)
Iterate over data using a prior-trained ClusterClassifier, for a number of iterations |
void |
iterateMR(org.apache.hadoop.conf.Configuration conf,
org.apache.hadoop.fs.Path inPath,
org.apache.hadoop.fs.Path priorPath,
org.apache.hadoop.fs.Path outPath,
int numIterations)
Iterate over data using a prior-trained ClusterClassifier, for a number of iterations using a mapreduce
implementation |
void |
iterateSeq(org.apache.hadoop.conf.Configuration conf,
org.apache.hadoop.fs.Path inPath,
org.apache.hadoop.fs.Path priorPath,
org.apache.hadoop.fs.Path outPath,
int numIterations)
Iterate over data using a prior-trained ClusterClassifier, for a number of iterations using a sequential
implementation |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
PRIOR_PATH_KEY
public static final String PRIOR_PATH_KEY
- See Also:
- Constant Field Values
ClusterIterator
public ClusterIterator()
iterate
public ClusterClassifier iterate(Iterable<Vector> data,
ClusterClassifier classifier,
int numIterations)
- Iterate over data using a prior-trained ClusterClassifier, for a number of iterations
- Parameters:
policy
- the ClusteringPolicy to usedata
- a List<Vector>
of input vectorsclassifier
- a prior ClusterClassifiernumIterations
- the int number of iterations to perform
- Returns:
- the posterior ClusterClassifier
iterateSeq
public void iterateSeq(org.apache.hadoop.conf.Configuration conf,
org.apache.hadoop.fs.Path inPath,
org.apache.hadoop.fs.Path priorPath,
org.apache.hadoop.fs.Path outPath,
int numIterations)
throws IOException
- Iterate over data using a prior-trained ClusterClassifier, for a number of iterations using a sequential
implementation
- Parameters:
conf
- the ConfigurationinPath
- a Path to input VectorWritablespriorPath
- a Path to the prior classifieroutPath
- a Path of output directorynumIterations
- the int number of iterations to perform
- Throws:
IOException
iterateMR
public void iterateMR(org.apache.hadoop.conf.Configuration conf,
org.apache.hadoop.fs.Path inPath,
org.apache.hadoop.fs.Path priorPath,
org.apache.hadoop.fs.Path outPath,
int numIterations)
throws IOException,
InterruptedException,
ClassNotFoundException
- Iterate over data using a prior-trained ClusterClassifier, for a number of iterations using a mapreduce
implementation
- Parameters:
conf
- the ConfigurationinPath
- a Path to input VectorWritablespriorPath
- a Path to the prior classifieroutPath
- a Path of output directorynumIterations
- the int number of iterations to perform
- Throws:
IOException
InterruptedException
ClassNotFoundException
Copyright © 2008-2012 The Apache Software Foundation. All Rights Reserved.