org.apache.mahout.clustering.kmeans
Class Cluster

java.lang.Object
  extended by org.apache.mahout.clustering.ClusterBase
      extended by org.apache.mahout.clustering.kmeans.Cluster
All Implemented Interfaces:
org.apache.hadoop.io.Writable, Printable

public class Cluster
extends ClusterBase


Constructor Summary
Cluster()
          For (de)serialization as a Writable
Cluster(java.lang.String clusterId)
          Construct a new clsuter with the given id as identifier
Cluster(Vector center)
          Construct a new cluster with the given point as its center
Cluster(Vector center, int clusterId)
          Construct a new cluster with the given point as its center
 
Method Summary
 void addPoint(Vector point)
          Add the point to the cluster
 void addPoints(int count, Vector delta)
          Add the point to the cluster
 java.lang.String asFormatString()
           
 Vector computeCentroid()
          Compute the centroid by averaging the pointTotals
 boolean computeConvergence(DistanceMeasure measure, double convergenceDelta)
          Return if the cluster is converged by comparing its center and centroid.
static Cluster decodeCluster(java.lang.String formattedString)
          Decodes and returns a Cluster from the formattedString.
static java.lang.String formatCluster(Cluster cluster)
          Format the cluster for output
 java.lang.String getIdentifier()
           
 double getStd()
           
 boolean isConverged()
           
 void readFields(java.io.DataInput in)
          Reads in the id, nothing else
 void recomputeCenter()
          Compute the centroid and set the center to it.
 java.lang.String toString()
           
 void write(java.io.DataOutput out)
          Simply writes out the id, and that's it!
 
Methods inherited from class org.apache.mahout.clustering.ClusterBase
asFormatString, asJsonString, formatVector, getCenter, getId, getNumPoints, getPointTotal, setCenter, setId, setNumPoints, setPointTotal
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Constructor Detail

Cluster

public Cluster(Vector center)
Construct a new cluster with the given point as its center

Parameters:
center - the center point

Cluster

public Cluster()
For (de)serialization as a Writable


Cluster

public Cluster(Vector center,
               int clusterId)
Construct a new cluster with the given point as its center

Parameters:
center - the center point

Cluster

public Cluster(java.lang.String clusterId)
Construct a new clsuter with the given id as identifier

Method Detail

formatCluster

public static java.lang.String formatCluster(Cluster cluster)
Format the cluster for output

Parameters:
cluster - the Cluster
Returns:
the String representation of the Cluster

asFormatString

public java.lang.String asFormatString()
Specified by:
asFormatString in class ClusterBase
Returns:

decodeCluster

public static Cluster decodeCluster(java.lang.String formattedString)
Decodes and returns a Cluster from the formattedString.

Parameters:
formattedString - a String produced by formatCluster
Returns:
a decoded Cluster, not null
Throws:
java.lang.IllegalArgumentException - when the string is wrongly formatted

write

public void write(java.io.DataOutput out)
           throws java.io.IOException
Description copied from class: ClusterBase
Simply writes out the id, and that's it!

Specified by:
write in interface org.apache.hadoop.io.Writable
Overrides:
write in class ClusterBase
Parameters:
out - The DataOutput
Throws:
java.io.IOException

readFields

public void readFields(java.io.DataInput in)
                throws java.io.IOException
Description copied from class: ClusterBase
Reads in the id, nothing else

Specified by:
readFields in interface org.apache.hadoop.io.Writable
Overrides:
readFields in class ClusterBase
Throws:
java.io.IOException

computeCentroid

public Vector computeCentroid()
Compute the centroid by averaging the pointTotals

Specified by:
computeCentroid in class ClusterBase
Returns:
the new centroid

toString

public java.lang.String toString()
Overrides:
toString in class java.lang.Object

getIdentifier

public java.lang.String getIdentifier()
Specified by:
getIdentifier in class ClusterBase

addPoint

public void addPoint(Vector point)
Add the point to the cluster

Parameters:
point - a point to add

addPoints

public void addPoints(int count,
                      Vector delta)
Add the point to the cluster

Parameters:
count - the number of points in the delta
delta - a point to add

recomputeCenter

public void recomputeCenter()
Compute the centroid and set the center to it.


computeConvergence

public boolean computeConvergence(DistanceMeasure measure,
                                  double convergenceDelta)
Return if the cluster is converged by comparing its center and centroid.

Parameters:
measure - The distance measure to use for cluster-point comparisons.
convergenceDelta - the convergence delta to use for stopping.
Returns:
if the cluster is converged

isConverged

public boolean isConverged()

getStd

public double getStd()
Returns:
the std


Copyright © 2008-2010 The Apache Software Foundation. All Rights Reserved.