org.apache.mahout.clustering
Class AbstractCluster

java.lang.Object
  extended by org.apache.mahout.clustering.AbstractCluster
All Implemented Interfaces:
org.apache.hadoop.io.Writable, Cluster, Model<VectorWritable>, Parametered
Direct Known Subclasses:
DistanceMeasureCluster, GaussianCluster

public abstract class AbstractCluster
extends Object
implements Cluster


Nested Class Summary
 
Nested classes/interfaces inherited from interface org.apache.mahout.common.parameters.Parametered
Parametered.ParameteredGeneralizations
 
Field Summary
 
Fields inherited from interface org.apache.mahout.clustering.Cluster
CLUSTERED_POINTS_DIR, CLUSTERS_DIR, FINAL_ITERATION_SUFFIX, INITIAL_CLUSTERS_DIR
 
Fields inherited from interface org.apache.mahout.common.parameters.Parametered
log
 
Constructor Summary
protected AbstractCluster()
           
protected AbstractCluster(Vector point, int id2)
           
protected AbstractCluster(Vector center2, Vector radius2, int id2)
           
 
Method Summary
 String asFormatString(String[] bindings)
          Produce a custom, human-friendly, printable representation of the Cluster.
 Vector computeCentroid()
          Compute the centroid by averaging the pointTotals
 void computeParameters()
          Compute a new set of posterior parameters based upon the Observations that have been observed since my creation
 void configure(org.apache.hadoop.conf.Configuration job)
           
 long count()
          Return the number of observations that have been observed by this model
 void createParameters(String prefix, org.apache.hadoop.conf.Configuration jobConf)
          EXPERT: consumers should never have to call this method.
static String formatVector(Vector v, String[] bindings)
          Return a human-readable formatted string representation of the vector, not intended to be complete nor usable as an input/output representation
 Vector getCenter()
          Get the "center" of the Cluster as a Vector
 int getId()
          Get the id of the Cluster
abstract  String getIdentifier()
           
 long getNumPoints()
          Get an integer denoting the number of points observed by this cluster
 ClusterObservations getObservations()
           
 Collection<Parameter<?>> getParameters()
           
 Vector getRadius()
          Get the "radius" of the Cluster as a Vector.
protected  double getS0()
           
protected  Vector getS1()
           
protected  Vector getS2()
           
 boolean isConverged()
           
 void observe(ClusterObservations observations)
           
 void observe(Model<VectorWritable> x)
          Observe the given model, retaining information about its observations
 void observe(Vector x)
           
 void observe(Vector x, double weight)
           
 void observe(VectorWritable x)
          Observe the given observation, retaining information about it
 void observe(VectorWritable x, double weight)
          Observe the given observation, retaining information about it
 void readFields(DataInput in)
           
protected  void setCenter(Vector center)
           
protected  void setId(int id)
           
protected  void setNumPoints(long l)
           
protected  void setRadius(Vector radius)
           
protected  void setS0(double s0)
           
protected  void setS1(Vector s1)
           
protected  void setS2(Vector s2)
           
 void write(DataOutput out)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface org.apache.mahout.clustering.Model
pdf, sampleFromPosterior
 

Constructor Detail

AbstractCluster

protected AbstractCluster()

AbstractCluster

protected AbstractCluster(Vector point,
                          int id2)

AbstractCluster

protected AbstractCluster(Vector center2,
                          Vector radius2,
                          int id2)
Method Detail

configure

public void configure(org.apache.hadoop.conf.Configuration job)
Specified by:
configure in interface Parametered

getParameters

public Collection<Parameter<?>> getParameters()
Specified by:
getParameters in interface Parametered

createParameters

public void createParameters(String prefix,
                             org.apache.hadoop.conf.Configuration jobConf)
Description copied from interface: Parametered
EXPERT: consumers should never have to call this method. It would be friendly visible to Parametered.ParameteredGeneralizations if java supported it. Calling this method should create a new list of parameters and is called

Specified by:
createParameters in interface Parametered
Parameters:
prefix - ends with a dot if not empty.
jobConf - configuration used for retrieving values
See Also:
invoking method, invoking method

setId

protected void setId(int id)
Parameters:
id - the id to set

setNumPoints

protected void setNumPoints(long l)
Parameters:
l - the numPoints to set

setCenter

protected void setCenter(Vector center)
Parameters:
center - the center to set

setRadius

protected void setRadius(Vector radius)
Parameters:
radius - the radius to set

getS0

protected double getS0()
Returns:
the s0

getS1

protected Vector getS1()
Returns:
the s1

getS2

protected Vector getS2()
Returns:
the s2

observe

public void observe(Model<VectorWritable> x)
Description copied from interface: Model
Observe the given model, retaining information about its observations

Specified by:
observe in interface Model<VectorWritable>
Parameters:
x - a Model<0>

observe

public void observe(ClusterObservations observations)

observe

public void observe(VectorWritable x)
Description copied from interface: Model
Observe the given observation, retaining information about it

Specified by:
observe in interface Model<VectorWritable>
Parameters:
x - an Observation from the posterior

observe

public void observe(VectorWritable x,
                    double weight)
Description copied from interface: Model
Observe the given observation, retaining information about it

Specified by:
observe in interface Model<VectorWritable>
Parameters:
x - an Observation from the posterior
weight - a double weighting factor

observe

public void observe(Vector x,
                    double weight)

observe

public void observe(Vector x)

getNumPoints

public long getNumPoints()
Description copied from interface: Cluster
Get an integer denoting the number of points observed by this cluster

Specified by:
getNumPoints in interface Cluster
Returns:
an integer

getObservations

public ClusterObservations getObservations()

computeParameters

public void computeParameters()
Description copied from interface: Model
Compute a new set of posterior parameters based upon the Observations that have been observed since my creation

Specified by:
computeParameters in interface Model<VectorWritable>

readFields

public void readFields(DataInput in)
                throws IOException
Specified by:
readFields in interface org.apache.hadoop.io.Writable
Throws:
IOException

write

public void write(DataOutput out)
           throws IOException
Specified by:
write in interface org.apache.hadoop.io.Writable
Throws:
IOException

asFormatString

public String asFormatString(String[] bindings)
Description copied from interface: Cluster
Produce a custom, human-friendly, printable representation of the Cluster.

Specified by:
asFormatString in interface Cluster
Parameters:
bindings - an optional String[] containing labels used to format the primary Vector/s of this implementation.
Returns:
a String

getIdentifier

public abstract String getIdentifier()

getCenter

public Vector getCenter()
Description copied from interface: Cluster
Get the "center" of the Cluster as a Vector

Specified by:
getCenter in interface Cluster
Returns:
a Vector

getId

public int getId()
Description copied from interface: Cluster
Get the id of the Cluster

Specified by:
getId in interface Cluster
Returns:
a unique integer

getRadius

public Vector getRadius()
Description copied from interface: Cluster
Get the "radius" of the Cluster as a Vector. Usually the radius is the standard deviation expressed as a Vector of size equal to the center. Some clusters may return zero values if not appropriate.

Specified by:
getRadius in interface Cluster
Returns:
aVector

computeCentroid

public Vector computeCentroid()
Compute the centroid by averaging the pointTotals

Returns:
the new centroid

formatVector

public static String formatVector(Vector v,
                                  String[] bindings)
Return a human-readable formatted string representation of the vector, not intended to be complete nor usable as an input/output representation


count

public long count()
Description copied from interface: Model
Return the number of observations that have been observed by this model

Specified by:
count in interface Model<VectorWritable>
Returns:
an int

isConverged

public boolean isConverged()
Specified by:
isConverged in interface Cluster
Returns:
if the receiver has converged, or false if that has no meaning for the implementation

setS0

protected void setS0(double s0)

setS1

protected void setS1(Vector s1)

setS2

protected void setS2(Vector s2)


Copyright © 2008-2012 The Apache Software Foundation. All Rights Reserved.