org.apache.mahout.math.hadoop.similarity.vector
Interface DistributedVectorSimilarity

All Known Implementing Classes:
AbstractDistributedVectorSimilarity, DistributedCooccurrenceVectorSimilarity, DistributedEuclideanDistanceVectorSimilarity, DistributedLoglikelihoodVectorSimilarity, DistributedPearsonCorrelationVectorSimilarity, DistributedTanimotoCoefficientVectorSimilarity, DistributedUncenteredCosineVectorSimilarity, DistributedUncenteredZeroAssumingCosineVectorSimilarity

public interface DistributedVectorSimilarity

a measure for the pairwise similarity of two rows of a matrix that is suitable for computing that similarity in a distributed way works in 2 steps: - at first weight() is called for each of the row vectors - later similarity is called with the previously computed weights as parameters


Method Summary
 double similarity(int rowA, int rowB, java.lang.Iterable<Cooccurrence> cooccurrences, double weightOfVectorA, double weightOfVectorB, int numberOfColumns)
          compute the similarity of a pair of row vectors
 double weight(Vector v)
          compute the weight (e.g.
 

Method Detail

weight

double weight(Vector v)
compute the weight (e.g. length) of a vector


similarity

double similarity(int rowA,
                  int rowB,
                  java.lang.Iterable<Cooccurrence> cooccurrences,
                  double weightOfVectorA,
                  double weightOfVectorB,
                  int numberOfColumns)
compute the similarity of a pair of row vectors

Parameters:
rowA - offset of the first row
rowB - offset of the second row
cooccurrences - all column entries where both vectors have a nonZero entry
weightOfVectorA - the result of weight(Vector) for the first row vector
weightOfVectorB - the result of weight(Vector) for the first row vector


Copyright © 2008-2010 The Apache Software Foundation. All Rights Reserved.