Package org.apache.mahout.math.hadoop.similarity

Class Summary
Cooccurrence a pair of entries in the same column of a row vector where each of the entries' values is != NaN
RowSimilarityJob Runs a completely distributed computation of the pairwise similarity of the row vectors of a DistributedRowMatrix as a series of mapreduces.
RowSimilarityJob.CooccurrencesMapper maps all pairs of weighted entries of a column vector
RowSimilarityJob.EntriesToVectorsReducer collects all DistributedRowMatrix.MatrixEntryWritable for each column and creates a VectorWritable
RowSimilarityJob.RowWeightMapper applies DistributedVectorSimilarity.weight(Vector) to each row of the input matrix
RowSimilarityJob.SimilarityReducer computes the pairwise similarities
RowSimilarityJob.WeightedOccurrencesPerColumnReducer collects all WeightedOccurrences for a column and writes them to a WeightedOccurrenceArray
SimilarityMatrixEntryKey used as key for the RowSimilarityJob.EntriesToVectorsReducer to collect all rows similar to the specified row ensures that the similarity matrix entries for a row are seen in descending order by their similarity value via secondary sort
SimilarityMatrixEntryKey.SimilarityMatrixEntryKeyComparator  
SimilarityMatrixEntryKey.SimilarityMatrixEntryKeyGroupingComparator  
WeightedRowPair a pair of row vectors that has at least one entry != NaN in the same column together with the precomputed weights of the row vectors
 

Enum Summary
SimilarityType  
 



Copyright © 2008-2010 The Apache Software Foundation. All Rights Reserved.