org.apache.mahout.classifier.bayes.algorithm
Class BayesAlgorithm

java.lang.Object
  extended by org.apache.mahout.classifier.bayes.algorithm.BayesAlgorithm
All Implemented Interfaces:
Algorithm

public class BayesAlgorithm
extends java.lang.Object
implements Algorithm

Class implementing the Naive Bayes Classifier Algorithm


Constructor Summary
BayesAlgorithm()
           
 
Method Summary
 ClassifierResult classifyDocument(java.lang.String[] document, Datastore datastore, java.lang.String defaultCategory)
          Classify the document and return the Result
 ClassifierResult[] classifyDocument(java.lang.String[] document, Datastore datastore, java.lang.String defaultCategory, int numResults)
          Classify the document and return the top numResults
 double documentWeight(Datastore datastore, java.lang.String label, java.lang.String[] document)
          Calculate the document weight as the dot product of document vector and the corresponding weight vector of a particular class
 double featureWeight(Datastore datastore, java.lang.String label, java.lang.String feature)
          Get the weighted probability of the feature.
 java.util.Collection<java.lang.String> getLabels(Datastore datastore)
          Returns the labels in the given Model
 void initialize(Datastore datastore)
          Initialize the data store and verifies the data in it.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

BayesAlgorithm

public BayesAlgorithm()
Method Detail

classifyDocument

public ClassifierResult classifyDocument(java.lang.String[] document,
                                         Datastore datastore,
                                         java.lang.String defaultCategory)
                                  throws InvalidDatastoreException
Description copied from interface: Algorithm
Classify the document and return the Result

Specified by:
classifyDocument in interface Algorithm
Parameters:
document - The document to classify
datastore - The data store(InMemory, HBase)
defaultCategory - The default category to assign Ties are broken by comparing the category
Returns:
A Collection of ClassifierResults.
Throws:
InvalidDatastoreException

classifyDocument

public ClassifierResult[] classifyDocument(java.lang.String[] document,
                                           Datastore datastore,
                                           java.lang.String defaultCategory,
                                           int numResults)
                                    throws InvalidDatastoreException
Description copied from interface: Algorithm
Classify the document and return the top numResults

Specified by:
classifyDocument in interface Algorithm
Parameters:
document - The document to classify
datastore - The Datastore (InMemory, HBase)
defaultCategory - The default category to assign
numResults - The maximum number of results to return, ranked by score. Ties are broken by comparing the category
Returns:
A Collection of ClassifierResults.
Throws:
InvalidDatastoreException

featureWeight

public double featureWeight(Datastore datastore,
                            java.lang.String label,
                            java.lang.String feature)
                     throws InvalidDatastoreException
Description copied from interface: Algorithm
Get the weighted probability of the feature.

Specified by:
featureWeight in interface Algorithm
Parameters:
datastore - The Datastore (InMemory, HBase)
label - The label of the feature
feature - The feature to calc. the prob. for
Returns:
The weighted probability
Throws:
InvalidDatastoreException

initialize

public void initialize(Datastore datastore)
                throws InvalidDatastoreException
Description copied from interface: Algorithm
Initialize the data store and verifies the data in it.

Specified by:
initialize in interface Algorithm
Throws:
InvalidDatastoreException

documentWeight

public double documentWeight(Datastore datastore,
                             java.lang.String label,
                             java.lang.String[] document)
                      throws InvalidDatastoreException
Description copied from interface: Algorithm
Calculate the document weight as the dot product of document vector and the corresponding weight vector of a particular class

Specified by:
documentWeight in interface Algorithm
Parameters:
datastore - The Datastore (InMemory, HBase)
label - The label to calculate the probability of
document - The document
Returns:
The probability
Throws:
InvalidDatastoreException
See Also:
Algorithm.featureWeight(Datastore, String, String)

getLabels

public java.util.Collection<java.lang.String> getLabels(Datastore datastore)
                                                 throws InvalidDatastoreException
Description copied from interface: Algorithm
Returns the labels in the given Model

Specified by:
getLabels in interface Algorithm
Parameters:
datastore - The Datastore (InMemory, HBase)
Returns:
Collection of labels
Throws:
InvalidDatastoreException


Copyright © 2008-2010 The Apache Software Foundation. All Rights Reserved.