org.apache.mahout.math.stats
Class LogLikelihood

java.lang.Object
  extended by org.apache.mahout.math.stats.LogLikelihood

public final class LogLikelihood
extends java.lang.Object

Utility methods for working with log-likelihood


Method Summary
static double entropy(int... elements)
          Calculate the unnormalized Shannon entropy.
static double logLikelihoodRatio(int k11, int k12, int k21, int k22)
          Calculate the Raw Log-likelihood ratio for two events, call them A and B.
static double rootLogLikelihoodRatio(int k11, int k12, int k21, int k22)
          Calculate the root log-likelihood ratio for two events.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Method Detail

entropy

public static double entropy(int... elements)
Calculate the unnormalized Shannon entropy. This is -sum x_i log x_i / N = -N sum x_i/N log x_i/N where N = sum x_i If the x's sum to 1, then this is the same as the normal expression. Leaving this un-normalized makes working with counts and computing the LLR easier.

Returns:
The entropy value for the elements

logLikelihoodRatio

public static double logLikelihoodRatio(int k11,
                                        int k12,
                                        int k21,
                                        int k22)
Calculate the Raw Log-likelihood ratio for two events, call them A and B. Then we have:

 Event AEverything but A
Event BA and B together (k_11)B, but not A (k_12)
Everything but BA without B (k_21)Neither A nor B (k_22)

Parameters:
k11 - The number of times the two events occurred together
k12 - The number of times the second event occurred WITHOUT the first event
k21 - The number of times the first event occurred WITHOUT the second event
k22 - The number of times something else occurred (i.e. was neither of these events
Returns:
The raw log-likelihood ratio

Credit to http://tdunning.blogspot.com/2008/03/surprise-and-coincidence.html for the table and the descriptions.


rootLogLikelihoodRatio

public static double rootLogLikelihoodRatio(int k11,
                                            int k12,
                                            int k21,
                                            int k22)
Calculate the root log-likelihood ratio for two events. See logLikelihoodRatio(int, int, int, int).

Parameters:
k11 - The number of times the two events occurred together
k12 - The number of times the second event occurred WITHOUT the first event
k21 - The number of times the first event occurred WITHOUT the second event
k22 - The number of times something else occurred (i.e. was neither of these events
Returns:
The root log-likelihood ratio

See discussion of raw vs. root LLR at http://www.lucidimagination.com/search/document/6dc8709e65a7ced1/llr_scoring_question



Copyright © 2008-2010 The Apache Software Foundation. All Rights Reserved.