org.apache.mahout.cf.taste.eval
Interface RecommenderEvaluator

All Known Implementing Classes:
AverageAbsoluteDifferenceRecommenderEvaluator, RMSRecommenderEvaluator

public interface RecommenderEvaluator

Implementations of this interface evaluate the quality of a Recommender's recommendations.


Method Summary
 double evaluate(RecommenderBuilder recommenderBuilder, DataModelBuilder dataModelBuilder, DataModel dataModel, double trainingPercentage, double evaluationPercentage)
           Evaluates the quality of a Recommender's recommendations.
 float getMaxPreference()
           
 float getMinPreference()
           
 void setMaxPreference(float maxPreference)
          Sets the maximum preference value that is possible in the current problem domain being evaluated.
 void setMinPreference(float minPreference)
          Sets the minimum preference value that is possible in the current problem domain being evaluated.
 

Method Detail

evaluate

double evaluate(RecommenderBuilder recommenderBuilder,
                DataModelBuilder dataModelBuilder,
                DataModel dataModel,
                double trainingPercentage,
                double evaluationPercentage)
                throws TasteException

Evaluates the quality of a Recommender's recommendations. The range of values that may be returned depends on the implementation, but lower values must mean better recommendations, with 0 being the lowest / best possible evaluation, meaning a perfect match. This method does not accept a Recommender directly, but rather a RecommenderBuilder which can build the Recommender to test on top of a given DataModel.

Implementations will take a certain percentage of the preferences supplied by the given DataModel as "training data". This is typically most of the data, like 90%. This data is used to produce recommendations, and the rest of the data is compared against estimated preference values to see how much the Recommender's predicted preferences match the user's real preferences. Specifically, for each user, this percentage of the user's ratings are used to produce recommendatinos, and for each user, the remaining preferences are compared against the user's real preferences.

For large datasets, it may be desirable to only evaluate based on a small percentage of the data. evaluationPercentage controls how many of the DataModel's users are used in evaluation.

To be clear, trainingPercentage and evaluationPercentage are not related. They do not need to add up to 1.0, for example.

Parameters:
recommenderBuilder - object that can build a Recommender to test
dataModelBuilder -
dataModelBuilder - DataModelBuilder to use, or if null, a default DataModel implementation will be used
dataModel - dataset to test on
trainingPercentage - percentage of each user's preferences to use to produce recommendations; the rest are compared to estimated preference values to evaluate Recommender performance
evaluationPercentage - percentage of users to use in evaluation
Returns:
a "score" representing how well the Recommender's estimated preferences match real values; lower scores mean a better match and 0 is a perfect match
Throws:
TasteException - if an error occurs while accessing the DataModel

getMaxPreference

float getMaxPreference()

setMaxPreference

void setMaxPreference(float maxPreference)
Sets the maximum preference value that is possible in the current problem domain being evaluated. For example, if the domain is movie ratings on a scale of 1 to 5, this should be set to 5. While a Recommender may estimate a preference value above 5.0, it isn't "fair" to consider that the system is actually suggesting an impossible rating of, say, 5.4 stars. In practice the application would cap this estimate to 5.0. Since s evaluate the difference between estimated and actual value, this at least prevents this effect from unfairly penalizing a Recommender.

See Also:
setMinPreference(float)

getMinPreference

float getMinPreference()

setMinPreference

void setMinPreference(float minPreference)
Sets the minimum preference value that is possible in the current problem domain being evaluated.



Copyright © 2008-2010 The Apache Software Foundation. All Rights Reserved.