org.apache.mahout.cf.taste.hadoop.item
Class RecommenderJob
java.lang.Object
org.apache.hadoop.conf.Configured
org.apache.mahout.common.AbstractJob
org.apache.mahout.cf.taste.hadoop.item.RecommenderJob
- All Implemented Interfaces:
- org.apache.hadoop.conf.Configurable, org.apache.hadoop.util.Tool
public final class RecommenderJob
- extends AbstractJob
Runs a completely distributed recommender job as a series of mapreduces.
Preferences in the input file should look like userID,itemID[,preferencevalue]
Preference value is optional to accommodate applications that have no notion of a preference value (that is, the user
simply expresses a preference for an item, but no degree of preference).
The preference value is assumed to be parseable as a double
. The user IDs and item IDs are
parsed as long
s.
Command line arguments specific to this class are:
- -Dmapred.input.dir=(path): Directory containing one or more text files with the preference data
- -Dmapred.output.dir=(path): output path where recommender output should go
- --similarityClassname (classname): Name of distributed similarity class to instantiate or a predefined similarity
from
SimilarityType
- --usersFile (path): only compute recommendations for user IDs contained in this file (optional)
- --itemsFile (path): only include item IDs from this file in the recommendations (optional)
- --filterFile (path): file containing comma-separated userID,itemID pairs. Used to exclude the item from the
recommendations for that user (optional)
- --numRecommendations (integer): Number of recommendations to compute per user (10)
- --booleanData (boolean): Treat input data as having no pref values (false)
- --maxPrefsPerUser (integer): Maximum number of preferences considered per user in
final recommendation phase (10)
- --maxSimilaritiesPerItem (integer): Maximum number of similarities considered per item (100)
- --maxCooccurrencesPerItem (integer): Maximum number of cooccurrences considered per item (100)
General command line options are documented in AbstractJob
.
Note that because of how Hadoop parses arguments, all "-D" arguments must appear before all other
arguments.
Method Summary |
static void |
main(java.lang.String[] args)
|
int |
run(java.lang.String[] args)
|
Methods inherited from class org.apache.mahout.common.AbstractJob |
addFlag, addInputOption, addOption, addOption, addOption, addOption, addOutputOption, getInputPath, getOption, getOutputPath, hasOption, keyFor, maybePut, parseArguments, parseDirectories, prepareJob, shouldRunNextPhase |
Methods inherited from class org.apache.hadoop.conf.Configured |
getConf, setConf |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Methods inherited from interface org.apache.hadoop.conf.Configurable |
getConf, setConf |
BOOLEAN_DATA
public static final java.lang.String BOOLEAN_DATA
- See Also:
- Constant Field Values
RecommenderJob
public RecommenderJob()
run
public int run(java.lang.String[] args)
throws java.io.IOException,
java.lang.ClassNotFoundException,
java.lang.InterruptedException
- Throws:
java.io.IOException
java.lang.ClassNotFoundException
java.lang.InterruptedException
main
public static void main(java.lang.String[] args)
throws java.lang.Exception
- Throws:
java.lang.Exception
Copyright © 2008-2010 The Apache Software Foundation. All Rights Reserved.