org.apache.mahout.cf.taste.hadoop.similarity.item
Class ItemSimilarityJob

java.lang.Object
  extended by org.apache.hadoop.conf.Configured
      extended by org.apache.mahout.common.AbstractJob
          extended by org.apache.mahout.cf.taste.hadoop.similarity.item.ItemSimilarityJob
All Implemented Interfaces:
org.apache.hadoop.conf.Configurable, org.apache.hadoop.util.Tool

public final class ItemSimilarityJob
extends AbstractJob

Distributed precomputation of the item-item-similarities for Itembased Collaborative Filtering

Preferences in the input file should look like userID,itemID[,preferencevalue]

Preference value is optional to accommodate applications that have no notion of a preference value (that is, the user simply expresses a preference for an item, but no degree of preference).

The preference value is assumed to be parseable as a double. The user IDs and item IDs are parsed as longs.

Command line arguments specific to this class are:

  1. -Dmapred.input.dir=(path): Directory containing one or more text files with the preference data
  2. -Dmapred.output.dir=(path): output path where similarity data should be written
  3. --similarityClassname (classname): Name of distributed similarity class to instantiate or a predefined similarity from SimilarityType
  4. --maxSimilaritiesPerItem (integer): Maximum number of similarities considered per item (100)
  5. --maxCooccurrencesPerItem (integer): Maximum number of cooccurrences considered per item (100)
  6. --booleanData (boolean): Treat input data as having no pref values (false)

General command line options are documented in AbstractJob.

Note that because of how Hadoop parses arguments, all "-D" arguments must appear before all other arguments.


Constructor Summary
ItemSimilarityJob()
           
 
Method Summary
static void main(java.lang.String[] args)
           
 int run(java.lang.String[] args)
           
 
Methods inherited from class org.apache.mahout.common.AbstractJob
addFlag, addInputOption, addOption, addOption, addOption, addOption, addOutputOption, getInputPath, getOption, getOutputPath, hasOption, keyFor, maybePut, parseArguments, parseDirectories, prepareJob, shouldRunNextPhase
 
Methods inherited from class org.apache.hadoop.conf.Configured
getConf, setConf
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface org.apache.hadoop.conf.Configurable
getConf, setConf
 

Constructor Detail

ItemSimilarityJob

public ItemSimilarityJob()
Method Detail

main

public static void main(java.lang.String[] args)
                 throws java.lang.Exception
Throws:
java.lang.Exception

run

public int run(java.lang.String[] args)
        throws java.lang.Exception
Throws:
java.lang.Exception


Copyright © 2008-2010 The Apache Software Foundation. All Rights Reserved.