org.apache.mahout.cf.taste.model
Interface IDMigrator

All Known Implementing Classes:
AbstractIDMigrator, AbstractJDBCIDMigrator, MemoryIDMigrator, MySQLJDBCIDMigrator

public interface IDMigrator

Mahout 0.2 changed the framework to operate only in terms of numeric (long) ID values for users and items. This is, obviously, not compatible with applications that used other key types -- most commonly String. Implementation of this class provide support for mapping String to longs and vice versa in order to provide a smoother migration path to applications that must still use strings as IDs.

The mapping from strings to 64-bit numeric values is fixed here, to provide a standard implementation that is 'portable' or reproducible outside the framework easily. See toLongID(String).

Because this mapping is deterministically computable, it does not need to be stored. Indeed, subclasses' job is to store the reverse mapping. There are an infinite number of strings but only a fixed number of longs, so, it is possible for two strings to map to the same value. Subclasses do not treat this as an error but rather retain only the most recent mapping, overwriting a previous mapping. The probability of collision in a 64-bit space is quite small, but not zero. However, in the context of a collaborative filtering problem, the consequence of a collision is small, at worst -- perhaps one user receives another recommendations.

Since:
0.2

Method Summary
 void initialize(java.lang.Iterable<java.lang.String> stringIDs)
          Make the mapping aware of the given string IDs.
 void storeMapping(long longID, java.lang.String stringID)
          Stores the reverse long-to-String mapping in some kind of backing store.
 long toLongID(java.lang.String stringID)
           
 java.lang.String toStringID(long longID)
           
 

Method Detail

toLongID

long toLongID(java.lang.String stringID)
Returns:
the top 8 bytes of the MD5 hash of the bytes of the given String's UTF-8 encoding as a long.
Throws:
TasteException - if an error occurs while storing the mapping

toStringID

java.lang.String toStringID(long longID)
                            throws TasteException
Returns:
the string ID most recently associated with the given long ID, or null if doesn't exist
Throws:
TasteException - if an error occurs while retrieving the mapping

storeMapping

void storeMapping(long longID,
                  java.lang.String stringID)
                  throws TasteException
Stores the reverse long-to-String mapping in some kind of backing store. Note that this must be called directly (or indirectly through initialize(Iterable)) for every String that might be encountered in the application, or else the mapping will not be known.

Parameters:
longID - long ID
stringID - string ID that maps to/from that long ID
Throws:
TasteException - if an error occurs while saving the mapping

initialize

void initialize(java.lang.Iterable<java.lang.String> stringIDs)
                throws TasteException
Make the mapping aware of the given string IDs. This must be called initially before the implementation is used, or else it will not be aware of reverse long-to-String mappings.

Throws:
TasteException - if an error occurs while storing the mappings


Copyright © 2008-2010 The Apache Software Foundation. All Rights Reserved.