org.apache.mahout.clustering.minhash
Class LastfmDataConverter

java.lang.Object
  extended by org.apache.mahout.clustering.minhash.LastfmDataConverter

public final class LastfmDataConverter
extends Object


Method Summary
static Map<String,List<Integer>> convertToItemFeatures(String inputFile, org.apache.mahout.clustering.minhash.LastfmDataConverter.Lastfm dataSet)
          Reads the LastFm dataset and constructs a Map of (item, features).
static void main(String[] args)
           
static boolean writeToSequenceFile(Map<String,List<Integer>> itemFeaturesMap, org.apache.hadoop.fs.Path outputPath)
          Converts each record in (item,features) map into Mahout vector format and writes it into sequencefile for minhash clustering
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Method Detail

convertToItemFeatures

public static Map<String,List<Integer>> convertToItemFeatures(String inputFile,
                                                              org.apache.mahout.clustering.minhash.LastfmDataConverter.Lastfm dataSet)
                                                       throws IOException
Reads the LastFm dataset and constructs a Map of (item, features). For 360K Users dataset - (Item=Artist, Feature=User) For 1K Users dataset - (Item=User, Feature=Artist)

Parameters:
inputFile - Lastfm dataset file on the local file system.
dataSet - Type of dataset - 360K Users or 1K Users
Returns:
Throws:
IOException

writeToSequenceFile

public static boolean writeToSequenceFile(Map<String,List<Integer>> itemFeaturesMap,
                                          org.apache.hadoop.fs.Path outputPath)
                                   throws IOException
Converts each record in (item,features) map into Mahout vector format and writes it into sequencefile for minhash clustering

Throws:
IOException

main

public static void main(String[] args)
                 throws Exception
Throws:
Exception


Copyright © 2008-2010 The Apache Software Foundation. All Rights Reserved.