|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectorg.apache.mahout.fpm.pfpgrowth.PFPGrowth
public final class PFPGrowth
Parallel FP Growth Driver Class. Runs each stage of PFPGrowth as described in the paper http://infolab.stanford.edu/~echang/recsys08-69.pdf
Field Summary | |
---|---|
static String |
ENCODING
|
static String |
F_LIST
|
static String |
FILE_PATTERN
|
static String |
FPGROWTH
|
static String |
FREQUENT_PATTERNS
|
static String |
G_LIST
|
static String |
INPUT
|
static String |
MAX_HEAPSIZE
|
static String |
MIN_SUPPORT
|
static String |
NUM_GROUPS
|
static String |
OUTPUT
|
static String |
PARALLEL_COUNTING
|
static String |
PFP_PARAMETERS
|
static String |
SORTED_OUTPUT
|
static String |
SPLIT_PATTERN
|
static Pattern |
SPLITTER
|
Method Summary | |
---|---|
static List<Pair<String,Long>> |
deserializeList(Parameters params,
String key,
org.apache.hadoop.conf.Configuration conf)
Generates the fList from the serialized string representation |
static Map<String,Long> |
deserializeMap(Parameters params,
String key,
org.apache.hadoop.conf.Configuration conf)
Generates the gList(Group ID Mapping of Various frequent Features) Map from the corresponding serialized representation |
static List<Pair<String,Long>> |
readFList(Parameters params)
read the feature frequency List which is built at the end of the Parallel counting job |
static List<Pair<String,TopKStringPatterns>> |
readFrequentPattern(Parameters params)
Read the Frequent Patterns generated from Text |
static void |
runPFPGrowth(Parameters params)
|
static void |
startAggregating(Parameters params)
Run the aggregation Job to aggregate the different TopK patterns and group each Pattern by the features present in it and thus calculate the final Top K frequent Patterns for each feature |
static void |
startGroupingItems(Parameters params)
Group the given Features into g groups as defined by the numGroups parameter in params |
static void |
startParallelCounting(Parameters params)
Count the frequencies of various features in parallel using Map/Reduce |
static void |
startParallelFPGrowth(Parameters params)
Run the Parallel FPGrowth Map/Reduce Job to calculate the Top K features of group dependent shards |
static void |
startTransactionSorting(Parameters params)
Run the Parallel FPGrowth Map/Reduce Job to calculate the Top K features of group dependent shards |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
public static final String ENCODING
public static final String F_LIST
public static final String G_LIST
public static final String NUM_GROUPS
public static final String OUTPUT
public static final String MIN_SUPPORT
public static final String MAX_HEAPSIZE
public static final String INPUT
public static final String PFP_PARAMETERS
public static final String FILE_PATTERN
public static final String FPGROWTH
public static final String FREQUENT_PATTERNS
public static final String PARALLEL_COUNTING
public static final String SORTED_OUTPUT
public static final String SPLIT_PATTERN
public static final Pattern SPLITTER
Method Detail |
---|
public static List<Pair<String,Long>> deserializeList(Parameters params, String key, org.apache.hadoop.conf.Configuration conf) throws IOException
IOException
public static Map<String,Long> deserializeMap(Parameters params, String key, org.apache.hadoop.conf.Configuration conf) throws IOException
IOException
public static List<Pair<String,Long>> readFList(Parameters params)
public static List<Pair<String,TopKStringPatterns>> readFrequentPattern(Parameters params) throws IOException
IOException
public static void runPFPGrowth(Parameters params) throws IOException, InterruptedException, ClassNotFoundException
params
- params should contain input and output locations as a string value, the additional parameters
include minSupport(3), maxHeapSize(50), numGroups(1000)
IOException
InterruptedException
ClassNotFoundException
public static void startAggregating(Parameters params) throws IOException, InterruptedException, ClassNotFoundException
IOException
InterruptedException
ClassNotFoundException
public static void startGroupingItems(Parameters params) throws IOException
params
-
IOException
public static void startParallelCounting(Parameters params) throws IOException, InterruptedException, ClassNotFoundException
IOException
InterruptedException
ClassNotFoundException
public static void startTransactionSorting(Parameters params) throws IOException, InterruptedException, ClassNotFoundException
IOException
InterruptedException
ClassNotFoundException
public static void startParallelFPGrowth(Parameters params) throws IOException, InterruptedException, ClassNotFoundException
IOException
InterruptedException
ClassNotFoundException
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |