A C D E F G H I J K L M N O P R S T V W

A

addDateFormat(Integer, DateFormat) - Method in interface org.apache.mahout.utils.vectors.arff.ARFFModel
 
addDateFormat(Integer, DateFormat) - Method in class org.apache.mahout.utils.vectors.arff.MapBackedARFFModel
 
addLabel(String, Integer) - Method in interface org.apache.mahout.utils.vectors.arff.ARFFModel
 
addLabel(String, Integer) - Method in class org.apache.mahout.utils.vectors.arff.MapBackedARFFModel
 
addNominal(String, String, int) - Method in interface org.apache.mahout.utils.vectors.arff.ARFFModel
 
addNominal(String, String, int) - Method in class org.apache.mahout.utils.vectors.arff.MapBackedARFFModel
 
addType(Integer, ARFFType) - Method in interface org.apache.mahout.utils.vectors.arff.ARFFModel
 
addType(Integer, ARFFType) - Method in class org.apache.mahout.utils.vectors.arff.MapBackedARFFModel
 
ANALYZER_CLASS - Static variable in class org.apache.mahout.utils.vectors.text.DocumentProcessor
 
ARFF_COMMENT - Static variable in interface org.apache.mahout.utils.vectors.arff.ARFFModel
 
ARFF_SPARSE - Static variable in interface org.apache.mahout.utils.vectors.arff.ARFFModel
 
ARFFModel - Interface in org.apache.mahout.utils.vectors.arff
An interface for representing an ARFFModel.
ARFFType - Enum in org.apache.mahout.utils.vectors.arff
 
ARFFVectorIterable - Class in org.apache.mahout.utils.vectors.arff
Read in ARFF (http://www.cs.waikato.ac.nz/~ml/weka/arff.html) and create Vectors

Attribute type handling: Numeric -> As is Nominal -> ordinal(value) i.e.

ARFFVectorIterable(File, ARFFModel) - Constructor for class org.apache.mahout.utils.vectors.arff.ARFFVectorIterable
 
ARFFVectorIterable(File, Charset, ARFFModel) - Constructor for class org.apache.mahout.utils.vectors.arff.ARFFVectorIterable
 
ARFFVectorIterable(String, ARFFModel) - Constructor for class org.apache.mahout.utils.vectors.arff.ARFFVectorIterable
 
ARFFVectorIterable(Reader, ARFFModel) - Constructor for class org.apache.mahout.utils.vectors.arff.ARFFVectorIterable
 
ATTRIBUTE - Static variable in interface org.apache.mahout.utils.vectors.arff.ARFFModel
 

C

CachedTermInfo - Class in org.apache.mahout.utils.vectors.lucene
Caches TermEntries from a single field.
CachedTermInfo(IndexReader, String, int, int) - Constructor for class org.apache.mahout.utils.vectors.lucene.CachedTermInfo
 
calculate(int, int, int, int) - Method in class org.apache.mahout.utils.vectors.TF
 
calculate(int, int, int, int) - Method in class org.apache.mahout.utils.vectors.TFIDF
 
calculate(int, int, int, int) - Method in interface org.apache.mahout.utils.vectors.Weight
Experimental
CHARSET - Static variable in class org.apache.mahout.utils.vectors.text.DocumentProcessor
 
cloneBenchmark() - Method in class org.apache.mahout.benchmark.VectorBenchmarks
 
close() - Method in class org.apache.mahout.utils.vectors.io.JWriterTermInfoWriter
Does NOT close the underlying writer
close() - Method in class org.apache.mahout.utils.vectors.io.JWriterVectorWriter
 
close() - Method in class org.apache.mahout.utils.vectors.io.SequenceFileVectorWriter
 
close() - Method in interface org.apache.mahout.utils.vectors.io.TermInfoWriter
 
close() - Method in interface org.apache.mahout.utils.vectors.io.VectorWriter
Close any internally held resources.
ClusterDumper - Class in org.apache.mahout.utils.clustering
 
ClusterDumper(String, String) - Constructor for class org.apache.mahout.utils.clustering.ClusterDumper
 
ClusterLabels - Class in org.apache.mahout.utils.vectors.lucene
Get labels for the cluster using Log Likelihood Ratio (LLR).
ClusterLabels(String, String, String, String, int, int) - Constructor for class org.apache.mahout.utils.vectors.lucene.ClusterLabels
 
CollocCombiner - Class in org.apache.mahout.utils.nlp.collocations.llr
Combiner for pass1 of the CollocationDriver.
CollocCombiner() - Constructor for class org.apache.mahout.utils.nlp.collocations.llr.CollocCombiner
 
CollocDriver - Class in org.apache.mahout.utils.nlp.collocations.llr
Driver for LLR Collocation discovery mapreduce job
CollocMapper - Class in org.apache.mahout.utils.nlp.collocations.llr
Pass 1 of the Collocation discovery job which generated ngrams and emits ngrams an their component n-1grams.
CollocMapper() - Constructor for class org.apache.mahout.utils.nlp.collocations.llr.CollocMapper
 
CollocMapper.Count - Enum in org.apache.mahout.utils.nlp.collocations.llr
 
CollocMapper.IteratorTokenStream - Class in org.apache.mahout.utils.nlp.collocations.llr
Used to emit tokens from an input string array in the style of TokenStream
CollocMapper.IteratorTokenStream(Iterator<String>) - Constructor for class org.apache.mahout.utils.nlp.collocations.llr.CollocMapper.IteratorTokenStream
 
CollocReducer - Class in org.apache.mahout.utils.nlp.collocations.llr
Reducer for Pass 1 of the collocation identification job.
CollocReducer() - Constructor for class org.apache.mahout.utils.nlp.collocations.llr.CollocReducer
 
CollocReducer.Skipped - Enum in org.apache.mahout.utils.nlp.collocations.llr
 
compare(WritableComparable, WritableComparable) - Method in class org.apache.mahout.utils.nlp.collocations.llr.GramKeyGroupComparator
 
computeNGramsPruneByLLR(long, String, boolean, float, int) - Static method in class org.apache.mahout.utils.nlp.collocations.llr.CollocDriver
pass2: perform the LLR calculation
configure(JobConf) - Method in class org.apache.mahout.utils.nlp.collocations.llr.CollocMapper
 
configure(JobConf) - Method in class org.apache.mahout.utils.nlp.collocations.llr.CollocReducer
 
configure(JobConf) - Method in class org.apache.mahout.utils.nlp.collocations.llr.GramKeyPartitioner
 
configure(JobConf) - Method in class org.apache.mahout.utils.nlp.collocations.llr.LLRReducer
 
configure(JobConf) - Method in class org.apache.mahout.utils.vectors.common.PartialVectorMergeReducer
 
configure(JobConf) - Method in class org.apache.mahout.utils.vectors.text.document.SequenceFileTokenizerMapper
 
configure(JobConf) - Method in class org.apache.mahout.utils.vectors.text.term.TermCountReducer
 
configure(JobConf) - Method in class org.apache.mahout.utils.vectors.text.term.TFPartialVectorReducer
 
configure(JobConf) - Method in class org.apache.mahout.utils.vectors.tfidf.TFIDFPartialVectorReducer
 
createBenchmark() - Method in class org.apache.mahout.benchmark.VectorBenchmarks
 
createTermFrequencyVectors(String, String, int, int, float, int, int, boolean) - Static method in class org.apache.mahout.utils.vectors.text.DictionaryVectorizer
Create Term Frequency (Tf) Vectors from the input set of documents in SequenceFile format.

D

DATA - Static variable in interface org.apache.mahout.utils.vectors.arff.ARFFModel
 
decodeType(byte[], int) - Static method in class org.apache.mahout.utils.nlp.collocations.llr.Gram
 
DEFAULT_DATE_FORMAT - Static variable in interface org.apache.mahout.utils.vectors.arff.ARFFModel
 
DEFAULT_EMIT_UNIGRAMS - Static variable in class org.apache.mahout.utils.nlp.collocations.llr.CollocDriver
 
DEFAULT_MAX_LABELS - Static variable in class org.apache.mahout.utils.vectors.lucene.ClusterLabels
 
DEFAULT_MAX_NGRAM_SIZE - Static variable in class org.apache.mahout.utils.nlp.collocations.llr.CollocDriver
 
DEFAULT_MAX_SHINGLE_SIZE - Static variable in class org.apache.mahout.utils.nlp.collocations.llr.CollocMapper
 
DEFAULT_MIN_IDS - Static variable in class org.apache.mahout.utils.vectors.lucene.ClusterLabels
 
DEFAULT_MIN_LLR - Static variable in class org.apache.mahout.utils.nlp.collocations.llr.LLRReducer
 
DEFAULT_MIN_SUPPORT - Static variable in class org.apache.mahout.utils.nlp.collocations.llr.CollocReducer
 
DEFAULT_MIN_SUPPORT - Static variable in class org.apache.mahout.utils.vectors.text.DictionaryVectorizer
 
DEFAULT_OUTPUT_DIRECTORY - Static variable in class org.apache.mahout.utils.nlp.collocations.llr.CollocDriver
 
DEFAULT_PASS1_NUM_REDUCE_TASKS - Static variable in class org.apache.mahout.utils.nlp.collocations.llr.CollocDriver
 
DictionaryVectorizer - Class in org.apache.mahout.utils.vectors.text
This class converts a set of input documents in the sequence file format to vectors.
DIMENSION - Static variable in class org.apache.mahout.utils.vectors.common.PartialVectorMerger
 
distanceMeasureBenchmark(DistanceMeasure) - Method in class org.apache.mahout.benchmark.VectorBenchmarks
 
docFreq - Variable in class org.apache.mahout.utils.vectors.TermEntry
 
DOCUMENT_VECTOR_OUTPUT_FOLDER - Static variable in class org.apache.mahout.utils.vectors.text.DictionaryVectorizer
 
DocumentProcessor - Class in org.apache.mahout.utils.vectors.text
This class converts a set of input documents in the sequence file format of StringTuples.The SequenceFile input should have a Text key containing the unique document identifier and a Text value containing the whole document.
dotBenchmark() - Method in class org.apache.mahout.benchmark.VectorBenchmarks
 
Driver - Class in org.apache.mahout.utils.vectors.arff
 
Driver - Class in org.apache.mahout.utils.vectors.lucene
 

E

EMIT_UNIGRAMS - Static variable in class org.apache.mahout.utils.nlp.collocations.llr.CollocDriver
 
encodeType(Gram.Type, byte[], int) - Static method in class org.apache.mahout.utils.nlp.collocations.llr.Gram
 

F

FEATURE_COUNT - Static variable in class org.apache.mahout.utils.vectors.tfidf.TFIDFConverter
 

G

generateAllGrams(String, String, int, int, float, int) - Static method in class org.apache.mahout.utils.nlp.collocations.llr.CollocDriver
Generate all ngrams for the DictionaryVectorizer job
generateCollocations(String, String, boolean, int, int, int) - Static method in class org.apache.mahout.utils.nlp.collocations.llr.CollocDriver
pass1: generate collocations, ngrams
getAllEntries() - Method in class org.apache.mahout.utils.vectors.lucene.CachedTermInfo
 
getAllEntries() - Method in interface org.apache.mahout.utils.vectors.TermInfo
 
getARFFType(Integer) - Method in interface org.apache.mahout.utils.vectors.arff.ARFFModel
 
getARFFType(Integer) - Method in class org.apache.mahout.utils.vectors.arff.MapBackedARFFModel
 
getBytes() - Method in class org.apache.mahout.utils.nlp.collocations.llr.Gram
 
getBytes() - Method in class org.apache.mahout.utils.nlp.collocations.llr.GramKey
 
getClusterIdToPoints() - Method in class org.apache.mahout.utils.clustering.ClusterDumper
 
getClusterLabels(String, List<String>) - Method in class org.apache.mahout.utils.vectors.lucene.ClusterLabels
Get the list of labels, sorted by best score.
getDateFormat(Integer) - Method in interface org.apache.mahout.utils.vectors.arff.ARFFModel
 
getDateFormat(Integer) - Method in class org.apache.mahout.utils.vectors.arff.MapBackedARFFModel
 
getDateMap() - Method in class org.apache.mahout.utils.vectors.arff.MapBackedARFFModel
Map of Date formatters used
getFrequency() - Method in class org.apache.mahout.utils.nlp.collocations.llr.Gram
 
getIdField() - Method in class org.apache.mahout.utils.vectors.lucene.ClusterLabels
 
getIndicator() - Method in enum org.apache.mahout.utils.vectors.arff.ARFFType
 
getLabel(String) - Method in enum org.apache.mahout.utils.vectors.arff.ARFFType
 
getLabelBindings() - Method in interface org.apache.mahout.utils.vectors.arff.ARFFModel
The vector attributes (labels in Mahout speak)
getLabelBindings() - Method in class org.apache.mahout.utils.vectors.arff.MapBackedARFFModel
The vector attributes (labels in Mahout speak), unmodifiable
getLabelIndex(String) - Method in interface org.apache.mahout.utils.vectors.arff.ARFFModel
 
getLabelIndex(String) - Method in class org.apache.mahout.utils.vectors.arff.MapBackedARFFModel
 
getLabels() - Method in class org.apache.mahout.utils.vectors.lucene.ClusterLabels
 
getLabelSize() - Method in interface org.apache.mahout.utils.vectors.arff.ARFFModel
 
getLabelSize() - Method in class org.apache.mahout.utils.vectors.arff.MapBackedARFFModel
 
getLength() - Method in class org.apache.mahout.utils.nlp.collocations.llr.Gram
 
getLength() - Method in class org.apache.mahout.utils.nlp.collocations.llr.GramKey
 
getModel() - Method in class org.apache.mahout.utils.vectors.arff.ARFFVectorIterable
Returns info about the ARFF content that was parsed.
getNominalMap() - Method in interface org.apache.mahout.utils.vectors.arff.ARFFModel
 
getNominalMap() - Method in class org.apache.mahout.utils.vectors.arff.MapBackedARFFModel
Map nominals to ids.
getNominalValue(String, String) - Method in interface org.apache.mahout.utils.vectors.arff.ARFFModel
 
getNominalValue(String, String) - Method in class org.apache.mahout.utils.vectors.arff.MapBackedARFFModel
 
getNumTopFeatures() - Method in class org.apache.mahout.utils.clustering.ClusterDumper
 
getOutput() - Method in class org.apache.mahout.utils.vectors.lucene.ClusterLabels
 
getOutputFile() - Method in class org.apache.mahout.utils.clustering.ClusterDumper
 
getPartition(GramKey, Gram, int) - Method in class org.apache.mahout.utils.nlp.collocations.llr.GramKeyPartitioner
 
getPath(String, int) - Static method in class org.apache.mahout.utils.vectors.tfidf.TFIDFConverter
 
getPrimaryLength() - Method in class org.apache.mahout.utils.nlp.collocations.llr.GramKey
 
getPrimaryString() - Method in class org.apache.mahout.utils.nlp.collocations.llr.GramKey
 
getRelation() - Method in interface org.apache.mahout.utils.vectors.arff.ARFFModel
 
getRelation() - Method in class org.apache.mahout.utils.vectors.arff.MapBackedARFFModel
 
getString() - Method in class org.apache.mahout.utils.nlp.collocations.llr.Gram
 
getSubString() - Method in class org.apache.mahout.utils.clustering.ClusterDumper
 
getTermDictionary() - Method in class org.apache.mahout.utils.clustering.ClusterDumper
 
getTermEntry(String, String) - Method in class org.apache.mahout.utils.vectors.lucene.CachedTermInfo
 
getTermEntry(String, String) - Method in interface org.apache.mahout.utils.vectors.TermInfo
 
getType() - Method in class org.apache.mahout.utils.nlp.collocations.llr.Gram
 
getType() - Method in class org.apache.mahout.utils.nlp.collocations.llr.GramKey
 
getTypeMap() - Method in class org.apache.mahout.utils.vectors.arff.MapBackedARFFModel
The map of types encountered
getValue(String, int) - Method in interface org.apache.mahout.utils.vectors.arff.ARFFModel
 
getValue(String, int) - Method in class org.apache.mahout.utils.vectors.arff.MapBackedARFFModel
Convert a piece of String data at a specific spot into a value
getVector() - Method in class org.apache.mahout.utils.vectors.lucene.TFDFMapper
 
getVector() - Method in class org.apache.mahout.utils.vectors.lucene.VectorMapper
Can be called after the TermVector has been mapped
getWordCount() - Method in interface org.apache.mahout.utils.vectors.arff.ARFFModel
The count of the number of words seen
getWordCount() - Method in class org.apache.mahout.utils.vectors.arff.MapBackedARFFModel
The count of the number of words seen
getWords() - Method in interface org.apache.mahout.utils.vectors.arff.ARFFModel
 
getWords() - Method in class org.apache.mahout.utils.vectors.arff.MapBackedARFFModel
Immutable map of words to the long id used for those words
getWriter() - Method in class org.apache.mahout.utils.vectors.io.SequenceFileVectorWriter
 
Gram - Class in org.apache.mahout.utils.nlp.collocations.llr
Writable for holding data generated from the collocation discovery jobs.
Gram() - Constructor for class org.apache.mahout.utils.nlp.collocations.llr.Gram
 
Gram(Gram) - Constructor for class org.apache.mahout.utils.nlp.collocations.llr.Gram
Copy constructor
Gram(String, Gram.Type) - Constructor for class org.apache.mahout.utils.nlp.collocations.llr.Gram
Create an gram with a frequency of 1
Gram(String, int, Gram.Type) - Constructor for class org.apache.mahout.utils.nlp.collocations.llr.Gram
Create a gram with the specified frequency.
Gram.Type - Enum in org.apache.mahout.utils.nlp.collocations.llr
 
GramKey - Class in org.apache.mahout.utils.nlp.collocations.llr
A GramKey, based on the identity fields of Gram (type, string) plus a byte[] used for secondary ordering
GramKey() - Constructor for class org.apache.mahout.utils.nlp.collocations.llr.GramKey
 
GramKey(Gram, byte[]) - Constructor for class org.apache.mahout.utils.nlp.collocations.llr.GramKey
create a GramKey based on the specified Gram and order
GramKeyGroupComparator - Class in org.apache.mahout.utils.nlp.collocations.llr
Group GramKeys based on their Gram, ignoring the secondary sort key, so that all keys with the same Gram are sent to the same call of the reduce method, sorted in natural order (for GramKeys).
GramKeyGroupComparator() - Constructor for class org.apache.mahout.utils.nlp.collocations.llr.GramKeyGroupComparator
 
GramKeyPartitioner - Class in org.apache.mahout.utils.nlp.collocations.llr
Partition GramKeys based on their Gram, ignoring the secondary sort key so that all GramKeys with the same gram are sent to the same partition.
GramKeyPartitioner() - Constructor for class org.apache.mahout.utils.nlp.collocations.llr.GramKeyPartitioner
 

H

hasNext() - Method in class org.apache.mahout.utils.vectors.SequenceFileVectorIterable.SeqFileIterator
 

I

incrementalCreateBenchmark() - Method in class org.apache.mahout.benchmark.VectorBenchmarks
 
incrementFrequency(int) - Method in class org.apache.mahout.utils.nlp.collocations.llr.Gram
 
incrementToken() - Method in class org.apache.mahout.utils.nlp.collocations.llr.CollocMapper.IteratorTokenStream
 
isIgnoringOffsets() - Method in class org.apache.mahout.utils.vectors.lucene.TFDFMapper
 
isIgnoringPositions() - Method in class org.apache.mahout.utils.vectors.lucene.TFDFMapper
 
iterator() - Method in class org.apache.mahout.utils.vectors.arff.ARFFVectorIterable
 
iterator() - Method in class org.apache.mahout.utils.vectors.lucene.LuceneIterable
 
iterator() - Method in class org.apache.mahout.utils.vectors.SequenceFileVectorIterable
 

J

JWriterTermInfoWriter - Class in org.apache.mahout.utils.vectors.io
Write ther TermInfo out to a Writer
JWriterTermInfoWriter(Writer, String, String) - Constructor for class org.apache.mahout.utils.vectors.io.JWriterTermInfoWriter
 
JWriterVectorWriter - Class in org.apache.mahout.utils.vectors.io
 
JWriterVectorWriter(Writer) - Constructor for class org.apache.mahout.utils.vectors.io.JWriterVectorWriter
 

K

key() - Method in class org.apache.mahout.utils.vectors.SequenceFileVectorIterable.SeqFileIterator
Only valid when SequenceFileVectorIterable.SeqFileIterator.next() is also valid

L

LDAPrintTopics - Class in org.apache.mahout.clustering.lda
Class to print out the top K words for each topic.
LLRReducer - Class in org.apache.mahout.utils.nlp.collocations.llr
Reducer for pass 2 of the collocation discovery job.
LLRReducer() - Constructor for class org.apache.mahout.utils.nlp.collocations.llr.LLRReducer
 
LLRReducer.ConcreteLLCallback - Class in org.apache.mahout.utils.nlp.collocations.llr
concrete implementation delegates to LogLikelihood class
LLRReducer.ConcreteLLCallback() - Constructor for class org.apache.mahout.utils.nlp.collocations.llr.LLRReducer.ConcreteLLCallback
 
LLRReducer.LLCallback - Interface in org.apache.mahout.utils.nlp.collocations.llr
provide interface so the input to the llr calculation can be captured for validation in unit testing
LLRReducer.Skipped - Enum in org.apache.mahout.utils.nlp.collocations.llr
Counter to track why a particlar entry was skipped
loadTermDictionary(File) - Static method in class org.apache.mahout.utils.vectors.VectorHelper
Read in a dictionary file.
loadTermDictionary(Configuration, FileSystem, String) - Static method in class org.apache.mahout.utils.vectors.VectorHelper
Read a dictionary in SequenceFile generated by DictionaryVectorizer
loadTermDictionary(InputStream) - Static method in class org.apache.mahout.utils.vectors.VectorHelper
Read in a dictionary file.
logLikelihoodRatio(int, int, int, int) - Method in class org.apache.mahout.utils.nlp.collocations.llr.LLRReducer.ConcreteLLCallback
 
logLikelihoodRatio(int, int, int, int) - Method in interface org.apache.mahout.utils.nlp.collocations.llr.LLRReducer.LLCallback
 
LuceneIterable - Class in org.apache.mahout.utils.vectors.lucene
A LuceneIterable is an Iterable<Vector> that uses a Lucene index as the source for creating the Vector.
LuceneIterable(IndexReader, String, String, VectorMapper) - Constructor for class org.apache.mahout.utils.vectors.lucene.LuceneIterable
 
LuceneIterable(IndexReader, String, String, VectorMapper, double) - Constructor for class org.apache.mahout.utils.vectors.lucene.LuceneIterable
Produce a LuceneIterable that can create the Vector plus normalize it.

M

main(String[]) - Static method in class org.apache.mahout.benchmark.VectorBenchmarks
 
main(String[]) - Static method in class org.apache.mahout.clustering.lda.LDAPrintTopics
 
main(String[]) - Static method in class org.apache.mahout.text.SparseVectorsFromSequenceFiles
 
main(String[]) - Static method in class org.apache.mahout.text.TextParagraphSplittingJob
 
main(String[]) - Static method in class org.apache.mahout.utils.clustering.ClusterDumper
 
main(String[]) - Static method in class org.apache.mahout.utils.nlp.collocations.llr.CollocDriver
 
main(String[]) - Static method in class org.apache.mahout.utils.SequenceFileDumper
 
main(String[]) - Static method in class org.apache.mahout.utils.vectors.arff.Driver
 
main(String[]) - Static method in class org.apache.mahout.utils.vectors.lucene.ClusterLabels
 
main(String[]) - Static method in class org.apache.mahout.utils.vectors.lucene.Driver
 
main(String[]) - Static method in class org.apache.mahout.utils.vectors.RowIdJob
 
main(String[]) - Static method in class org.apache.mahout.utils.vectors.VectorDumper
 
map(Text, Text, OutputCollector<Text, Text>, Reporter) - Method in class org.apache.mahout.text.TextParagraphSplittingJob.SplitMap
 
map(Text, StringTuple, OutputCollector<GramKey, Gram>, Reporter) - Method in class org.apache.mahout.utils.nlp.collocations.llr.CollocMapper
Collocation finder: pass 1 map phase.
map(String, int, TermVectorOffsetInfo[], int[]) - Method in class org.apache.mahout.utils.vectors.lucene.TFDFMapper
 
map(Text, Text, OutputCollector<Text, StringTuple>, Reporter) - Method in class org.apache.mahout.utils.vectors.text.document.SequenceFileTokenizerMapper
 
map(Text, StringTuple, OutputCollector<Text, LongWritable>, Reporter) - Method in class org.apache.mahout.utils.vectors.text.term.TermCountMapper
 
map(WritableComparable<?>, VectorWritable, OutputCollector<IntWritable, LongWritable>, Reporter) - Method in class org.apache.mahout.utils.vectors.text.term.TermDocumentCountMapper
 
MapBackedARFFModel - Class in org.apache.mahout.utils.vectors.arff
Holds ARFF information in Map.
MapBackedARFFModel() - Constructor for class org.apache.mahout.utils.vectors.arff.MapBackedARFFModel
 
MapBackedARFFModel(Map<String, Long>, long, Map<String, Map<String, Integer>>) - Constructor for class org.apache.mahout.utils.vectors.arff.MapBackedARFFModel
 
MAX_DF_PERCENTAGE - Static variable in class org.apache.mahout.utils.vectors.tfidf.TFIDFConverter
 
MAX_NGRAMS - Static variable in class org.apache.mahout.utils.vectors.text.DictionaryVectorizer
 
MAX_SHINGLE_SIZE - Static variable in class org.apache.mahout.utils.nlp.collocations.llr.CollocMapper
 
mergePartialVectors(List<Path>, String, float, int, boolean) - Static method in class org.apache.mahout.utils.vectors.common.PartialVectorMerger
Merge all the partial RandomAccessSparseVectors into the complete Document RandomAccessSparseVector
MIN_DF - Static variable in class org.apache.mahout.utils.vectors.tfidf.TFIDFConverter
 
MIN_LLR - Static variable in class org.apache.mahout.utils.nlp.collocations.llr.LLRReducer
 
MIN_SUPPORT - Static variable in class org.apache.mahout.utils.nlp.collocations.llr.CollocReducer
 
MIN_SUPPORT - Static variable in class org.apache.mahout.utils.vectors.text.DictionaryVectorizer
 

N

next() - Method in class org.apache.mahout.utils.vectors.SequenceFileVectorIterable.SeqFileIterator
 
NGRAM_OUTPUT_DIRECTORY - Static variable in class org.apache.mahout.utils.nlp.collocations.llr.CollocDriver
 
NGRAM_TOTAL - Static variable in class org.apache.mahout.utils.nlp.collocations.llr.LLRReducer
 
NO_NORMALIZING - Static variable in class org.apache.mahout.utils.vectors.common.PartialVectorMerger
 
NO_NORMALIZING - Static variable in class org.apache.mahout.utils.vectors.lucene.LuceneIterable
 
NORMALIZATION_POWER - Static variable in class org.apache.mahout.utils.vectors.common.PartialVectorMerger
 

O

org.apache.mahout.benchmark - package org.apache.mahout.benchmark
 
org.apache.mahout.clustering.lda - package org.apache.mahout.clustering.lda
 
org.apache.mahout.text - package org.apache.mahout.text
 
org.apache.mahout.utils - package org.apache.mahout.utils
 
org.apache.mahout.utils.clustering - package org.apache.mahout.utils.clustering
 
org.apache.mahout.utils.nlp.collocations.llr - package org.apache.mahout.utils.nlp.collocations.llr
 
org.apache.mahout.utils.vectors - package org.apache.mahout.utils.vectors
 
org.apache.mahout.utils.vectors.arff - package org.apache.mahout.utils.vectors.arff
 
org.apache.mahout.utils.vectors.common - package org.apache.mahout.utils.vectors.common
 
org.apache.mahout.utils.vectors.io - package org.apache.mahout.utils.vectors.io
 
org.apache.mahout.utils.vectors.lucene - package org.apache.mahout.utils.vectors.lucene
 
org.apache.mahout.utils.vectors.text - package org.apache.mahout.utils.vectors.text
 
org.apache.mahout.utils.vectors.text.document - package org.apache.mahout.utils.vectors.text.document
 
org.apache.mahout.utils.vectors.text.term - package org.apache.mahout.utils.vectors.text.term
 
org.apache.mahout.utils.vectors.tfidf - package org.apache.mahout.utils.vectors.tfidf
 

P

PartialVectorMerger - Class in org.apache.mahout.utils.vectors.common
This class groups a set of input vectors.
PartialVectorMergeReducer - Class in org.apache.mahout.utils.vectors.common
Merges partial vectors in to a full sparse vector
PartialVectorMergeReducer() - Constructor for class org.apache.mahout.utils.vectors.common.PartialVectorMergeReducer
 
printClusters() - Method in class org.apache.mahout.utils.clustering.ClusterDumper
 
processDate(String, int) - Method in class org.apache.mahout.utils.vectors.arff.MapBackedARFFModel
 
processNominal(String, String) - Method in class org.apache.mahout.utils.vectors.arff.MapBackedARFFModel
 
processNumeric(String) - Static method in class org.apache.mahout.utils.vectors.arff.MapBackedARFFModel
 
processString(String) - Method in class org.apache.mahout.utils.vectors.arff.MapBackedARFFModel
Process a String
processSubgram(GramKey, Iterator<Gram>, OutputCollector<Gram, Gram>, Reporter) - Method in class org.apache.mahout.utils.nlp.collocations.llr.CollocReducer
Sum frequencies for subgram, ngrams and deliver ngram, subgram pairs to the collector.
processTfIdf(String, String, int, int, int, float, boolean) - Static method in class org.apache.mahout.utils.vectors.tfidf.TFIDFConverter
Create Term Frequency-Inverse Document Frequency (Tf-Idf) Vectors from the input set of vectors in SequenceFile format.
processUnigram(GramKey, Iterator<Gram>, OutputCollector<Gram, Gram>, Reporter) - Method in class org.apache.mahout.utils.nlp.collocations.llr.CollocReducer
Sum frequencies for unigrams and deliver to the collector

R

readFields(DataInput) - Method in class org.apache.mahout.utils.nlp.collocations.llr.Gram
 
readFields(DataInput) - Method in class org.apache.mahout.utils.nlp.collocations.llr.GramKey
 
reduce(GramKey, Iterator<Gram>, OutputCollector<GramKey, Gram>, Reporter) - Method in class org.apache.mahout.utils.nlp.collocations.llr.CollocCombiner
 
reduce(GramKey, Iterator<Gram>, OutputCollector<Gram, Gram>, Reporter) - Method in class org.apache.mahout.utils.nlp.collocations.llr.CollocReducer
collocation finder: pass 1 reduce phase:

given input from the mapper,

reduce(Gram, Iterator<Gram>, OutputCollector<Text, DoubleWritable>, Reporter) - Method in class org.apache.mahout.utils.nlp.collocations.llr.LLRReducer
Perform LLR calculation, input is: k:ngram:ngramFreq v:(h_|t_)subgram:subgramfreq N = ngram total Each ngram will have 2 subgrams, a head and a tail, referred to as A and B respectively below.
reduce(WritableComparable<?>, Iterator<VectorWritable>, OutputCollector<WritableComparable<?>, VectorWritable>, Reporter) - Method in class org.apache.mahout.utils.vectors.common.PartialVectorMergeReducer
 
reduce(Text, Iterator<LongWritable>, OutputCollector<Text, LongWritable>, Reporter) - Method in class org.apache.mahout.utils.vectors.text.term.TermCountReducer
 
reduce(IntWritable, Iterator<LongWritable>, OutputCollector<IntWritable, LongWritable>, Reporter) - Method in class org.apache.mahout.utils.vectors.text.term.TermDocumentCountReducer
 
reduce(Text, Iterator<StringTuple>, OutputCollector<Text, VectorWritable>, Reporter) - Method in class org.apache.mahout.utils.vectors.text.term.TFPartialVectorReducer
 
reduce(WritableComparable<?>, Iterator<VectorWritable>, OutputCollector<WritableComparable<?>, VectorWritable>, Reporter) - Method in class org.apache.mahout.utils.vectors.tfidf.TFIDFPartialVectorReducer
 
RELATION - Static variable in interface org.apache.mahout.utils.vectors.arff.ARFFModel
 
remove() - Method in class org.apache.mahout.utils.vectors.SequenceFileVectorIterable.SeqFileIterator
 
RowIdJob - Class in org.apache.mahout.utils.vectors
 
RowIdJob() - Constructor for class org.apache.mahout.utils.vectors.RowIdJob
 
run(String[]) - Method in class org.apache.mahout.text.TextParagraphSplittingJob
 
run(String[]) - Method in class org.apache.mahout.utils.nlp.collocations.llr.CollocDriver
 
run(String[]) - Method in class org.apache.mahout.utils.vectors.RowIdJob
 

S

SequenceFileDumper - Class in org.apache.mahout.utils
 
SequenceFileTokenizerMapper - Class in org.apache.mahout.utils.vectors.text.document
Tokenizes a text document and outputs tokens in a StringTuple
SequenceFileTokenizerMapper() - Constructor for class org.apache.mahout.utils.vectors.text.document.SequenceFileTokenizerMapper
 
SequenceFileVectorIterable - Class in org.apache.mahout.utils.vectors
Reads in a file containing Vectors.
SequenceFileVectorIterable(SequenceFile.Reader) - Constructor for class org.apache.mahout.utils.vectors.SequenceFileVectorIterable
 
SequenceFileVectorIterable(SequenceFile.Reader, boolean) - Constructor for class org.apache.mahout.utils.vectors.SequenceFileVectorIterable
 
SequenceFileVectorIterable.SeqFileIterator - Class in org.apache.mahout.utils.vectors
 
SequenceFileVectorWriter - Class in org.apache.mahout.utils.vectors.io
Closes the writer when done
SequenceFileVectorWriter(SequenceFile.Writer) - Constructor for class org.apache.mahout.utils.vectors.io.SequenceFileVectorWriter
 
SEQUENTIAL_ACCESS - Static variable in class org.apache.mahout.utils.vectors.common.PartialVectorMerger
 
set(Gram, byte[]) - Method in class org.apache.mahout.utils.nlp.collocations.llr.GramKey
set the gram held by this key
setExpectations(String, int, boolean, boolean) - Method in class org.apache.mahout.utils.vectors.lucene.TFDFMapper
 
setFrequency(int) - Method in class org.apache.mahout.utils.nlp.collocations.llr.Gram
 
setIdField(String) - Method in class org.apache.mahout.utils.vectors.lucene.ClusterLabels
 
setNumTopFeatures(int) - Method in class org.apache.mahout.utils.clustering.ClusterDumper
 
setOffsets(Configuration, int, int) - Static method in class org.apache.mahout.utils.nlp.collocations.llr.GramKeyPartitioner
 
setOutput(String) - Method in class org.apache.mahout.utils.vectors.lucene.ClusterLabels
 
setOutputFile(String) - Method in class org.apache.mahout.utils.clustering.ClusterDumper
 
setRelation(String) - Method in interface org.apache.mahout.utils.vectors.arff.ARFFModel
 
setRelation(String) - Method in class org.apache.mahout.utils.vectors.arff.MapBackedARFFModel
 
setSubString(int) - Method in class org.apache.mahout.utils.clustering.ClusterDumper
 
setTermDictionary(String, String) - Method in class org.apache.mahout.utils.clustering.ClusterDumper
 
SparseVectorsFromSequenceFiles - Class in org.apache.mahout.text
Converts a given set of sequence files into SparseVectors
SUBGRAM_OUTPUT_DIRECTORY - Static variable in class org.apache.mahout.utils.nlp.collocations.llr.CollocDriver
 
summarize() - Method in class org.apache.mahout.benchmark.VectorBenchmarks
 

T

term - Variable in class org.apache.mahout.utils.vectors.TermEntry
 
TermCountMapper - Class in org.apache.mahout.utils.vectors.text.term
TextVectorizer Term Count Mapper.
TermCountMapper() - Constructor for class org.apache.mahout.utils.vectors.text.term.TermCountMapper
 
TermCountReducer - Class in org.apache.mahout.utils.vectors.text.term
Can also be used as a local Combiner.
TermCountReducer() - Constructor for class org.apache.mahout.utils.vectors.text.term.TermCountReducer
 
TermDocumentCountMapper - Class in org.apache.mahout.utils.vectors.text.term
TextVectorizer Document Frequency Count Mapper.
TermDocumentCountMapper() - Constructor for class org.apache.mahout.utils.vectors.text.term.TermDocumentCountMapper
 
TermDocumentCountReducer - Class in org.apache.mahout.utils.vectors.text.term
Can also be used as a local Combiner.
TermDocumentCountReducer() - Constructor for class org.apache.mahout.utils.vectors.text.term.TermDocumentCountReducer
 
TermEntry - Class in org.apache.mahout.utils.vectors
 
TermEntry(String, int, int) - Constructor for class org.apache.mahout.utils.vectors.TermEntry
 
termIdx - Variable in class org.apache.mahout.utils.vectors.TermEntry
 
TermInfo - Interface in org.apache.mahout.utils.vectors
 
TermInfoWriter - Interface in org.apache.mahout.utils.vectors.io
 
TextParagraphSplittingJob - Class in org.apache.mahout.text
 
TextParagraphSplittingJob() - Constructor for class org.apache.mahout.text.TextParagraphSplittingJob
 
TextParagraphSplittingJob.SplitMap - Class in org.apache.mahout.text
 
TextParagraphSplittingJob.SplitMap() - Constructor for class org.apache.mahout.text.TextParagraphSplittingJob.SplitMap
 
TF - Class in org.apache.mahout.utils.vectors
Weight based on term frequency only
TF() - Constructor for class org.apache.mahout.utils.vectors.TF
 
TFDFMapper - Class in org.apache.mahout.utils.vectors.lucene
Not thread-safe
TFDFMapper(IndexReader, Weight, TermInfo) - Constructor for class org.apache.mahout.utils.vectors.lucene.TFDFMapper
 
TFIDF - Class in org.apache.mahout.utils.vectors
 
TFIDF() - Constructor for class org.apache.mahout.utils.vectors.TFIDF
 
TFIDF(Similarity) - Constructor for class org.apache.mahout.utils.vectors.TFIDF
 
TFIDF_OUTPUT_FOLDER - Static variable in class org.apache.mahout.utils.vectors.tfidf.TFIDFConverter
 
TFIDFConverter - Class in org.apache.mahout.utils.vectors.tfidf
This class converts a set of input vectors with term frequencies to TfIdf vectors.
TFIDFPartialVectorReducer - Class in org.apache.mahout.utils.vectors.tfidf
Converts a document in to a sparse vector
TFIDFPartialVectorReducer() - Constructor for class org.apache.mahout.utils.vectors.tfidf.TFIDFPartialVectorReducer
 
TFPartialVectorReducer - Class in org.apache.mahout.utils.vectors.text.term
Converts a document in to a sparse vector
TFPartialVectorReducer() - Constructor for class org.apache.mahout.utils.vectors.text.term.TFPartialVectorReducer
 
TOKENIZED_DOCUMENT_OUTPUT_FOLDER - Static variable in class org.apache.mahout.utils.vectors.text.DocumentProcessor
 
tokenizeDocuments(String, Class<? extends Analyzer>, String) - Static method in class org.apache.mahout.utils.vectors.text.DocumentProcessor
Convert the input documents into token array using the StringTuple The input documents has to be in the SequenceFile format
topWordsForTopics(String, Configuration, List<String>, int) - Static method in class org.apache.mahout.clustering.lda.LDAPrintTopics
 
toString() - Method in class org.apache.mahout.utils.nlp.collocations.llr.Gram
 
toString() - Method in class org.apache.mahout.utils.nlp.collocations.llr.GramKey
 
totalTerms(String) - Method in class org.apache.mahout.utils.vectors.lucene.CachedTermInfo
 
totalTerms(String) - Method in interface org.apache.mahout.utils.vectors.TermInfo
 

V

valueOf(String) - Static method in enum org.apache.mahout.utils.nlp.collocations.llr.CollocMapper.Count
Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum org.apache.mahout.utils.nlp.collocations.llr.CollocReducer.Skipped
Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum org.apache.mahout.utils.nlp.collocations.llr.Gram.Type
Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum org.apache.mahout.utils.nlp.collocations.llr.LLRReducer.Skipped
Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum org.apache.mahout.utils.vectors.arff.ARFFType
Returns the enum constant of this type with the specified name.
values() - Static method in enum org.apache.mahout.utils.nlp.collocations.llr.CollocMapper.Count
Returns an array containing the constants of this enum type, in the order they are declared.
values() - Static method in enum org.apache.mahout.utils.nlp.collocations.llr.CollocReducer.Skipped
Returns an array containing the constants of this enum type, in the order they are declared.
values() - Static method in enum org.apache.mahout.utils.nlp.collocations.llr.Gram.Type
Returns an array containing the constants of this enum type, in the order they are declared.
values() - Static method in enum org.apache.mahout.utils.nlp.collocations.llr.LLRReducer.Skipped
Returns an array containing the constants of this enum type, in the order they are declared.
values() - Static method in enum org.apache.mahout.utils.vectors.arff.ARFFType
Returns an array containing the constants of this enum type, in the order they are declared.
VECTOR_COUNT - Static variable in class org.apache.mahout.utils.vectors.tfidf.TFIDFConverter
 
VectorBenchmarks - Class in org.apache.mahout.benchmark
 
VectorBenchmarks(int, int, int, int, int) - Constructor for class org.apache.mahout.benchmark.VectorBenchmarks
 
VectorDumper - Class in org.apache.mahout.utils.vectors
Can read in a SequenceFile of Vectors and dump out the results using Vector.asFormatString() to either the console or to a file.
VectorHelper - Class in org.apache.mahout.utils.vectors
 
VectorMapper - Class in org.apache.mahout.utils.vectors.lucene
Not thread-safe
VectorMapper() - Constructor for class org.apache.mahout.utils.vectors.lucene.VectorMapper
 
vectorToString(Vector, String[]) - Static method in class org.apache.mahout.utils.vectors.VectorHelper
Create a String from a vector that fills in the values with the appropriate value from a dictionary where each the ith entry is the term for the ith vector cell..
VectorWriter - Interface in org.apache.mahout.utils.vectors.io
 

W

Weight - Interface in org.apache.mahout.utils.vectors
 
write(DataOutput) - Method in class org.apache.mahout.utils.nlp.collocations.llr.Gram
 
write(DataOutput) - Method in class org.apache.mahout.utils.nlp.collocations.llr.GramKey
 
write(TermInfo) - Method in class org.apache.mahout.utils.vectors.io.JWriterTermInfoWriter
 
write(Iterable<Vector>) - Method in class org.apache.mahout.utils.vectors.io.JWriterVectorWriter
 
write(Iterable<Vector>, long) - Method in class org.apache.mahout.utils.vectors.io.JWriterVectorWriter
 
write(Iterable<Vector>, long) - Method in class org.apache.mahout.utils.vectors.io.SequenceFileVectorWriter
 
write(Iterable<Vector>) - Method in class org.apache.mahout.utils.vectors.io.SequenceFileVectorWriter
 
write(TermInfo) - Method in interface org.apache.mahout.utils.vectors.io.TermInfoWriter
 
write(Iterable<Vector>) - Method in interface org.apache.mahout.utils.vectors.io.VectorWriter
Write all values in the Iterable to the output
write(Iterable<Vector>, long) - Method in interface org.apache.mahout.utils.vectors.io.VectorWriter
Write the first maxDocs to the output.

A C D E F G H I J K L M N O P R S T V W

Copyright © 2008-2010 The Apache Software Foundation. All Rights Reserved.