org.apache.mahout.utils.vectors.lucene
Class LuceneIterator
java.lang.Object
com.google.common.collect.UnmodifiableIterator<T>
com.google.common.collect.AbstractIterator<Vector>
org.apache.mahout.utils.vectors.lucene.LuceneIterator
- All Implemented Interfaces:
- Iterator<Vector>
public final class LuceneIterator
- extends com.google.common.collect.AbstractIterator<Vector>
An Iterator
over Vector
s that uses a Lucene index as the source for creating the
Vector
s. The field used to create the vectors currently must have term vectors stored for it.
Constructor Summary |
LuceneIterator(org.apache.lucene.index.IndexReader indexReader,
String idField,
String field,
VectorMapper mapper,
double normPower)
Produce a LuceneIterable that can create the Vector plus normalize it. |
LuceneIterator(org.apache.lucene.index.IndexReader indexReader,
String idField,
String field,
VectorMapper mapper,
double normPower,
double maxPercentErrorDocs)
|
Methods inherited from class com.google.common.collect.AbstractIterator |
endOfData, hasNext, next, peek |
Methods inherited from class com.google.common.collect.UnmodifiableIterator |
remove |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
LuceneIterator
public LuceneIterator(org.apache.lucene.index.IndexReader indexReader,
String idField,
String field,
VectorMapper mapper,
double normPower)
throws IOException
- Produce a LuceneIterable that can create the Vector plus normalize it.
- Parameters:
indexReader
- IndexReader
to read the documents from.idField
- field containing the id. May be null.field
- field to use for the Vectormapper
- VectorMapper
for creating Vector
s from Lucene's TermVectors.normPower
- the normalization value. Must be nonnegative, or LuceneIterable.NO_NORMALIZING
- Throws:
IOException
LuceneIterator
public LuceneIterator(org.apache.lucene.index.IndexReader indexReader,
String idField,
String field,
VectorMapper mapper,
double normPower,
double maxPercentErrorDocs)
throws IOException
- Parameters:
maxPercentErrorDocs
- most documents that will be tolerated without a term freq vector. In [0,1].
- Throws:
IOException
- See Also:
LuceneIterator(IndexReader, String, String, VectorMapper, double)
computeNext
protected Vector computeNext()
- Specified by:
computeNext
in class com.google.common.collect.AbstractIterator<Vector>
Copyright © 2008-2012 The Apache Software Foundation. All Rights Reserved.