org.apache.mahout.utils.vectors.lucene
Class LuceneIterator

java.lang.Object
  extended by com.google.common.collect.UnmodifiableIterator<T>
      extended by com.google.common.collect.AbstractIterator<Vector>
          extended by org.apache.mahout.utils.vectors.lucene.LuceneIterator
All Implemented Interfaces:
Iterator<Vector>

public final class LuceneIterator
extends com.google.common.collect.AbstractIterator<Vector>

An Iterator over Vectors that uses a Lucene index as the source for creating the Vectors. The field used to create the vectors currently must have term vectors stored for it.


Constructor Summary
LuceneIterator(org.apache.lucene.index.IndexReader indexReader, String idField, String field, VectorMapper mapper, double normPower)
          Produce a LuceneIterable that can create the Vector plus normalize it.
LuceneIterator(org.apache.lucene.index.IndexReader indexReader, String idField, String field, VectorMapper mapper, double normPower, double maxPercentErrorDocs)
           
 
Method Summary
protected  Vector computeNext()
           
 
Methods inherited from class com.google.common.collect.AbstractIterator
endOfData, hasNext, next, peek
 
Methods inherited from class com.google.common.collect.UnmodifiableIterator
remove
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

LuceneIterator

public LuceneIterator(org.apache.lucene.index.IndexReader indexReader,
                      String idField,
                      String field,
                      VectorMapper mapper,
                      double normPower)
               throws IOException
Produce a LuceneIterable that can create the Vector plus normalize it.

Parameters:
indexReader - IndexReader to read the documents from.
idField - field containing the id. May be null.
field - field to use for the Vector
mapper - VectorMapper for creating Vectors from Lucene's TermVectors.
normPower - the normalization value. Must be nonnegative, or LuceneIterable.NO_NORMALIZING
Throws:
IOException

LuceneIterator

public LuceneIterator(org.apache.lucene.index.IndexReader indexReader,
                      String idField,
                      String field,
                      VectorMapper mapper,
                      double normPower,
                      double maxPercentErrorDocs)
               throws IOException
Parameters:
maxPercentErrorDocs - most documents that will be tolerated without a term freq vector. In [0,1].
Throws:
IOException
See Also:
LuceneIterator(IndexReader, String, String, VectorMapper, double)
Method Detail

computeNext

protected Vector computeNext()
Specified by:
computeNext in class com.google.common.collect.AbstractIterator<Vector>


Copyright © 2008-2012 The Apache Software Foundation. All Rights Reserved.