|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectorg.apache.lucene.index.LiveIndexWriterConfig
public class LiveIndexWriterConfig
Holds all the configuration used by IndexWriter
with few setters for
settings that can be changed on an IndexWriter
instance "live".
Field Summary | |
---|---|
protected Codec |
codec
|
protected IndexCommit |
commit
|
protected IndexDeletionPolicy |
delPolicy
|
protected org.apache.lucene.index.FlushPolicy |
flushPolicy
|
protected org.apache.lucene.index.DocumentsWriterPerThreadPool |
indexerThreadPool
|
protected org.apache.lucene.index.DocumentsWriterPerThread.IndexingChain |
indexingChain
|
protected InfoStream |
infoStream
|
protected Version |
matchVersion
|
protected MergePolicy |
mergePolicy
|
protected MergeScheduler |
mergeScheduler
|
protected IndexWriterConfig.OpenMode |
openMode
|
protected int |
perThreadHardLimitMB
|
protected boolean |
readerPooling
|
protected Similarity |
similarity
|
protected long |
writeLockTimeout
|
Method Summary | |
---|---|
Analyzer |
getAnalyzer()
Returns the default analyzer to use for indexing documents. |
Codec |
getCodec()
Returns the current Codec . |
IndexCommit |
getIndexCommit()
Returns the IndexCommit as specified in
IndexWriterConfig.setIndexCommit(IndexCommit) or the default,
null which specifies to open the latest index commit point. |
IndexDeletionPolicy |
getIndexDeletionPolicy()
Returns the IndexDeletionPolicy specified in
IndexWriterConfig.setIndexDeletionPolicy(IndexDeletionPolicy) or
the default KeepOnlyLastCommitDeletionPolicy / |
InfoStream |
getInfoStream()
|
int |
getMaxBufferedDeleteTerms()
Returns the number of buffered deleted terms that will trigger a flush of all buffered deletes if enabled. |
int |
getMaxBufferedDocs()
Returns the number of buffered added documents that will trigger a flush if enabled. |
int |
getMaxThreadStates()
Returns the max number of simultaneous threads that may be indexing documents at once in IndexWriter. |
IndexWriter.IndexReaderWarmer |
getMergedSegmentWarmer()
Returns the current merged segment warmer. |
MergePolicy |
getMergePolicy()
Returns the current MergePolicy in use by this writer. |
MergeScheduler |
getMergeScheduler()
Returns the MergeScheduler that was set by
IndexWriterConfig.setMergeScheduler(MergeScheduler) . |
IndexWriterConfig.OpenMode |
getOpenMode()
Returns the IndexWriterConfig.OpenMode set by IndexWriterConfig.setOpenMode(OpenMode) . |
double |
getRAMBufferSizeMB()
Returns the value set by setRAMBufferSizeMB(double) if enabled. |
int |
getRAMPerThreadHardLimitMB()
Returns the max amount of memory each DocumentsWriterPerThread can
consume until forcefully flushed. |
boolean |
getReaderPooling()
Returns true if IndexWriter should pool readers even if
DirectoryReader.open(IndexWriter, boolean) has not been called. |
int |
getReaderTermsIndexDivisor()
|
Similarity |
getSimilarity()
Expert: returns the Similarity implementation used by this
IndexWriter . |
int |
getTermIndexInterval()
Returns the interval between indexed terms. |
long |
getWriteLockTimeout()
Returns allowed timeout when acquiring the write lock. |
LiveIndexWriterConfig |
setMaxBufferedDeleteTerms(int maxBufferedDeleteTerms)
Determines the minimal number of delete terms required before the buffered in-memory delete terms and queries are applied and flushed. |
LiveIndexWriterConfig |
setMaxBufferedDocs(int maxBufferedDocs)
Determines the minimal number of documents required before the buffered in-memory documents are flushed as a new Segment. |
LiveIndexWriterConfig |
setMergedSegmentWarmer(IndexWriter.IndexReaderWarmer mergeSegmentWarmer)
Set the merged segment warmer. |
LiveIndexWriterConfig |
setRAMBufferSizeMB(double ramBufferSizeMB)
Determines the amount of RAM that may be used for buffering added documents and deletions before they are flushed to the Directory. |
LiveIndexWriterConfig |
setReaderTermsIndexDivisor(int divisor)
Sets the termsIndexDivisor passed to any readers that IndexWriter opens, for example when applying deletes or creating a near-real-time reader in DirectoryReader.open(IndexWriter, boolean) . |
LiveIndexWriterConfig |
setTermIndexInterval(int interval)
Expert: set the interval between indexed terms. |
String |
toString()
|
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
Field Detail |
---|
protected volatile IndexDeletionPolicy delPolicy
protected volatile IndexCommit commit
protected volatile IndexWriterConfig.OpenMode openMode
protected volatile Similarity similarity
protected volatile MergeScheduler mergeScheduler
protected volatile long writeLockTimeout
protected volatile org.apache.lucene.index.DocumentsWriterPerThread.IndexingChain indexingChain
protected volatile Codec codec
protected volatile InfoStream infoStream
protected volatile MergePolicy mergePolicy
protected volatile org.apache.lucene.index.DocumentsWriterPerThreadPool indexerThreadPool
protected volatile boolean readerPooling
protected volatile org.apache.lucene.index.FlushPolicy flushPolicy
protected volatile int perThreadHardLimitMB
protected final Version matchVersion
Method Detail |
---|
public Analyzer getAnalyzer()
public LiveIndexWriterConfig setTermIndexInterval(int interval)
This parameter determines the amount of computation required per query term, regardless of the number of documents that contain that term. In particular, it is the maximum number of other terms that must be scanned before a term is located and its frequency and position information may be processed. In a large index with user-entered query terms, query processing time is likely to be dominated not by term lookup but rather by the processing of frequency and positional data. In a small index or when many uncommon query terms are generated (e.g., by wildcard queries) term lookup may become a dominant cost.
In particular, numUniqueTerms/interval
terms are read into
memory by an IndexReader, and, on average, interval/2
terms
must be scanned for each random term access.
Takes effect immediately, but only applies to newly flushed/merged segments.
NOTE: This parameter does not apply to all PostingsFormat implementations,
including the default one in this release. It only makes sense for term indexes
that are implemented as a fixed gap between terms. For example,
Lucene40PostingsFormat
implements the term index instead based upon how
terms share prefixes. To configure its parameters (the minimum and maximum size
for a block), you would instead use Lucene40PostingsFormat.Lucene40PostingsFormat(int, int)
.
which can also be configured on a per-field basis:
//customize Lucene40PostingsFormat, passing minBlockSize=50, maxBlockSize=100 final PostingsFormat tweakedPostings = new Lucene40PostingsFormat(50, 100); iwc.setCodec(new Lucene40Codec() { @Override public PostingsFormat getPostingsFormatForField(String field) { if (field.equals("fieldWithTonsOfTerms")) return tweakedPostings; else return super.getPostingsFormatForField(field); } });Note that other implementations may have their own parameters, or no parameters at all.
IndexWriterConfig.DEFAULT_TERM_INDEX_INTERVAL
public int getTermIndexInterval()
setTermIndexInterval(int)
public LiveIndexWriterConfig setMaxBufferedDeleteTerms(int maxBufferedDeleteTerms)
Disabled by default (writer flushes by RAM usage).
NOTE: This setting won't trigger a segment flush.
Takes effect immediately, but only the next time a document is added, updated or deleted.
IllegalArgumentException
- if maxBufferedDeleteTerms is enabled but smaller than 1setRAMBufferSizeMB(double)
public int getMaxBufferedDeleteTerms()
setMaxBufferedDeleteTerms(int)
public LiveIndexWriterConfig setRAMBufferSizeMB(double ramBufferSizeMB)
When this is set, the writer will flush whenever buffered documents and
deletions use this much RAM. Pass in
IndexWriterConfig.DISABLE_AUTO_FLUSH
to prevent triggering a flush
due to RAM usage. Note that if flushing by document count is also enabled,
then the flush will be triggered by whichever comes first.
The maximum RAM limit is inherently determined by the JVMs available
memory. Yet, an IndexWriter
session can consume a significantly
larger amount of memory than the given RAM limit since this limit is just
an indicator when to flush memory resident documents to the Directory.
Flushes are likely happen concurrently while other threads adding documents
to the writer. For application stability the available memory in the JVM
should be significantly larger than the RAM buffer used for indexing.
NOTE: the account of RAM usage for pending deletions is only
approximate. Specifically, if you delete by Query, Lucene currently has no
way to measure the RAM usage of individual Queries so the accounting will
under-estimate and you should compensate by either calling commit()
periodically yourself, or by using setMaxBufferedDeleteTerms(int)
to flush and apply buffered deletes by count instead of RAM usage (for each
buffered delete Query a constant number of bytes is used to estimate RAM
usage). Note that enabling setMaxBufferedDeleteTerms(int)
will not
trigger any segment flushes.
NOTE: It's not guaranteed that all memory resident documents are
flushed once this limit is exceeded. Depending on the configured
FlushPolicy
only a subset of the buffered documents are flushed and
therefore only parts of the RAM buffer is released.
The default value is IndexWriterConfig.DEFAULT_RAM_BUFFER_SIZE_MB
.
Takes effect immediately, but only the next time a document is added, updated or deleted.
IllegalArgumentException
- if ramBufferSize is enabled but non-positive, or it disables
ramBufferSize when maxBufferedDocs is already disabledIndexWriterConfig.setRAMPerThreadHardLimitMB(int)
public double getRAMBufferSizeMB()
setRAMBufferSizeMB(double)
if enabled.
public LiveIndexWriterConfig setMaxBufferedDocs(int maxBufferedDocs)
When this is set, the writer will flush every maxBufferedDocs added
documents. Pass in IndexWriterConfig.DISABLE_AUTO_FLUSH
to prevent
triggering a flush due to number of buffered documents. Note that if
flushing by RAM usage is also enabled, then the flush will be triggered by
whichever comes first.
Disabled by default (writer flushes by RAM usage).
Takes effect immediately, but only the next time a document is added, updated or deleted.
IllegalArgumentException
- if maxBufferedDocs is enabled but smaller than 2, or it disables
maxBufferedDocs when ramBufferSize is already disabledsetRAMBufferSizeMB(double)
public int getMaxBufferedDocs()
setMaxBufferedDocs(int)
public LiveIndexWriterConfig setMergedSegmentWarmer(IndexWriter.IndexReaderWarmer mergeSegmentWarmer)
IndexWriter.IndexReaderWarmer
.
Takes effect on the next merge.
public IndexWriter.IndexReaderWarmer getMergedSegmentWarmer()
IndexWriter.IndexReaderWarmer
.
public LiveIndexWriterConfig setReaderTermsIndexDivisor(int divisor)
DirectoryReader.open(IndexWriter, boolean)
. If you pass -1, the
terms index won't be loaded by the readers. This is only useful in advanced
situations when you will only .next() through all terms; attempts to seek
will hit an exception.
Takes effect immediately, but only applies to readers opened after this call
NOTE: divisor settings > 1 do not apply to all PostingsFormat implementations, including the default one in this release. It only makes sense for terms indexes that can efficiently re-sample terms at load time.
public int getReaderTermsIndexDivisor()
setReaderTermsIndexDivisor(int)
public IndexWriterConfig.OpenMode getOpenMode()
IndexWriterConfig.OpenMode
set by IndexWriterConfig.setOpenMode(OpenMode)
.
public IndexDeletionPolicy getIndexDeletionPolicy()
IndexDeletionPolicy
specified in
IndexWriterConfig.setIndexDeletionPolicy(IndexDeletionPolicy)
or
the default KeepOnlyLastCommitDeletionPolicy
/
public IndexCommit getIndexCommit()
IndexCommit
as specified in
IndexWriterConfig.setIndexCommit(IndexCommit)
or the default,
null
which specifies to open the latest index commit point.
public Similarity getSimilarity()
Similarity
implementation used by this
IndexWriter
.
public MergeScheduler getMergeScheduler()
MergeScheduler
that was set by
IndexWriterConfig.setMergeScheduler(MergeScheduler)
.
public long getWriteLockTimeout()
IndexWriterConfig.setWriteLockTimeout(long)
public Codec getCodec()
Codec
.
public MergePolicy getMergePolicy()
IndexWriterConfig.setMergePolicy(MergePolicy)
public int getMaxThreadStates()
public boolean getReaderPooling()
true
if IndexWriter
should pool readers even if
DirectoryReader.open(IndexWriter, boolean)
has not been called.
public int getRAMPerThreadHardLimitMB()
DocumentsWriterPerThread
can
consume until forcefully flushed.
IndexWriterConfig.setRAMPerThreadHardLimitMB(int)
public InfoStream getInfoStream()
IndexWriterConfig.setInfoStream(InfoStream)
public String toString()
toString
in class Object
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |