org.apache.lucene.codecs.bloom
Class BloomFilteringPostingsFormat

java.lang.Object
  extended by org.apache.lucene.codecs.PostingsFormat
      extended by org.apache.lucene.codecs.bloom.BloomFilteringPostingsFormat
All Implemented Interfaces:
NamedSPILoader.NamedSPI

public class BloomFilteringPostingsFormat
extends PostingsFormat

A PostingsFormat useful for low doc-frequency fields such as primary keys. Bloom filters are maintained in a ".blm" file which offers "fast-fail" for reads in segments known to have no record of the key. A choice of delegate PostingsFormat is used to record all other Postings data.

A choice of BloomFilterFactory can be passed to tailor Bloom Filter settings on a per-field basis. The default configuration is DefaultBloomFilterFactory which allocates a ~8mb bitset and hashes values using MurmurHash2. This should be suitable for most purposes.

The format of the blm file is as follows:

WARNING: This API is experimental and might change in incompatible ways in the next release.

Nested Class Summary
 class BloomFilteringPostingsFormat.BloomFilteredFieldsProducer
           
 
Field Summary
static String BLOOM_CODEC_NAME
           
static int BLOOM_CODEC_VERSION
           
 
Fields inherited from class org.apache.lucene.codecs.PostingsFormat
EMPTY
 
Constructor Summary
BloomFilteringPostingsFormat()
           
BloomFilteringPostingsFormat(PostingsFormat delegatePostingsFormat)
          Creates Bloom filters for a selection of fields created in the index.
BloomFilteringPostingsFormat(PostingsFormat delegatePostingsFormat, BloomFilterFactory bloomFilterFactory)
          Creates Bloom filters for a selection of fields created in the index.
 
Method Summary
 FieldsConsumer fieldsConsumer(SegmentWriteState state)
          Writes a new segment
 FieldsProducer fieldsProducer(SegmentReadState state)
          Reads a segment.
 
Methods inherited from class org.apache.lucene.codecs.PostingsFormat
availablePostingsFormats, forName, getName, reloadPostingsFormats, toString
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

BLOOM_CODEC_NAME

public static final String BLOOM_CODEC_NAME
See Also:
Constant Field Values

BLOOM_CODEC_VERSION

public static final int BLOOM_CODEC_VERSION
See Also:
Constant Field Values
Constructor Detail

BloomFilteringPostingsFormat

public BloomFilteringPostingsFormat(PostingsFormat delegatePostingsFormat,
                                    BloomFilterFactory bloomFilterFactory)
Creates Bloom filters for a selection of fields created in the index. This is recorded as a set of Bitsets held as a segment summary in an additional "blm" file. This PostingsFormat delegates to a choice of delegate PostingsFormat for encoding all other postings data.

Parameters:
delegatePostingsFormat - The PostingsFormat that records all the non-bloom filter data i.e. postings info.
bloomFilterFactory - The BloomFilterFactory responsible for sizing BloomFilters appropriately

BloomFilteringPostingsFormat

public BloomFilteringPostingsFormat(PostingsFormat delegatePostingsFormat)
Creates Bloom filters for a selection of fields created in the index. This is recorded as a set of Bitsets held as a segment summary in an additional "blm" file. This PostingsFormat delegates to a choice of delegate PostingsFormat for encoding all other postings data. This choice of constructor defaults to the DefaultBloomFilterFactory for configuring per-field BloomFilters.

Parameters:
delegatePostingsFormat - The PostingsFormat that records all the non-bloom filter data i.e. postings info.

BloomFilteringPostingsFormat

public BloomFilteringPostingsFormat()
Method Detail

fieldsConsumer

public FieldsConsumer fieldsConsumer(SegmentWriteState state)
                              throws IOException
Description copied from class: PostingsFormat
Writes a new segment

Specified by:
fieldsConsumer in class PostingsFormat
Throws:
IOException

fieldsProducer

public FieldsProducer fieldsProducer(SegmentReadState state)
                              throws IOException
Description copied from class: PostingsFormat
Reads a segment. NOTE: by the time this call returns, it must hold open any files it will need to use; else, those files may be deleted. Additionally, required files may be deleted during the execution of this call before there is a chance to open them. Under these circumstances an IOException should be thrown by the implementation. IOExceptions are expected and will automatically cause a retry of the segment opening logic with the newly revised segments.

Specified by:
fieldsProducer in class PostingsFormat
Throws:
IOException


Copyright © 2000-2012 Apache Software Foundation. All Rights Reserved.