org.apache.hadoop.hbase.util
Interface BloomFilter

All Known Implementing Classes:
ByteBloomFilter, DynamicByteBloomFilter

public interface BloomFilter

Defines the general behavior of a bloom filter.

The Bloom filter is a data structure that was introduced in 1970 and that has been adopted by the networking research community in the past decade thanks to the bandwidth efficiencies that it offers for the transmission of set membership information between networked hosts. A sender encodes the information into a bit vector, the Bloom filter, that is more compact than a conventional representation. Computation and space costs for construction are linear in the number of elements. The receiver uses the filter to test whether various elements are members of the set. Though the filter will occasionally return a false positive, it will never return a false negative. When creating the filter, the sender can choose its desired point in a trade-off between the false positive rate and the size.

Originally created by European Commission One-Lab Project 034819.

It must be extended in order to define the real behavior.


Method Summary
 void add(byte[] buf)
          Add the specified binary to the bloom filter.
 void add(byte[] buf, int offset, int len)
          Add the specified binary to the bloom filter.
 void allocBloom()
          Allocate memory for the bloom filter data.
 void compactBloom()
          Compact the bloom before writing metadata & data to disk
 boolean contains(byte[] buf, ByteBuffer bloom)
          Check if the specified key is contained in the bloom filter.
 boolean contains(byte[] buf, int offset, int length, ByteBuffer bloom)
          Check if the specified key is contained in the bloom filter.
 int getByteSize()
           
 org.apache.hadoop.io.Writable getDataWriter()
          Get a writable interface into bloom filter data (actual bloom).
 int getKeyCount()
           
 int getMaxKeys()
           
 org.apache.hadoop.io.Writable getMetaWriter()
          Get a writable interface into bloom filter meta data.
 

Method Detail

allocBloom

void allocBloom()
Allocate memory for the bloom filter data. Note that bloom data isn't allocated by default because it can grow large & reads would be better managed by the LRU cache.


add

void add(byte[] buf)
Add the specified binary to the bloom filter.

Parameters:
buf - data to be added to the bloom

add

void add(byte[] buf,
         int offset,
         int len)
Add the specified binary to the bloom filter.

Parameters:
buf - data to be added to the bloom
offset - offset into the data to be added
len - length of the data to be added

contains

boolean contains(byte[] buf,
                 ByteBuffer bloom)
Check if the specified key is contained in the bloom filter.

Parameters:
buf - data to check for existence of
bloom - bloom filter data to search
Returns:
true if matched by bloom, false if not

contains

boolean contains(byte[] buf,
                 int offset,
                 int length,
                 ByteBuffer bloom)
Check if the specified key is contained in the bloom filter.

Parameters:
buf - data to check for existence of
offset - offset into the data
length - length of the data
bloom - bloom filter data to search
Returns:
true if matched by bloom, false if not

getKeyCount

int getKeyCount()
Returns:
The number of keys added to the bloom

getMaxKeys

int getMaxKeys()
Returns:
The max number of keys that can be inserted to maintain the desired error rate

getByteSize

int getByteSize()
Returns:
Size of the bloom, in bytes

compactBloom

void compactBloom()
Compact the bloom before writing metadata & data to disk


getMetaWriter

org.apache.hadoop.io.Writable getMetaWriter()
Get a writable interface into bloom filter meta data.

Returns:
writable class

getDataWriter

org.apache.hadoop.io.Writable getDataWriter()
Get a writable interface into bloom filter data (actual bloom).

Returns:
writable class


Copyright © 2011 The Apache Software Foundation. All Rights Reserved.