org.apache.hadoop.hbase
Class BloomFilterDescriptor

java.lang.Object
  extended by org.apache.hadoop.hbase.BloomFilterDescriptor
All Implemented Interfaces:
Comparable, Writable, WritableComparable

public class BloomFilterDescriptor
extends Object
implements WritableComparable

Supplied as a parameter to HColumnDescriptor to specify what kind of bloom filter to use for a column, and its configuration parameters. There is no way to automatically determine the vector size and the number of hash functions to use. In particular, bloom filters are very sensitive to the number of elements inserted into them. For HBase, the number of entries depends on the size of the data stored in the column. Currently the default region size is 64MB, so the number of entries is approximately 64MB / (average value size for column). If m denotes the number of bits in the Bloom filter (vectorSize), n denotes the number of elements inserted into the Bloom filter and k represents the number of hash functions used (nbHash), then according to Broder and Mitzenmacher, ( http://www.eecs.harvard.edu/~michaelm/NEWWORK/postscripts/BloomFilterSurvey.pdf ) the probability of false positives is minimized when k is approximately m/n ln(2).


Nested Class Summary
static class BloomFilterDescriptor.BloomFilterType
          The type of bloom filter
 
Constructor Summary
BloomFilterDescriptor()
          Default constructor - used in conjunction with Writable
BloomFilterDescriptor(BloomFilterDescriptor.BloomFilterType type, int numberOfEntries)
          Creates a BloomFilterDescriptor for the specified type of filter, fixes the number of hash functions to 4 and computes a vector size using: vectorSize = ceil((4 * n) / ln(2))
BloomFilterDescriptor(BloomFilterDescriptor.BloomFilterType type, int vectorSize, int nbHash)
           
 
Method Summary
 int compareTo(Object o)
          
 boolean equals(Object obj)
          
 int getNbHash()
           
 BloomFilterDescriptor.BloomFilterType getType()
           
 int getVectorSize()
           
 int hashCode()
          
 void readFields(DataInput in)
          Deserialize the fields of this object from in.
 String toString()
          
 void write(DataOutput out)
          Serialize the fields of this object to out.
 
Methods inherited from class java.lang.Object
clone, finalize, getClass, notify, notifyAll, wait, wait, wait
 

Constructor Detail

BloomFilterDescriptor

public BloomFilterDescriptor()
Default constructor - used in conjunction with Writable


BloomFilterDescriptor

public BloomFilterDescriptor(BloomFilterDescriptor.BloomFilterType type,
                             int numberOfEntries)
Creates a BloomFilterDescriptor for the specified type of filter, fixes the number of hash functions to 4 and computes a vector size using: vectorSize = ceil((4 * n) / ln(2))

Parameters:
type -
numberOfEntries -

BloomFilterDescriptor

public BloomFilterDescriptor(BloomFilterDescriptor.BloomFilterType type,
                             int vectorSize,
                             int nbHash)
Parameters:
type - The kind of bloom filter to use.
vectorSize - The vector size of this filter.
nbHash - The number of hash functions to consider.
Method Detail

toString

public String toString()

Overrides:
toString in class Object

getType

public BloomFilterDescriptor.BloomFilterType getType()

getVectorSize

public int getVectorSize()

getNbHash

public int getNbHash()

equals

public boolean equals(Object obj)

Overrides:
equals in class Object

hashCode

public int hashCode()

Overrides:
hashCode in class Object

readFields

public void readFields(DataInput in)
                throws IOException
Deserialize the fields of this object from in.

For efficiency, implementations should attempt to re-use storage in the existing object where possible.

Specified by:
readFields in interface Writable
Parameters:
in - DataInput to deseriablize this object from.
Throws:
IOException

write

public void write(DataOutput out)
           throws IOException
Serialize the fields of this object to out.

Specified by:
write in interface Writable
Parameters:
out - DataOuput to serialize this object into.
Throws:
IOException

compareTo

public int compareTo(Object o)

Specified by:
compareTo in interface Comparable


Copyright © 2006 The Apache Software Foundation