|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectorg.apache.hadoop.hbase.util.ByteBloomFilter
public class ByteBloomFilter
Implements a Bloom filter, as defined by Bloom in 1970.
The Bloom filter is a data structure that was introduced in 1970 and that has been adopted by the networking research community in the past decade thanks to the bandwidth efficiencies that it offers for the transmission of set membership information between networked hosts. A sender encodes the information into a bit vector, the Bloom filter, that is more compact than a conventional representation. Computation and space costs for construction are linear in the number of elements. The receiver uses the filter to test whether various elements are members of the set. Though the filter will occasionally return a false positive, it will never return a false negative. When creating the filter, the sender can choose its desired point in a trade-off between the false positive rate and the size.
Originally inspired by European Commission One-Lab Project 034819.
The general behavior of a filter
,
Space/Time Trade-Offs in Hash Coding with Allowable ErrorsField Summary | |
---|---|
protected ByteBuffer |
bloom
Bloom bits |
protected long |
byteSize
Bytes (B) in the array |
protected Hash |
hash
Hash Function |
protected int |
hashCount
Number of hash functions |
protected int |
hashType
Hash type |
protected int |
keyCount
Keys currently in the bloom |
protected int |
maxKeys
Max Keys expected for the bloom |
static int |
VERSION
Current file format version |
Constructor Summary | |
---|---|
ByteBloomFilter(ByteBuffer meta)
Loads bloom filter meta data from file input. |
|
ByteBloomFilter(int maxKeys,
float errorRate,
int hashType,
int foldFactor)
Determines & initializes bloom filter meta data from user config. |
Method Summary | |
---|---|
void |
add(byte[] buf)
Add the specified binary to the bloom filter. |
void |
add(byte[] buf,
int offset,
int len)
Add the specified binary to the bloom filter. |
void |
allocBloom()
Allocate memory for the bloom filter data. |
void |
compactBloom()
Compact the bloom before writing metadata & data to disk |
boolean |
contains(byte[] buf,
ByteBuffer theBloom)
Check if the specified key is contained in the bloom filter. |
boolean |
contains(byte[] buf,
int offset,
int length,
ByteBuffer theBloom)
Check if the specified key is contained in the bloom filter. |
int |
getByteSize()
|
org.apache.hadoop.io.Writable |
getDataWriter()
Get a writable interface into bloom filter data (actual bloom). |
int |
getKeyCount()
|
int |
getMaxKeys()
|
org.apache.hadoop.io.Writable |
getMetaWriter()
Get a writable interface into bloom filter meta data. |
void |
writeBloom(DataOutput out)
Writes just the bloom filter to the output array |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
public static final int VERSION
protected long byteSize
protected final int hashCount
protected final int hashType
protected final Hash hash
protected int keyCount
protected int maxKeys
protected ByteBuffer bloom
Constructor Detail |
---|
public ByteBloomFilter(ByteBuffer meta) throws IllegalArgumentException
meta
- stored bloom meta data
IllegalArgumentException
- meta data is invalidpublic ByteBloomFilter(int maxKeys, float errorRate, int hashType, int foldFactor) throws IllegalArgumentException
allocBloom()
to allocate bloom filter data.
maxKeys
- Maximum expected number of keys that will be stored in this bloomerrorRate
- Desired false positive error rate. Lower rate = more storage requiredhashType
- Type of hash function to usefoldFactor
- When finished adding entries, you may be able to 'fold'
this bloom to save space. Tradeoff potentially excess bytes in bloom for
ability to fold if keyCount is exponentially greater than maxKeys.
IllegalArgumentException
Method Detail |
---|
public void allocBloom()
BloomFilter
allocBloom
in interface BloomFilter
public void add(byte[] buf)
BloomFilter
add
in interface BloomFilter
buf
- data to be added to the bloompublic void add(byte[] buf, int offset, int len)
BloomFilter
add
in interface BloomFilter
buf
- data to be added to the bloomoffset
- offset into the data to be addedlen
- length of the data to be addedpublic boolean contains(byte[] buf, ByteBuffer theBloom)
BloomFilter
contains
in interface BloomFilter
buf
- data to check for existence oftheBloom
- bloom filter data to search
public boolean contains(byte[] buf, int offset, int length, ByteBuffer theBloom)
BloomFilter
contains
in interface BloomFilter
buf
- data to check for existence ofoffset
- offset into the datalength
- length of the datatheBloom
- bloom filter data to search
public int getKeyCount()
getKeyCount
in interface BloomFilter
public int getMaxKeys()
getMaxKeys
in interface BloomFilter
public int getByteSize()
getByteSize
in interface BloomFilter
public void compactBloom()
BloomFilter
compactBloom
in interface BloomFilter
public void writeBloom(DataOutput out) throws IOException
out
- OutputStream to place bloom
IOException
- Error writing bloom arraypublic org.apache.hadoop.io.Writable getMetaWriter()
BloomFilter
getMetaWriter
in interface BloomFilter
public org.apache.hadoop.io.Writable getDataWriter()
BloomFilter
getDataWriter
in interface BloomFilter
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |