org.apache.hadoop.hbase.io.hfile
Class LruBlockCache

java.lang.Object
  extended by org.apache.hadoop.hbase.io.hfile.LruBlockCache
All Implemented Interfaces:
HeapSize, BlockCache

public class LruBlockCache
extends Object
implements BlockCache, HeapSize

A block cache implementation that is memory-aware using HeapSize, memory-bound using an LRU eviction algorithm, and concurrent: backed by a ConcurrentHashMap and with a non-blocking eviction thread giving constant-time cacheBlock(org.apache.hadoop.hbase.io.hfile.BlockCacheKey, org.apache.hadoop.hbase.io.hfile.Cacheable, boolean) and getBlock(org.apache.hadoop.hbase.io.hfile.BlockCacheKey, boolean, boolean) operations.

Contains three levels of block priority to allow for scan-resistance and in-memory families. A block is added with an inMemory flag if necessary, otherwise a block becomes a single access priority. Once a blocked is accessed again, it changes to multiple access. This is used to prevent scans from thrashing the cache, adding a least-frequently-used element to the eviction algorithm.

Each priority is given its own chunk of the total cache to ensure fairness during eviction. Each priority will retain close to its maximum size, however, if any priority is not using its entire chunk the others are able to grow beyond their chunk size.

Instantiated at a minimum with the total size and average block size. All sizes are in bytes. The block size is not especially important as this cache is fully dynamic in its sizing of blocks. It is only used for pre-allocating data structures and in initial heap estimation of the map.

The detailed constructor defines the sizes for the three priorities (they should total to the maximum size defined). It also sets the levels that trigger and control the eviction thread.

The acceptable size is the cache size level which triggers the eviction process to start. It evicts enough blocks to get the size below the minimum size specified.

Eviction happens in a separate thread and involves a single full-scan of the map. It determines how many bytes must be freed to reach the minimum size, and then while scanning determines the fewest least-recently-used blocks necessary from each of the three priorities (would be 3 times bytes to free). It then uses the priority chunk sizes to evict fairly according to the relative sizes and usage.


Field Summary
static long CACHE_FIXED_OVERHEAD
           
 
Constructor Summary
LruBlockCache(long maxSize, long blockSize, boolean evictionThread, org.apache.hadoop.conf.Configuration conf)
          Constructor used for testing.
LruBlockCache(long maxSize, long blockSize, boolean evictionThread, int mapInitialSize, float mapLoadFactor, int mapConcurrencyLevel, float minFactor, float acceptableFactor, float singleFactor, float multiFactor, float memoryFactor)
          Configurable constructor.
LruBlockCache(long maxSize, long blockSize, org.apache.hadoop.conf.Configuration conf)
          Default constructor.
 
Method Summary
 void cacheBlock(BlockCacheKey cacheKey, Cacheable buf)
          Cache the block with the specified name and buffer.
 void cacheBlock(BlockCacheKey cacheKey, Cacheable buf, boolean inMemory)
          Cache the block with the specified name and buffer.
static long calculateOverhead(long maxSize, long blockSize, int concurrency)
           
 void clearCache()
          Clears the cache.
 boolean evictBlock(BlockCacheKey cacheKey)
          Evict block from cache.
protected  long evictBlock(CachedBlock block)
           
 int evictBlocksByHfileName(String hfileName)
          Evicts all blocks for a specific HFile.
 Cacheable getBlock(BlockCacheKey cacheKey, boolean caching, boolean repeat)
          Get the buffer of the block with the specified name.
 List<BlockCacheColumnFamilySummary> getBlockCacheColumnFamilySummaries(org.apache.hadoop.conf.Configuration conf)
          Performs a BlockCache summary and returns a List of BlockCacheColumnFamilySummary objects.
 long getBlockCount()
          Returns the number of blocks currently cached in the block cache.
 long getCurrentSize()
          Get the current size of this cache.
 Map<DataBlockEncoding,Integer> getEncodingCountsForTest()
           
 long getEvictedCount()
          Get the number of blocks that have been evicted during the lifetime of this cache.
 long getEvictionCount()
          Get the number of eviction runs that have occurred
 long getFreeSize()
          Get the current size of this cache.
 long getMaxSize()
          Get the maximum size of this cache.
 CacheStats getStats()
          Get counter statistics for this cache.
 long heapSize()
           
 void logStats()
           
 void setMaxSize(long maxSize)
           
 void shutdown()
          Shutdown the cache.
 long size()
          Get the size of this cache (number of cached blocks)
protected  long updateSizeMetrics(CachedBlock cb, boolean evict)
          Helper function that updates the local size counter and also updates any per-cf or per-blocktype metrics it can discern from given CachedBlock
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

CACHE_FIXED_OVERHEAD

public static final long CACHE_FIXED_OVERHEAD
Constructor Detail

LruBlockCache

public LruBlockCache(long maxSize,
                     long blockSize,
                     org.apache.hadoop.conf.Configuration conf)
Default constructor. Specify maximum size and expected average block size (approximation is fine).

All other factors will be calculated based on defaults specified in this class.

Parameters:
maxSize - maximum size of cache, in bytes
blockSize - approximate size of each block, in bytes
conf - configuration

LruBlockCache

public LruBlockCache(long maxSize,
                     long blockSize,
                     boolean evictionThread,
                     org.apache.hadoop.conf.Configuration conf)
Constructor used for testing. Allows disabling of the eviction thread.


LruBlockCache

public LruBlockCache(long maxSize,
                     long blockSize,
                     boolean evictionThread,
                     int mapInitialSize,
                     float mapLoadFactor,
                     int mapConcurrencyLevel,
                     float minFactor,
                     float acceptableFactor,
                     float singleFactor,
                     float multiFactor,
                     float memoryFactor)
Configurable constructor. Use this constructor if not using defaults.

Parameters:
maxSize - maximum size of this cache, in bytes
blockSize - expected average size of blocks, in bytes
evictionThread - whether to run evictions in a bg thread or not
mapInitialSize - initial size of backing ConcurrentHashMap
mapLoadFactor - initial load factor of backing ConcurrentHashMap
mapConcurrencyLevel - initial concurrency factor for backing CHM
minFactor - percentage of total size that eviction will evict until
acceptableFactor - percentage of total size that triggers eviction
singleFactor - percentage of total size for single-access blocks
multiFactor - percentage of total size for multiple-access blocks
memoryFactor - percentage of total size for in-memory blocks
Method Detail

setMaxSize

public void setMaxSize(long maxSize)

cacheBlock

public void cacheBlock(BlockCacheKey cacheKey,
                       Cacheable buf,
                       boolean inMemory)
Cache the block with the specified name and buffer.

It is assumed this will NEVER be called on an already cached block. If that is done, an exception will be thrown.

Specified by:
cacheBlock in interface BlockCache
Parameters:
cacheKey - block's cache key
buf - block buffer
inMemory - if block is in-memory

cacheBlock

public void cacheBlock(BlockCacheKey cacheKey,
                       Cacheable buf)
Cache the block with the specified name and buffer.

It is assumed this will NEVER be called on an already cached block. If that is done, it is assumed that you are reinserting the same exact block due to a race condition and will update the buffer but not modify the size of the cache.

Specified by:
cacheBlock in interface BlockCache
Parameters:
cacheKey - block's cache key
buf - block buffer

updateSizeMetrics

protected long updateSizeMetrics(CachedBlock cb,
                                 boolean evict)
Helper function that updates the local size counter and also updates any per-cf or per-blocktype metrics it can discern from given CachedBlock

Parameters:
cb -
evict -

getBlock

public Cacheable getBlock(BlockCacheKey cacheKey,
                          boolean caching,
                          boolean repeat)
Get the buffer of the block with the specified name.

Specified by:
getBlock in interface BlockCache
Parameters:
cacheKey - block's cache key
caching - true if the caller caches blocks on cache misses
repeat - Whether this is a repeat lookup for the same block (used to avoid double counting cache misses when doing double-check locking) HFileReaderV2.readBlock(long, long, boolean, boolean, boolean, BlockType)
Returns:
buffer of specified cache key, or null if not in cache

evictBlock

public boolean evictBlock(BlockCacheKey cacheKey)
Description copied from interface: BlockCache
Evict block from cache.

Specified by:
evictBlock in interface BlockCache
Parameters:
cacheKey - Block to evict
Returns:
true if block existed and was evicted, false if not

evictBlocksByHfileName

public int evictBlocksByHfileName(String hfileName)
Evicts all blocks for a specific HFile. This is an expensive operation implemented as a linear-time search through all blocks in the cache. Ideally this should be a search in a log-access-time map.

This is used for evict-on-close to remove all blocks of a specific HFile.

Specified by:
evictBlocksByHfileName in interface BlockCache
Returns:
the number of blocks evicted

evictBlock

protected long evictBlock(CachedBlock block)

getMaxSize

public long getMaxSize()
Get the maximum size of this cache.

Returns:
max size in bytes

getCurrentSize

public long getCurrentSize()
Get the current size of this cache.

Specified by:
getCurrentSize in interface BlockCache
Returns:
current size in bytes

getFreeSize

public long getFreeSize()
Get the current size of this cache.

Specified by:
getFreeSize in interface BlockCache
Returns:
current size in bytes

size

public long size()
Get the size of this cache (number of cached blocks)

Specified by:
size in interface BlockCache
Returns:
number of cached blocks

getBlockCount

public long getBlockCount()
Description copied from interface: BlockCache
Returns the number of blocks currently cached in the block cache.

Specified by:
getBlockCount in interface BlockCache
Returns:
number of blocks in the cache

getEvictionCount

public long getEvictionCount()
Get the number of eviction runs that have occurred


getEvictedCount

public long getEvictedCount()
Get the number of blocks that have been evicted during the lifetime of this cache.

Specified by:
getEvictedCount in interface BlockCache
Returns:
number of evictions

logStats

public void logStats()

getStats

public CacheStats getStats()
Get counter statistics for this cache.

Includes: total accesses, hits, misses, evicted blocks, and runs of the eviction processes.

Specified by:
getStats in interface BlockCache
Returns:
Stats

heapSize

public long heapSize()
Specified by:
heapSize in interface HeapSize
Returns:
Approximate 'exclusive deep size' of implementing object. Includes count of payload and hosting object sizings.

calculateOverhead

public static long calculateOverhead(long maxSize,
                                     long blockSize,
                                     int concurrency)

getBlockCacheColumnFamilySummaries

public List<BlockCacheColumnFamilySummary> getBlockCacheColumnFamilySummaries(org.apache.hadoop.conf.Configuration conf)
                                                                       throws IOException
Description copied from interface: BlockCache
Performs a BlockCache summary and returns a List of BlockCacheColumnFamilySummary objects. This method could be fairly heavyweight in that it evaluates the entire HBase file-system against what is in the RegionServer BlockCache.

The contract of this interface is to return the List in sorted order by Table name, then ColumnFamily.

Specified by:
getBlockCacheColumnFamilySummaries in interface BlockCache
Parameters:
conf - HBaseConfiguration
Returns:
List of BlockCacheColumnFamilySummary
Throws:
IOException - exception

shutdown

public void shutdown()
Description copied from interface: BlockCache
Shutdown the cache.

Specified by:
shutdown in interface BlockCache

clearCache

public void clearCache()
Clears the cache. Used in tests.


getEncodingCountsForTest

public Map<DataBlockEncoding,Integer> getEncodingCountsForTest()


Copyright © 2013 The Apache Software Foundation. All Rights Reserved.