org.apache.hadoop.hbase.master.snapshot
Class SnapshotFileCache

java.lang.Object
  extended by org.apache.hadoop.hbase.master.snapshot.SnapshotFileCache
All Implemented Interfaces:
Stoppable

@InterfaceAudience.Private
@InterfaceStability.Evolving
public class SnapshotFileCache
extends Object
implements Stoppable

Intelligently keep track of all the files for all the snapshots.

A cache of files is kept to avoid querying the FileSystem frequently. If there is a cache miss the directory modification time is used to ensure that we don't rescan directories that we already have in cache. We only check the modification times of the snapshot directories (/hbase/.snapshot/[snapshot_name]) to determine if the files need to be loaded into the cache.

New snapshots will be added to the cache and deleted snapshots will be removed when we refresh the cache. If the files underneath a snapshot directory are changed, but not the snapshot itself, we will ignore updates to that snapshot's files.

This is sufficient because each snapshot has its own directory and is added via an atomic rename once, when the snapshot is created. We don't need to worry about the data in the snapshot being run.

Further, the cache is periodically refreshed ensure that files in snapshots that were deleted are also removed from the cache.

A SnapshotFileCache.SnapshotFileInspector must be passed when creating this to allow extraction of files under /hbase/.snapshot/[snapshot name] directory, for each snapshot. This allows you to only cache files under, for instance, all the logs in the .logs directory or all the files under all the regions.

this also considers all running snapshots (those under /hbase/.snapshot/.tmp) as valid snapshots and will attempt to cache files from those snapshots as well.

Queries about a given file are thread-safe with respect to multiple queries and cache refreshes.


Nested Class Summary
 class SnapshotFileCache.RefreshCacheTask
          Simple helper task that just periodically attempts to refresh the cache
 
Constructor Summary
SnapshotFileCache(org.apache.hadoop.conf.Configuration conf, long cacheRefreshPeriod, String refreshThreadName, org.apache.hadoop.hbase.master.snapshot.SnapshotFileCache.SnapshotFileInspector inspectSnapshotFiles)
          Create a snapshot file cache for all snapshots under the specified [root]/.snapshot on the filesystem.
SnapshotFileCache(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path rootDir, long cacheRefreshPeriod, long cacheRefreshDelay, String refreshThreadName, org.apache.hadoop.hbase.master.snapshot.SnapshotFileCache.SnapshotFileInspector inspectSnapshotFiles)
          Create a snapshot file cache for all snapshots under the specified [root]/.snapshot on the filesystem
 
Method Summary
 boolean contains(String fileName)
          Check to see if the passed file name is contained in any of the snapshots.
 boolean isStopped()
           
 void stop(String why)
          Stop this service.
 void triggerCacheRefreshForTesting()
          Trigger a cache refresh, even if its before the next cache refresh.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

SnapshotFileCache

public SnapshotFileCache(org.apache.hadoop.conf.Configuration conf,
                         long cacheRefreshPeriod,
                         String refreshThreadName,
                         org.apache.hadoop.hbase.master.snapshot.SnapshotFileCache.SnapshotFileInspector inspectSnapshotFiles)
                  throws IOException
Create a snapshot file cache for all snapshots under the specified [root]/.snapshot on the filesystem.

Immediately loads the file cache.

Parameters:
conf - to extract the configured FileSystem where the snapshots are stored and hbase root directory
cacheRefreshPeriod - frequency (ms) with which the cache should be refreshed
refreshThreadName - name of the cache refresh thread
inspectSnapshotFiles - Filter to apply to each snapshot to extract the files.
Throws:
IOException - if the FileSystem or root directory cannot be loaded

SnapshotFileCache

public SnapshotFileCache(org.apache.hadoop.fs.FileSystem fs,
                         org.apache.hadoop.fs.Path rootDir,
                         long cacheRefreshPeriod,
                         long cacheRefreshDelay,
                         String refreshThreadName,
                         org.apache.hadoop.hbase.master.snapshot.SnapshotFileCache.SnapshotFileInspector inspectSnapshotFiles)
Create a snapshot file cache for all snapshots under the specified [root]/.snapshot on the filesystem

Parameters:
fs - FileSystem where the snapshots are stored
rootDir - hbase root directory
cacheRefreshPeriod - period (ms) with which the cache should be refreshed
cacheRefreshDelay - amount of time to wait for the cache to be refreshed
refreshThreadName - name of the cache refresh thread
inspectSnapshotFiles - Filter to apply to each snapshot to extract the files.
Method Detail

triggerCacheRefreshForTesting

public void triggerCacheRefreshForTesting()
Trigger a cache refresh, even if its before the next cache refresh. Does not affect pending cache refreshes.

Blocks until the cache is refreshed.

Exposed for TESTING.


contains

public boolean contains(String fileName)
                 throws IOException
Check to see if the passed file name is contained in any of the snapshots. First checks an in-memory cache of the files to keep. If its not in the cache, then the cache is refreshed and the cache checked again for that file. This ensures that we always return true for a files that exists.

Note this may lead to periodic false positives for the file being referenced. Periodically, the cache is refreshed even if there are no requests to ensure that the false negatives get removed eventually. For instance, suppose you have a file in the snapshot and it gets loaded into the cache. Then at some point later that snapshot is deleted. If the cache has not been refreshed at that point, cache will still think the file system contains that file and return true, even if it is no longer present (false positive). However, if the file never was on the filesystem, we will never find it and always return false.

Parameters:
fileName - file to check
Returns:
false if the file is not referenced in any current or running snapshot, true if the file is in the cache.
Throws:
IOException - if there is an unexpected error reaching the filesystem.

stop

public void stop(String why)
Description copied from interface: Stoppable
Stop this service.

Specified by:
stop in interface Stoppable
Parameters:
why - Why we're stopping.

isStopped

public boolean isStopped()
Specified by:
isStopped in interface Stoppable
Returns:
True if Stoppable.stop(String) has been closed.


Copyright © 2013 The Apache Software Foundation. All Rights Reserved.