org.apache.hadoop.hbase.regionserver
Class StoreFile

java.lang.Object
  extended by org.apache.hadoop.hbase.regionserver.StoreFile

public class StoreFile
extends Object

A Store data file. Stores usually have one or more of these files. They are produced by flushing the memstore to disk. To create, call createWriter(FileSystem, Path, int, Configuration, CacheConfig) and append data. Be sure to add any metadata before calling close on the Writer (Use the appendMetadata convenience methods). On close, a StoreFile is sitting in the Filesystem. To refer to it, create a StoreFile instance passing filesystem and path. To read, call createReader().

StoreFiles may also reference store files in another Store. The reason for this weird pattern where you use a different instance for the writer and a reader is that we write once but read a lot more.


Nested Class Summary
static class StoreFile.BloomType
           
static class StoreFile.Reader
          Reader for a StoreFile.
static class StoreFile.Writer
          A StoreFile writer.
 
Field Summary
static byte[] BULKLOAD_TASK_KEY
          Meta key set when store file is a result of a bulk load
static byte[] BULKLOAD_TIME_KEY
           
static int DEFAULT_BLOCKSIZE_SMALL
           
static byte[] MAJOR_COMPACTION_KEY
          Major compaction flag in FileInfo
static byte[] MAX_SEQ_ID_KEY
          Max Sequence ID in FileInfo
static byte[] TIMERANGE_KEY
          Key for Timerange information in metadata
 
Method Summary
 void closeReader(boolean evictOnClose)
           
static HDFSBlocksDistribution computeHDFSBlockDistribution(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path p)
          helper function to compute HDFS blocks distribution of a given file.
 StoreFile.Reader createReader()
           
static StoreFile.Writer createWriter(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path dir, int blocksize, Compression.Algorithm algorithm, KeyValue.KVComparator c, org.apache.hadoop.conf.Configuration conf, CacheConfig cacheConf, StoreFile.BloomType bloomType, long maxKeyCount)
          Create a store file writer.
static StoreFile.Writer createWriter(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path dir, int blocksize, org.apache.hadoop.conf.Configuration conf, CacheConfig cacheConf)
          Get a store file writer.
 void deleteReader()
          Delete this file
 long getBulkLoadTimestamp()
          Return the timestamp at which this bulk load file was generated.
 HDFSBlocksDistribution getHDFSBlockDistribution()
           
 long getMaxMemstoreTS()
           
static long getMaxMemstoreTSInList(Collection<StoreFile> sfs)
          Return the largest memstoreTS found across all storefiles in the given list.
 long getMaxSequenceId()
           
static long getMaxSequenceIdInList(Collection<StoreFile> sfs)
          Return the highest sequence ID found across all storefiles in the given list.
 long getModificationTimeStamp()
           
 StoreFile.Reader getReader()
           
static org.apache.hadoop.fs.Path getUniqueFile(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path dir)
           
static boolean isReference(org.apache.hadoop.fs.Path p)
           
static boolean isReference(org.apache.hadoop.fs.Path p, Matcher m)
           
static org.apache.hadoop.fs.Path rename(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path src, org.apache.hadoop.fs.Path tgt)
          Utility to help with rename.
 void setMaxMemstoreTS(long maxMemstoreTS)
           
 String toString()
           
 String toStringDetailed()
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

MAX_SEQ_ID_KEY

public static final byte[] MAX_SEQ_ID_KEY
Max Sequence ID in FileInfo


MAJOR_COMPACTION_KEY

public static final byte[] MAJOR_COMPACTION_KEY
Major compaction flag in FileInfo


TIMERANGE_KEY

public static final byte[] TIMERANGE_KEY
Key for Timerange information in metadata


DEFAULT_BLOCKSIZE_SMALL

public static final int DEFAULT_BLOCKSIZE_SMALL
See Also:
Constant Field Values

BULKLOAD_TASK_KEY

public static final byte[] BULKLOAD_TASK_KEY
Meta key set when store file is a result of a bulk load


BULKLOAD_TIME_KEY

public static final byte[] BULKLOAD_TIME_KEY
Method Detail

getMaxMemstoreTS

public long getMaxMemstoreTS()

setMaxMemstoreTS

public void setMaxMemstoreTS(long maxMemstoreTS)

isReference

public static boolean isReference(org.apache.hadoop.fs.Path p)
Parameters:
p - Path to check.
Returns:
True if the path has format of a HStoreFile reference.

isReference

public static boolean isReference(org.apache.hadoop.fs.Path p,
                                  Matcher m)
Parameters:
p - Path to check.
m - Matcher to use.
Returns:
True if the path has format of a HStoreFile reference.

getMaxSequenceId

public long getMaxSequenceId()
Returns:
This files maximum edit sequence id.

getModificationTimeStamp

public long getModificationTimeStamp()

getMaxMemstoreTSInList

public static long getMaxMemstoreTSInList(Collection<StoreFile> sfs)
Return the largest memstoreTS found across all storefiles in the given list. Store files that were created by a mapreduce bulk load are ignored, as they do not correspond to any specific put operation, and thus do not have a memstoreTS associated with them.

Returns:
0 if no non-bulk-load files are provided or, this is Store that does not yet have any store files.

getMaxSequenceIdInList

public static long getMaxSequenceIdInList(Collection<StoreFile> sfs)
Return the highest sequence ID found across all storefiles in the given list. Store files that were created by a mapreduce bulk load are ignored, as they do not correspond to any edit log items.

Returns:
0 if no non-bulk-load files are provided or, this is Store that does not yet have any store files.

getBulkLoadTimestamp

public long getBulkLoadTimestamp()
Return the timestamp at which this bulk load file was generated.


getHDFSBlockDistribution

public HDFSBlocksDistribution getHDFSBlockDistribution()
Returns:
the cached value of HDFS blocks distribution. The cached value is calculated when store file is opened.

computeHDFSBlockDistribution

public static HDFSBlocksDistribution computeHDFSBlockDistribution(org.apache.hadoop.fs.FileSystem fs,
                                                                  org.apache.hadoop.fs.Path p)
                                                           throws IOException
helper function to compute HDFS blocks distribution of a given file. For reference file, it is an estimate

Parameters:
fs - The FileSystem
p - The path of the file
Returns:
HDFS blocks distribution
Throws:
IOException

createReader

public StoreFile.Reader createReader()
                              throws IOException
Returns:
Reader for StoreFile. creates if necessary
Throws:
IOException

getReader

public StoreFile.Reader getReader()
Returns:
Current reader. Must call createReader first else returns null.
See Also:
createReader()

closeReader

public void closeReader(boolean evictOnClose)
                 throws IOException
Parameters:
evictOnClose -
Throws:
IOException

deleteReader

public void deleteReader()
                  throws IOException
Delete this file

Throws:
IOException

toString

public String toString()
Overrides:
toString in class Object

toStringDetailed

public String toStringDetailed()
Returns:
a length description of this StoreFile, suitable for debug output

rename

public static org.apache.hadoop.fs.Path rename(org.apache.hadoop.fs.FileSystem fs,
                                               org.apache.hadoop.fs.Path src,
                                               org.apache.hadoop.fs.Path tgt)
                                        throws IOException
Utility to help with rename.

Parameters:
fs -
src -
tgt -
Returns:
True if succeeded.
Throws:
IOException

createWriter

public static StoreFile.Writer createWriter(org.apache.hadoop.fs.FileSystem fs,
                                            org.apache.hadoop.fs.Path dir,
                                            int blocksize,
                                            org.apache.hadoop.conf.Configuration conf,
                                            CacheConfig cacheConf)
                                     throws IOException
Get a store file writer. Client is responsible for closing file when done.

Parameters:
fs -
dir - Path to family directory. Makes the directory if doesn't exist. Creates a file with a unique name in this directory.
blocksize - size per filesystem block
Returns:
StoreFile.Writer
Throws:
IOException

createWriter

public static StoreFile.Writer createWriter(org.apache.hadoop.fs.FileSystem fs,
                                            org.apache.hadoop.fs.Path dir,
                                            int blocksize,
                                            Compression.Algorithm algorithm,
                                            KeyValue.KVComparator c,
                                            org.apache.hadoop.conf.Configuration conf,
                                            CacheConfig cacheConf,
                                            StoreFile.BloomType bloomType,
                                            long maxKeyCount)
                                     throws IOException
Create a store file writer. Client is responsible for closing file when done. If metadata, add BEFORE closing using appendMetadata()

Parameters:
fs -
dir - Path to family directory. Makes the directory if doesn't exist. Creates a file with a unique name in this directory.
blocksize -
algorithm - Pass null to get default.
c - Pass null to get default.
conf - HBase system configuration. used with bloom filters
cacheConf - Cache configuration and reference.
bloomType - column family setting for bloom filters
maxKeyCount - estimated maximum number of keys we expect to add
Returns:
HFile.Writer
Throws:
IOException

getUniqueFile

public static org.apache.hadoop.fs.Path getUniqueFile(org.apache.hadoop.fs.FileSystem fs,
                                                      org.apache.hadoop.fs.Path dir)
                                               throws IOException
Parameters:
fs -
dir - Directory to create file in.
Returns:
random filename inside passed dir
Throws:
IOException


Copyright © 2012 The Apache Software Foundation. All Rights Reserved.