org.apache.hadoop.hbase.regionserver
Class StoreFile

java.lang.Object
  extended by org.apache.hadoop.hbase.regionserver.StoreFile

@InterfaceAudience.LimitedPrivate(value="Coprocessor")
public class StoreFile
extends Object

A Store data file. Stores usually have one or more of these files. They are produced by flushing the memstore to disk. To create, instantiate a writer using StoreFile.WriterBuilder and append data. Be sure to add any metadata before calling close on the Writer (Use the appendMetadata convenience methods). On close, a StoreFile is sitting in the Filesystem. To refer to it, create a StoreFile instance passing filesystem and path. To read, call createReader().

StoreFiles may also reference store files in another Store. The reason for this weird pattern where you use a different instance for the writer and a reader is that we write once but read a lot more.


Nested Class Summary
static class StoreFile.Comparators
          Useful comparators for comparing StoreFiles.
static class StoreFile.Reader
          Reader for a StoreFile.
static class StoreFile.Writer
          A StoreFile writer.
static class StoreFile.WriterBuilder
           
 
Field Summary
static byte[] BULKLOAD_TASK_KEY
          Meta key set when store file is a result of a bulk load
static byte[] BULKLOAD_TIME_KEY
           
static int DEFAULT_BLOCKSIZE_SMALL
           
static byte[] DELETE_FAMILY_COUNT
          Delete Family Count in FileInfo
static byte[] EARLIEST_PUT_TS
          Key for timestamp of earliest-put in metadata
static byte[] EXCLUDE_FROM_MINOR_COMPACTION_KEY
          Minor compaction flag in FileInfo
static byte[] MAJOR_COMPACTION_KEY
          Major compaction flag in FileInfo
static byte[] MAX_SEQ_ID_KEY
          Max Sequence ID in FileInfo
static byte[] TIMERANGE_KEY
          Key for Timerange information in metadata
 
Constructor Summary
StoreFile(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path p, org.apache.hadoop.conf.Configuration conf, CacheConfig cacheConf, BloomType cfBloomType, HFileDataBlockEncoder dataBlockEncoder)
          Constructor, loads a reader and it's indices, etc.
StoreFile(org.apache.hadoop.fs.FileSystem fs, StoreFileInfo fileInfo, org.apache.hadoop.conf.Configuration conf, CacheConfig cacheConf, BloomType cfBloomType, HFileDataBlockEncoder dataBlockEncoder)
          Constructor, loads a reader and it's indices, etc.
 
Method Summary
 void closeReader(boolean evictOnClose)
           
 StoreFile.Reader createReader()
           
 void deleteReader()
          Delete this file
 boolean excludeFromMinorCompaction()
           
 long getBulkLoadTimestamp()
          Return the timestamp at which this bulk load file was generated.
 HDFSBlocksDistribution getHDFSBlockDistribution()
           
 long getMaxMemstoreTS()
           
static long getMaxMemstoreTSInList(Collection<StoreFile> sfs)
          Return the largest memstoreTS found across all storefiles in the given list.
 long getMaxSequenceId()
           
static long getMaxSequenceIdInList(Collection<StoreFile> sfs, boolean includeBulkLoadedFiles)
          Return the highest sequence ID found across all storefiles in the given list.
 Long getMinimumTimestamp()
           
 long getModificationTimeStamp()
           
 org.apache.hadoop.fs.Path getPath()
           
 StoreFile.Reader getReader()
           
static org.apache.hadoop.fs.Path getUniqueFile(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path dir)
           
 boolean isMajorCompaction()
           
 boolean isReference()
           
 void setMaxMemstoreTS(long maxMemstoreTS)
           
 String toString()
           
 String toStringDetailed()
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

MAX_SEQ_ID_KEY

public static final byte[] MAX_SEQ_ID_KEY
Max Sequence ID in FileInfo


MAJOR_COMPACTION_KEY

public static final byte[] MAJOR_COMPACTION_KEY
Major compaction flag in FileInfo


EXCLUDE_FROM_MINOR_COMPACTION_KEY

public static final byte[] EXCLUDE_FROM_MINOR_COMPACTION_KEY
Minor compaction flag in FileInfo


DELETE_FAMILY_COUNT

public static final byte[] DELETE_FAMILY_COUNT
Delete Family Count in FileInfo


TIMERANGE_KEY

public static final byte[] TIMERANGE_KEY
Key for Timerange information in metadata


EARLIEST_PUT_TS

public static final byte[] EARLIEST_PUT_TS
Key for timestamp of earliest-put in metadata


DEFAULT_BLOCKSIZE_SMALL

public static final int DEFAULT_BLOCKSIZE_SMALL
See Also:
Constant Field Values

BULKLOAD_TASK_KEY

public static final byte[] BULKLOAD_TASK_KEY
Meta key set when store file is a result of a bulk load


BULKLOAD_TIME_KEY

public static final byte[] BULKLOAD_TIME_KEY
Constructor Detail

StoreFile

public StoreFile(org.apache.hadoop.fs.FileSystem fs,
                 org.apache.hadoop.fs.Path p,
                 org.apache.hadoop.conf.Configuration conf,
                 CacheConfig cacheConf,
                 BloomType cfBloomType,
                 HFileDataBlockEncoder dataBlockEncoder)
          throws IOException
Constructor, loads a reader and it's indices, etc. May allocate a substantial amount of ram depending on the underlying files (10-20MB?).

Parameters:
fs - The current file system to use.
p - The path of the file.
conf - The current configuration.
cacheConf - The cache configuration and block cache reference.
cfBloomType - The bloom type to use for this store file as specified by column family configuration. This may or may not be the same as the Bloom filter type actually present in the HFile, because column family configuration might change. If this is BloomType.NONE, the existing Bloom filter is ignored.
dataBlockEncoder - data block encoding algorithm.
Throws:
IOException - When opening the reader fails.

StoreFile

public StoreFile(org.apache.hadoop.fs.FileSystem fs,
                 StoreFileInfo fileInfo,
                 org.apache.hadoop.conf.Configuration conf,
                 CacheConfig cacheConf,
                 BloomType cfBloomType,
                 HFileDataBlockEncoder dataBlockEncoder)
          throws IOException
Constructor, loads a reader and it's indices, etc. May allocate a substantial amount of ram depending on the underlying files (10-20MB?).

Parameters:
fs - The current file system to use.
fileInfo - The store file information.
conf - The current configuration.
cacheConf - The cache configuration and block cache reference.
cfBloomType - The bloom type to use for this store file as specified by column family configuration. This may or may not be the same as the Bloom filter type actually present in the HFile, because column family configuration might change. If this is BloomType.NONE, the existing Bloom filter is ignored.
dataBlockEncoder - data block encoding algorithm.
Throws:
IOException - When opening the reader fails.
Method Detail

getMaxMemstoreTS

public long getMaxMemstoreTS()

setMaxMemstoreTS

public void setMaxMemstoreTS(long maxMemstoreTS)

getPath

public org.apache.hadoop.fs.Path getPath()
Returns:
Path or null if this StoreFile was made with a Stream.

isReference

public boolean isReference()
Returns:
True if this is a StoreFile Reference; call after open() else may get wrong answer.

isMajorCompaction

public boolean isMajorCompaction()
Returns:
True if this file was made by a major compaction.

excludeFromMinorCompaction

public boolean excludeFromMinorCompaction()
Returns:
True if this file should not be part of a minor compaction.

getMaxSequenceId

public long getMaxSequenceId()
Returns:
This files maximum edit sequence id.

getModificationTimeStamp

public long getModificationTimeStamp()

getMaxMemstoreTSInList

public static long getMaxMemstoreTSInList(Collection<StoreFile> sfs)
Return the largest memstoreTS found across all storefiles in the given list. Store files that were created by a mapreduce bulk load are ignored, as they do not correspond to any specific put operation, and thus do not have a memstoreTS associated with them.

Returns:
0 if no non-bulk-load files are provided or, this is Store that does not yet have any store files.

getMaxSequenceIdInList

public static long getMaxSequenceIdInList(Collection<StoreFile> sfs,
                                          boolean includeBulkLoadedFiles)
Return the highest sequence ID found across all storefiles in the given list. Store files that were created by a mapreduce bulk load are ignored, as they do not correspond to any edit log items.

Parameters:
sfs -
includeBulkLoadedFiles -
Returns:
0 if no non-bulk-load files are provided or, this is Store that does not yet have any store files.

getBulkLoadTimestamp

public long getBulkLoadTimestamp()
Return the timestamp at which this bulk load file was generated.


getHDFSBlockDistribution

public HDFSBlocksDistribution getHDFSBlockDistribution()
Returns:
the cached value of HDFS blocks distribution. The cached value is calculated when store file is opened.

createReader

public StoreFile.Reader createReader()
                              throws IOException
Returns:
Reader for StoreFile. creates if necessary
Throws:
IOException

getReader

public StoreFile.Reader getReader()
Returns:
Current reader. Must call createReader first else returns null.
See Also:
createReader()

closeReader

public void closeReader(boolean evictOnClose)
                 throws IOException
Parameters:
evictOnClose - whether to evict blocks belonging to this file
Throws:
IOException

deleteReader

public void deleteReader()
                  throws IOException
Delete this file

Throws:
IOException

toString

public String toString()
Overrides:
toString in class Object

toStringDetailed

public String toStringDetailed()
Returns:
a length description of this StoreFile, suitable for debug output

getUniqueFile

public static org.apache.hadoop.fs.Path getUniqueFile(org.apache.hadoop.fs.FileSystem fs,
                                                      org.apache.hadoop.fs.Path dir)
                                               throws IOException
Parameters:
fs -
dir - Directory to create file in.
Returns:
random filename inside passed dir
Throws:
IOException

getMinimumTimestamp

public Long getMinimumTimestamp()


Copyright © 2013 The Apache Software Foundation. All Rights Reserved.