org.apache.hadoop.hbase.regionserver
Class StoreFile

java.lang.Object
  extended by org.apache.hadoop.hbase.regionserver.StoreFile

@InterfaceAudience.LimitedPrivate(value="Coprocessor")
public class StoreFile
extends Object

A Store data file. Stores usually have one or more of these files. They are produced by flushing the memstore to disk. To create, instantiate a writer using StoreFile.WriterBuilder and append data. Be sure to add any metadata before calling close on the Writer (Use the appendMetadata convenience methods). On close, a StoreFile is sitting in the Filesystem. To refer to it, create a StoreFile instance passing filesystem and path. To read, call createReader().

StoreFiles may also reference store files in another Store. The reason for this weird pattern where you use a different instance for the writer and a reader is that we write once but read a lot more.


Nested Class Summary
static class StoreFile.Comparators
          Useful comparators for comparing StoreFiles.
static class StoreFile.Reader
          Reader for a StoreFile.
static class StoreFile.Writer
          A StoreFile writer.
static class StoreFile.WriterBuilder
           
 
Field Summary
static byte[] BLOOM_FILTER_TYPE_KEY
          Bloom filter Type in FileInfo
static byte[] BULKLOAD_TASK_KEY
          Meta key set when store file is a result of a bulk load
static byte[] BULKLOAD_TIME_KEY
           
static byte[] DELETE_FAMILY_COUNT
          Delete Family Count in FileInfo
static byte[] EARLIEST_PUT_TS
          Key for timestamp of earliest-put in metadata
static byte[] EXCLUDE_FROM_MINOR_COMPACTION_KEY
          Minor compaction flag in FileInfo
static byte[] MAJOR_COMPACTION_KEY
          Major compaction flag in FileInfo
static byte[] MAX_SEQ_ID_KEY
          Max Sequence ID in FileInfo
static byte[] TIMERANGE_KEY
          Key for Timerange information in metadata
 
Constructor Summary
StoreFile(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path p, org.apache.hadoop.conf.Configuration conf, CacheConfig cacheConf, BloomType cfBloomType)
          Constructor, loads a reader and it's indices, etc.
StoreFile(org.apache.hadoop.fs.FileSystem fs, StoreFileInfo fileInfo, org.apache.hadoop.conf.Configuration conf, CacheConfig cacheConf, BloomType cfBloomType)
          Constructor, loads a reader and it's indices, etc.
StoreFile(StoreFile other)
          Clone
 
Method Summary
 void closeReader(boolean evictOnClose)
           
 StoreFile.Reader createReader()
           
 StoreFile.Reader createReader(boolean canUseDropBehind)
           
 void deleteReader()
          Delete this file
 boolean excludeFromMinorCompaction()
           
 long getBulkLoadTimestamp()
          Return the timestamp at which this bulk load file was generated.
 CacheConfig getCacheConf()
           
 StoreFileInfo getFileInfo()
           
 HDFSBlocksDistribution getHDFSBlockDistribution()
           
 Long getMaximumTimestamp()
           
 long getMaxMemstoreTS()
           
static long getMaxMemstoreTSInList(Collection<StoreFile> sfs)
          Return the largest memstoreTS found across all storefiles in the given list.
 long getMaxSequenceId()
           
static long getMaxSequenceIdInList(Collection<StoreFile> sfs)
          Return the highest sequence ID found across all storefiles in the given list.
 byte[] getMetadataValue(byte[] key)
           
 Long getMinimumTimestamp()
           
 long getModificationTimeStamp()
           
 org.apache.hadoop.fs.Path getPath()
           
 org.apache.hadoop.fs.Path getQualifiedPath()
           
 StoreFile.Reader getReader()
           
static org.apache.hadoop.fs.Path getUniqueFile(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path dir)
           
 boolean isHFile()
           
 boolean isMajorCompaction()
           
 boolean isReference()
           
 void setMaxMemstoreTS(long maxMemstoreTS)
           
 String toString()
           
 String toStringDetailed()
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

MAX_SEQ_ID_KEY

public static final byte[] MAX_SEQ_ID_KEY
Max Sequence ID in FileInfo


MAJOR_COMPACTION_KEY

public static final byte[] MAJOR_COMPACTION_KEY
Major compaction flag in FileInfo


EXCLUDE_FROM_MINOR_COMPACTION_KEY

public static final byte[] EXCLUDE_FROM_MINOR_COMPACTION_KEY
Minor compaction flag in FileInfo


BLOOM_FILTER_TYPE_KEY

public static final byte[] BLOOM_FILTER_TYPE_KEY
Bloom filter Type in FileInfo


DELETE_FAMILY_COUNT

public static final byte[] DELETE_FAMILY_COUNT
Delete Family Count in FileInfo


TIMERANGE_KEY

public static final byte[] TIMERANGE_KEY
Key for Timerange information in metadata


EARLIEST_PUT_TS

public static final byte[] EARLIEST_PUT_TS
Key for timestamp of earliest-put in metadata


BULKLOAD_TASK_KEY

public static final byte[] BULKLOAD_TASK_KEY
Meta key set when store file is a result of a bulk load


BULKLOAD_TIME_KEY

public static final byte[] BULKLOAD_TIME_KEY
Constructor Detail

StoreFile

public StoreFile(org.apache.hadoop.fs.FileSystem fs,
                 org.apache.hadoop.fs.Path p,
                 org.apache.hadoop.conf.Configuration conf,
                 CacheConfig cacheConf,
                 BloomType cfBloomType)
          throws IOException
Constructor, loads a reader and it's indices, etc. May allocate a substantial amount of ram depending on the underlying files (10-20MB?).

Parameters:
fs - The current file system to use.
p - The path of the file.
conf - The current configuration.
cacheConf - The cache configuration and block cache reference.
cfBloomType - The bloom type to use for this store file as specified by column family configuration. This may or may not be the same as the Bloom filter type actually present in the HFile, because column family configuration might change. If this is BloomType.NONE, the existing Bloom filter is ignored.
Throws:
IOException - When opening the reader fails.

StoreFile

public StoreFile(org.apache.hadoop.fs.FileSystem fs,
                 StoreFileInfo fileInfo,
                 org.apache.hadoop.conf.Configuration conf,
                 CacheConfig cacheConf,
                 BloomType cfBloomType)
          throws IOException
Constructor, loads a reader and it's indices, etc. May allocate a substantial amount of ram depending on the underlying files (10-20MB?).

Parameters:
fs - The current file system to use.
fileInfo - The store file information.
conf - The current configuration.
cacheConf - The cache configuration and block cache reference.
cfBloomType - The bloom type to use for this store file as specified by column family configuration. This may or may not be the same as the Bloom filter type actually present in the HFile, because column family configuration might change. If this is BloomType.NONE, the existing Bloom filter is ignored.
Throws:
IOException - When opening the reader fails.

StoreFile

public StoreFile(StoreFile other)
Clone

Parameters:
other - The StoreFile to clone from
Method Detail

getMaxMemstoreTS

public long getMaxMemstoreTS()

setMaxMemstoreTS

public void setMaxMemstoreTS(long maxMemstoreTS)

getFileInfo

public StoreFileInfo getFileInfo()
Returns:
the StoreFile object associated to this StoreFile. null if the StoreFile is not a reference.

getPath

public org.apache.hadoop.fs.Path getPath()
Returns:
Path or null if this StoreFile was made with a Stream.

getQualifiedPath

public org.apache.hadoop.fs.Path getQualifiedPath()
Returns:
Returns the qualified path of this StoreFile

isReference

public boolean isReference()
Returns:
True if this is a StoreFile Reference; call after open(boolean canUseDropBehind) else may get wrong answer.

isHFile

public boolean isHFile()
Returns:
True if this is HFile.

isMajorCompaction

public boolean isMajorCompaction()
Returns:
True if this file was made by a major compaction.

excludeFromMinorCompaction

public boolean excludeFromMinorCompaction()
Returns:
True if this file should not be part of a minor compaction.

getMaxSequenceId

public long getMaxSequenceId()
Returns:
This files maximum edit sequence id.

getModificationTimeStamp

public long getModificationTimeStamp()

getMetadataValue

public byte[] getMetadataValue(byte[] key)

getMaxMemstoreTSInList

public static long getMaxMemstoreTSInList(Collection<StoreFile> sfs)
Return the largest memstoreTS found across all storefiles in the given list. Store files that were created by a mapreduce bulk load are ignored, as they do not correspond to any specific put operation, and thus do not have a memstoreTS associated with them.

Returns:
0 if no non-bulk-load files are provided or, this is Store that does not yet have any store files.

getMaxSequenceIdInList

public static long getMaxSequenceIdInList(Collection<StoreFile> sfs)
Return the highest sequence ID found across all storefiles in the given list.

Parameters:
sfs -
Returns:
0 if no non-bulk-load files are provided or, this is Store that does not yet have any store files.

getCacheConf

public CacheConfig getCacheConf()

getBulkLoadTimestamp

public long getBulkLoadTimestamp()
Return the timestamp at which this bulk load file was generated.


getHDFSBlockDistribution

public HDFSBlocksDistribution getHDFSBlockDistribution()
Returns:
the cached value of HDFS blocks distribution. The cached value is calculated when store file is opened.

createReader

public StoreFile.Reader createReader()
                              throws IOException
Throws:
IOException

createReader

public StoreFile.Reader createReader(boolean canUseDropBehind)
                              throws IOException
Returns:
Reader for StoreFile. creates if necessary
Throws:
IOException

getReader

public StoreFile.Reader getReader()
Returns:
Current reader. Must call createReader first else returns null.
See Also:
createReader()

closeReader

public void closeReader(boolean evictOnClose)
                 throws IOException
Parameters:
evictOnClose - whether to evict blocks belonging to this file
Throws:
IOException

deleteReader

public void deleteReader()
                  throws IOException
Delete this file

Throws:
IOException

toString

public String toString()
Overrides:
toString in class Object

toStringDetailed

public String toStringDetailed()
Returns:
a length description of this StoreFile, suitable for debug output

getUniqueFile

public static org.apache.hadoop.fs.Path getUniqueFile(org.apache.hadoop.fs.FileSystem fs,
                                                      org.apache.hadoop.fs.Path dir)
                                               throws IOException
Parameters:
fs -
dir - Directory to create file in.
Returns:
random filename inside passed dir
Throws:
IOException

getMinimumTimestamp

public Long getMinimumTimestamp()

getMaximumTimestamp

public Long getMaximumTimestamp()


Copyright © 2007–2016 The Apache Software Foundation. All rights reserved.