org.apache.hadoop.hbase.regionserver.compactions
Class DateTieredCompactionPolicy

java.lang.Object
  extended by org.apache.hadoop.hbase.regionserver.compactions.CompactionPolicy
      extended by org.apache.hadoop.hbase.regionserver.compactions.RatioBasedCompactionPolicy
          extended by org.apache.hadoop.hbase.regionserver.compactions.DateTieredCompactionPolicy

public class DateTieredCompactionPolicy
extends RatioBasedCompactionPolicy

HBASE-15181 This is a simple implementation of date-based tiered compaction similar to Cassandra's for the following benefits: 1. Improve date-range-based scan by structuring store files in date-based tiered layout. 2. Reduce compaction overhead. 3. Improve TTL efficiency. Perfect fit for the use cases that: 1. has mostly date-based data write and scan and a focus on the most recent data. 2. never or rarely deletes data. Out-of-order writes are handled gracefully. Time range overlapping among store files is tolerated and the performance impact is minimized. Configuration can be set at hbase-site or overriden at per-table or per-column-famly level by hbase shell. Design spec is at https://docs.google.com/document/d/1_AmlNb2N8Us1xICsTeGDLKIqL6T-oHoRLZ323MG_uy8/


Field Summary
 
Fields inherited from class org.apache.hadoop.hbase.regionserver.compactions.CompactionPolicy
comConf, storeConfigInfo
 
Constructor Summary
DateTieredCompactionPolicy(org.apache.hadoop.conf.Configuration conf, StoreConfigInformation storeConfigInfo)
           
 
Method Summary
 ArrayList<StoreFile> applyCompactionPolicy(ArrayList<StoreFile> candidates, boolean mayUseOffPeak, boolean mayBeStuck)
           
 ArrayList<StoreFile> applyCompactionPolicy(ArrayList<StoreFile> candidates, boolean mayUseOffPeak, boolean mayBeStuck, long now)
           
 boolean isMajorCompaction(Collection<StoreFile> filesToCompact)
           
 boolean needsCompaction(Collection<StoreFile> storeFiles, List<StoreFile> filesCompacting)
           
 boolean needsCompaction(Collection<StoreFile> storeFiles, List<StoreFile> filesCompacting, long now)
           
 
Methods inherited from class org.apache.hadoop.hbase.regionserver.compactions.RatioBasedCompactionPolicy
checkMinFilesCriteria, filterBulk, getNextMajorCompactTime, preSelectCompactionForCoprocessor, selectCompaction, setMinThreshold, skipLargeFiles, throttleCompaction
 
Methods inherited from class org.apache.hadoop.hbase.regionserver.compactions.CompactionPolicy
getConf, setConf
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

DateTieredCompactionPolicy

public DateTieredCompactionPolicy(org.apache.hadoop.conf.Configuration conf,
                                  StoreConfigInformation storeConfigInfo)
                           throws IOException
Throws:
IOException
Method Detail

isMajorCompaction

public boolean isMajorCompaction(Collection<StoreFile> filesToCompact)
                          throws IOException
Overrides:
isMajorCompaction in class RatioBasedCompactionPolicy
Parameters:
filesToCompact - Files to compact. Can be null.
Returns:
True if we should run a major compaction.
Throws:
IOException

needsCompaction

public boolean needsCompaction(Collection<StoreFile> storeFiles,
                               List<StoreFile> filesCompacting)
Overrides:
needsCompaction in class RatioBasedCompactionPolicy

needsCompaction

public boolean needsCompaction(Collection<StoreFile> storeFiles,
                               List<StoreFile> filesCompacting,
                               long now)

applyCompactionPolicy

public ArrayList<StoreFile> applyCompactionPolicy(ArrayList<StoreFile> candidates,
                                                  boolean mayUseOffPeak,
                                                  boolean mayBeStuck)
                                           throws IOException
Parameters:
candidates - pre-filtrate
Returns:
filtered subset -- Default minor compaction selection algorithm: choose CompactSelection from candidates -- First exclude bulk-load files if indicated in configuration. Start at the oldest file and stop when you find the first file that meets compaction criteria: (1) a recently-flushed, small file (i.e. <= minCompactSize) OR (2) within the compactRatio of sum(newer_files) Given normal skew, any newer files will also meet this criteria

Additional Note: If fileSizes.size() >> maxFilesToCompact, we will recurse on compact(). Consider the oldest files first to avoid a situation where we always compact [end-threshold,end). Then, the last file becomes an aggregate of the previous compactions. normal skew: older ----> newer (increasing seqID) _ | | _ | | | | _ --|-|- |-|- |-|---_-------_------- minCompactSize | | | | | | | | _ | | | | | | | | | | | | | | | | | | | | | | | | | |

Throws:
IOException

applyCompactionPolicy

public ArrayList<StoreFile> applyCompactionPolicy(ArrayList<StoreFile> candidates,
                                                  boolean mayUseOffPeak,
                                                  boolean mayBeStuck,
                                                  long now)
                                           throws IOException
Throws:
IOException


Copyright © 2007–2016 The Apache Software Foundation. All rights reserved.