org.apache.hadoop.hbase.regionserver.compactions
Class DateTieredCompactionPolicy
java.lang.Object
org.apache.hadoop.hbase.regionserver.compactions.CompactionPolicy
org.apache.hadoop.hbase.regionserver.compactions.RatioBasedCompactionPolicy
org.apache.hadoop.hbase.regionserver.compactions.DateTieredCompactionPolicy
public class DateTieredCompactionPolicy
- extends RatioBasedCompactionPolicy
HBASE-15181 This is a simple implementation of date-based tiered compaction similar to
Cassandra's for the following benefits:
1. Improve date-range-based scan by structuring store files in date-based tiered layout.
2. Reduce compaction overhead.
3. Improve TTL efficiency.
Perfect fit for the use cases that:
1. has mostly date-based data write and scan and a focus on the most recent data.
2. never or rarely deletes data. Out-of-order writes are handled gracefully. Time range
overlapping among store files is tolerated and the performance impact is minimized. Configuration
can be set at hbase-site or overriden at per-table or per-column-famly level by hbase shell.
Design spec is at
https://docs.google.com/document/d/1_AmlNb2N8Us1xICsTeGDLKIqL6T-oHoRLZ323MG_uy8/
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
DateTieredCompactionPolicy
public DateTieredCompactionPolicy(org.apache.hadoop.conf.Configuration conf,
StoreConfigInformation storeConfigInfo)
throws IOException
- Throws:
IOException
isMajorCompaction
public boolean isMajorCompaction(Collection<StoreFile> filesToCompact)
throws IOException
- Overrides:
isMajorCompaction
in class RatioBasedCompactionPolicy
- Parameters:
filesToCompact
- Files to compact. Can be null.
- Returns:
- True if we should run a major compaction.
- Throws:
IOException
needsCompaction
public boolean needsCompaction(Collection<StoreFile> storeFiles,
List<StoreFile> filesCompacting)
- Overrides:
needsCompaction
in class RatioBasedCompactionPolicy
needsCompaction
public boolean needsCompaction(Collection<StoreFile> storeFiles,
List<StoreFile> filesCompacting,
long now)
applyCompactionPolicy
public ArrayList<StoreFile> applyCompactionPolicy(ArrayList<StoreFile> candidates,
boolean mayUseOffPeak,
boolean mayBeStuck)
throws IOException
- Parameters:
candidates
- pre-filtrate
- Returns:
- filtered subset
-- Default minor compaction selection algorithm:
choose CompactSelection from candidates --
First exclude bulk-load files if indicated in configuration.
Start at the oldest file and stop when you find the first file that
meets compaction criteria:
(1) a recently-flushed, small file (i.e. <= minCompactSize)
OR
(2) within the compactRatio of sum(newer_files)
Given normal skew, any newer files will also meet this criteria
Additional Note:
If fileSizes.size() >> maxFilesToCompact, we will recurse on
compact(). Consider the oldest files first to avoid a
situation where we always compact [end-threshold,end). Then, the
last file becomes an aggregate of the previous compactions.
normal skew:
older ----> newer (increasing seqID)
_
| | _
| | | | _
--|-|- |-|- |-|---_-------_------- minCompactSize
| | | | | | | | _ | |
| | | | | | | | | | | |
| | | | | | | | | | | |
- Throws:
IOException
applyCompactionPolicy
public ArrayList<StoreFile> applyCompactionPolicy(ArrayList<StoreFile> candidates,
boolean mayUseOffPeak,
boolean mayBeStuck,
long now)
throws IOException
- Throws:
IOException
Copyright © 2007–2016 The Apache Software Foundation. All rights reserved.