|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectorg.apache.hadoop.hbase.regionserver.HRegion
public class HRegion
HRegion stores data for a certain region of a table. It stores all columns for each row. A given table consists of one or more HRegions.
We maintain multiple HStores for a single HRegion.
An Store is a set of rows with some column data; together, they make up all the data for the rows.
Each HRegion has a 'startKey' and 'endKey'.
The first is inclusive, the second is exclusive (except for the final region) The endKey of region 0 is the same as startKey for region 1 (if it exists). The startKey for the first region is null. The endKey for the final region is null.
Locking at the HRegion level serves only one purpose: preventing the region from being closed (and consequently split) while other operations are ongoing. Each row level operation obtains both a row lock and a region read lock for the duration of the operation. While a scanner is being constructed, getScanner holds a read lock. If the scanner is successfully constructed, it holds a read lock until it is closed. A close takes out a write lock and consequently will block for ongoing operations and will block new operations from starting while the close is in progress.
An HRegion is defined by its table and its key extent.
It consists of at least one Store. The number of Stores should be configurable, so that data which is accessed together is stored in the same Store. Right now, we approximate that by building a single Store for each column family. (This config info will be communicated via the tabledesc.)
The HTableDescriptor contains metainfo about the HRegion's table. regionName is a unique identifier for this HRegion. (startKey, endKey] defines the keyspace for this HRegion.
Field Summary | |
---|---|
static long |
DEEP_OVERHEAD
|
static long |
FIXED_OVERHEAD
|
static org.apache.commons.logging.Log |
LOG
|
static String |
REGIONINFO_FILE
Name of the region info file that resides just under the region directory. |
protected Map<byte[],Store> |
stores
|
Constructor Summary | |
---|---|
HRegion()
Should only be used for testing purposes |
|
HRegion(org.apache.hadoop.fs.Path tableDir,
HLog log,
org.apache.hadoop.fs.FileSystem fs,
org.apache.hadoop.conf.Configuration conf,
HRegionInfo regionInfo,
FlushRequester flushRequester)
HRegion constructor. |
Method Summary | |
---|---|
static void |
addRegionToMETA(HRegion meta,
HRegion r)
Inserts a new region's meta information into the passed meta region. |
void |
bulkLoadHFile(String hfilePath,
byte[] familyName)
|
boolean |
checkAndMutate(byte[] row,
byte[] family,
byte[] qualifier,
byte[] expectedValue,
org.apache.hadoop.io.Writable w,
Integer lockId,
boolean writeToWAL)
|
protected void |
checkReadOnly()
|
List<StoreFile> |
close()
Close down this HRegion. |
List<StoreFile> |
close(boolean abort)
Close down this HRegion. |
byte[] |
compactStores()
Called by compaction thread and after region is opened to compact the HStores if necessary. |
static HRegion |
createHRegion(HRegionInfo info,
org.apache.hadoop.fs.Path rootDir,
org.apache.hadoop.conf.Configuration conf)
Convenience method creating new HRegions. |
void |
delete(Delete delete,
Integer lockid,
boolean writeToWAL)
|
void |
delete(Map<byte[],List<KeyValue>> familyMap,
boolean writeToWAL)
|
static void |
deleteRegion(org.apache.hadoop.fs.FileSystem fs,
org.apache.hadoop.fs.Path rootdir,
HRegionInfo info)
Deletes all the files for a HRegion |
boolean |
equals(Object o)
|
boolean |
flushcache()
Flush the cache. |
Result |
get(Get get,
Integer lockid)
|
Result |
getClosestRowBefore(byte[] row,
byte[] family)
Return all the data for the row that matches row exactly, or the one that immediately preceeds it, at or immediately before ts. |
int |
getCompactPriority()
|
protected long |
getCompleteCacheFlushSequenceId(long currentSequenceId)
Get the sequence number to be associated with this cache flush. |
org.apache.hadoop.conf.Configuration |
getConf()
|
byte[] |
getEndKey()
|
org.apache.hadoop.fs.FileSystem |
getFilesystem()
|
long |
getLargestHStoreSize()
|
Pair<Long,Long> |
getLastCompactInfo()
|
long |
getLastFlushTime()
|
HLog |
getLog()
|
List<Pair<Long,Long>> |
getRecentFlushInfo()
|
org.apache.hadoop.fs.Path |
getRegionDir()
|
static org.apache.hadoop.fs.Path |
getRegionDir(org.apache.hadoop.fs.Path rootdir,
HRegionInfo info)
Computes the Path of the HRegion |
static org.apache.hadoop.fs.Path |
getRegionDir(org.apache.hadoop.fs.Path tabledir,
String name)
Computes the Path of the HRegion |
long |
getRegionId()
|
HRegionInfo |
getRegionInfo()
|
byte[] |
getRegionName()
|
String |
getRegionNameAsString()
|
ReadWriteConsistencyControl |
getRWCC()
|
InternalScanner |
getScanner(Scan scan)
Return an iterator that scans over the HRegion, returning the indicated columns and rows specified by the Scan . |
protected InternalScanner |
getScanner(Scan scan,
List<KeyValueScanner> additionalScanners)
|
byte[] |
getStartKey()
|
Store |
getStore(byte[] column)
Return HStore instance. |
HTableDescriptor |
getTableDesc()
|
org.apache.hadoop.fs.Path |
getTableDir()
|
int |
hashCode()
|
boolean |
hasReferences()
|
boolean |
hasTooManyStoreFiles()
Checks every store to see if one has too many store files |
long |
heapSize()
|
Result |
increment(Increment increment,
Integer lockid,
boolean writeToWAL)
Perform one or more increment operations on a row. |
long |
incrementColumnValue(byte[] row,
byte[] family,
byte[] qualifier,
long amount,
boolean writeToWAL)
|
long |
initialize()
Initialize this region. |
long |
initialize(CancelableProgressable reporter)
Initialize this region. |
protected Store |
instantiateHStore(org.apache.hadoop.fs.Path tableDir,
HColumnDescriptor c)
|
protected InternalScanner |
instantiateInternalScanner(Scan scan,
List<KeyValueScanner> additionalScanners)
|
protected boolean |
internalFlushcache()
Flush the memstore. |
protected boolean |
internalFlushcache(HLog wal,
long myseqid)
|
boolean |
isClosed()
|
boolean |
isClosing()
|
static void |
main(String[] args)
Facility for dumping and compacting catalog tables. |
static void |
makeColumnFamilyDirs(org.apache.hadoop.fs.FileSystem fs,
org.apache.hadoop.fs.Path tabledir,
HRegionInfo hri,
byte[] colFamily)
Make the directories for a specific column family |
static HRegion |
merge(HRegion a,
HRegion b)
Merge two regions whether they are adjacent or not. |
static HRegion |
mergeAdjacent(HRegion srcA,
HRegion srcB)
Merge two HRegions. |
static HRegion |
newHRegion(org.apache.hadoop.fs.Path tableDir,
HLog log,
org.apache.hadoop.fs.FileSystem fs,
org.apache.hadoop.conf.Configuration conf,
HRegionInfo regionInfo,
FlushRequester flushListener)
A utility method to create new instances of HRegion based on the HConstants.REGION_IMPL configuration property. |
Integer |
obtainRowLock(byte[] row)
Obtain a lock on the given row. |
protected HRegion |
openHRegion(CancelableProgressable reporter)
Open HRegion. |
static HRegion |
openHRegion(HRegionInfo info,
HLog wal,
org.apache.hadoop.conf.Configuration conf)
Open a Region. |
static HRegion |
openHRegion(HRegionInfo info,
HLog wal,
org.apache.hadoop.conf.Configuration conf,
FlushRequester flusher,
CancelableProgressable reporter)
Open a Region. |
protected void |
prepareToSplit()
Give the region a chance to prepare before it is split. |
HConstants.OperationStatusCode[] |
put(Pair<Put,Integer>[] putsAndLocks)
Perform a batch of puts. |
void |
put(Put put)
|
HConstants.OperationStatusCode[] |
put(Put[] puts)
Perform a batch put with no pre-specified locks |
void |
put(Put put,
boolean writeToWAL)
|
void |
put(Put put,
Integer lockid)
|
void |
put(Put put,
Integer lockid,
boolean writeToWAL)
|
protected long |
replayRecoveredEditsIfAny(org.apache.hadoop.fs.Path regiondir,
long minSeqId,
CancelableProgressable reporter)
Read the edits log put under this region by wal log splitting process. |
protected boolean |
restoreEdit(Store s,
KeyValue kv)
Used by tests |
static boolean |
rowIsInRange(HRegionInfo info,
byte[] row)
Determines if the specified row is within the row range specified by the specified HRegionInfo |
String |
toString()
|
Integer |
tryObtainRowLock(byte[] row)
Tries to obtain a row lock on the given row, but does not block if the row lock is not available. |
Methods inherited from class java.lang.Object |
---|
clone, finalize, getClass, notify, notifyAll, wait, wait, wait |
Field Detail |
---|
public static final org.apache.commons.logging.Log LOG
protected final Map<byte[],Store> stores
public static final String REGIONINFO_FILE
public static final long FIXED_OVERHEAD
public static final long DEEP_OVERHEAD
Constructor Detail |
---|
public HRegion()
public HRegion(org.apache.hadoop.fs.Path tableDir, HLog log, org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.conf.Configuration conf, HRegionInfo regionInfo, FlushRequester flushRequester)
newHRegion(Path, HLog, FileSystem, Configuration, org.apache.hadoop.hbase.HRegionInfo, FlushRequester)
method.
tableDir
- qualified path of directory where region should be located,
usually the table directory.log
- The HLog is the outbound log for any updates to the HRegion
(There's a single HLog for all the HRegions on a single HRegionServer.)
The log file is a logfile from the previous execution that's
custom-computed for this HRegion. The HRegionServer computes and sorts the
appropriate log info for this HRegion. If there is a previous log file
(implying that the HRegion has been written-to before), then read it from
the supplied path.fs
- is the filesystem.conf
- is global configuration settings.regionInfo
- - HRegionInfo that describes the region
is new), then read them from the supplied path.flushRequester
- an object that implements FlushRequester
or nullnewHRegion(Path, HLog, FileSystem, Configuration, org.apache.hadoop.hbase.HRegionInfo, FlushRequester)
Method Detail |
---|
public long initialize() throws IOException
IOException
- epublic long initialize(CancelableProgressable reporter) throws IOException
reporter
- Tickle every so often if initialize is taking a while.
IOException
- epublic boolean hasReferences()
public HRegionInfo getRegionInfo()
public boolean isClosed()
public boolean isClosing()
public ReadWriteConsistencyControl getRWCC()
public List<StoreFile> close() throws IOException
This method could take some time to execute, so don't call it from a time-sensitive thread.
IOException
- epublic List<StoreFile> close(boolean abort) throws IOException
abort
- true if server is aborting (only during testing)
IOException
- epublic byte[] getStartKey()
public byte[] getEndKey()
public long getRegionId()
public byte[] getRegionName()
public String getRegionNameAsString()
public HTableDescriptor getTableDesc()
public HLog getLog()
public org.apache.hadoop.conf.Configuration getConf()
public org.apache.hadoop.fs.Path getRegionDir()
public static org.apache.hadoop.fs.Path getRegionDir(org.apache.hadoop.fs.Path tabledir, String name)
tabledir
- qualified path for tablename
- ENCODED region name
public org.apache.hadoop.fs.FileSystem getFilesystem()
public Pair<Long,Long> getLastCompactInfo()
public long getLastFlushTime()
public List<Pair<Long,Long>> getRecentFlushInfo()
public long getLargestHStoreSize()
public byte[] compactStores() throws IOException
This operation could block for a long time, so don't call it from a time-sensitive thread. Note that no locking is necessary at this level because compaction only conflicts with a region split, and that cannot happen because the region server does them sequentially and not in parallel.
IOException
- epublic boolean flushcache() throws IOException
This method may block for some time, so it should not be called from a time-sensitive thread.
IOException
- general io exceptions
DroppedSnapshotException
- Thrown when replay of hlog is required
because a Snapshot was not properly persisted.protected boolean internalFlushcache() throws IOException
So, we have a three-step process:
This method is protected, but can be accessed via several public routes.
This method may block for some time.
IOException
- general io exceptions
DroppedSnapshotException
- Thrown when replay of hlog is required
because a Snapshot was not properly persisted.protected boolean internalFlushcache(HLog wal, long myseqid) throws IOException
wal
- Null if we're NOT to go via hlog/wal.myseqid
- The seqid to use if wal
is null writing out
flush file.
IOException
internalFlushcache()
protected long getCompleteCacheFlushSequenceId(long currentSequenceId)
currentSequenceId
-
public Result getClosestRowBefore(byte[] row, byte[] family) throws IOException
row
- row keyfamily
- column family to find on
IOException
- read exceptionspublic InternalScanner getScanner(Scan scan) throws IOException
Scan
.
This Iterator must be closed by the caller.
scan
- configured Scan
IOException
- read exceptionsprotected InternalScanner getScanner(Scan scan, List<KeyValueScanner> additionalScanners) throws IOException
IOException
protected InternalScanner instantiateInternalScanner(Scan scan, List<KeyValueScanner> additionalScanners) throws IOException
IOException
public void delete(Delete delete, Integer lockid, boolean writeToWAL) throws IOException
delete
- delete objectlockid
- existing lock id, or null for grab a lockwriteToWAL
- append to the write ahead lock or not
IOException
- read exceptionspublic void delete(Map<byte[],List<KeyValue>> familyMap, boolean writeToWAL) throws IOException
familyMap
- map of family to edits for the given family.writeToWAL
-
IOException
public void put(Put put) throws IOException
put
-
IOException
public void put(Put put, boolean writeToWAL) throws IOException
put
- writeToWAL
-
IOException
public void put(Put put, Integer lockid) throws IOException
put
- lockid
-
IOException
public void put(Put put, Integer lockid, boolean writeToWAL) throws IOException
put
- lockid
- writeToWAL
-
IOException
public HConstants.OperationStatusCode[] put(Put[] puts) throws IOException
IOException
put(Pair[])
public HConstants.OperationStatusCode[] put(Pair<Put,Integer>[] putsAndLocks) throws IOException
putsAndLocks
- the list of puts paired with their requested lock IDs.
IOException
public boolean checkAndMutate(byte[] row, byte[] family, byte[] qualifier, byte[] expectedValue, org.apache.hadoop.io.Writable w, Integer lockId, boolean writeToWAL) throws IOException
row
- family
- qualifier
- expectedValue
- lockId
- writeToWAL
-
IOException
protected void checkReadOnly() throws IOException
IOException
- Throws exception if region is in read-only mode.protected long replayRecoveredEditsIfAny(org.apache.hadoop.fs.Path regiondir, long minSeqId, CancelableProgressable reporter) throws UnsupportedEncodingException, IOException
We can ignore any log message that has a sequence ID that's equal to or lower than minSeqId. (Because we know such log messages are already reflected in the HFiles.)
While this is running we are putting pressure on memory yet we are outside of our usual accounting because we are not yet an onlined region (this stuff is being run as part of Region initialization). This means that if we're up against global memory limits, we'll not be flagged to flush because we are not online. We can't be flushed by usual mechanisms anyways; we're not yet online so our relative sequenceids are not yet aligned with HLog sequenceids -- not till we come up online, post processing of split edits.
But to help relieve memory pressure, at least manage our own heap size flushing if are in excess of per-region limits. Flushing, though, we have to be careful and avoid using the regionserver/hlog sequenceid. Its running on a different line to whats going on in here in this region context so if we crashed replaying these edits, but in the midst had a flush that used the regionserver log with a sequenceid in excess of whats going on in here in this region and with its split editlogs, then we could miss edits the next time we go to recover. So, we have to flush inline, using seqids that make sense in a this single region context only -- until we online.
regiondir
- minSeqId
- Any edit found in split editlogs needs to be in excess of
this minSeqId to be applied, else its skipped.reporter
-
minSeqId
if nothing added from editlogs.
UnsupportedEncodingException
IOException
protected boolean restoreEdit(Store s, KeyValue kv)
s
- Store to add edit too.kv
- KeyValue to add.
protected Store instantiateHStore(org.apache.hadoop.fs.Path tableDir, HColumnDescriptor c) throws IOException
IOException
public Store getStore(byte[] column)
column
- Name of column family hosted by this region.
column
.
TODO: Make this lookup faster.public Integer obtainRowLock(byte[] row) throws IOException
ROWS ==> LOCKSas well as
LOCKS ==> ROWSBut it acts as a guard on the client; a miswritten client just can't submit the name of a row and start writing to it; it must know the correct lockid, which matches the lock list in memory.
It would be more memory-efficient to assume a correctly-written client, which maybe we'll do in the future.
row
- Name of row to lock.
IOException
public Integer tryObtainRowLock(byte[] row) throws IOException
IOException
obtainRowLock(byte[])
public void bulkLoadHFile(String hfilePath, byte[] familyName) throws IOException
IOException
public boolean equals(Object o)
equals
in class Object
public int hashCode()
hashCode
in class Object
public String toString()
toString
in class Object
public org.apache.hadoop.fs.Path getTableDir()
public static HRegion newHRegion(org.apache.hadoop.fs.Path tableDir, HLog log, org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.conf.Configuration conf, HRegionInfo regionInfo, FlushRequester flushListener)
HConstants.REGION_IMPL
configuration property.
tableDir
- qualified path of directory where region should be located,
usually the table directory.log
- The HLog is the outbound log for any updates to the HRegion
(There's a single HLog for all the HRegions on a single HRegionServer.)
The log file is a logfile from the previous execution that's
custom-computed for this HRegion. The HRegionServer computes and sorts the
appropriate log info for this HRegion. If there is a previous log file
(implying that the HRegion has been written-to before), then read it from
the supplied path.fs
- is the filesystem.conf
- is global configuration settings.regionInfo
- - HRegionInfo that describes the region
is new), then read them from the supplied path.flushListener
- an object that implements CacheFlushListener or null
making progress to master -- otherwise master might think region deploy
failed. Can be null.
public static HRegion createHRegion(HRegionInfo info, org.apache.hadoop.fs.Path rootDir, org.apache.hadoop.conf.Configuration conf) throws IOException
HLog
for the created region. It
needs to be closed explicitly. Use getLog()
to get
access.
info
- Info for region to create.rootDir
- Root directory for HBase instanceconf
-
IOException
public static HRegion openHRegion(HRegionInfo info, HLog wal, org.apache.hadoop.conf.Configuration conf) throws IOException
info
- Info for region to be opened.wal
- HLog for region to use. This method will call
HLog#setSequenceNumber(long) passing the result of the call to
HRegion#getMinSequenceId() to ensure the log id is properly kept
up. HRegionStore does this every time it opens a new region.conf
-
IOException
public static HRegion openHRegion(HRegionInfo info, HLog wal, org.apache.hadoop.conf.Configuration conf, FlushRequester flusher, CancelableProgressable reporter) throws IOException
info
- Info for region to be opened.wal
- HLog for region to use. This method will call
HLog#setSequenceNumber(long) passing the result of the call to
HRegion#getMinSequenceId() to ensure the log id is properly kept
up. HRegionStore does this every time it opens a new region.conf
- flusher
- An interface we can request flushes against.reporter
- An interface we can report progress against.
IOException
protected HRegion openHRegion(CancelableProgressable reporter) throws IOException
reporter
-
this
IOException
public static void addRegionToMETA(HRegion meta, HRegion r) throws IOException
meta
region. Used by the HMaster bootstrap code adding
new table to ROOT table.
meta
- META HRegion to be updatedr
- HRegion to add to meta
IOException
public static void deleteRegion(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path rootdir, HRegionInfo info) throws IOException
fs
- the file system objectrootdir
- qualified path of HBase root directoryinfo
- HRegionInfo for region to be deleted
IOException
public static org.apache.hadoop.fs.Path getRegionDir(org.apache.hadoop.fs.Path rootdir, HRegionInfo info)
rootdir
- qualified path of HBase root directoryinfo
- HRegionInfo for the region
public static boolean rowIsInRange(HRegionInfo info, byte[] row)
info
- HRegionInfo that specifies the row rangerow
- row to be checked
public static void makeColumnFamilyDirs(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path tabledir, HRegionInfo hri, byte[] colFamily) throws IOException
fs
- the file systemtabledir
- base directory where region will live (usually the table dir)hri
- colFamily
- the column family
IOException
public static HRegion mergeAdjacent(HRegion srcA, HRegion srcB) throws IOException
srcA
- srcB
-
IOException
public static HRegion merge(HRegion a, HRegion b) throws IOException
a
- region ab
- region b
IOException
public Result get(Get get, Integer lockid) throws IOException
get
- get objectlockid
- existing lock id, or null for no previous lock
IOException
- read exceptionspublic Result increment(Increment increment, Integer lockid, boolean writeToWAL) throws IOException
Increments performed are done under row lock but reads do not take locks out so this can be seen partially complete by gets and scans.
increment
- lockid
- writeToWAL
-
IOException
public long incrementColumnValue(byte[] row, byte[] family, byte[] qualifier, long amount, boolean writeToWAL) throws IOException
row
- family
- qualifier
- amount
- writeToWAL
-
IOException
public long heapSize()
heapSize
in interface HeapSize
protected void prepareToSplit()
public int getCompactPriority()
public boolean hasTooManyStoreFiles()
public static void main(String[] args) throws IOException
./bin/hbase org.apache.hadoop.hbase.regionserver.HRegion
args
-
IOException
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |