org.apache.hadoop.hbase.regionserver
Class HLog

java.lang.Object
  extended by org.apache.hadoop.hbase.regionserver.HLog
All Implemented Interfaces:
HConstants

public class HLog
extends Object
implements HConstants

HLog stores all the edits to the HStore. It performs logfile-rolling, so external callers are not aware that the underlying file is being rolled.

A single HLog is used by several HRegions simultaneously.

Each HRegion is identified by a unique long int. HRegions do not need to declare themselves before using the HLog; they simply include their HRegion-id in the append or completeCacheFlush calls.

An HLog consists of multiple on-disk files, which have a chronological order. As data is flushed to other (better) on-disk structures, the log becomes obsolete. We can destroy all the log messages for a given HRegion-id up to the most-recent CACHEFLUSH message from that HRegion.

It's only practical to delete entire files. Thus, we delete an entire on-disk file F when all of the messages in F have a log-sequence-id that's older (smaller) than the most-recent CACHEFLUSH message for every HRegion that has a message in F.

Synchronized methods can never execute in parallel. However, between the start of a cache flush and the completion point, appends are allowed but log rolling is not. To prevent log rolling taking place during this period, a separate reentrant lock is used.

TODO: Vuk Ercegovac also pointed out that keeping HBase HRegion edit logs in HDFS is currently flawed. HBase writes edits to logs and to a memcache. The 'atomic' write to the log is meant to serve as insurance against abnormal RegionServer exit: on startup, the log is rerun to reconstruct an HRegion's last wholesome state. But files in HDFS do not 'exist' until they are cleanly closed -- something that will not happen if RegionServer exits without running its 'close'.


Field Summary
 
Fields inherited from interface org.apache.hadoop.hbase.HConstants
ALL_META_COLUMNS, ALL_VERSIONS, COL_REGIONINFO, COL_REGIONINFO_ARRAY, COL_SERVER, COL_SPLITA, COL_SPLITB, COL_STARTCODE, COLUMN_FAMILY, COLUMN_FAMILY_ARRAY, COLUMN_FAMILY_HISTORIAN, COLUMN_FAMILY_HISTORIAN_STR, COLUMN_FAMILY_STR, DEFAULT_CLIENT_RETRIES, DEFAULT_HOST, DEFAULT_MASTER_ADDRESS, DEFAULT_MASTER_INFOPORT, DEFAULT_MASTER_PORT, DEFAULT_MAX_FILE_SIZE, DEFAULT_REGION_SERVER_CLASS, DEFAULT_REGIONSERVER_ADDRESS, DEFAULT_REGIONSERVER_INFOPORT, DEFAULT_SIZE_RESERVATION_BLOCK, EMPTY_BYTE_ARRAY, EMPTY_END_ROW, EMPTY_START_ROW, FILE_SYSTEM_VERSION, FOREVER, HBASE_CLIENT_RETRIES_NUMBER_KEY, HBASE_DIR, HREGION_LOGDIR_NAME, HREGION_OLDLOGFILE_NAME, IN_MEMORY, LAST_ROW, LATEST_TIMESTAMP, MASTER_ADDRESS, META_TABLE_NAME, NAME, NINES, REGION_SERVER_CLASS, REGION_SERVER_IMPL, REGIONSERVER_ADDRESS, RETRY_BACKOFF, ROOT_TABLE_NAME, THREAD_WAKE_FREQUENCY, UTF8_ENCODING, VERSION_FILE_NAME, VERSIONS, ZERO_L, ZEROES
 
Constructor Summary
HLog(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path dir, org.apache.hadoop.conf.Configuration conf, LogRollListener listener)
          Create an edit log at the given dir location.
 
Method Summary
 void closeAndDelete()
          Shut down the log and delete the log directory
static void main(String[] args)
          Pass one or more log file names and it will either dump out a text version on stdout or split the specified log files.
 void rollWriter()
          Roll the log writer.
static void splitLog(org.apache.hadoop.fs.Path rootDir, org.apache.hadoop.fs.Path srcDir, org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.conf.Configuration conf)
          Split up a bunch of log files, that are no longer being written to, into new files, one per region.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

HLog

public HLog(org.apache.hadoop.fs.FileSystem fs,
            org.apache.hadoop.fs.Path dir,
            org.apache.hadoop.conf.Configuration conf,
            LogRollListener listener)
     throws IOException
Create an edit log at the given dir location. You should never have to load an existing log. If there is a log at startup, it should have already been processed and deleted by the time the HLog object is started up.

Parameters:
fs -
dir -
conf -
listener -
Throws:
IOException
Method Detail

rollWriter

public void rollWriter()
                throws IOException
Roll the log writer. That is, start writing log messages to a new file. Because a log cannot be rolled during a cache flush, and a cache flush spans two method calls, a special lock needs to be obtained so that a cache flush cannot start when the log is being rolled and the log cannot be rolled during a cache flush.

Note that this method cannot be synchronized because it is possible that startCacheFlush runs, obtaining the cacheFlushLock, then this method could start which would obtain the lock on this but block on obtaining the cacheFlushLock and then completeCacheFlush could be called which would wait for the lock on this and consequently never release the cacheFlushLock

Throws:
IOException

closeAndDelete

public void closeAndDelete()
                    throws IOException
Shut down the log and delete the log directory

Throws:
IOException

splitLog

public static void splitLog(org.apache.hadoop.fs.Path rootDir,
                            org.apache.hadoop.fs.Path srcDir,
                            org.apache.hadoop.fs.FileSystem fs,
                            org.apache.hadoop.conf.Configuration conf)
                     throws IOException
Split up a bunch of log files, that are no longer being written to, into new files, one per region. Delete the old log files when finished.

Parameters:
rootDir - qualified root directory of the HBase instance
srcDir - Directory of log files to split: e.g. ${ROOTDIR}/log_HOST_PORT
fs - FileSystem
conf - HBaseConfiguration
Throws:
IOException

main

public static void main(String[] args)
                 throws IOException
Pass one or more log file names and it will either dump out a text version on stdout or split the specified log files.

Parameters:
args -
Throws:
IOException


Copyright © 2008 The Apache Software Foundation