org.apache.hadoop.hbase
Class HLog

java.lang.Object
  extended by org.apache.hadoop.hbase.HLog
All Implemented Interfaces:
HConstants

public class HLog
extends Object
implements HConstants

HLog stores all the edits to the HStore. It performs logfile-rolling, so external callers are not aware that the underlying file is being rolled.

A single HLog is used by several HRegions simultaneously.

Each HRegion is identified by a unique long int. HRegions do not need to declare themselves before using the HLog; they simply include their HRegion-id in the append or completeCacheFlush calls.

An HLog consists of multiple on-disk files, which have a chronological order. As data is flushed to other (better) on-disk structures, the log becomes obsolete. We can destroy all the log messages for a given HRegion-id up to the most-recent CACHEFLUSH message from that HRegion.

It's only practical to delete entire files. Thus, we delete an entire on-disk file F when all of the messages in F have a log-sequence-id that's older (smaller) than the most-recent CACHEFLUSH message for every HRegion that has a message in F.

Synchronized methods can never execute in parallel. However, between the start of a cache flush and the completion point, appends are allowed but log rolling is not. To prevent log rolling taking place during this period, a separate reentrant lock is used.

TODO: Vuk Ercegovac also pointed out that keeping HBase HRegion edit logs in HDFS is currently flawed. HBase writes edits to logs and to a memcache. The 'atomic' write to the log is meant to serve as insurance against abnormal RegionServer exit: on startup, the log is rerun to reconstruct an HRegion's last wholesome state. But files in HDFS do not 'exist' until they are cleanly closed -- something that will not happen if RegionServer exits without running its 'close'.


Field Summary
 
Fields inherited from interface org.apache.hadoop.hbase.HConstants
ALL_VERSIONS, COL_REGIONINFO, COL_REGIONINFO_ARRAY, COL_SERVER, COL_SPLITA, COL_SPLITB, COL_STARTCODE, COLUMN_FAMILY, COLUMN_FAMILY_ARRAY, COLUMN_FAMILY_STR, DEFAULT_HBASE_DIR, DEFAULT_HOST, DEFAULT_MASTER_ADDRESS, DEFAULT_MASTER_INFOPORT, DEFAULT_MAX_FILE_SIZE, DEFAULT_REGION_SERVER_CLASS, DEFAULT_REGIONSERVER_ADDRESS, DEFAULT_REGIONSERVER_INFOPORT, EMPTY_START_ROW, EMPTY_TEXT, HBASE_DIR, HREGION_LOGDIR_NAME, HREGION_OLDLOGFILE_NAME, HREGIONDIR_PREFIX, LATEST_TIMESTAMP, MASTER_ADDRESS, META_TABLE_NAME, REGION_SERVER_CLASS, REGIONSERVER_ADDRESS, ROOT_TABLE_NAME, THREAD_WAKE_FREQUENCY, UTF8_ENCODING
 
Method Summary
static void main(String[] args)
          Pass one or more log file names and it will either dump out a text version on stdout or split the specified log files.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Method Detail

main

public static void main(String[] args)
                 throws IOException
Pass one or more log file names and it will either dump out a text version on stdout or split the specified log files.

Parameters:
args -
Throws:
IOException


Copyright © 2006 The Apache Software Foundation