The Apache HBase Book

Revision History
Revision 0.90.0  
Adding first cuts at Configuration, Getting Started, Data Model
Revision 0.89.20100924 5 October 2010 stack
Initial layout

Abstract

This is the official book of Apache HBase, a distributed, versioned, column-oriented database built on top of Apache Hadoop and Apache ZooKeeper.


Table of Contents

Preface
1. Getting Started
1.1. Introduction
1.2. Quick Start
1.2.1. Download and unpack the latest stable release.
1.2.2. Start HBase
1.2.3. Shell Exercises
1.2.4. Stopping HBase
1.2.5. Where to go next
1.3. Not-so-quick Start Guide
1.3.1. Requirements
1.3.2. HBase run modes: Standalone and Distributed
1.3.3. Example Configurations
2. Configuration
2.1. hbase-site.xml and hbase-default.xml
2.1.1. HBase Default Configuration
2.2. hbase-env.sh
2.3. log4j.properties
2.4. The Important Configurations
2.4.1. Required Configurations
2.4.2. Recommended Configuations
2.5. Client configuration and dependencies connecting to an HBase cluster
3. The HBase Shell
3.1. Scripting
3.2. Shell Tricks
3.2.1. irbrc
3.2.2. LOG data to timestamp
3.2.3. Debug
4. HBase and MapReduce
5. Metrics
6. Cluster Replication
7. Data Model
7.1. Table
7.2. Row
7.3. Column Family
7.4. Cells
7.5. Versions
7.5.1. Versions and HBase Operations
7.5.2. Current Limitations
8. Architecture
8.1. Daemons
8.1.1. Master
8.1.2. RegionServer
8.2. Regions
8.2.1. Region Size
8.2.2. Region Splits
8.2.3. Region Load Balancer
8.2.4. Store
9. The WAL
9.1. What is the purpose of the HBase WAL
9.2. WAL splitting
9.2.1. hbase.hlog.split.skip.errors
9.2.2. How EOFExceptions are treated when splitting a crashed RegionServers' WALs
10. Bloom Filters
10.1. Configurations
10.1.1. HColumnDescriptor option
10.1.2. io.hfile.bloom.enabled global kill switch
10.1.3. io.hfile.bloom.error.rate
10.1.4. io.hfile.bloom.max.fold
10.2. Bloom StoreFile footprint
10.2.1. BloomFilter in the StoreFile FileInfo data structure
10.2.2. BloomFilter entries in StoreFile metadata
A. Tools
A.1. HBase hbck
A.2. HFile Tool
A.3. WAL Tools
A.3.1. HLog tool
B. Compression
B.1. LZO
B.2. hbase.regionserver.codecs
C. FAQ
Index