Chapter 7. HBase and MapReduce

Table of Contents

7.1. The default HBase MapReduce Splitter
7.2. HBase Input MapReduce Example
7.3. Accessing Other HBase Tables in a MapReduce Job
7.4. Speculative Execution

See HBase and MapReduce up in javadocs. Start there. Below is some additional help.

7.1. The default HBase MapReduce Splitter

When TableInputFormat, is used to source an HBase table in a MapReduce job, its splitter will make a map task for each region of the table. Thus, if there are 100 regions in the table, there will be 100 map-tasks for the job - regardless of how many column families are selected in the Scan.