org.apache.hadoop.hbase.mapreduce
Class LoadIncrementalHFiles

java.lang.Object
  extended by org.apache.hadoop.conf.Configured
      extended by org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles
All Implemented Interfaces:
org.apache.hadoop.conf.Configurable, org.apache.hadoop.util.Tool

public class LoadIncrementalHFiles
extends org.apache.hadoop.conf.Configured
implements org.apache.hadoop.util.Tool

Tool to load the output of HFileOutputFormat into an existing table.

See Also:
usage()

Field Summary
static String ASSIGN_SEQ_IDS
           
static String NAME
           
 
Constructor Summary
LoadIncrementalHFiles(org.apache.hadoop.conf.Configuration conf)
           
LoadIncrementalHFiles(org.apache.hadoop.conf.Configuration conf, boolean useSecureHBaseOverride)
           
 
Method Summary
protected  void bulkLoadPhase(HTable table, HConnection conn, ExecutorService pool, Deque<org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.LoadQueueItem> queue, com.google.common.collect.Multimap<ByteBuffer,org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.LoadQueueItem> regionGroups)
          This takes the LQI's grouped by likely regions and attempts to bulk load them.
 void doBulkLoad(org.apache.hadoop.fs.Path hfofDir, HTable table)
          Perform a bulk load of the given directory into the given pre-existing table.
protected  List<org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.LoadQueueItem> groupOrSplit(com.google.common.collect.Multimap<ByteBuffer,org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.LoadQueueItem> regionGroups, org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.LoadQueueItem item, HTable table, Pair<byte[][],byte[][]> startEndKeys)
          Attempt to assign the given load queue item into its target region group.
static byte[][] inferBoundaries(TreeMap<byte[],Integer> bdryMap)
           
static void main(String[] args)
           
 int run(String[] args)
           
protected  List<org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.LoadQueueItem> splitStoreFile(org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.LoadQueueItem item, HTable table, byte[] startKey, byte[] splitKey)
           
protected  List<org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.LoadQueueItem> tryAtomicRegionLoad(HConnection conn, byte[] tableName, byte[] first, Collection<org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.LoadQueueItem> lqis)
          Attempts to do an atomic load of many hfiles into a region.
 
Methods inherited from class org.apache.hadoop.conf.Configured
getConf, setConf
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface org.apache.hadoop.conf.Configurable
getConf, setConf
 

Field Detail

NAME

public static String NAME

ASSIGN_SEQ_IDS

public static String ASSIGN_SEQ_IDS
Constructor Detail

LoadIncrementalHFiles

public LoadIncrementalHFiles(org.apache.hadoop.conf.Configuration conf)
                      throws Exception
Throws:
Exception

LoadIncrementalHFiles

public LoadIncrementalHFiles(org.apache.hadoop.conf.Configuration conf,
                             boolean useSecureHBaseOverride)
                      throws Exception
Throws:
Exception
Method Detail

doBulkLoad

public void doBulkLoad(org.apache.hadoop.fs.Path hfofDir,
                       HTable table)
                throws TableNotFoundException,
                       IOException
Perform a bulk load of the given directory into the given pre-existing table. This method is not threadsafe.

Parameters:
hfofDir - the directory that was provided as the output path of a job using HFileOutputFormat
table - the table to load into
Throws:
TableNotFoundException - if table does not yet exist
IOException

bulkLoadPhase

protected void bulkLoadPhase(HTable table,
                             HConnection conn,
                             ExecutorService pool,
                             Deque<org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.LoadQueueItem> queue,
                             com.google.common.collect.Multimap<ByteBuffer,org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.LoadQueueItem> regionGroups)
                      throws IOException
This takes the LQI's grouped by likely regions and attempts to bulk load them. Any failures are re-queued for another pass with the groupOrSplitPhase.

Throws:
IOException

splitStoreFile

protected List<org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.LoadQueueItem> splitStoreFile(org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.LoadQueueItem item,
                                                                                                     HTable table,
                                                                                                     byte[] startKey,
                                                                                                     byte[] splitKey)
                                                                                              throws IOException
Throws:
IOException

groupOrSplit

protected List<org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.LoadQueueItem> groupOrSplit(com.google.common.collect.Multimap<ByteBuffer,org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.LoadQueueItem> regionGroups,
                                                                                                   org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.LoadQueueItem item,
                                                                                                   HTable table,
                                                                                                   Pair<byte[][],byte[][]> startEndKeys)
                                                                                            throws IOException
Attempt to assign the given load queue item into its target region group. If the hfile boundary no longer fits into a region, physically splits the hfile such that the new bottom half will fit and returns the list of LQI's corresponding to the resultant hfiles. protected for testing

Throws:
IOException

tryAtomicRegionLoad

protected List<org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.LoadQueueItem> tryAtomicRegionLoad(HConnection conn,
                                                                                                          byte[] tableName,
                                                                                                          byte[] first,
                                                                                                          Collection<org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.LoadQueueItem> lqis)
                                                                                                   throws IOException
Attempts to do an atomic load of many hfiles into a region. If it fails, it returns a list of hfiles that need to be retried. If it is successful it will return an empty list. NOTE: To maintain row atomicity guarantees, region server callable should succeed atomically and fails atomically. Protected for testing.

Returns:
empty list if success, list of items to retry on recoverable failure
Throws:
IOException

inferBoundaries

public static byte[][] inferBoundaries(TreeMap<byte[],Integer> bdryMap)

run

public int run(String[] args)
        throws Exception
Specified by:
run in interface org.apache.hadoop.util.Tool
Throws:
Exception

main

public static void main(String[] args)
                 throws Exception
Throws:
Exception


Copyright © 2015 The Apache Software Foundation. All Rights Reserved.