org.apache.hadoop.hbase.mapreduce
Class TableSnapshotInputFormatImpl
java.lang.Object
org.apache.hadoop.hbase.mapreduce.TableSnapshotInputFormatImpl
@InterfaceAudience.Private
@InterfaceStability.Evolving
public class TableSnapshotInputFormatImpl
- extends Object
API-agnostic implementation for mapreduce over table snapshots.
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
TableSnapshotInputFormatImpl
public TableSnapshotInputFormatImpl()
getSplits
public static List<TableSnapshotInputFormatImpl.InputSplit> getSplits(org.apache.hadoop.conf.Configuration conf)
throws IOException
- Throws:
IOException
getBestLocations
public static List<String> getBestLocations(org.apache.hadoop.conf.Configuration conf,
HDFSBlocksDistribution blockDistribution)
- This computes the locations to be passed from the InputSplit. MR/Yarn schedulers does not take
weights into account, thus will treat every location passed from the input split as equal. We
do not want to blindly pass all the locations, since we are creating one split per region, and
the region's blocks are all distributed throughout the cluster unless favorite node assignment
is used. On the expected stable case, only one location will contain most of the blocks as local.
On the other hand, in favored node assignment, 3 nodes will contain highly local blocks. Here
we are doing a simple heuristic, where we will pass all hosts which have at least 80%
(hbase.tablesnapshotinputformat.locality.cutoff.multiplier) as much block locality as the top
host with the best locality.
setInput
public static void setInput(org.apache.hadoop.conf.Configuration conf,
String snapshotName,
org.apache.hadoop.fs.Path restoreDir)
throws IOException
- Configures the job to use TableSnapshotInputFormat to read from a snapshot.
- Parameters:
conf
- the job to configuresnapshotName
- the name of the snapshot to read fromrestoreDir
- a temporary directory to restore the snapshot into. Current user should
have write permissions to this directory, and this should not be a subdirectory of rootdir.
After the job is finished, restoreDir can be deleted.
- Throws:
IOException
- if an error occurs
Copyright © 2015 The Apache Software Foundation. All rights reserved.