org.apache.hadoop.hbase.mapreduce.hadoopbackport
Class InputSampler<K,V>

java.lang.Object
  extended by org.apache.hadoop.conf.Configured
      extended by org.apache.hadoop.hbase.mapreduce.hadoopbackport.InputSampler<K,V>
All Implemented Interfaces:
org.apache.hadoop.conf.Configurable, org.apache.hadoop.util.Tool

public class InputSampler<K,V>
extends org.apache.hadoop.conf.Configured
implements org.apache.hadoop.util.Tool

Utility for collecting samples and writing a partition file for TotalOrderPartitioner. This is an identical copy of o.a.h.mapreduce.lib.partition.TotalOrderPartitioner from Hadoop trunk at r961542, with the exception of replacing TaskAttemptContextImpl with TaskAttemptContext.


Nested Class Summary
static class InputSampler.IntervalSampler<K,V>
          Sample from s splits at regular intervals.
static class InputSampler.RandomSampler<K,V>
          Sample from random points in the input.
static interface InputSampler.Sampler<K,V>
          Interface to sample using an InputFormat.
static class InputSampler.SplitSampler<K,V>
          Samples the first n records from s splits.
 
Constructor Summary
InputSampler(org.apache.hadoop.conf.Configuration conf)
           
 
Method Summary
static org.apache.hadoop.mapreduce.TaskAttemptContext getTaskAttemptContext(org.apache.hadoop.mapreduce.Job job)
          This method is about making hbase portable, making it so it can run on more than just hadoop 0.20.
static void main(String[] args)
           
 int run(String[] args)
          Driver for InputSampler from the command line.
static
<K,V> void
writePartitionFile(org.apache.hadoop.mapreduce.Job job, InputSampler.Sampler<K,V> sampler)
          Write a partition file for the given job, using the Sampler provided.
 
Methods inherited from class org.apache.hadoop.conf.Configured
getConf, setConf
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface org.apache.hadoop.conf.Configurable
getConf, setConf
 

Constructor Detail

InputSampler

public InputSampler(org.apache.hadoop.conf.Configuration conf)
Method Detail

getTaskAttemptContext

public static org.apache.hadoop.mapreduce.TaskAttemptContext getTaskAttemptContext(org.apache.hadoop.mapreduce.Job job)
                                                                            throws IOException
This method is about making hbase portable, making it so it can run on more than just hadoop 0.20. In later hadoops, TaskAttemptContext became an Interface. But in hadoops where TAC is an Interface, we shouldn't be using the classes that are in this package; we should be using the native Hadoop ones (We'll throw a ClassNotFoundException if end up in here when we should be using native hadoop TotalOrderPartitioner).

Parameters:
job -
Returns:
Context
Throws:
IOException

writePartitionFile

public static <K,V> void writePartitionFile(org.apache.hadoop.mapreduce.Job job,
                                            InputSampler.Sampler<K,V> sampler)
                               throws IOException,
                                      ClassNotFoundException,
                                      InterruptedException
Write a partition file for the given job, using the Sampler provided. Queries the sampler for a sample keyset, sorts by the output key comparator, selects the keys for each rank, and writes to the destination returned from TotalOrderPartitioner.getPartitionFile(org.apache.hadoop.conf.Configuration).

Throws:
IOException
ClassNotFoundException
InterruptedException

run

public int run(String[] args)
        throws Exception
Driver for InputSampler from the command line. Configures a JobConf instance and calls writePartitionFile(org.apache.hadoop.mapreduce.Job, org.apache.hadoop.hbase.mapreduce.hadoopbackport.InputSampler.Sampler).

Specified by:
run in interface org.apache.hadoop.util.Tool
Throws:
Exception

main

public static void main(String[] args)
                 throws Exception
Throws:
Exception


Copyright © 2015 The Apache Software Foundation. All Rights Reserved.