org.apache.hadoop.examples
Class RandomWriter
java.lang.Object
org.apache.hadoop.mapred.MapReduceBase
org.apache.hadoop.examples.RandomWriter
- All Implemented Interfaces:
- Closeable, JobConfigurable, Reducer
public class RandomWriter
- extends MapReduceBase
- implements Reducer
This program uses map/reduce to just run a distributed job where there is
no interaction between the tasks and each task write a large unsorted
random binary sequence file of BytesWritable.
- Author:
- Owen O'Malley
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
RandomWriter
public RandomWriter()
reduce
public void reduce(WritableComparable key,
Iterator values,
OutputCollector output,
Reporter reporter)
throws IOException
- Description copied from interface:
Reducer
- Combines values for a given key. Output values must be of the same type
as input values. Input keys must not be altered. Typically all values
are combined into zero or one value. Output pairs are collected with
calls to
OutputCollector.collect(WritableComparable,Writable)
.
- Specified by:
reduce
in interface Reducer
- Parameters:
key
- the keyvalues
- the values to combineoutput
- to collect combined values
- Throws:
IOException
main
public static void main(String[] args)
throws IOException
- This is the main routine for launching a distributed random write job.
It runs 10 maps/node and each node writes 1 gig of data to a DFS file.
The reduce doesn't do anything.
This program uses a useful pattern for dealing with Hadoop's constraints
on InputSplits. Since each input split can only consist of a file and
byte range and we want to control how many maps there are (and we don't
really have any inputs), we create a directory with a set of artificial
files that each contain the filename that we want a given map to write
to. Then, using the text line reader and this "fake" input directory, we
generate exactly the right number of maps. Each map gets a single record
that is the filename it is supposed to write its output to.
- Throws:
IOException
Copyright © 2006 The Apache Software Foundation