org.apache.hadoop.streaming
Class PipeCombiner

java.lang.Object
  extended by org.apache.hadoop.streaming.PipeMapRed
      extended by org.apache.hadoop.streaming.PipeReducer
          extended by org.apache.hadoop.streaming.PipeCombiner
All Implemented Interfaces:
Closeable, JobConfigurable, Reducer

public class PipeCombiner
extends PipeReducer

A generic Combiner bridge.
To use a Combiner specify -combiner myprogram in hadoopStreaming. It delegates operations to an external program via stdin and stdout. In one run of the external program, you can expect all records with the same key to appear together. You should not make assumptions about how many times the combiner is run on your data. Ideally the combiner and the reducer are the same program, the combiner partially aggregates the data zero or more times and the reducer applies the last aggregation pass. Do not use a Combiner if your reduce logic does not suport such a multipass aggregation.

Author:
Michel Tourn

Field Summary
 
Fields inherited from class org.apache.hadoop.streaming.PipeMapRed
LOG
 
Constructor Summary
PipeCombiner()
           
 
Method Summary
 
Methods inherited from class org.apache.hadoop.streaming.PipeReducer
close, reduce
 
Methods inherited from class org.apache.hadoop.streaming.PipeMapRed
configure, getContext, mapRedFinished
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface org.apache.hadoop.mapred.JobConfigurable
configure
 

Constructor Detail

PipeCombiner

public PipeCombiner()


Copyright © 2006 The Apache Software Foundation