org.apache.pig.backend.hadoop.executionengine.mapReduceLayer
Class CombinerOptimizer

java.lang.Object
  extended by org.apache.pig.impl.plan.PlanVisitor<MapReduceOper,MROperPlan>
      extended by org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.plans.MROpPlanVisitor
          extended by org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.CombinerOptimizer

public class CombinerOptimizer
extends MROpPlanVisitor

Optimize map reduce plans to use the combiner where possible. Algebriac functions and distinct in nested plan of a foreach are partially computed in the map and combine phase. A new foreach statement with initial and intermediate forms of algebraic functions are added to map and combine plans respectively. If bag portion of group-by result is projected or a non algebraic expression/udf has bag as input, combiner will not be used. This is because the use of combiner in such case is likely to degrade performance as there will not be much reduction in data size in combine stage to offset the cost of the additional number of times (de)serialization is done. Major areas for enhancement: 1. use of combiner in cogroup 2. queries with order-by, limit or sort in a nested foreach after group-by 3. case where group-by is followed by filter that has algebraic expression


Field Summary
 
Fields inherited from class org.apache.pig.impl.plan.PlanVisitor
mCurrentWalker, mPlan
 
Constructor Summary
CombinerOptimizer(MROperPlan plan, String chunkSize)
           
CombinerOptimizer(MROperPlan plan, String chunkSize, CompilationMessageCollector messageCollector)
           
 
Method Summary
 CompilationMessageCollector getMessageCollector()
           
 void visitMROp(MapReduceOper mr)
           
 
Methods inherited from class org.apache.pig.impl.plan.PlanVisitor
getPlan, popWalker, pushWalker, visit
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

CombinerOptimizer

public CombinerOptimizer(MROperPlan plan,
                         String chunkSize)

CombinerOptimizer

public CombinerOptimizer(MROperPlan plan,
                         String chunkSize,
                         CompilationMessageCollector messageCollector)
Method Detail

getMessageCollector

public CompilationMessageCollector getMessageCollector()

visitMROp

public void visitMROp(MapReduceOper mr)
               throws VisitorException
Overrides:
visitMROp in class MROpPlanVisitor
Throws:
VisitorException


Copyright © ${year} The Apache Software Foundation