org.apache.pig.backend.hadoop.executionengine.mapReduceLayer
Class MRCompiler
java.lang.Object
org.apache.pig.impl.plan.PlanVisitor<PhysicalOperator,PhysicalPlan>
org.apache.pig.backend.hadoop.executionengine.physicalLayer.plans.PhyPlanVisitor
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler
public class MRCompiler
- extends PhyPlanVisitor
The compiler that compiles a given physical plan
into a DAG of MapReduce operators which can then
be converted into the JobControl structure.
Is implemented as a visitor of the PhysicalPlan it
is compiling.
Currently supports all operators except the MR Sort
operator
Uses a predecessor based depth first traversal.
To compile an operator, first compiles
the predecessors into MapReduce Operators and tries to
merge the current operator into one of them. The goal
being to keep the number of MROpers to a minimum.
It also merges multiple Map jobs, created by compiling
the inputs individually, into a single job. Here a new
map job is created and then the contents of the previous
map plans are added. However, any other state that was in
the previous map plans, should be manually moved over. So,
if you are adding something new take care about this.
Ex of this is in requestedParallelism
Only in case of blocking operators and splits, a new
MapReduce operator is started using a store-load combination
to connect the two operators. Whenever this happens
care is taken to add the MROper into the MRPlan and connect it
appropriately.
Method Summary |
MROperPlan |
compile()
The front-end method that the user calls to compile
the plan. |
CompilationMessageCollector |
getMessageCollector()
|
MROperPlan |
getMRPlan()
Used to get the compiled plan |
PhysicalPlan |
getPlan()
Used to get the plan that was compiled |
Pair<MapReduceOper,Integer> |
getQuantileJob(POSort inpSort,
MapReduceOper prevJob,
FileSpec lFile,
FileSpec quantFile,
int rp,
Pair<Integer,Byte>[] fields)
|
MapReduceOper |
getSortJob(POSort sort,
MapReduceOper quantJob,
FileSpec lFile,
FileSpec quantFile,
int rp,
Pair<Integer,Byte>[] fields)
|
void |
randomizeFileLocalizer()
|
void |
simpleConnectMapToReduce(MapReduceOper mro)
|
void |
visitDistinct(PODistinct op)
|
void |
visitFilter(POFilter op)
|
void |
visitFRJoin(POFRJoin op)
This is an operator which will have multiple inputs(= to number of join inputs)
But it prunes off all inputs but the fragment input and creates separate MR jobs
for each of the replicated inputs and uses these as the replicated files that
are configured in the POFRJoin operator. |
void |
visitGlobalRearrange(POGlobalRearrange op)
|
void |
visitLimit(POLimit op)
|
void |
visitLoad(POLoad op)
|
void |
visitLocalRearrange(POLocalRearrange op)
|
void |
visitPackage(POPackage op)
|
void |
visitPOForEach(POForEach op)
|
void |
visitSort(POSort op)
|
void |
visitSplit(POSplit op)
Compiles a split operator. |
void |
visitStore(POStore op)
|
void |
visitStream(POStream op)
|
void |
visitUnion(POUnion op)
|
Methods inherited from class org.apache.pig.backend.hadoop.executionengine.physicalLayer.plans.PhyPlanVisitor |
visitAdd, visitAnd, visitBinCond, visitCast, visitCogroup, visitCombinerPackage, visitComparisonFunc, visitConstant, visitCross, visitDemux, visitDivide, visitEqualTo, visitGreaterThan, visitGTOrEqual, visitIsNull, visitJoinPackage, visitLessThan, visitLocalRearrangeForIllustrate, visitLTOrEqual, visitMapLookUp, visitMod, visitMultiply, visitMultiQueryPackage, visitNegative, visitNot, visitNotEqualTo, visitOr, visitPOOptimizedForEach, visitPreCombinerLocalRearrange, visitProject, visitRead, visitRegexp, visitSplit, visitSubtract, visitUserFunc |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
USER_COMPARATOR_MARKER
public static String USER_COMPARATOR_MARKER
MRCompiler
public MRCompiler(PhysicalPlan plan)
throws MRCompilerException
- Throws:
MRCompilerException
MRCompiler
public MRCompiler(PhysicalPlan plan,
PigContext pigContext)
throws MRCompilerException
- Throws:
MRCompilerException
randomizeFileLocalizer
public void randomizeFileLocalizer()
getMRPlan
public MROperPlan getMRPlan()
- Used to get the compiled plan
- Returns:
- map reduce plan built by the compiler
getPlan
public PhysicalPlan getPlan()
- Used to get the plan that was compiled
- Overrides:
getPlan
in class PlanVisitor<PhysicalOperator,PhysicalPlan>
- Returns:
- physical plan
getMessageCollector
public CompilationMessageCollector getMessageCollector()
compile
public MROperPlan compile()
throws IOException,
PlanException,
VisitorException
- The front-end method that the user calls to compile
the plan. Assumes that all submitted plans have a Store
operators as the leaf.
- Returns:
- A map reduce plan
- Throws:
IOException
PlanException
VisitorException
visitSplit
public void visitSplit(POSplit op)
throws VisitorException
- Compiles a split operator. The logic is to
close the split job by replacing the split oper by
a store and creating a new Map MRoper and return
that as the current MROper to which other operators
would be compiled into. The new MROper would be connected
to the split job by load-store. Also add the split oper
to the splitsSeen map.
- Overrides:
visitSplit
in class PhyPlanVisitor
- Parameters:
op
- - The split operator
- Throws:
VisitorException
visitLoad
public void visitLoad(POLoad op)
throws VisitorException
- Overrides:
visitLoad
in class PhyPlanVisitor
- Throws:
VisitorException
visitStore
public void visitStore(POStore op)
throws VisitorException
- Overrides:
visitStore
in class PhyPlanVisitor
- Throws:
VisitorException
visitFilter
public void visitFilter(POFilter op)
throws VisitorException
- Overrides:
visitFilter
in class PhyPlanVisitor
- Throws:
VisitorException
visitStream
public void visitStream(POStream op)
throws VisitorException
- Overrides:
visitStream
in class PhyPlanVisitor
- Throws:
VisitorException
simpleConnectMapToReduce
public void simpleConnectMapToReduce(MapReduceOper mro)
throws PlanException
- Throws:
PlanException
visitLimit
public void visitLimit(POLimit op)
throws VisitorException
- Overrides:
visitLimit
in class PhyPlanVisitor
- Throws:
VisitorException
visitLocalRearrange
public void visitLocalRearrange(POLocalRearrange op)
throws VisitorException
- Overrides:
visitLocalRearrange
in class PhyPlanVisitor
- Throws:
VisitorException
visitPOForEach
public void visitPOForEach(POForEach op)
throws VisitorException
- Overrides:
visitPOForEach
in class PhyPlanVisitor
- Throws:
VisitorException
visitGlobalRearrange
public void visitGlobalRearrange(POGlobalRearrange op)
throws VisitorException
- Overrides:
visitGlobalRearrange
in class PhyPlanVisitor
- Throws:
VisitorException
visitPackage
public void visitPackage(POPackage op)
throws VisitorException
- Overrides:
visitPackage
in class PhyPlanVisitor
- Throws:
VisitorException
visitUnion
public void visitUnion(POUnion op)
throws VisitorException
- Overrides:
visitUnion
in class PhyPlanVisitor
- Throws:
VisitorException
visitFRJoin
public void visitFRJoin(POFRJoin op)
throws VisitorException
- This is an operator which will have multiple inputs(= to number of join inputs)
But it prunes off all inputs but the fragment input and creates separate MR jobs
for each of the replicated inputs and uses these as the replicated files that
are configured in the POFRJoin operator. It also sets that this is FRJoin job
and some parametes associated with it.
- Overrides:
visitFRJoin
in class PhyPlanVisitor
- Throws:
VisitorException
visitDistinct
public void visitDistinct(PODistinct op)
throws VisitorException
- Overrides:
visitDistinct
in class PhyPlanVisitor
- Throws:
VisitorException
visitSort
public void visitSort(POSort op)
throws VisitorException
- Overrides:
visitSort
in class PhyPlanVisitor
- Throws:
VisitorException
getSortJob
public MapReduceOper getSortJob(POSort sort,
MapReduceOper quantJob,
FileSpec lFile,
FileSpec quantFile,
int rp,
Pair<Integer,Byte>[] fields)
throws PlanException
- Throws:
PlanException
getQuantileJob
public Pair<MapReduceOper,Integer> getQuantileJob(POSort inpSort,
MapReduceOper prevJob,
FileSpec lFile,
FileSpec quantFile,
int rp,
Pair<Integer,Byte>[] fields)
throws PlanException,
VisitorException
- Throws:
PlanException
VisitorException
Copyright © ${year} The Apache Software Foundation