org.apache.pig.impl.logicalLayer
Class LOCogroup

java.lang.Object
  extended by org.apache.pig.impl.plan.Operator<LOVisitor>
      extended by org.apache.pig.impl.logicalLayer.LogicalOperator
          extended by org.apache.pig.impl.logicalLayer.RelationalOperator
              extended by org.apache.pig.impl.logicalLayer.LOCogroup
All Implemented Interfaces:
Serializable, Cloneable, Comparable<Operator>

public class LOCogroup
extends RelationalOperator

See Also:
Serialized Form

Nested Class Summary
static class LOCogroup.GROUPTYPE
          Enum for the type of group
 
Field Summary
 
Fields inherited from class org.apache.pig.impl.logicalLayer.LogicalOperator
mAlias, mIsProjectionMapComputed, mIsSchemaComputed, mPlan, mProjectionMap, mRequestedParallelism, mSchema, mType
 
Fields inherited from class org.apache.pig.impl.plan.Operator
mKey
 
Constructor Summary
LOCogroup(LogicalPlan plan, OperatorKey k, MultiMap<LogicalOperator,LogicalPlan> groupByPlans, boolean[] isInner)
           
LOCogroup(LogicalPlan plan, OperatorKey k, MultiMap<LogicalOperator,LogicalPlan> groupByPlans, LOCogroup.GROUPTYPE type, boolean[] isInner)
           
 
Method Summary
protected  Object clone()
           
 byte getAtomicGroupByType()
          This can be used to get the merged type of output group col only when the group col is of atomic type TODO: This doesn't work with group by complex type
 MultiMap<LogicalOperator,LogicalPlan> getGroupByPlans()
           
 LOCogroup.GROUPTYPE getGroupType()
           
 boolean[] getInner()
           
 List<LogicalOperator> getInputs()
           
 ProjectionMap getProjectionMap()
          Produce a map describing how this operator modifies its projection.
 List<RequiredFields> getRelevantInputs(int output, int column)
          Get relevant input columns of a particular output column.
 List<RequiredFields> getRequiredFields()
          Get a list of fields that this operator requires.
 Schema getSchema()
          Get a copy of the schema for the output of this operator.
 Schema getTupleGroupBySchema()
           
 boolean isTupleGroupCol()
           
 String name()
           
 boolean pruneColumns(List<Pair<Integer,Integer>> columns)
           
 void rewire(Operator<LOVisitor> oldPred, int oldPredIndex, Operator<LOVisitor> newPred, boolean useOldPred)
          Make any necessary changes to a node based on a change of position in the plan.
 void setGroupByPlans(MultiMap<LogicalOperator,LogicalPlan> groupByPlans)
           
 void setInner(boolean[] inner)
           
 boolean supportsMultipleInputs()
          Indicates whether this operator supports multiple inputs.
 void switchGroupByPlanOp(LogicalOperator oldOp, LogicalOperator newOp)
          This does switch the mapping oldOp -> List of inner plans to newOp -> List of inner plans which is useful when there is a structural change in LogicalPlan
 void unsetSchema()
          Unset the schema as if it had not been calculated.
 void visit(LOVisitor v)
          Visit this node with the provided visitor.
 
Methods inherited from class org.apache.pig.impl.logicalLayer.RelationalOperator
pruneColumnInPlan, regenerateProjectionMap, unsetProjectionMap
 
Methods inherited from class org.apache.pig.impl.logicalLayer.LogicalOperator
forceSchema, getAlias, getOperatorKey, getPlan, getRequestedParallelism, getType, reconcileSchema, regenerateSchema, setAlias, setCanonicalNames, setPlan, setRequestedParallelism, setSchema, setSchemaComputed, setType, supportsMultipleOutputs, toString
 
Methods inherited from class org.apache.pig.impl.plan.Operator
compareTo, equals, hashCode
 
Methods inherited from class java.lang.Object
finalize, getClass, notify, notifyAll, wait, wait, wait
 

Constructor Detail

LOCogroup

public LOCogroup(LogicalPlan plan,
                 OperatorKey k,
                 MultiMap<LogicalOperator,LogicalPlan> groupByPlans,
                 boolean[] isInner)
Parameters:
plan - LogicalPlan this operator is a part of.
k - OperatorKey for this operator
groupByPlans - the group by columns
isInner - indicates whether the cogroup is inner for each relation

LOCogroup

public LOCogroup(LogicalPlan plan,
                 OperatorKey k,
                 MultiMap<LogicalOperator,LogicalPlan> groupByPlans,
                 LOCogroup.GROUPTYPE type,
                 boolean[] isInner)
Parameters:
plan - LogicalPlan this operator is a part of.
k - OperatorKey for this operator
groupByPlans - the group by columns
type - the type of this group
isInner - indicates whether the cogroup is inner for each relation
Method Detail

getInputs

public List<LogicalOperator> getInputs()

getGroupByPlans

public MultiMap<LogicalOperator,LogicalPlan> getGroupByPlans()

setGroupByPlans

public void setGroupByPlans(MultiMap<LogicalOperator,LogicalPlan> groupByPlans)

getInner

public boolean[] getInner()

setInner

public void setInner(boolean[] inner)

getGroupType

public LOCogroup.GROUPTYPE getGroupType()

name

public String name()
Specified by:
name in class Operator<LOVisitor>

supportsMultipleInputs

public boolean supportsMultipleInputs()
Description copied from class: Operator
Indicates whether this operator supports multiple inputs.

Specified by:
supportsMultipleInputs in class Operator<LOVisitor>
Returns:
true if it does, otherwise false.

getSchema

public Schema getSchema()
                 throws FrontendException
Description copied from class: LogicalOperator
Get a copy of the schema for the output of this operator.

Specified by:
getSchema in class LogicalOperator
Throws:
FrontendException

isTupleGroupCol

public boolean isTupleGroupCol()

visit

public void visit(LOVisitor v)
           throws VisitorException
Description copied from class: LogicalOperator
Visit this node with the provided visitor. This should only be called by the visitor class itself, never directly.

Specified by:
visit in class LogicalOperator
Parameters:
v - Visitor to visit with.
Throws:
VisitorException - if the visitor has a problem.

switchGroupByPlanOp

public void switchGroupByPlanOp(LogicalOperator oldOp,
                                LogicalOperator newOp)
This does switch the mapping oldOp -> List of inner plans to newOp -> List of inner plans which is useful when there is a structural change in LogicalPlan

Parameters:
oldOp - the old operator
newOp - the new operator

unsetSchema

public void unsetSchema()
                 throws VisitorException
Description copied from class: LogicalOperator
Unset the schema as if it had not been calculated. This is used by anyone who reorganizes the tree and needs to have schemas recalculated.

Overrides:
unsetSchema in class LogicalOperator
Throws:
VisitorException

getAtomicGroupByType

public byte getAtomicGroupByType()
                          throws FrontendException
This can be used to get the merged type of output group col only when the group col is of atomic type TODO: This doesn't work with group by complex type

Returns:
The type of the group by
Throws:
FrontendException

getTupleGroupBySchema

public Schema getTupleGroupBySchema()
                             throws FrontendException
Throws:
FrontendException

clone

protected Object clone()
                throws CloneNotSupportedException
Overrides:
clone in class LogicalOperator
Throws:
CloneNotSupportedException
See Also:
Do not use the clone method directly. Operators are cloned when logical plans are cloned using {@link LogicalPlanCloner}

getProjectionMap

public ProjectionMap getProjectionMap()
Description copied from class: RelationalOperator
Produce a map describing how this operator modifies its projection.

Overrides:
getProjectionMap in class RelationalOperator
Returns:
ProjectionMap null indicates it does not know how the projection changes, for example a join of two inputs where one input does not have a schema.

getRequiredFields

public List<RequiredFields> getRequiredFields()
Description copied from class: RelationalOperator
Get a list of fields that this operator requires. This is not necessarily equivalent to the list of fields the operator projects. For example, a filter will project anything passed to it, but requires only the fields explicitly referenced in its filter expression.

Overrides:
getRequiredFields in class RelationalOperator
Returns:
list of RequiredFields null indicates that the operator does not need any fields from its input.

rewire

public void rewire(Operator<LOVisitor> oldPred,
                   int oldPredIndex,
                   Operator<LOVisitor> newPred,
                   boolean useOldPred)
            throws PlanException
Description copied from class: Operator
Make any necessary changes to a node based on a change of position in the plan. This allows operators to rewire their projections, etc. when they are relocated in a plan.

Overrides:
rewire in class Operator<LOVisitor>
Parameters:
oldPred - Operator that was previously the predecessor.
oldPredIndex - position of the old predecessor in the list of predecessors
newPred - Operator that will now be the predecessor.
useOldPred - If true use oldPred's projection map for the rewire; otherwise use newPred's projection map
Throws:
PlanException

getRelevantInputs

public List<RequiredFields> getRelevantInputs(int output,
                                              int column)
                                       throws FrontendException
Description copied from class: RelationalOperator
Get relevant input columns of a particular output column. The resulting input columns are necessary components only to the output column. Input columns needed by the entire RelationalOperator thus indirectly contribute to the output columns are not counted. Those are required columns. eg1: A = load 'a' AS (a0, a1, a2); B = filter a by a0=='1'; Relevant input columns for B.$1 is A.a1 because A.a1 direct generate B.$1. A.a0 is needed by the filter operator and it is considered as required fields for the relational operator. eg2: A = load 'a' AS (a0, a1); B = load 'b' AS (b0, b1); C = join A by a0, B by b0; Relevant input columns for C.$0 is A.a0. Relevant input columns for C.$1 is A.a1. eg3: A = load 'a' AS (a0, a1); B = load 'b' AS (b0, b1); C = cogroup A by a0, B by b0; Relevant input columns for C.$0 is A.a0, B.b0. Relevant input columns for C.$1 is A.*. Relevant input columns for C.$2 is B.*. eg4: A = load 'a' AS (a0, a1, a2); B = foreach A generate a1, a0+a2; Relevant input columns for B.$0 is A.a1. Relevant input columns for B.$1 is A.a0 and A.a2. eg5: A = load 'a' AS (a0, a1, a2); B = foreach A generate a1, *; Relevant input columns for B.$0 is A.a1. Relevant input columns for B.$1 is A.a0. Relevant input columns for B.$2 is A.a1. Relevant input columns for B.$3 is A.a2.

Specified by:
getRelevantInputs in class RelationalOperator
Parameters:
output - output index. Only LOSplit have output other than 0 currently
column - output column
Returns:
List of relevant input columns. null if Pig cannot determine relevant inputs or any error occurs
Throws:
FrontendException

pruneColumns

public boolean pruneColumns(List<Pair<Integer,Integer>> columns)
                     throws FrontendException
Overrides:
pruneColumns in class RelationalOperator
Throws:
FrontendException


Copyright © ${year} The Apache Software Foundation