org.apache.pig.impl.logicalLayer
Class RelationalOperator

java.lang.Object
  extended by org.apache.pig.impl.plan.Operator<LOVisitor>
      extended by org.apache.pig.impl.logicalLayer.LogicalOperator
          extended by org.apache.pig.impl.logicalLayer.RelationalOperator
All Implemented Interfaces:
Serializable, Cloneable, Comparable<Operator>
Direct Known Subclasses:
LOCogroup, LOCross, LODistinct, LOFilter, LOForEach, LOJoin, LOLimit, LOLoad, LOSort, LOSplit, LOSplitOutput, LOStore, LOStream, LOUnion

public abstract class RelationalOperator
extends LogicalOperator

See Also:
Serialized Form

Field Summary
 
Fields inherited from class org.apache.pig.impl.logicalLayer.LogicalOperator
mAlias, mIsProjectionMapComputed, mIsSchemaComputed, mPlan, mProjectionMap, mRequestedParallelism, mSchema, mType
 
Fields inherited from class org.apache.pig.impl.plan.Operator
mKey
 
Constructor Summary
RelationalOperator(LogicalPlan plan, OperatorKey k)
           
RelationalOperator(LogicalPlan plan, OperatorKey k, int rp)
           
 
Method Summary
 ProjectionMap getProjectionMap()
          Produce a map describing how this operator modifies its projection.
abstract  List<RequiredFields> getRelevantInputs(int output, int column)
          Get relevant input columns of a particular output column.
 List<RequiredFields> getRequiredFields()
          Get a list of fields that this operator requires.
 ProjectionMap regenerateProjectionMap()
          Regenerate the projection map by unsetting and getting the projection map
 void unsetProjectionMap()
          Unset the projection map as if it had not been calculated.
 
Methods inherited from class org.apache.pig.impl.logicalLayer.LogicalOperator
clone, forceSchema, getAlias, getOperatorKey, getPlan, getRequestedParallelism, getSchema, getType, reconcileSchema, regenerateSchema, setAlias, setCanonicalNames, setPlan, setRequestedParallelism, setSchema, setSchemaComputed, setType, supportsMultipleOutputs, toString, unsetSchema, visit
 
Methods inherited from class org.apache.pig.impl.plan.Operator
compareTo, equals, hashCode, name, rewire, supportsMultipleInputs
 
Methods inherited from class java.lang.Object
finalize, getClass, notify, notifyAll, wait, wait, wait
 

Constructor Detail

RelationalOperator

public RelationalOperator(LogicalPlan plan,
                          OperatorKey k,
                          int rp)
Parameters:
plan - Logical plan this operator is a part of.
k - Operator key to assign to this node.
rp - degree of requested parallelism with which to execute this node.

RelationalOperator

public RelationalOperator(LogicalPlan plan,
                          OperatorKey k)
Parameters:
plan - Logical plan this operator is a part of.
k - Operator key to assign to this node.
Method Detail

getProjectionMap

public ProjectionMap getProjectionMap()
Produce a map describing how this operator modifies its projection.

Overrides:
getProjectionMap in class Operator<LOVisitor>
Returns:
ProjectionMap null indicates it does not know how the projection changes, for example a join of two inputs where one input does not have a schema.

unsetProjectionMap

public void unsetProjectionMap()
Unset the projection map as if it had not been calculated. This is used by anyone who reorganizes the tree and needs to have projection maps recalculated.

Overrides:
unsetProjectionMap in class Operator<LOVisitor>

regenerateProjectionMap

public ProjectionMap regenerateProjectionMap()
Regenerate the projection map by unsetting and getting the projection map

Overrides:
regenerateProjectionMap in class Operator<LOVisitor>

getRequiredFields

public List<RequiredFields> getRequiredFields()
Get a list of fields that this operator requires. This is not necessarily equivalent to the list of fields the operator projects. For example, a filter will project anything passed to it, but requires only the fields explicitly referenced in its filter expression.

Returns:
list of RequiredFields null indicates that the operator does not need any fields from its input.

getRelevantInputs

public abstract List<RequiredFields> getRelevantInputs(int output,
                                                       int column)
Get relevant input columns of a particular output column. The resulting input columns are necessary components only to the output column. Input columns needed by the entire RelationalOperator thus indirectly contribute to the output columns are not counted. Those are required columns. eg1: A = load 'a' AS (a0, a1, a2); B = filter a by a0=='1'; Relevant input columns for B.$1 is A.a1 because A.a1 direct generate B.$1. A.a0 is needed by the filter operator and it is considered as required fields for the relational operator. eg2: A = load 'a' AS (a0, a1); B = load 'b' AS (b0, b1); C = join A by a0, B by b0; Relevant input columns for C.$0 is A.a0. Relevant input columns for C.$1 is A.a1. eg3: A = load 'a' AS (a0, a1); B = load 'b' AS (b0, b1); C = cogroup A by a0, B by b0; Relevant input columns for C.$0 is A.a0, B.b0. Relevant input columns for C.$1 is A.*. Relevant input columns for C.$2 is B.*. eg4: A = load 'a' AS (a0, a1, a2); B = foreach A generate a1, a0+a2; Relevant input columns for B.$0 is A.a1. Relevant input columns for B.$1 is A.a0 and A.a2. eg5: A = load 'a' AS (a0, a1, a2); B = foreach A generate a1, *; Relevant input columns for B.$0 is A.a1. Relevant input columns for B.$1 is A.a0. Relevant input columns for B.$2 is A.a1. Relevant input columns for B.$3 is A.a2.

Parameters:
output - output index. Only LOSplit have output other than 0 currently
column - output column
Returns:
List of relevant input columns. null if Pig cannot determine relevant inputs or any error occurs


Copyright © ${year} The Apache Software Foundation