Package org.apache.pig.impl.logicalLayer

The logical operators that represent a pig script and tools for manipulating those operators.

See:
          Description

Class Summary
BinaryExpressionOperator This abstract class represents the logical Binary Expression Operator The binary operator has two operands and an operator.
CanonicalNamer A visitor to walk the logical plan and give canonical names fields.
CastFinder A visitor to track the casts in a plan.
ColumnPruner  
DotLOPrinter This class can print a logical plan in the DOT format.
ExpressionOperator  
LOAdd  
LOAnd  
LOBinCond  
LOCast  
LOCogroup  
LOConst  
LOCross  
LODefine  
LODistinct  
LODivide  
LOEqual  
LOFilter  
LOForEach  
LOGenerate  
LogicalOperator Parent for all Logical operators.
LogicalPlan  
LogicalPlanBuilder PlanBuilder class outputs a logical plan given a query String and set of ValidIDs
LogicalPlanCloneHelper LogicalPlanCloneHelper implements a visitor mechanism to clone a logical plan and then patch up the connections held within the operators of the logical plan.
LogicalPlanCloner LogicalPlanCloner provides the only mechanism of cloning a logical plan and hence the the logical operators in the plan.
LOGreaterThan  
LOGreaterThanEqual  
LOIsNull  
LOJoin  
LOLesserThan  
LOLesserThanEqual  
LOLimit  
LOLoad  
LOMapLookup  
LOMod  
LOMultiply  
LONegative  
LONot  
LONotEqual  
LOOr  
LOPrinter A visitor mechanism printing out the logical plan.
LOProject LOProject is designed like a singly linked list; A few examples will illustrate the point about the linked list nature of the design; a = load 'input1' as (name, age); b = group a by name; foreach b generate a, a.name; The project operator occurs in two places in the above script: generate a(here) and a.name(here) In the first occurrence, we are trying to project the elements of the bag a; In order to retrieve the bag, we need to project the the second column ($1) or column number 1 (using the zero based index) from the input (the relation or bag b) In the second occurence, we are trying to project the first column ($0) or column number 0 from the bag a which in turn is the column number 1 in the relation b; As you can see, the nested structure or the singly linked list nature is clearly visible; Given that it's a singly linked list, the null pointer or the sentinel is marked explictly using the boolean variable mSentinel; The sentinel is marked true only when the input is a relational operator; This occurs when we create the innermost operator
LORegexp  
LOSort  
LOSplit  
LOSplitOutput  
LOStore  
LOStream LOStream represents the specification of an external command to be executed in a Pig Query.
LOSubtract  
LOUnion  
LOUserFunc  
LOVisitor A visitor mechanism for navigating and operating on a tree of Logical Operators.
PlanSetter A visitor to set plans correctly inside logical operators.
ProjectFixerUpper A class to visit all the projects and change them to attach to a new node.
ProjectionMapCalculator A visitor to calculate all the projection maps in a logical plan.
ProjectionMapRemover A visitor to reset all the projection maps in a logical plan.
ProjectStarTranslator A visitor to walk operators that contain a nested plan and translate project( * ) operators to a list of projection operators, i.e., project( * ) -> project(0), project(1), ...
RelationalOperator  
RemoveRedundantOperators A visitor to remove redundant operators in a plan
TopLevelProjectFinder A visitor to track the top-level projection operators in a plan.
UDFFinder A visitor to track the UDFs in a plan.
UnaryExpressionOperator This abstract class represents the logical Unary Expression Operator The unary operator has an operand and an operator.
 

Enum Summary
LOCogroup.GROUPTYPE Enum for the type of group
LOJoin.JOINTYPE Enum for the type of join
 

Exception Summary
FrontendException  
 

Package org.apache.pig.impl.logicalLayer Description

The logical operators that represent a pig script and tools for manipulating those operators. The logical layer contains the logical operators themselves, as well as validators that check the logical plan, an optimizer, and a general visitor utility for working with the logical plans.

Design

Logical operators use the operator, plan, visitor, and optimizer framework provided by the org.apache.pig.impl.plan package.

Logical operators consist of both relational and expression operators. Relational operators work on an entire bag. Expression operators work on an element of a tuple (which may also be a bag). Due to Pig's nested data and execution model the distinction between relational and expression operators is not always clear. And some operators such as LOProject function as both.

In a traditional data base system, a query execution plan is constructed from relational operators, such as project, filter, sort, aggregate, join. Each of these may contain an expression tree, made up of expression operators. For example, consider a SQL query select a from T where a = 5;. The where clause would be represented by a filter operator with an expression tree for a=5.

Pig takes a similar approach, except that the operators contained inside of a relational operator may also be relational. For example, a foreach statement that has a nested script, such as foreach B { C = distinct $1; generate group, COUNT(C);}. This foreach needs to contain not just an expression tree but the distinct relational operator. For this reason, Pig's relational operators do not contain expression trees. Instead they contain one or more LogicalPlans themselves. This allows Pig to arbitrarily nest the logical plan. In this sense Pig is more similar to a traditional procedural language where certain statements (e.g. if, while) can contain any other statement in the language rather than being like SQL where the statement execution tends to be more linear.

Notes

Heads up to developers: when adding a new logical operator to the plan, there are a number of classes that need to know about every type of operator. These include PlanSetter, SchemaRemover, SchemaCalculator, and LogicalTransformer.



Copyright © ${year} The Apache Software Foundation