org.apache.hadoop.hive.ql.exec
Class LateralViewJoinOperator
java.lang.Object
org.apache.hadoop.hive.ql.exec.Operator<lateralViewJoinDesc>
org.apache.hadoop.hive.ql.exec.LateralViewJoinOperator
- All Implemented Interfaces:
- Serializable, Node
public class LateralViewJoinOperator
- extends Operator<lateralViewJoinDesc>
The lateral view join operator is used to implement the lateral view
functionality. This operator was implemented with the following
operator DAG in mind. For a query such as
SELECT pageid, adid.* FROM example_table LATERAL VIEW explode(adid_list) AS adid
The top of the operator tree will look similar to
[Table Scan]
/ \
[Select](*) [Select](adid_list)
| |
| [UDTF] (explode)
\ /
[Lateral View Join]
|
|
[Select] (pageid, adid.*)
|
....
Rows from the table scan operator are first sent to two select operators.
The select operator on the left picks all the columns while the select
operator on the right picks only the columns needed by the UDTF.
The output of select in the left branch and output of the UDTF in the right
branch are then sent to the lateral view join (LVJ). In most cases, the UDTF
will generate > 1 row for every row received from the TS, while the left
select operator will generate only one. For each row output from the TS,
the LVJ outputs all possible rows that can be created by joining the row from
the left select and one of the rows output from the UDTF.
Additional lateral views can be supported by adding a similar DAG after the
previous LVJ operator.
- See Also:
- Serialized Form
Fields inherited from class org.apache.hadoop.hive.ql.exec.Operator |
alias, beginTime, childOperators, childOperatorsArray, childOperatorsTag, colExprMap, conf, counterNames, counterNameToEnum, counters, done, fatalErrorCntr, id, inputObjInspectors, inputRows, LOG, numInputRowsCntr, numOutputRowsCntr, operatorId, out, outputObjInspector, outputRows, parentOperators, reporter, state, statsMap, timeTakenCntr, totalTime |
Method Summary |
protected void |
initializeOp(org.apache.hadoop.conf.Configuration hconf)
Operator specific initialization. |
void |
processOp(Object row,
int tag)
An important assumption for processOp() is that for a given row from the
TS, the LVJ will first get the row from the left select operator, followed
by all the corresponding rows from the UDTF operator. |
Methods inherited from class org.apache.hadoop.hive.ql.exec.Operator |
areAllParentsInitialized, assignCounterNameToEnum, checkFatalErrors, close, closeOp, dump, dump, endGroup, fatalErrorMessage, forward, getChildOperators, getChildren, getColumnExprMap, getConf, getCounterNames, getCounterNameToEnum, getCounters, getDone, getIdentifier, getName, getOperatorId, getParentOperators, getSchema, getStats, getType, incrCounter, initEvaluators, initEvaluatorsAndReturnStruct, initialize, initializeChildren, initializeCounters, initOperatorId, jobClose, logStats, preorderMap, process, removeChild, replaceChild, replaceParent, resetId, resetLastEnumUsed, resetStats, setAlias, setChildOperators, setColumnExprMap, setConf, setCounterNames, setCounterNameToEnum, setDone, setId, setOperatorId, setOutputCollector, setParentOperators, setReporter, setSchema, startGroup, updateCounters |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
LateralViewJoinOperator
public LateralViewJoinOperator()
initializeOp
protected void initializeOp(org.apache.hadoop.conf.Configuration hconf)
throws HiveException
- Description copied from class:
Operator
- Operator specific initialization.
- Overrides:
initializeOp
in class Operator<lateralViewJoinDesc>
- Throws:
HiveException
processOp
public void processOp(Object row,
int tag)
throws HiveException
- An important assumption for processOp() is that for a given row from the
TS, the LVJ will first get the row from the left select operator, followed
by all the corresponding rows from the UDTF operator. And so on.
- Specified by:
processOp
in class Operator<lateralViewJoinDesc>
- Parameters:
row
- The object representing the row.tag
- The tag of the row usually means which parent this row comes from.
Rows with the same tag should have exactly the same rowInspector all the time.
- Throws:
HiveException
Copyright © 2009 The Apache Software Foundation