org.apache.hadoop.hive.ql.optimizer
Class MapJoinProcessor

java.lang.Object
  extended by org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor
All Implemented Interfaces:
Transform

public class MapJoinProcessor
extends Object
implements Transform

Implementation of one of the rule-based map join optimization. User passes hints to specify map-joins and during this optimization, all user specified map joins are converted to MapJoins - the reduce sink operator above the join are converted to map sink operators. In future, once statistics are implemented, this transformation can also be done based on costs.


Nested Class Summary
static class MapJoinProcessor.CurrentMapJoin
          CurrentMapJoin.
static class MapJoinProcessor.Default
          Default.
static class MapJoinProcessor.MapJoinDefault
          MapJoinDefault.
static class MapJoinProcessor.MapJoinFS
          MapJoinFS.
static class MapJoinProcessor.MapJoinWalkerCtx
          MapJoinWalkerCtx.
 
Constructor Summary
MapJoinProcessor()
          empty constructor.
 
Method Summary
static void checkMapJoin(int mapJoinPos, JoinCondDesc[] condns)
           
static MapJoinOperator convertMapJoin(LinkedHashMap<Operator<? extends Serializable>,OpParseContext> opParseCtxMap, JoinOperator op, QBJoinTree joinTree, int mapJoinPos, boolean noCheckOuterJoin)
          convert a regular join to a a map-side join.
 MapJoinOperator generateMapJoinOperator(ParseContext pctx, JoinOperator op, QBJoinTree joinTree, int mapJoinPos)
           
static String genMapJoinOpAndLocalWork(MapredWork newWork, JoinOperator op, int mapJoinPos)
           
static HashSet<Integer> getBigTableCandidates(JoinCondDesc[] condns)
          Get a list of big table candidates.
static NodeProcessor getCurrentMapJoin()
           
static NodeProcessor getDefault()
           
static NodeProcessor getMapJoinDefault()
           
static NodeProcessor getMapJoinFS()
           
 ParseContext transform(ParseContext pactx)
          Transform the query tree.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

MapJoinProcessor

public MapJoinProcessor()
empty constructor.

Method Detail

genMapJoinOpAndLocalWork

public static String genMapJoinOpAndLocalWork(MapredWork newWork,
                                              JoinOperator op,
                                              int mapJoinPos)
                                       throws SemanticException
Throws:
SemanticException

convertMapJoin

public static MapJoinOperator convertMapJoin(LinkedHashMap<Operator<? extends Serializable>,OpParseContext> opParseCtxMap,
                                             JoinOperator op,
                                             QBJoinTree joinTree,
                                             int mapJoinPos,
                                             boolean noCheckOuterJoin)
                                      throws SemanticException
convert a regular join to a a map-side join.

Parameters:
opParseCtxMap -
op - join operator
joinTree - qb join tree
mapJoinPos - position of the source to be read as part of map-reduce framework. All other sources are cached in memory
noCheckOuterJoin -
Throws:
SemanticException

generateMapJoinOperator

public MapJoinOperator generateMapJoinOperator(ParseContext pctx,
                                               JoinOperator op,
                                               QBJoinTree joinTree,
                                               int mapJoinPos)
                                        throws SemanticException
Throws:
SemanticException

getBigTableCandidates

public static HashSet<Integer> getBigTableCandidates(JoinCondDesc[] condns)
Get a list of big table candidates. Only the tables in the returned set can be used as big table in the join operation. The logic here is to scan the join condition array from left to right. If see a inner join, and the bigTableCandidates is empty or the outer join that we last saw is a right outer join, add both side of this inner join to big table candidates only if they are not in bad position. If see a left outer join, set lastSeenRightOuterJoin to false, and the bigTableCandidates is empty, add the left side to it, and if the bigTableCandidates is not empty, do nothing (which means the bigTableCandidates is from left side). If see a right outer join, set lastSeenRightOuterJoin to true, clear the bigTableCandidates, and add right side to the bigTableCandidates, it means the right side of a right outer join always win. If see a full outer join, return null immediately (no one can be the big table, can not do a mapjoin).

Parameters:
condns -
Returns:
list of big table candidates

checkMapJoin

public static void checkMapJoin(int mapJoinPos,
                                JoinCondDesc[] condns)
                         throws SemanticException
Throws:
SemanticException

transform

public ParseContext transform(ParseContext pactx)
                       throws SemanticException
Transform the query tree. For each join, check if it is a map-side join (user specified). If yes, convert it to a map-side join.

Specified by:
transform in interface Transform
Parameters:
pactx - current parse context
Returns:
ParseContext
Throws:
SemanticException

getMapJoinFS

public static NodeProcessor getMapJoinFS()

getMapJoinDefault

public static NodeProcessor getMapJoinDefault()

getDefault

public static NodeProcessor getDefault()

getCurrentMapJoin

public static NodeProcessor getCurrentMapJoin()


Copyright © 2011 The Apache Software Foundation