org.apache.hadoop.hive.ql.exec
Class GroupByOperator

java.lang.Object
  extended by org.apache.hadoop.hive.ql.exec.Operator<groupByDesc>
      extended by org.apache.hadoop.hive.ql.exec.GroupByOperator
All Implemented Interfaces:
Serializable, Node

public class GroupByOperator
extends Operator<groupByDesc>
implements Serializable

GroupBy operator implementation.

See Also:
Serialized Form

Nested Class Summary
 
Nested classes/interfaces inherited from class org.apache.hadoop.hive.ql.exec.Operator
Operator.OperatorFunc, Operator.State
 
Field Summary
protected  boolean[] aggregationIsDistinct
           
protected  ExprNodeEvaluator[][] aggregationParameterFields
           
protected  ObjectInspector[][] aggregationParameterObjectInspectors
           
protected  Object[][] aggregationParameterObjects
           
protected  ObjectInspector[][] aggregationParameterStandardObjectInspectors
           
protected  GenericUDAFEvaluator.AggregationBuffer[] aggregations
           
protected  Object[][] aggregationsParametersLastInvoke
           
protected  ObjectInspector[] currentKeyObjectInspectors
           
protected  ArrayList<Object> currentKeys
           
protected  HashMap<ArrayList<Object>,GenericUDAFEvaluator.AggregationBuffer[]> hashAggregations
           
protected  ExprNodeEvaluator[] keyFields
           
protected  ObjectInspector[] keyObjectInspectors
           
protected  Object[] keyObjects
           
protected  HashSet<ArrayList<Object>> keysCurrentGroup
           
protected  ArrayList<Object> newKeys
           
protected  ArrayList<ObjectInspector> objectInspectors
           
 
Fields inherited from class org.apache.hadoop.hive.ql.exec.Operator
alias, childOperators, childOperatorsArray, childOperatorsTag, colExprMap, conf, done, id, inputObjInspectors, out, outputObjInspector, parentOperators, reporter, state, statsMap
 
Constructor Summary
GroupByOperator()
           
 
Method Summary
 void closeOp(boolean abort)
          We need to forward all the aggregations to children.
 void endGroup()
           
protected  void forward(ArrayList<Object> keys, GenericUDAFEvaluator.AggregationBuffer[] aggs)
          Forward a record of keys and aggregation results.
 List<String> genColLists(HashMap<Operator<? extends Serializable>,OpParseContext> opParseCtx)
           
 String getName()
          Implements the getName function for the Node Interface.
protected  void initializeOp(org.apache.hadoop.conf.Configuration hconf)
          Operator specific initialization.
protected  GenericUDAFEvaluator.AggregationBuffer[] newAggregations()
           
 void process(Object row, int tag)
          Process the row.
protected  void resetAggregations(GenericUDAFEvaluator.AggregationBuffer[] aggs)
           
 void startGroup()
           
protected  void updateAggregations(GenericUDAFEvaluator.AggregationBuffer[] aggs, Object row, ObjectInspector rowInspector, boolean hashAggr, boolean newEntryForHashAggr, Object[][] lastInvoke)
           
 
Methods inherited from class org.apache.hadoop.hive.ql.exec.Operator
areAllParentsInitialized, close, dump, forward, getChildOperators, getChildren, getColumnExprMap, getConf, getDone, getIdentifier, getParentOperators, getSchema, getStats, initEvaluators, initEvaluatorsAndReturnStruct, initialize, initializeChildren, jobClose, logStats, preorderMap, removeChild, replaceChild, replaceParent, resetStats, setAlias, setChildOperators, setColumnExprMap, setConf, setDone, setId, setOutputCollector, setParentOperators, setReporter, setSchema
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

keyFields

protected transient ExprNodeEvaluator[] keyFields

keyObjectInspectors

protected transient ObjectInspector[] keyObjectInspectors

keyObjects

protected transient Object[] keyObjects

aggregationParameterFields

protected transient ExprNodeEvaluator[][] aggregationParameterFields

aggregationParameterObjectInspectors

protected transient ObjectInspector[][] aggregationParameterObjectInspectors

aggregationParameterStandardObjectInspectors

protected transient ObjectInspector[][] aggregationParameterStandardObjectInspectors

aggregationParameterObjects

protected transient Object[][] aggregationParameterObjects

aggregationIsDistinct

protected transient boolean[] aggregationIsDistinct

objectInspectors

protected transient ArrayList<ObjectInspector> objectInspectors

currentKeys

protected transient ArrayList<Object> currentKeys

newKeys

protected transient ArrayList<Object> newKeys

aggregations

protected transient GenericUDAFEvaluator.AggregationBuffer[] aggregations

aggregationsParametersLastInvoke

protected transient Object[][] aggregationsParametersLastInvoke

hashAggregations

protected transient HashMap<ArrayList<Object>,GenericUDAFEvaluator.AggregationBuffer[]> hashAggregations

keysCurrentGroup

protected transient HashSet<ArrayList<Object>> keysCurrentGroup

currentKeyObjectInspectors

protected transient ObjectInspector[] currentKeyObjectInspectors
Constructor Detail

GroupByOperator

public GroupByOperator()
Method Detail

initializeOp

protected void initializeOp(org.apache.hadoop.conf.Configuration hconf)
                     throws HiveException
Description copied from class: Operator
Operator specific initialization.

Overrides:
initializeOp in class Operator<groupByDesc>
Throws:
HiveException

newAggregations

protected GenericUDAFEvaluator.AggregationBuffer[] newAggregations()
                                                            throws HiveException
Throws:
HiveException

resetAggregations

protected void resetAggregations(GenericUDAFEvaluator.AggregationBuffer[] aggs)
                          throws HiveException
Throws:
HiveException

updateAggregations

protected void updateAggregations(GenericUDAFEvaluator.AggregationBuffer[] aggs,
                                  Object row,
                                  ObjectInspector rowInspector,
                                  boolean hashAggr,
                                  boolean newEntryForHashAggr,
                                  Object[][] lastInvoke)
                           throws HiveException
Throws:
HiveException

startGroup

public void startGroup()
                throws HiveException
Overrides:
startGroup in class Operator<groupByDesc>
Throws:
HiveException

endGroup

public void endGroup()
              throws HiveException
Overrides:
endGroup in class Operator<groupByDesc>
Throws:
HiveException

process

public void process(Object row,
                    int tag)
             throws HiveException
Description copied from class: Operator
Process the row.

Specified by:
process in class Operator<groupByDesc>
Parameters:
row - The object representing the row.
tag - The tag of the row usually means which parent this row comes from. Rows with the same tag should have exactly the same rowInspector all the time.
Throws:
HiveException

forward

protected void forward(ArrayList<Object> keys,
                       GenericUDAFEvaluator.AggregationBuffer[] aggs)
                throws HiveException
Forward a record of keys and aggregation results.

Parameters:
keys - The keys in the record
Throws:
HiveException

closeOp

public void closeOp(boolean abort)
             throws HiveException
We need to forward all the aggregations to children.

Overrides:
closeOp in class Operator<groupByDesc>
Throws:
HiveException

genColLists

public List<String> genColLists(HashMap<Operator<? extends Serializable>,OpParseContext> opParseCtx)

getName

public String getName()
Description copied from class: Operator
Implements the getName function for the Node Interface.

Specified by:
getName in interface Node
Overrides:
getName in class Operator<groupByDesc>
Returns:
the name of the operator


Copyright © 2009 The Apache Software Foundation