org.apache.pig.data
Class DataType

java.lang.Object
  extended by org.apache.pig.data.DataType

public class DataType
extends Object

A class of static final values used to encode data type and a number of static helper funcitons for manipulating data objects. The data type values could be done as an enumeration, but it is done as byte codes instead to save creating objects.


Field Summary
static byte BAG
           
static byte BIGCHARARRAY
           
static byte BOOLEAN
           
static byte BYTE
           
static byte BYTEARRAY
           
static byte CHARARRAY
           
static byte DOUBLE
           
static byte ERROR
           
static byte FLOAT
           
static byte INTEGER
           
static byte INTERNALMAP
           
static byte LONG
           
static byte MAP
           
static byte NULL
           
static byte TUPLE
           
static byte UNKNOWN
           
 
Constructor Summary
DataType()
           
 
Method Summary
static int compare(Object o1, Object o2)
          /** Compare two objects to each other.
static Schema.FieldSchema determineFieldSchema(Object o)
          Determine the field schema of an object
static boolean equalByteArrays(byte[] lhs, byte[] rhs)
           
static byte findType(Object o)
          Determine the datatype of an object.
static byte findType(Type t)
          Given a Type object determine the data type it represents.
static String findTypeName(byte dt)
          Get the type name from the type byte code
static String findTypeName(Object o)
          Get the type name.
static byte[] genAllTypes()
           
static Map<String,Byte> genNameToTypeMap()
           
static Map<Byte,String> genTypeToNameMap()
           
static boolean isAtomic(byte dataType)
          Determine whether the this data type is atomic.
static boolean isAtomic(Object o)
          Determine whether the this data type is atomic.
static boolean isComplex(byte dataType)
          Determine whether the this data type is complex.
static boolean isComplex(Object o)
          Determine whether the object is complex or atomic.
static boolean isNumberType(byte t)
           
static boolean isSchemaType(byte dataType)
          Determine whether the this data type can have a schema.
static boolean isSchemaType(Object o)
          Determine whether the this object can have a schema.
static boolean isUsableType(byte t)
           
static String mapToString(Map<String,Object> m)
           
static byte mergeType(byte type1, byte type2)
          Merge types if possible
static int numTypes()
           
static void spillTupleContents(Tuple t, String label)
          Purely for debugging
static DataBag toBag(Object o)
          If this object is a bag, return it as a bag.
static Double toDouble(Object o)
          If type of object is not known, use this method which in turns call toLong(object,type) after finding type.
static Double toDouble(Object o, byte type)
          Force a data object to a Double, if possible.
static Float toFloat(Object o)
          If type of object is not known, use this method which in turns call toFloat(object,type) after finding type.
static Float toFloat(Object o, byte type)
          Force a data object to a Float, if possible.
static Integer toInteger(Object o)
          If type of object is not known, use this method, which internally calls toInteger(object,type)
static Integer toInteger(Object o, byte type)
          Force a data object to an Integer, if possible.
static Long toLong(Object o)
          If type of object is not known, use this method which in turns call toLong(object,type) after finding type.
static Long toLong(Object o, byte type)
          Force a data object to a Long, if possible.
static Map<String,Object> toMap(Object o)
          If this object is a map, return it as a map.
static String toString(Object o)
          If type of object is not known, use this method which in turns call toString(object,type) after finding type.
static String toString(Object o, byte type)
          Force a data object to a String, if possible.
static Tuple toTuple(Object o)
          If this object is a tuple, return it as a tuple.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

UNKNOWN

public static final byte UNKNOWN
See Also:
Constant Field Values

NULL

public static final byte NULL
See Also:
Constant Field Values

BOOLEAN

public static final byte BOOLEAN
See Also:
Constant Field Values

BYTE

public static final byte BYTE
See Also:
Constant Field Values

INTEGER

public static final byte INTEGER
See Also:
Constant Field Values

LONG

public static final byte LONG
See Also:
Constant Field Values

FLOAT

public static final byte FLOAT
See Also:
Constant Field Values

DOUBLE

public static final byte DOUBLE
See Also:
Constant Field Values

BYTEARRAY

public static final byte BYTEARRAY
See Also:
Constant Field Values

CHARARRAY

public static final byte CHARARRAY
See Also:
Constant Field Values

BIGCHARARRAY

public static final byte BIGCHARARRAY
See Also:
Constant Field Values

MAP

public static final byte MAP
See Also:
Constant Field Values

TUPLE

public static final byte TUPLE
See Also:
Constant Field Values

BAG

public static final byte BAG
See Also:
Constant Field Values

INTERNALMAP

public static final byte INTERNALMAP
See Also:
Constant Field Values

ERROR

public static final byte ERROR
See Also:
Constant Field Values
Constructor Detail

DataType

public DataType()
Method Detail

findType

public static byte findType(Object o)
Determine the datatype of an object.

Parameters:
o - Object to test.
Returns:
byte code of the type, or ERROR if we don't know.

findType

public static byte findType(Type t)
Given a Type object determine the data type it represents. This isn't cheap, as it uses reflection, so use sparingly.

Parameters:
t - Type to examine
Returns:
byte code of the type, or ERROR if we don't know.

numTypes

public static int numTypes()

genAllTypes

public static byte[] genAllTypes()

genTypeToNameMap

public static Map<Byte,String> genTypeToNameMap()

genNameToTypeMap

public static Map<String,Byte> genNameToTypeMap()

findTypeName

public static String findTypeName(Object o)
Get the type name.

Parameters:
o - Object to test.
Returns:
type name, as a String.

findTypeName

public static String findTypeName(byte dt)
Get the type name from the type byte code

Parameters:
dt - Type byte code
Returns:
type name, as a String.

isComplex

public static boolean isComplex(byte dataType)
Determine whether the this data type is complex.

Parameters:
dataType - Data type code to test.
Returns:
true if dataType is bag, tuple, or map.

isComplex

public static boolean isComplex(Object o)
Determine whether the object is complex or atomic.

Parameters:
o - Object to determine type of.
Returns:
true if dataType is bag, tuple, or map.

isAtomic

public static boolean isAtomic(byte dataType)
Determine whether the this data type is atomic.

Parameters:
dataType - Data type code to test.
Returns:
true if dataType is bytearray, bigchararray, chararray, integer, long, float, or boolean.

isAtomic

public static boolean isAtomic(Object o)
Determine whether the this data type is atomic.

Parameters:
o - Object to determine type of.
Returns:
true if dataType is bytearray, chararray, integer, long, float, or boolean.

isSchemaType

public static boolean isSchemaType(Object o)
Determine whether the this object can have a schema.

Parameters:
o - Object to determine if it has a schema
Returns:
true if the type can have a valid schema (i.e., bag or tuple)

isSchemaType

public static boolean isSchemaType(byte dataType)
Determine whether the this data type can have a schema.

Parameters:
dataType - dataType to determine if it has a schema
Returns:
true if the type can have a valid schema (i.e., bag or tuple)

compare

public static int compare(Object o1,
                          Object o2)
/** Compare two objects to each other. This function is necessary because there's no super class that implements compareTo. This function provides an (arbitrary) ordering of objects of different types as follows: NULL < BOOLEAN < BYTE < INTEGER < LONG < FLOAT < DOUBLE * < BYTEARRAY < STRING < MAP < TUPLE < BAG. No other functions should implement this cross object logic. They should call this function for it instead.

Parameters:
o1 - First object
o2 - Second object
Returns:
-1 if o1 is less, 0 if they are equal, 1 if o2 is less.

toInteger

public static Integer toInteger(Object o,
                                byte type)
                         throws ExecException
Force a data object to an Integer, if possible. Any numeric type can be forced to an Integer (though precision may be lost), as well as CharArray, ByteArray, or Boolean. Complex types cannot be forced to an Integer. This isn't particularly efficient, so if you already know that the object you have is an Integer you should just cast it.

Returns:
The object as a Integer.
Throws:
ExecException - if the type can't be forced to an Integer.

toInteger

public static Integer toInteger(Object o)
                         throws ExecException
If type of object is not known, use this method, which internally calls toInteger(object,type)

Parameters:
o -
Returns:
Object as Integer.
Throws:
ExecException

toLong

public static Long toLong(Object o,
                          byte type)
                   throws ExecException
Force a data object to a Long, if possible. Any numeric type can be forced to a Long (though precision may be lost), as well as CharArray, ByteArray, or Boolean. Complex types cannot be forced to a Long. This isn't particularly efficient, so if you already know that the object you have is a Long you should just cast it.

Returns:
The object as a Long.
Throws:
ExecException - if the type can't be forced to a Long.

toLong

public static Long toLong(Object o)
                   throws ExecException
If type of object is not known, use this method which in turns call toLong(object,type) after finding type.

Parameters:
o -
Returns:
Object as Long.
Throws:
ExecException

toFloat

public static Float toFloat(Object o,
                            byte type)
                     throws ExecException
Force a data object to a Float, if possible. Any numeric type can be forced to a Float (though precision may be lost), as well as CharArray, ByteArray. Complex types cannot be forced to a Float. This isn't particularly efficient, so if you already know that the object you have is a Float you should just cast it.

Returns:
The object as a Float.
Throws:
ExecException - if the type can't be forced to a Float.

toFloat

public static Float toFloat(Object o)
                     throws ExecException
If type of object is not known, use this method which in turns call toFloat(object,type) after finding type.

Parameters:
o -
Returns:
Object as Float.
Throws:
ExecException

toDouble

public static Double toDouble(Object o,
                              byte type)
                       throws ExecException
Force a data object to a Double, if possible. Any numeric type can be forced to a Double, as well as CharArray, ByteArray. Complex types cannot be forced to a Double. This isn't particularly efficient, so if you already know that the object you have is a Double you should just cast it.

Returns:
The object as a Double.
Throws:
ExecException - if the type can't be forced to a Double.

toDouble

public static Double toDouble(Object o)
                       throws ExecException
If type of object is not known, use this method which in turns call toLong(object,type) after finding type.

Parameters:
o -
Returns:
Object as Double.
Throws:
ExecException

toString

public static String toString(Object o,
                              byte type)
                       throws ExecException
Force a data object to a String, if possible. Any simple (atomic) type can be forced to a String including ByteArray. Complex types cannot be forced to a String. This isn't particularly efficient, so if you already know that the object you have is a String you should just cast it.

Returns:
The object as a String.
Throws:
ExecException - if the type can't be forced to a String.

toString

public static String toString(Object o)
                       throws ExecException
If type of object is not known, use this method which in turns call toString(object,type) after finding type.

Parameters:
o -
Returns:
Object as String.
Throws:
ExecException

toMap

public static Map<String,Object> toMap(Object o)
                                throws ExecException
If this object is a map, return it as a map. This isn't particularly efficient, so if you already know that the object you have is a Map you should just cast it.

Returns:
The object as a Map.
Throws:
ExecException - if the type can't be forced to a Double.

toTuple

public static Tuple toTuple(Object o)
                     throws ExecException
If this object is a tuple, return it as a tuple. This isn't particularly efficient, so if you already know that the object you have is a Tuple you should just cast it.

Returns:
The object as a Double.
Throws:
ExecException - if the type can't be forced to a Double.

toBag

public static DataBag toBag(Object o)
                     throws ExecException
If this object is a bag, return it as a bag. This isn't particularly efficient, so if you already know that the object you have is a bag you should just cast it.

Returns:
The object as a Double.
Throws:
ExecException - if the type can't be forced to a Double.

spillTupleContents

public static void spillTupleContents(Tuple t,
                                      String label)
Purely for debugging


isNumberType

public static boolean isNumberType(byte t)

isUsableType

public static boolean isUsableType(byte t)

mergeType

public static byte mergeType(byte type1,
                             byte type2)
Merge types if possible

Parameters:
type1 -
type2 -
Returns:
the merged type, or DataType.ERROR if not successful

mapToString

public static String mapToString(Map<String,Object> m)

equalByteArrays

public static boolean equalByteArrays(byte[] lhs,
                                      byte[] rhs)

determineFieldSchema

public static Schema.FieldSchema determineFieldSchema(Object o)
                                               throws ExecException,
                                                      FrontendException,
                                                      SchemaMergeException
Determine the field schema of an object

Parameters:
o - the object whose field schema is to be determined
Returns:
the field schema corresponding to the object
Throws:
ExecException,FrontendException,SchemaMergeException
ExecException
FrontendException
SchemaMergeException


Copyright © ${year} The Apache Software Foundation