org.apache.pig.impl.logicalLayer.schema
Class Schema

java.lang.Object
  extended by org.apache.pig.impl.logicalLayer.schema.Schema
All Implemented Interfaces:
Serializable, Cloneable

public class Schema
extends Object
implements Serializable, Cloneable

The Schema class encapsulates the notion of a schema for a relational operator. A schema is a list of columns that describe the output of a relational operator. Each column in the relation is represented as a FieldSchema, a static class inside the Schema. A column by definition has an alias, a type and a possible schema (if the column is a bag or a tuple). In addition, each column in the schema has a unique auto generated name used for tracking the lineage of the column in a sequence of statements. The lineage of the column is tracked using a map of the predecessors' columns to the operators that generate the predecessor columns. The predecessor columns are the columns required in order to generate the column under consideration. Similarly, a reverse lookup of operators that generate the predecessor column to the predecessor column is maintained.

See Also:
Serialized Form

Nested Class Summary
static class Schema.FieldSchema
           
 
Constructor Summary
Schema()
           
Schema(List<Schema.FieldSchema> fields)
           
Schema(Schema.FieldSchema fieldSchema)
          Create a schema with only one field.
Schema(Schema s)
          Copy Constructor.
 
Method Summary
 void add(Schema.FieldSchema f)
           
 void addAlias(String alias, Schema.FieldSchema fs)
           
static boolean castable(Schema cast, Schema input)
          Recursively compare two schemas to check if the input schema can be cast to the cast schema
 Schema clone()
          Make a deep copy of a schema.
 boolean equals(Object other)
          For two schemas to be equal, they have to be deeply equal.
static boolean equals(Schema schema, Schema other, boolean relaxInner, boolean relaxAlias)
          Recursively compare two schemas for equality
static Schema generateNestedSchema(byte topLevelType, byte... innerTypes)
           
 Set<String> getAliases()
           
 Schema.FieldSchema getField(int fieldNum)
          Given a field number, find the associated FieldSchema.
 Schema.FieldSchema getField(String alias)
          Given an alias name, find the associated FieldSchema.
 List<Schema.FieldSchema> getFields()
           
 int getPosition(String alias)
          Given an alias, find the associated position of the field schema.
 int hashCode()
           
 boolean isTwoLevelAccessRequired()
           
 Schema merge(Schema other, boolean otherTakesAliasPrecedence)
          Merge this schema with the other schema
 Schema mergePrefixSchema(Schema other, boolean otherTakesAliasPrecedence)
          Recursively prefix merge two schemas
 Schema mergePrefixSchema(Schema other, boolean otherTakesAliasPrecedence, boolean allowMergeableTypes)
          Recursively prefix merge two schemas
static Schema mergeSchema(Schema schema, Schema other, boolean otherTakesAliasPrecedence)
          Recursively merge two schemas
static Schema mergeSchema(Schema schema, Schema other, boolean otherTakesAliasPrecedence, boolean allowDifferentSizeMerge, boolean allowIncompatibleTypes)
          Recursively merge two schemas
 void printAliases()
           
 void reconcile(Schema other)
          Reconcile this schema with another schema.
static void setSchemaDefaultType(Schema s, byte t)
          Recursively set NULL type to the specifid type in a schema
 void setTwoLevelAccessRequired(boolean twoLevelAccess)
           
 int size()
          Find the number of fields in the schema.
static void stringifySchema(StringBuilder sb, Schema schema, byte type)
           
 String toString()
           
 
Methods inherited from class java.lang.Object
finalize, getClass, notify, notifyAll, wait, wait, wait
 

Constructor Detail

Schema

public Schema()

Schema

public Schema(List<Schema.FieldSchema> fields)
Parameters:
fields - List of field schemas that describes the fields.

Schema

public Schema(Schema.FieldSchema fieldSchema)
Create a schema with only one field.

Parameters:
fieldSchema - field to put in this schema.

Schema

public Schema(Schema s)
Copy Constructor.

Parameters:
s - source schema
Method Detail

getField

public Schema.FieldSchema getField(String alias)
                            throws FrontendException
Given an alias name, find the associated FieldSchema.

Parameters:
alias - Alias to look up.
Returns:
FieldSchema, or null if no such alias is in this tuple.
Throws:
FrontendException

getField

public Schema.FieldSchema getField(int fieldNum)
                            throws FrontendException
Given a field number, find the associated FieldSchema.

Parameters:
fieldNum - Field number to look up.
Returns:
FieldSchema for this field.
Throws:
ParseException - if the field number exceeds the number of fields in the tuple.
FrontendException

size

public int size()
Find the number of fields in the schema.

Returns:
number of fields.

reconcile

public void reconcile(Schema other)
               throws FrontendException
Reconcile this schema with another schema. The schema being reconciled with should have the same number of columns. The use case is where a schema already exists but may not have alias and or type information. If an alias exists in this schema and a new one is given, then the new one will be used. Similarly with types, though this needs to be used carefully, as types should not be lightly changed.

Parameters:
other - Schema to reconcile with.
Throws:
ParseException - if this cannot be reconciled.
FrontendException

equals

public boolean equals(Object other)
For two schemas to be equal, they have to be deeply equal. Use Schema.equals(Schema schema, Schema other, boolean relaxInner, boolean relaxAlias) if relaxation of aliases is a requirement.

Overrides:
equals in class Object

clone

public Schema clone()
             throws CloneNotSupportedException
Make a deep copy of a schema.

Overrides:
clone in class Object
Throws:
CloneNotSupportedException

hashCode

public int hashCode()
Overrides:
hashCode in class Object

toString

public String toString()
Overrides:
toString in class Object

stringifySchema

public static void stringifySchema(StringBuilder sb,
                                   Schema schema,
                                   byte type)
                            throws FrontendException
Throws:
FrontendException

add

public void add(Schema.FieldSchema f)

getPosition

public int getPosition(String alias)
                throws FrontendException
Given an alias, find the associated position of the field schema.

Parameters:
alias - alias of the FieldSchema.
Returns:
position of the FieldSchema.
Throws:
FrontendException

addAlias

public void addAlias(String alias,
                     Schema.FieldSchema fs)

getAliases

public Set<String> getAliases()

printAliases

public void printAliases()

getFields

public List<Schema.FieldSchema> getFields()

castable

public static boolean castable(Schema cast,
                               Schema input)
Recursively compare two schemas to check if the input schema can be cast to the cast schema

Parameters:
cast - schema of the cast operator
input - schema of the cast input
Returns:
true or falsew!

equals

public static boolean equals(Schema schema,
                             Schema other,
                             boolean relaxInner,
                             boolean relaxAlias)
Recursively compare two schemas for equality

Parameters:
schema -
other -
relaxInner - if true, inner schemas will not be checked
relaxAlias - if true, aliases will not be checked
Returns:
true if schemas are equal, false otherwise

merge

public Schema merge(Schema other,
                    boolean otherTakesAliasPrecedence)
Merge this schema with the other schema

Parameters:
other - the other schema to be merged with
otherTakesAliasPrecedence - true if aliases from the other schema take precedence
Returns:
the merged schema, null if they are not compatible

mergeSchema

public static Schema mergeSchema(Schema schema,
                                 Schema other,
                                 boolean otherTakesAliasPrecedence)
Recursively merge two schemas

Parameters:
schema - the initial schema
other - the other schema to be merged with
otherTakesAliasPrecedence - true if aliases from the other schema take precedence
Returns:
the merged schema, null if they are not compatible

mergeSchema

public static Schema mergeSchema(Schema schema,
                                 Schema other,
                                 boolean otherTakesAliasPrecedence,
                                 boolean allowDifferentSizeMerge,
                                 boolean allowIncompatibleTypes)
                          throws SchemaMergeException
Recursively merge two schemas

Parameters:
schema - the initial schema
other - the other schema to be merged with
otherTakesAliasPrecedence - true if aliases from the other schema take precedence
allowDifferentSizeMerge - allow merging of schemas of different types
allowIncompatibleTypes - 1) if types in schemas are not compatible they will be treated as ByteArray (untyped) 2) if schemas in schemas are not compatible and allowIncompatibleTypes is true those inner schemas in the output will be null.
Returns:
the merged schema this can be null if one schema is null and allowIncompatibleTypes is true
Throws:
SchemaMergeException - if they cannot be merged

generateNestedSchema

public static Schema generateNestedSchema(byte topLevelType,
                                          byte... innerTypes)
                                   throws FrontendException
Parameters:
topLevelType - DataType type of the top level element
innerTypes - DataType types of the inner level element
Returns:
nested schema representing type of top level element at first level and inner schema representing types of inner element(s)
Throws:
FrontendException

mergePrefixSchema

public Schema mergePrefixSchema(Schema other,
                                boolean otherTakesAliasPrecedence)
                         throws SchemaMergeException
Recursively prefix merge two schemas

Parameters:
other - the other schema to be merged with
otherTakesAliasPrecedence - true if aliases from the other schema take precedence
Returns:
the prefix merged schema this can be null if one schema is null and allowIncompatibleTypes is true
Throws:
SchemaMergeException - if they cannot be merged

mergePrefixSchema

public Schema mergePrefixSchema(Schema other,
                                boolean otherTakesAliasPrecedence,
                                boolean allowMergeableTypes)
                         throws SchemaMergeException
Recursively prefix merge two schemas

Parameters:
other - the other schema to be merged with
otherTakesAliasPrecedence - true if aliases from the other schema take precedence
allowMergeableTypes - true if "mergeable" types should be allowed. Two types are mergeable if any of the following conditions is true IN THE BELOW ORDER of checks: 1) if either one has a type null or unknown and other has a type OTHER THAN null or unknown, the result type will be the latter non null/unknown type 2) If either type is bytearray, then result type will be the other (possibly non BYTEARRAY) type 3) If current type can be cast to the other type, then the result type will be the other type
Returns:
the prefix merged schema this can be null if one schema is null and allowIncompatibleTypes is true
Throws:
SchemaMergeException - if they cannot be merged

setSchemaDefaultType

public static void setSchemaDefaultType(Schema s,
                                        byte t)
Recursively set NULL type to the specifid type in a schema

Parameters:
s - the schema whose NULL type has to be set
t - the specified type

isTwoLevelAccessRequired

public boolean isTwoLevelAccessRequired()
Returns:
the twoLevelAccess

setTwoLevelAccessRequired

public void setTwoLevelAccessRequired(boolean twoLevelAccess)
Parameters:
twoLevelAccess - the twoLevelAccess to set


Copyright © ${year} The Apache Software Foundation