Uses of Interface
org.apache.pig.OrderedLoadFunc

Packages that use OrderedLoadFunc
org.apache.hadoop.zebra.pig Implementation of PIG Storer/Loader Interfaces 
org.apache.pig Public interfaces and classes for Pig. 
org.apache.pig.builtin   
org.apache.pig.piggybank.storage   
 

Uses of OrderedLoadFunc in org.apache.hadoop.zebra.pig
 

Classes in org.apache.hadoop.zebra.pig that implement OrderedLoadFunc
 class TableLoader
          Pig IndexableLoadFunc and Slicer for Zebra Table
 

Uses of OrderedLoadFunc in org.apache.pig
 

Classes in org.apache.pig that implement OrderedLoadFunc
 class FileInputLoadFunc
          This class provides an implementation of OrderedLoadFunc interface which can be optionally re-used by LoadFuncs that use FileInputFormat, by having this as a super class
 

Uses of OrderedLoadFunc in org.apache.pig.builtin
 

Classes in org.apache.pig.builtin that implement OrderedLoadFunc
 class BinStorage
           
 class PigStorage
          A load function that parses a line of input into fields using a delimiter to set the fields.
 

Uses of OrderedLoadFunc in org.apache.pig.piggybank.storage
 

Classes in org.apache.pig.piggybank.storage that implement OrderedLoadFunc
 class HiveColumnarLoader
          Loader for Hive RC Columnar files.
Supports the following types:
* Hive TypePig Type from DataType stringCHARARRAY intINTEGER bigint or longLONG floatfloat doubleDOUBLE booleanBOOLEAN byteBYTE arrayTUPLE mapMAP
Usage 1:
To load a hive table: uid bigint, ts long, arr ARRAY, m MAP a = LOAD 'file' USING HiveColumnarLoader("uid bigint, ts long, arr array, m map"); -- to reference the fields b = FOREACH GENERATE a.uid, a.ts, a.arr, a.m;

Usage 2:
To load a hive table: uid bigint, ts long, arr ARRAY, m MAP only processing dates 2009-10-01 to 2009-10-02 in a
date partitioned hive table.
a = LOAD 'file' USING HiveColumnarLoader("uid bigint, ts long, arr array, m map", "2009-10-01:2009-10-02"); -- to reference the fields b = FOREACH GENERATE a.uid, a.ts, a.arr, a.m;

Usage 3:
To load a hive table: uid bigint, ts long, arr ARRAY, m MAP only reading column uid and ts.
a = LOAD 'file' USING HiveColumnarLoader("uid bigint, ts long, arr array, m map", "", "uid,ts"); -- to reference the fields b = FOREACH a GENERATE uid, ts, arr, m;

Usage 4:
To load a hive table: uid bigint, ts long, arr ARRAY, m MAP only reading column uid and ts for dates 2009-10-01 to 2009-10-02.
a = LOAD 'file' USING HiveColumnarLoader("uid bigint, ts long, arr array, m map", "2009-10-01:2009-10-02", "uid,ts"); -- to reference the fields b = FOREACH a GENERATE uid, ts, arr, m;

Issues

Table schema definition
The schema definition must be column name followed by a space then a comma then no space and the next column name and so on.
This so column1 string, column2 string will not word, it must be column1 string,column2 string

Date partitioning
Hive date partition folders must have format daydate=[date].

 class PigStorageSchema
          This Load/Store Func reads/writes metafiles that allow the schema and aliases to be determined at load time, saving one from having to manually enter schemas for pig-generated datasets.
 class SequenceFileLoader
          A Loader for Hadoop-Standard SequenceFiles.
 



Copyright © ${year} The Apache Software Foundation