Uses of Class
org.apache.pig.FileInputLoadFunc

Packages that use FileInputLoadFunc
org.apache.pig.builtin   
org.apache.pig.piggybank.storage   
 

Uses of FileInputLoadFunc in org.apache.pig.builtin
 

Subclasses of FileInputLoadFunc in org.apache.pig.builtin
 class BinStorage
           
 class PigStorage
          A load function that parses a line of input into fields using a delimiter to set the fields.
 

Uses of FileInputLoadFunc in org.apache.pig.piggybank.storage
 

Subclasses of FileInputLoadFunc in org.apache.pig.piggybank.storage
 class HiveColumnarLoader
          Loader for Hive RC Columnar files.
Supports the following types:
* Hive TypePig Type from DataType stringCHARARRAY intINTEGER bigint or longLONG floatfloat doubleDOUBLE booleanBOOLEAN byteBYTE arrayTUPLE mapMAP
Usage 1:
To load a hive table: uid bigint, ts long, arr ARRAY, m MAP a = LOAD 'file' USING HiveColumnarLoader("uid bigint, ts long, arr array, m map"); -- to reference the fields b = FOREACH GENERATE a.uid, a.ts, a.arr, a.m;

Usage 2:
To load a hive table: uid bigint, ts long, arr ARRAY, m MAP only processing dates 2009-10-01 to 2009-10-02 in a
date partitioned hive table.
a = LOAD 'file' USING HiveColumnarLoader("uid bigint, ts long, arr array, m map", "2009-10-01:2009-10-02"); -- to reference the fields b = FOREACH GENERATE a.uid, a.ts, a.arr, a.m;

Usage 3:
To load a hive table: uid bigint, ts long, arr ARRAY, m MAP only reading column uid and ts.
a = LOAD 'file' USING HiveColumnarLoader("uid bigint, ts long, arr array, m map", "", "uid,ts"); -- to reference the fields b = FOREACH a GENERATE uid, ts, arr, m;

Usage 4:
To load a hive table: uid bigint, ts long, arr ARRAY, m MAP only reading column uid and ts for dates 2009-10-01 to 2009-10-02.
a = LOAD 'file' USING HiveColumnarLoader("uid bigint, ts long, arr array, m map", "2009-10-01:2009-10-02", "uid,ts"); -- to reference the fields b = FOREACH a GENERATE uid, ts, arr, m;

Issues

Table schema definition
The schema definition must be column name followed by a space then a comma then no space and the next column name and so on.
This so column1 string, column2 string will not word, it must be column1 string,column2 string

Date partitioning
Hive date partition folders must have format daydate=[date].

 class PigStorageSchema
          This Load/Store Func reads/writes metafiles that allow the schema and aliases to be determined at load time, saving one from having to manually enter schemas for pig-generated datasets.
 class SequenceFileLoader
          A Loader for Hadoop-Standard SequenceFiles.
 



Copyright © ${year} The Apache Software Foundation