org.apache.pig.builtin
Class TextLoader

java.lang.Object
  extended by org.apache.pig.builtin.TextLoader
All Implemented Interfaces:
LoadFunc

public class TextLoader
extends Object
implements LoadFunc

This load function simply creates a tuple for each line of text that has a single field that contains the line of text.


Constructor Summary
TextLoader()
           
 
Method Summary
 void bindTo(String fileName, BufferedPositionedInputStream in, long offset, long end)
          Specifies a portion of an InputStream to read tuples.
 DataBag bytesToBag(byte[] b)
          TextLoader does not support conversion to Bag
 Boolean bytesToBoolean(byte[] b)
          TextLoader does not support conversion to Boolean.
 String bytesToCharArray(byte[] b)
          Cast data from bytes to chararray value.
 Double bytesToDouble(byte[] b)
          TextLoader does not support conversion to Double
 Float bytesToFloat(byte[] b)
          TextLoader does not support conversion to Float
 Integer bytesToInteger(byte[] b)
          TextLoader does not support conversion to Integer
 Long bytesToLong(byte[] b)
          TextLoader does not support conversion to Long
 Map<String,Object> bytesToMap(byte[] b)
          TextLoader does not support conversion to Map
 Tuple bytesToTuple(byte[] b)
          TextLoader does not support conversion to Tuple
 Schema determineSchema(String fileName, ExecType execType, DataStorage storage)
          TextLoader does not provide a schema.
 void fieldsToRead(Schema schema)
          TextLoader doesn't make use of this.
 Tuple getNext()
          Retrieves the next tuple to be processed.
 byte[] toBytes(DataBag bag)
           
 byte[] toBytes(Double d)
           
 byte[] toBytes(Float f)
           
 byte[] toBytes(Integer i)
           
 byte[] toBytes(Long l)
           
 byte[] toBytes(Map<String,Object> m)
           
 byte[] toBytes(String s)
           
 byte[] toBytes(Tuple t)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

TextLoader

public TextLoader()
Method Detail

bindTo

public void bindTo(String fileName,
                   BufferedPositionedInputStream in,
                   long offset,
                   long end)
            throws IOException
Description copied from interface: LoadFunc
Specifies a portion of an InputStream to read tuples. Because the starting and ending offsets may not be on record boundaries it is up to the implementor to deal with figuring out the actual starting and ending offsets in such a way that an arbitrarily sliced up file will be processed in its entirety.

A common way of handling slices in the middle of records is to start at the given offset and, if the offset is not zero, skip to the end of the first record (which may be a partial record) before reading tuples. Reading continues until a tuple has been read that ends at an offset past the ending offset.

The load function should not do any buffering on the input stream. Buffering will cause the offsets returned by is.getPos() to be unreliable.

Specified by:
bindTo in interface LoadFunc
Parameters:
fileName - the name of the file to be read
in - the stream representing the file to be processed, and which can also provide its position.
offset - the offset to start reading tuples.
end - the ending offset for reading.
Throws:
IOException

getNext

public Tuple getNext()
              throws IOException
Description copied from interface: LoadFunc
Retrieves the next tuple to be processed.

Specified by:
getNext in interface LoadFunc
Returns:
the next tuple to be processed or null if there are no more tuples to be processed.
Throws:
IOException

bytesToBoolean

public Boolean bytesToBoolean(byte[] b)
                       throws IOException
TextLoader does not support conversion to Boolean.

Throws:
IOException - if the value cannot be cast.

bytesToInteger

public Integer bytesToInteger(byte[] b)
                       throws IOException
TextLoader does not support conversion to Integer

Specified by:
bytesToInteger in interface LoadFunc
Parameters:
b - byte array to be cast.
Returns:
Integer value.
Throws:
IOException - if the value cannot be cast.

bytesToLong

public Long bytesToLong(byte[] b)
                 throws IOException
TextLoader does not support conversion to Long

Specified by:
bytesToLong in interface LoadFunc
Parameters:
b - byte array to be cast.
Returns:
Long value.
Throws:
IOException - if the value cannot be cast.

bytesToFloat

public Float bytesToFloat(byte[] b)
                   throws IOException
TextLoader does not support conversion to Float

Specified by:
bytesToFloat in interface LoadFunc
Parameters:
b - byte array to be cast.
Returns:
Float value.
Throws:
IOException - if the value cannot be cast.

bytesToDouble

public Double bytesToDouble(byte[] b)
                     throws IOException
TextLoader does not support conversion to Double

Specified by:
bytesToDouble in interface LoadFunc
Parameters:
b - byte array to be cast.
Returns:
Double value.
Throws:
IOException - if the value cannot be cast.

bytesToCharArray

public String bytesToCharArray(byte[] b)
                        throws IOException
Cast data from bytes to chararray value.

Specified by:
bytesToCharArray in interface LoadFunc
Parameters:
b - byte array to be cast.
Returns:
String value.
Throws:
IOException - if the value cannot be cast.

bytesToMap

public Map<String,Object> bytesToMap(byte[] b)
                              throws IOException
TextLoader does not support conversion to Map

Specified by:
bytesToMap in interface LoadFunc
Parameters:
b - byte array to be cast.
Returns:
Map value.
Throws:
IOException - if the value cannot be cast.

bytesToTuple

public Tuple bytesToTuple(byte[] b)
                   throws IOException
TextLoader does not support conversion to Tuple

Specified by:
bytesToTuple in interface LoadFunc
Parameters:
b - byte array to be cast.
Returns:
Tuple value.
Throws:
IOException - if the value cannot be cast.

bytesToBag

public DataBag bytesToBag(byte[] b)
                   throws IOException
TextLoader does not support conversion to Bag

Specified by:
bytesToBag in interface LoadFunc
Parameters:
b - byte array to be cast.
Returns:
Bag value.
Throws:
IOException - if the value cannot be cast.

fieldsToRead

public void fieldsToRead(Schema schema)
TextLoader doesn't make use of this.

Specified by:
fieldsToRead in interface LoadFunc
Parameters:
schema - Schema indicating which columns will be needed.

determineSchema

public Schema determineSchema(String fileName,
                              ExecType execType,
                              DataStorage storage)
                       throws IOException
TextLoader does not provide a schema.

Specified by:
determineSchema in interface LoadFunc
Parameters:
fileName - Name of the file to be read.(this will be the same as the filename in the "load statement of the script)
execType - - execution mode of the pig script - one of ExecType.LOCAL or ExecType.MAPREDUCE
storage - - the DataStorage object corresponding to the execType
Returns:
a Schema describing the data if possible, or null otherwise.
Throws:
IOException

toBytes

public byte[] toBytes(DataBag bag)
               throws IOException
Throws:
IOException

toBytes

public byte[] toBytes(String s)
               throws IOException
Throws:
IOException

toBytes

public byte[] toBytes(Double d)
               throws IOException
Throws:
IOException

toBytes

public byte[] toBytes(Float f)
               throws IOException
Throws:
IOException

toBytes

public byte[] toBytes(Integer i)
               throws IOException
Throws:
IOException

toBytes

public byte[] toBytes(Long l)
               throws IOException
Throws:
IOException

toBytes

public byte[] toBytes(Map<String,Object> m)
               throws IOException
Throws:
IOException

toBytes

public byte[] toBytes(Tuple t)
               throws IOException
Throws:
IOException


Copyright © ${year} The Apache Software Foundation