|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectorg.apache.pig.impl.builtin.DefaultIndexableLoader
public class DefaultIndexableLoader
Nested Class Summary |
---|
Nested classes/interfaces inherited from interface org.apache.pig.LoadFunc |
---|
LoadFunc.RequiredField, LoadFunc.RequiredFieldList, LoadFunc.RequiredFieldResponse |
Constructor Summary | |
---|---|
DefaultIndexableLoader(String loaderFuncSpec,
String indexFile,
String indexFileLoadFuncSpec,
String scope)
|
Method Summary | |
---|---|
void |
bindTo(String fileName,
BufferedPositionedInputStream is,
long offset,
long end)
Specifies a portion of an InputStream to read tuples. |
DataBag |
bytesToBag(byte[] b)
Cast data from bytes to bag value. |
String |
bytesToCharArray(byte[] b)
Cast data from bytes to chararray value. |
Double |
bytesToDouble(byte[] b)
Cast data from bytes to double value. |
Float |
bytesToFloat(byte[] b)
Cast data from bytes to float value. |
Integer |
bytesToInteger(byte[] b)
Cast data from bytes to integer value. |
Long |
bytesToLong(byte[] b)
Cast data from bytes to long value. |
Map<String,Object> |
bytesToMap(byte[] b)
Cast data from bytes to map value. |
Tuple |
bytesToTuple(byte[] b)
Cast data from bytes to tuple value. |
void |
close()
A method called by the pig runtime to give an opportunity for implementations to perform cleanup actions like closing the underlying input stream. |
Schema |
determineSchema(String fileName,
ExecType execType,
DataStorage storage)
Find the schema from the loader. |
LoadFunc.RequiredFieldResponse |
fieldsToRead(LoadFunc.RequiredFieldList requiredFieldList)
Indicate to the loader fields that will be needed. |
Tuple |
getNext()
Retrieves the next tuple to be processed. |
void |
initialize(org.apache.hadoop.conf.Configuration conf)
This method is called by pig run time to allow the IndexableLoadFunc to perform any initialization actions |
void |
seekNear(Tuple keys)
This method is called by the pig runtime to indicate to the LoadFunc to position its underlying input stream near the keys supplied as the argument. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
---|
public DefaultIndexableLoader(String loaderFuncSpec, String indexFile, String indexFileLoadFuncSpec, String scope)
Method Detail |
---|
public void seekNear(Tuple keys) throws IOException
IndexableLoadFunc
seekNear
in interface IndexableLoadFunc
keys
- Tuple with join keys (which are a prefix of the sort
keys of the input data). For example if the data is sorted on
columns in position 2,4,5 any of the following Tuples are
valid as an argument value:
(fieldAt(2))
(fieldAt(2), fieldAt(4))
(fieldAt(2), fieldAt(4), fieldAt(5))
The following are some invalid cases:
(fieldAt(4))
(fieldAt(2), fieldAt(5))
(fieldAt(4), fieldAt(5))
IOException
- When the loadFunc is unable to position
to the required point in its input streampublic void bindTo(String fileName, BufferedPositionedInputStream is, long offset, long end) throws IOException
LoadFunc
A common way of handling slices in the middle of records is to start at the given offset and, if the offset is not zero, skip to the end of the first record (which may be a partial record) before reading tuples. Reading continues until a tuple has been read that ends at an offset past the ending offset.
The load function should not do any buffering on the input stream. Buffering will cause the offsets returned by is.getPos() to be unreliable.
bindTo
in interface LoadFunc
fileName
- the name of the file to be readis
- the stream representing the file to be processed, and which can also provide its position.offset
- the offset to start reading tuples.end
- the ending offset for reading.
IOException
public DataBag bytesToBag(byte[] b) throws IOException
LoadFunc
bytesToBag
in interface LoadFunc
b
- byte array to be cast.
IOException
- if the value cannot be cast.public String bytesToCharArray(byte[] b) throws IOException
LoadFunc
bytesToCharArray
in interface LoadFunc
b
- byte array to be cast.
IOException
- if the value cannot be cast.public Double bytesToDouble(byte[] b) throws IOException
LoadFunc
bytesToDouble
in interface LoadFunc
b
- byte array to be cast.
IOException
- if the value cannot be cast.public Float bytesToFloat(byte[] b) throws IOException
LoadFunc
bytesToFloat
in interface LoadFunc
b
- byte array to be cast.
IOException
- if the value cannot be cast.public Integer bytesToInteger(byte[] b) throws IOException
LoadFunc
bytesToInteger
in interface LoadFunc
b
- byte array to be cast.
IOException
- if the value cannot be cast.public Long bytesToLong(byte[] b) throws IOException
LoadFunc
bytesToLong
in interface LoadFunc
b
- byte array to be cast.
IOException
- if the value cannot be cast.public Map<String,Object> bytesToMap(byte[] b) throws IOException
LoadFunc
bytesToMap
in interface LoadFunc
b
- byte array to be cast.
IOException
- if the value cannot be cast.public Tuple bytesToTuple(byte[] b) throws IOException
LoadFunc
bytesToTuple
in interface LoadFunc
b
- byte array to be cast.
IOException
- if the value cannot be cast.public Schema determineSchema(String fileName, ExecType execType, DataStorage storage) throws IOException
LoadFunc
determineSchema
in interface LoadFunc
fileName
- Name of the file to be read.(this will be the same as the filename
in the "load statement of the script)execType
- - execution mode of the pig script - one of ExecType.LOCAL or ExecType.MAPREDUCEstorage
- - the DataStorage object corresponding to the execType
IOException
public LoadFunc.RequiredFieldResponse fieldsToRead(LoadFunc.RequiredFieldList requiredFieldList) throws FrontendException
LoadFunc
fieldsToRead
in interface LoadFunc
requiredFieldList
- RequiredFieldList indicating which columns will be needed.
FrontendException
public Tuple getNext() throws IOException
LoadFunc
getNext
in interface LoadFunc
IOException
public void close() throws IOException
IndexableLoadFunc
close
in interface IndexableLoadFunc
IOException
- if the loadfunc is unable to perform
its close actions.public void initialize(org.apache.hadoop.conf.Configuration conf) throws IOException
IndexableLoadFunc
initialize
in interface IndexableLoadFunc
conf
- The job configuration object
IOException
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |