|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectorg.apache.pig.builtin.Utf8StorageConverter
org.apache.pig.builtin.BinaryStorage
public class BinaryStorage
BinaryStorage
is a simple, as-is, serializer/deserializer pair.
It is LoadFunc
which loads all the given data from the given
InputStream
into a single Tuple
and a StoreFunc
which writes out all input data as a single Tuple
.
BinaryStorage
is intended to work in cases where input files
are to be sent in-whole for processing without any splitting and
interpretation of their data.
Field Summary | |
---|---|
protected int |
bufferSize
|
protected long |
end
|
protected BufferedPositionedInputStream |
in
|
protected long |
offset
|
Fields inherited from class org.apache.pig.builtin.Utf8StorageConverter |
---|
mBagFactory, mLog, mTupleFactory |
Constructor Summary | |
---|---|
BinaryStorage()
Create a BinaryStorage with default buffer size for reading
inputs. |
|
BinaryStorage(int bufferSize)
Create a BinaryStorage with the given buffer-size for
reading inputs. |
Method Summary | |
---|---|
void |
bindTo(OutputStream out)
Specifies the OutputStream to write to. |
void |
bindTo(String fileName,
BufferedPositionedInputStream in,
long offset,
long end)
Specifies a portion of an InputStream to read tuples. |
Schema |
determineSchema(String fileName,
ExecType execType,
DataStorage storage)
Find the schema from the loader. |
boolean |
equals(Object obj)
|
void |
fieldsToRead(Schema schema)
Indicate to the loader fields that will be needed. |
void |
finish()
Do any kind of post processing because the last tuple has been stored. |
Tuple |
getNext()
Retrieves the next tuple to be processed. |
Class |
getStorePreparationClass()
Specify a backend specific class to use to prepare for storing output. |
void |
putNext(Tuple f)
Write a tuple the output stream to which this instance was previously bound. |
String |
toString()
|
Methods inherited from class org.apache.pig.builtin.Utf8StorageConverter |
---|
bytesToBag, bytesToCharArray, bytesToDouble, bytesToFloat, bytesToInteger, bytesToLong, bytesToMap, bytesToTuple, toBytes, toBytes, toBytes, toBytes, toBytes, toBytes, toBytes, toBytes |
Methods inherited from class java.lang.Object |
---|
clone, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
Methods inherited from interface org.apache.pig.LoadFunc |
---|
bytesToBag, bytesToCharArray, bytesToDouble, bytesToFloat, bytesToInteger, bytesToLong, bytesToMap, bytesToTuple |
Field Detail |
---|
protected int bufferSize
protected BufferedPositionedInputStream in
protected long offset
protected long end
Constructor Detail |
---|
public BinaryStorage()
BinaryStorage
with default buffer size for reading
inputs.
public BinaryStorage(int bufferSize)
BinaryStorage
with the given buffer-size for
reading inputs.
bufferSize
- buffer size to be usedMethod Detail |
---|
public void bindTo(String fileName, BufferedPositionedInputStream in, long offset, long end) throws IOException
LoadFunc
A common way of handling slices in the middle of records is to start at the given offset and, if the offset is not zero, skip to the end of the first record (which may be a partial record) before reading tuples. Reading continues until a tuple has been read that ends at an offset past the ending offset.
The load function should not do any buffering on the input stream. Buffering will cause the offsets returned by is.getPos() to be unreliable.
bindTo
in interface LoadFunc
fileName
- the name of the file to be readin
- the stream representing the file to be processed, and which can also provide its position.offset
- the offset to start reading tuples.end
- the ending offset for reading.
IOException
public Tuple getNext() throws IOException
LoadFunc
getNext
in interface LoadFunc
IOException
public void bindTo(OutputStream out) throws IOException
StoreFunc
bindTo
in interface StoreFunc
out
- The stream to write tuples to.
IOException
public void finish() throws IOException
StoreFunc
finish
in interface StoreFunc
IOException
public void putNext(Tuple f) throws IOException
StoreFunc
putNext
in interface StoreFunc
f
- the tuple to store.
IOException
public String toString()
toString
in class Object
public boolean equals(Object obj)
equals
in class Object
public Schema determineSchema(String fileName, ExecType execType, DataStorage storage) throws IOException
LoadFunc
determineSchema
in interface LoadFunc
fileName
- Name of the file to be read.(this will be the same as the filename
in the "load statement of the script)execType
- - execution mode of the pig script - one of ExecType.LOCAL or ExecType.MAPREDUCEstorage
- - the DataStorage object corresponding to the execType
IOException
public void fieldsToRead(Schema schema)
LoadFunc
fieldsToRead
in interface LoadFunc
schema
- Schema indicating which columns will be needed.public Class getStorePreparationClass() throws IOException
StoreFunc
PigOutputFormat.getRecordWriter(org.apache.hadoop.fs.FileSystem, org.apache.hadoop.mapred.JobConf, String, org.apache.hadoop.util.Progressable)
getStorePreparationClass
in interface StoreFunc
StoreFunc
implementation does not have a class to prepare
for storing output, it can return null and a default Pig implementation
will be used to prepare for storing output.
IOException
- if the class does not implement the expected
interface(s).
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |