|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectorg.apache.pig.builtin.BinStorage
public class BinStorage
Field Summary | |
---|---|
protected long |
end
|
protected BufferedPositionedInputStream |
in
|
static int |
RECORD_1
|
static int |
RECORD_2
|
static int |
RECORD_3
|
Constructor Summary | |
---|---|
BinStorage()
Simple binary nested reader format |
Method Summary | |
---|---|
void |
bindTo(OutputStream os)
Specifies the OutputStream to write to. |
void |
bindTo(String fileName,
BufferedPositionedInputStream in,
long offset,
long end)
Specifies a portion of an InputStream to read tuples. |
DataBag |
bytesToBag(byte[] b)
Cast data from bytes to bag value. |
String |
bytesToCharArray(byte[] b)
Cast data from bytes to chararray value. |
Double |
bytesToDouble(byte[] b)
Cast data from bytes to double value. |
Float |
bytesToFloat(byte[] b)
Cast data from bytes to float value. |
Integer |
bytesToInteger(byte[] b)
Cast data from bytes to integer value. |
Long |
bytesToLong(byte[] b)
Cast data from bytes to long value. |
Map<String,Object> |
bytesToMap(byte[] b)
Cast data from bytes to map value. |
Tuple |
bytesToTuple(byte[] b)
Cast data from bytes to tuple value. |
Schema |
determineSchema(String fileName,
ExecType execType,
DataStorage storage)
Find the schema from the loader. |
boolean |
equals(Object obj)
|
void |
fieldsToRead(Schema schema)
Indicate to the loader fields that will be needed. |
void |
finish()
Do any kind of post processing because the last tuple has been stored. |
Tuple |
getNext()
Retrieves the next tuple to be processed. |
long |
getPosition()
Get the current position in the stream. |
Tuple |
getSampledTuple()
Get the next tuple from the stream starting from the current read position. |
Class |
getStorePreparationClass()
Specify a backend specific class to use to prepare for storing output. |
void |
putNext(Tuple t)
Write a tuple the output stream to which this instance was previously bound. |
long |
skip(long n)
Skip ahead in the input stream. |
byte[] |
toBytes(DataBag bag)
|
byte[] |
toBytes(Double d)
|
byte[] |
toBytes(Float f)
|
byte[] |
toBytes(Integer i)
|
byte[] |
toBytes(Long l)
|
byte[] |
toBytes(Map<String,Object> m)
|
byte[] |
toBytes(String s)
|
byte[] |
toBytes(Tuple t)
|
Methods inherited from class java.lang.Object |
---|
clone, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
public static final int RECORD_1
public static final int RECORD_2
public static final int RECORD_3
protected BufferedPositionedInputStream in
protected long end
Constructor Detail |
---|
public BinStorage()
Method Detail |
---|
public long getPosition() throws IOException
SamplableLoader
getPosition
in interface SamplableLoader
IOException
public long skip(long n) throws IOException
SamplableLoader
skip
in interface SamplableLoader
n
- number of bytes to skip
InpuStream
IOException
public Tuple getNext() throws IOException
LoadFunc
getNext
in interface LoadFunc
IOException
public void bindTo(String fileName, BufferedPositionedInputStream in, long offset, long end) throws IOException
LoadFunc
A common way of handling slices in the middle of records is to start at the given offset and, if the offset is not zero, skip to the end of the first record (which may be a partial record) before reading tuples. Reading continues until a tuple has been read that ends at an offset past the ending offset.
The load function should not do any buffering on the input stream. Buffering will cause the offsets returned by is.getPos() to be unreliable.
bindTo
in interface LoadFunc
fileName
- the name of the file to be readin
- the stream representing the file to be processed, and which can also provide its position.offset
- the offset to start reading tuples.end
- the ending offset for reading.
IOException
public void bindTo(OutputStream os) throws IOException
StoreFunc
bindTo
in interface StoreFunc
os
- The stream to write tuples to.
IOException
public void finish() throws IOException
StoreFunc
finish
in interface StoreFunc
IOException
public void putNext(Tuple t) throws IOException
StoreFunc
putNext
in interface StoreFunc
t
- the tuple to store.
IOException
public DataBag bytesToBag(byte[] b)
LoadFunc
bytesToBag
in interface LoadFunc
b
- byte array to be cast.
public String bytesToCharArray(byte[] b)
LoadFunc
bytesToCharArray
in interface LoadFunc
b
- byte array to be cast.
public Double bytesToDouble(byte[] b)
LoadFunc
bytesToDouble
in interface LoadFunc
b
- byte array to be cast.
public Float bytesToFloat(byte[] b)
LoadFunc
bytesToFloat
in interface LoadFunc
b
- byte array to be cast.
public Integer bytesToInteger(byte[] b)
LoadFunc
bytesToInteger
in interface LoadFunc
b
- byte array to be cast.
public Long bytesToLong(byte[] b)
LoadFunc
bytesToLong
in interface LoadFunc
b
- byte array to be cast.
public Map<String,Object> bytesToMap(byte[] b)
LoadFunc
bytesToMap
in interface LoadFunc
b
- byte array to be cast.
public Tuple bytesToTuple(byte[] b)
LoadFunc
bytesToTuple
in interface LoadFunc
b
- byte array to be cast.
public Schema determineSchema(String fileName, ExecType execType, DataStorage storage) throws IOException
LoadFunc
determineSchema
in interface LoadFunc
fileName
- Name of the file to be read.(this will be the same as the filename
in the "load statement of the script)execType
- - execution mode of the pig script - one of ExecType.LOCAL or ExecType.MAPREDUCEstorage
- - the DataStorage object corresponding to the execType
IOException
public void fieldsToRead(Schema schema)
LoadFunc
fieldsToRead
in interface LoadFunc
schema
- Schema indicating which columns will be needed.public byte[] toBytes(DataBag bag) throws IOException
IOException
public byte[] toBytes(String s) throws IOException
IOException
public byte[] toBytes(Double d) throws IOException
IOException
public byte[] toBytes(Float f) throws IOException
IOException
public byte[] toBytes(Integer i) throws IOException
IOException
public byte[] toBytes(Long l) throws IOException
IOException
public byte[] toBytes(Map<String,Object> m) throws IOException
IOException
public byte[] toBytes(Tuple t) throws IOException
IOException
public boolean equals(Object obj)
equals
in class Object
public Class getStorePreparationClass() throws IOException
StoreFunc
PigOutputFormat.getRecordWriter(org.apache.hadoop.fs.FileSystem, org.apache.hadoop.mapred.JobConf, String, org.apache.hadoop.util.Progressable)
getStorePreparationClass
in interface StoreFunc
StoreFunc
implementation does not have a class to prepare
for storing output, it can return null and a default Pig implementation
will be used to prepare for storing output.
IOException
- if the class does not implement the expected
interface(s).public Tuple getSampledTuple() throws IOException
SamplableLoader
getSampledTuple
in interface SamplableLoader
IOException
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |