|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectorg.apache.pig.builtin.Utf8StorageConverter
org.apache.pig.builtin.PigStorage
public class PigStorage
A load function that parses a line of input into fields using a delimiter to set the fields. The delimiter is given as a regular expression. See String.split(delimiter) and http://java.sun.com/j2se/1.5.0/docs/api/java/util/regex/Pattern.html for more information.
Nested Class Summary |
---|
Nested classes/interfaces inherited from interface org.apache.pig.LoadFunc |
---|
LoadFunc.RequiredField, LoadFunc.RequiredFieldList, LoadFunc.RequiredFieldResponse |
Field Summary | |
---|---|
protected PigLineRecordReader |
in
|
protected org.apache.commons.logging.Log |
mLog
|
Fields inherited from class org.apache.pig.builtin.Utf8StorageConverter |
---|
mBagFactory, mTupleFactory |
Constructor Summary | |
---|---|
PigStorage()
|
|
PigStorage(String delimiter)
Constructs a Pig loader that uses specified regex as a field delimiter. |
Method Summary | |
---|---|
void |
bindTo(OutputStream os)
Specifies the OutputStream to write to. |
void |
bindTo(String fileName,
BufferedPositionedInputStream in,
long offset,
long end)
Specifies a portion of an InputStream to read tuples. |
Schema |
determineSchema(String fileName,
ExecType execType,
DataStorage storage)
Find the schema from the loader. |
boolean |
equals(Object obj)
|
boolean |
equals(PigStorage other)
|
LoadFunc.RequiredFieldResponse |
fieldsToRead(LoadFunc.RequiredFieldList requiredFieldList)
Indicate to the loader fields that will be needed. |
void |
finish()
Do any kind of post processing because the last tuple has been stored. |
Tuple |
getNext()
Retrieves the next tuple to be processed. |
long |
getPosition()
Get the current position in the stream. |
Tuple |
getSampledTuple()
Get the next tuple from the stream starting from the current read position. |
Class |
getStorePreparationClass()
Specify a backend specific class to use to prepare for storing output. |
int |
hashCode()
|
void |
putNext(Tuple f)
Write a tuple the output stream to which this instance was previously bound. |
void |
setSignature(String signature)
|
long |
skip(long n)
Skip ahead in the input stream. |
Methods inherited from class org.apache.pig.builtin.Utf8StorageConverter |
---|
bytesToBag, bytesToCharArray, bytesToDouble, bytesToFloat, bytesToInteger, bytesToLong, bytesToMap, bytesToTuple, toBytes, toBytes, toBytes, toBytes, toBytes, toBytes, toBytes, toBytes |
Methods inherited from class java.lang.Object |
---|
clone, finalize, getClass, notify, notifyAll, toString, wait, wait, wait |
Methods inherited from interface org.apache.pig.LoadFunc |
---|
bytesToBag, bytesToCharArray, bytesToDouble, bytesToFloat, bytesToInteger, bytesToLong, bytesToMap, bytesToTuple |
Field Detail |
---|
protected PigLineRecordReader in
protected final org.apache.commons.logging.Log mLog
Constructor Detail |
---|
public PigStorage()
public PigStorage(String delimiter)
delimiter
- the single byte character that is used to separate fields.
("\t" is the default.)Method Detail |
---|
public long getPosition() throws IOException
SamplableLoader
getPosition
in interface SamplableLoader
IOException
public long skip(long n) throws IOException
SamplableLoader
skip
in interface SamplableLoader
n
- number of bytes to skip
InpuStream
IOException
public Tuple getNext() throws IOException
LoadFunc
getNext
in interface LoadFunc
IOException
public Tuple getSampledTuple() throws IOException
SamplableLoader
getSampledTuple
in interface SamplableLoader
IOException
public void bindTo(String fileName, BufferedPositionedInputStream in, long offset, long end) throws IOException
LoadFunc
A common way of handling slices in the middle of records is to start at the given offset and, if the offset is not zero, skip to the end of the first record (which may be a partial record) before reading tuples. Reading continues until a tuple has been read that ends at an offset past the ending offset.
The load function should not do any buffering on the input stream. Buffering will cause the offsets returned by is.getPos() to be unreliable.
bindTo
in interface LoadFunc
fileName
- the name of the file to be readin
- the stream representing the file to be processed, and which can also provide its position.offset
- the offset to start reading tuples.end
- the ending offset for reading.
IOException
public void bindTo(OutputStream os) throws IOException
StoreFunc
bindTo
in interface StoreFunc
os
- The stream to write tuples to.
IOException
public void putNext(Tuple f) throws IOException
StoreFunc
putNext
in interface StoreFunc
f
- the tuple to store.
IOException
public void finish() throws IOException
StoreFunc
finish
in interface StoreFunc
IOException
public Schema determineSchema(String fileName, ExecType execType, DataStorage storage) throws IOException
LoadFunc
determineSchema
in interface LoadFunc
fileName
- Name of the file to be read.(this will be the same as the filename
in the "load statement of the script)execType
- - execution mode of the pig script - one of ExecType.LOCAL or ExecType.MAPREDUCEstorage
- - the DataStorage object corresponding to the execType
IOException
public LoadFunc.RequiredFieldResponse fieldsToRead(LoadFunc.RequiredFieldList requiredFieldList) throws FrontendException
LoadFunc
fieldsToRead
in interface LoadFunc
requiredFieldList
- RequiredFieldList indicating which columns will be needed.
FrontendException
public boolean equals(Object obj)
equals
in class Object
public boolean equals(PigStorage other)
public int hashCode()
hashCode
in class Object
public Class getStorePreparationClass() throws IOException
StoreFunc
PigOutputFormat.getRecordWriter(org.apache.hadoop.fs.FileSystem, org.apache.hadoop.mapred.JobConf, String, org.apache.hadoop.util.Progressable)
getStorePreparationClass
in interface StoreFunc
StoreFunc
implementation does not have a class to prepare
for storing output, it can return null and a default Pig implementation
will be used to prepare for storing output.
IOException
- if the class does not implement the expected
interface(s).public void setSignature(String signature)
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |