|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectorg.apache.pig.builtin.Utf8StorageConverter
org.apache.pig.backend.hadoop.hbase.HBaseStorage
public class HBaseStorage
A Slicer
that split the hbase table into HBaseSlice
s.
And a load function will provided to do none load operations, the actually
load operatrions will be done in HBaseSlice
.
Nested Class Summary |
---|
Nested classes/interfaces inherited from interface org.apache.pig.LoadFunc |
---|
LoadFunc.RequiredField, LoadFunc.RequiredFieldList, LoadFunc.RequiredFieldResponse |
Field Summary |
---|
Fields inherited from class org.apache.pig.builtin.Utf8StorageConverter |
---|
mBagFactory, mLog, mTupleFactory |
Constructor Summary | |
---|---|
HBaseStorage(String columnList)
Constructor. |
Method Summary | |
---|---|
void |
bindTo(String fileName,
BufferedPositionedInputStream is,
long offset,
long end)
Specifies a portion of an InputStream to read tuples. |
Schema |
determineSchema(String fileName,
ExecType execType,
DataStorage storage)
Find the schema from the loader. |
LoadFunc.RequiredFieldResponse |
fieldsToRead(LoadFunc.RequiredFieldList requiredFields)
Indicate to the loader fields that will be needed. |
Tuple |
getNext()
Retrieves the next tuple to be processed. |
Slice[] |
slice(DataStorage store,
String tablename)
Creates slices of data from store at location . |
void |
validate(DataStorage store,
String tablename)
Checks that location is parsable by this Slicer, and that
if the DataStorage is used by the Slicer, it's readable from there. |
Methods inherited from class org.apache.pig.builtin.Utf8StorageConverter |
---|
bytesToBag, bytesToCharArray, bytesToDouble, bytesToFloat, bytesToInteger, bytesToLong, bytesToMap, bytesToTuple, toBytes, toBytes, toBytes, toBytes, toBytes, toBytes, toBytes, toBytes |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Methods inherited from interface org.apache.pig.LoadFunc |
---|
bytesToBag, bytesToCharArray, bytesToDouble, bytesToFloat, bytesToInteger, bytesToLong, bytesToMap, bytesToTuple |
Constructor Detail |
---|
public HBaseStorage(String columnList)
columnList
- columnlist that is a presented string delimited by space.Method Detail |
---|
public Slice[] slice(DataStorage store, String tablename) throws IOException
Slicer
store
at location
.
slice
in interface Slicer
IOException
public void validate(DataStorage store, String tablename) throws IOException
Slicer
location
is parsable by this Slicer, and that
if the DataStorage is used by the Slicer, it's readable from there. If it
isn't, an IOException with a message explaining why will be thrown.
This does not ensure that all the data in location
is
valid. It's a preflight check that there's some chance of the Slicer
working before actual Slices are created and sent off for processing.
validate
in interface Slicer
IOException
public void bindTo(String fileName, BufferedPositionedInputStream is, long offset, long end) throws IOException
LoadFunc
A common way of handling slices in the middle of records is to start at the given offset and, if the offset is not zero, skip to the end of the first record (which may be a partial record) before reading tuples. Reading continues until a tuple has been read that ends at an offset past the ending offset.
The load function should not do any buffering on the input stream. Buffering will cause the offsets returned by is.getPos() to be unreliable.
bindTo
in interface LoadFunc
fileName
- the name of the file to be readis
- the stream representing the file to be processed, and which can also provide its position.offset
- the offset to start reading tuples.end
- the ending offset for reading.
IOException
public Schema determineSchema(String fileName, ExecType execType, DataStorage storage) throws IOException
LoadFunc
determineSchema
in interface LoadFunc
fileName
- Name of the file to be read.(this will be the same as the filename
in the "load statement of the script)execType
- - execution mode of the pig script - one of ExecType.LOCAL or ExecType.MAPREDUCEstorage
- - the DataStorage object corresponding to the execType
IOException
public LoadFunc.RequiredFieldResponse fieldsToRead(LoadFunc.RequiredFieldList requiredFields) throws FrontendException
LoadFunc
fieldsToRead
in interface LoadFunc
requiredFields
- RequiredFieldList indicating which columns will be needed.
FrontendException
public Tuple getNext() throws IOException
LoadFunc
getNext
in interface LoadFunc
IOException
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |