See: Description
Interface | Description |
---|---|
Batch.Call<T,R> |
Defines a unit of work to be executed.
|
Batch.Callback<R> |
Defines a generic callback to be triggered for each
Batch.Call.call(Object)
result. |
Class | Description |
---|---|
AggregationClient |
This client class is for invoking the aggregate functions deployed on the
Region Server side via the AggregateProtocol.
|
Batch |
A collection of interfaces and utilities used for interacting with custom RPC
interfaces exposed by Coprocessors.
|
BigDecimalColumnInterpreter |
ColumnInterpreter for doing Aggregation's with BigDecimal columns.
|
Exec |
Represents an arbitrary method invocation against a Coprocessor
instance.
|
ExecResult |
Represents the return value from a
Exec invocation. |
LongColumnInterpreter |
a concrete column interpreter implementation.
|
The coprocessor framework provides a way for custom code to run in place on the HBase region servers with each of a table's regions. These client classes enable applications to communicate with coprocessor instances via custom RPC protocols.
In order to provide a custom RPC protocol to clients, a coprocessor implementation
defines an interface that extends CoprocessorProtocol
.
The interface can define any methods that the coprocessor wishes to expose.
Using this protocol, you can communicate with the coprocessor instances via
the HTable.coprocessorProxy(Class, byte[])
and
HTable.coprocessorExec(Class, byte[], byte[], org.apache.hadoop.hbase.client.coprocessor.Batch.Call, org.apache.hadoop.hbase.client.coprocessor.Batch.Callback)
methods.
Since CoprocessorProtocol
instances are
associated with individual regions within the table, the client RPC calls
must ultimately identify which regions should be used in the CoprocessorProtocol
method invocations. Since regions are seldom handled directly in client code
and the region names may change over time, the coprocessor RPC calls use row keys
to identify which regions should be used for the method invocations. Clients
can call CoprocessorProtocol
methods against either:
HTable.coprocessorProxy(Class, byte[])
with a single row key. This returns a dynamic proxy of the CoprocessorProtocol
interface which uses the region containing the given row key (even if the
row does not exist) as the RPC endpoint.HTable.coprocessorExec(Class, byte[], byte[], org.apache.hadoop.hbase.client.coprocessor.Batch.Call, org.apache.hadoop.hbase.client.coprocessor.Batch.Callback)
with a starting row key and an ending row key. All regions in the table
from the region containing the start row key to the region containing the end
row key (inclusive), will we used as the RPC endpoints.Note that the row keys passed as parameters to the HTable
methods are not passed to the CoprocessorProtocol
implementations.
They are only used to identify the regions for endpoints of the remote calls.
The Batch
class defines two
interfaces used for CoprocessorProtocol
invocations against
multiple regions. Clients implement Batch.Call
to
call methods of the actual CoprocessorProtocol
instance. The interface's
call()
method will be called once per selected region, passing the
CoprocessorProtocol
instance for the region as a parameter. Clients
can optionally implement Batch.Callback
to be notified of the results from each region invocation as they complete.
The instance's Batch.Callback.update(byte[], byte[], Object)
method will be called with the Batch.Call.call(Object)
return value from each region.
To start with, let's use a fictitious coprocessor, RowCountCoprocessor
that counts the number of rows and key-values in each region where it is running.
For clients to query this information, the coprocessor defines and implements
the following CoprocessorProtocol
extension
interface:
public interface RowCountProtocol extends CoprocessorProtocol { long getRowCount(); long getRowCount(Filter filt); long getKeyValueCount(); }
Now we need a way to access the results that RowCountCoprocessor
is making available. If we want to find the row count for all regions, we could
use:
HTable table = new HTable("mytable"); // find row count keyed by region name Mapresults = table.coprocessorExec( RowCountProtocol.class, // the protocol interface we're invoking null, null, // start and end row keys new Batch.Call () { public Long call(RowCountProtocol counter) { return counter.getRowCount(); } });
This will return a java.util.Map
of the counter.getRowCount()
result for the RowCountCoprocessor
instance running in each region
of mytable
, keyed by the region name.
By implementing Batch.Call
as an anonymous class, we can invoke RowCountProtocol
methods
directly against the Batch.Call.call(Object)
method's argument. Calling HTable.coprocessorExec(Class, byte[], byte[], org.apache.hadoop.hbase.client.coprocessor.Batch.Call)
will take care of invoking Batch.Call.call()
against our anonymous class
with the RowCountCoprocessor
instance for each table region.
For this simple case, where we only want to obtain the result from a single
CoprocessorProtocol
method, there's also a bit of syntactic sugar
we can use to cut down on the amount of code required:
HTable table = new HTable("mytable"); Batch.Callcall = Batch.forMethod(RowCountProtocol.class, "getRowCount"); Map results = table.coprocessorExec(RowCountProtocol.class, null, null, call);
Batch.forMethod(Class, String, Object...)
is a simple factory method that will return a Batch.Call
instance that will call RowCountProtocol.getRowCount()
for us
using reflection.
However, if you want to perform additional processing on the results,
implementing Batch.Call
directly will provide more power and flexibility. For example, if you would
like to combine row count and key-value count for each region:
HTable table = new HTable("mytable"); // combine row count and kv count for region Map> results = table.coprocessorExec( RowCountProtocol.class, null, null, new Batch.Call >() { public Pair call(RowCountProtocol counter) { return new Pair(counter.getRowCount(), counter.getKeyValueCount()); } });
Similarly, you could average the number of key-values per row for each region:
Mapresults = table.coprocessorExec( RowCountProtocol.class, null, null, new Batch.Call () { public Double call(RowCountProtocol counter) { return ((double)counter.getKeyValueCount()) / ((double)counter.getRowCount()); } });
Copyright © 2014 The Apache Software Foundation. All Rights Reserved.