Newer releases of Apache HBase (>= 0.92) support optional access control list (ACL-) based protection of resources on a column family and/or table basis.
This describes how to set up Secure HBase for access control, with an example of granting and revoking user permission on table resources provided.
You must configure HBase for secure or simple user access operation. Refer to the Secure Client Access to HBase or Simple User Access to HBase sections and complete all of the steps described there.
For secure access, you must also configure ZooKeeper for secure operation. Changes to ACLs are synchronized throughout the cluster using ZooKeeper. Secure authentication to ZooKeeper must be enabled or otherwise it will be possible to subvert HBase access control via direct client access to ZooKeeper. Refer to the section on secure ZooKeeper configuration and complete all of the steps described there.
With Secure RPC and Access Control enabled, client access to HBase is authenticated and user data is private unless access has been explicitly granted. Access to data can be granted at a table or per column family basis.
However, the following items have been left out of the initial implementation for simplicity:
Row-level or per value (cell): Using Tags in HFile V3
Push down of file ownership to HDFS: HBase is not designed for the case where files may have different permissions than the HBase system principal. Pushing file ownership down into HDFS would necessitate changes to core code. Also, while HDFS file ownership would make applying quotas easy, and possibly make bulk imports more straightforward, it is not clear that it would offer a more secure setup.
HBase managed "roles" as collections of permissions: We will not model "roles" internally in HBase to begin with. We instead allow group names to be granted permissions, which allows external modeling of roles via group membership. Groups are created and manipulated externally to HBase, via the Hadoop group mapping service.
Access control mechanisms are mature and fairly standardized in the relational database world. The HBase implementation approximates current convention, but HBase has a simpler feature set than relational databases, especially in terms of client operations. We don't distinguish between an insert (new record) and update (of existing record), for example, as both collapse down into a Put. Accordingly, the important operations condense to four permissions: READ, WRITE, CREATE, and ADMIN.
Table 8.1. Operation To Permission Mapping
Permission | Operation |
---|---|
Read | Get |
Exists | |
Scan | |
Write | Put |
Delete | |
Lock/UnlockRow | |
IncrementColumnValue | |
CheckAndDelete/Put | |
Create | Create |
Alter | |
Drop | |
Bulk Load | |
Admin | Enable/Disable |
Snapshot/Restore/Clone | |
Split | |
Flush | |
Compact | |
Major Compact | |
Grant | |
Revoke | |
Shutdown |
Permissions can be granted in any of the following scopes, though CREATE and ADMIN permissions are effective only at table scope.
Table
Read: User can read from any column family in table
Write: User can write to any column family in table
Create: User can alter table attributes; add, alter, or drop column families; and drop the table.
Admin: User can alter table attributes; add, alter, or drop column families; and enable, disable, or drop the table. User can also trigger region (re)assignments or relocation.
Column Family
Read: User can read from the column family
Write: User can write to the column family
There is also an implicit global scope for the superuser.
The superuser is a principal, specified in the HBase site configuration file, that has equivalent access to HBase as the 'root' user would on a UNIX derived system. Normally this is the principal that the HBase processes themselves authenticate as. Although future versions of HBase Access Control may support multiple superusers, the superuser privilege will always include the principal used to run the HMaster process. Only the superuser is allowed to create tables, switch the balancer on or off, or take other actions with global consequence. Furthermore, the superuser has an implicit grant of all permissions to all resources.
Tables have a new metadata attribute: OWNER, the user principal who owns the table. By default this will be set to the user principal who creates the table, though it may be changed at table creation time or during an alter operation by setting or changing the OWNER table attribute. Only a single user principal can own a table at a given time. A table owner will have all permissions over a given table.
The following matrix shows the minimum permission set required to perform operations in HBase. Before using the table, read through the information about how to interpret it.
Interpreting the ACL Matrix Table
The following conventions are used in the ACL Matrix table:
Permissions are evaluated starting at the widest scope and working to the narrowest scope. A scope corresponds to a level of the data model. From broadest to narrowest, the scopes are as follows::
Global
Namespace (NS)
Table
Column Qualifier (CF)
Column Family (CQ)
Cell
For instance, a permission granted at table level dominates any grants done at the ColumnFamily, ColumnQualifier, or cell level. The user can do what that grant implies at any location in the table. A permission granted at global scope dominates all: the user is always allowed to take that action everywhere.
Possible permissions include the following:
Superuser - a special user that belongs to group "supergroup" and has unlimited access
Admin (A)
Create (C)
Write (W)
Read (R)
Execute (X)
For the most part, permissions work in an expected way, with the following caveats:
Having Write permission does not imply Read permission. It is possible and sometimes desirable for a user to be able to write data that same user cannot read. One such example is a log-writing process.
Admin is a superset of Create, so a user with Admin permissions does not also need Create permissions to perform an action such as creating a table.
The hbase:meta
table is readable by every user, regardless
of the user's other grants or restrictions. This is a requirement for HBase to
function correctly.
Users with Create or Admin permissions are granted Write permission on meta regions, so the table operations they are allowed to perform can complete, even if technically the bits can be granted separately in any possible combination.
CheckAndPut
and CheckAndDelete
operations will fail if the user does not have both
Write and Read permission.
Increment
and Append
operations do not require Read access.
The following table is sorted by the interface that provides each operation. In case the
table goes out of date, the unit tests which check for accuracy of permissions can be found
in
hbase-server/src/test/java/org/apache/hadoop/hbase/security/access/TestAccessController.java
,
and the access controls themselves can be examined in
hbase-server/src/main/java/org/apache/hadoop/hbase/security/access/AccessController.java
.
Table 8.2. ACL Matrix
Interface | Operation | Minimum Scope | Minimum Permission |
---|---|---|---|
Master |
createTable |
Global |
A |
modifyTable |
Table |
A|CW | |
deleteTable |
Table |
A|CW | |
truncateTable |
Table |
A|CW | |
addColumn |
Table |
A|CW | |
modifyColumn |
Table |
A|CW | |
deleteColumn |
Table |
A|CW | |
disableTable |
Table |
A|CW | |
disableAclTable |
None |
Not allowed | |
enableTable |
Table |
A|CW | |
move |
Global |
A | |
assign |
Global |
A | |
unassign |
Global |
A | |
regionOffline |
Global |
A | |
balance |
Global |
A | |
balanceSwitch |
Global |
A | |
shutdown |
Global |
A | |
stopMaster |
Global |
A | |
snapshot |
Global |
A | |
clone |
Global |
A | |
restore |
Global |
A | |
deleteSnapshot |
Global |
A | |
createNamespace |
Global |
A | |
deleteNamespace |
Namespace |
A | |
modifyNamespace |
Namespace |
A | |
flushTable |
Table |
A|CW | |
getTableDescriptors |
Global|Table |
A | |
mergeRegions |
Global |
A | |
Region | preOpen | Global | A |
openRegion |
Global |
A | |
preClose | Global | A | |
closeRegion |
Global |
A | |
preStopRegionServer | Global | A | |
stopRegionServer |
Global |
A | |
mergeRegions |
Global |
A | |
append | Table | W | |
delete | Table|CF|CQ | W | |
exists | Table|CF|CQ | R | |
get | Table|CF|CQ | R | |
getClosestRowBefore | Table|CF|CQ | R | |
increment | Table|CF|CQ | W | |
put | Table|CF|CQ | W | |
flush |
Global |
A|CW | |
split |
Global |
A | |
compact |
Global |
A|CW | |
bulkLoadHFile | Table | W | |
prepareBulkLoad | Table | CW | |
cleanupBulkLoad | Table | W | |
checkAndDelete | Table|CF|CQ | RW | |
checkAndPut | Table|CF|CQ | RW | |
incrementColumnValue | Table|CF|CQ | RW | |
ScannerClose | Table | R | |
ScannerNext | Table | R | |
ScannerOpen | Table|CQ|CF | R | |
Endpoint |
invoke | Endpoint |
X |
AccessController |
grant | Global|Table|NS |
A |
revoke | Global|Table|NS |
A | |
userPermissions |
Global|Table|NS |
A | |
checkPermissions |
Global|Table|NS |
A |
Enable the AccessController coprocessor in the cluster configuration and restart HBase. The restart can be a rolling one. Complete the restart of all Master and RegionServer processes before setting up ACLs.
To enable the AccessController, modify the hbase-site.xml
file on every
server machine in the cluster to look like:
<property> <name>hbase.coprocessor.master.classes</name> <value>org.apache.hadoop.hbase.security.access.AccessController</value> </property> <property> <name>hbase.coprocessor.region.classes</name> <value>org.apache.hadoop.hbase.security.token.TokenProvider, org.apache.hadoop.hbase.security.access.AccessController</value> </property>
Prior to HBase 0.98 access control was restricted to table and column family level. Thanks to tags feature in 0.98 that allows Access control on a cell level. The existing Access Controller coprocessor helps in achieving cell level access control also. For details on configuring it refer to Access Control section.
The ACLs can be specified for every mutation using the APIs
Mutation.setACL(String user, Permission perms) Mutation.setACL(Map<String, Permission> perms)
For example, to provide read permission to an user ‘user1’ then
put.setACL(“user1”, new Permission(Permission.Action.READ))
Generally the ACL applied on the table and CF takes precedence over Cell level ACL. In order to make the cell level ACL to take precedence use the following API,
Mutation.setACLStrategy(boolean cellFirstStrategy)
Please note that inorder to use this feature, HFile V3 version should be turned on.
<property> <name>hfile.format.version</name> <value>3</value> </property>
Note that deletes with ACLs do not have any effect. To keep things simple the ACLs applied on the current Put does not change the ACL of any previous Put in the sense that the ACL on the current put does not affect older versions of Put for the same row.
The HBase shell has been extended to provide simple commands for editing and updating user permissions. The following commands have been added for access control list management:
Example 8.1. Grant
grant <user|@group> <permissions> [ <table> [ <column family> [ <column qualifier> ] ] ]
<user|@group>
is user or group (start with character '@'), Groups are
created and manipulated via the Hadoop group mapping service.
<permissions>
is zero or more letters from the set "RWCA": READ('R'),
WRITE('W'), CREATE('C'), ADMIN('A').
Note: Grants and revocations of individual permissions on a resource are both
accomplished using the grant
command. A separate revoke
command is
also provided by the shell, but this is for fast revocation of all of a user's access rights
to a given resource only.
Example 8.3. Alter
The alter
command has been extended to allow ownership
assignment:
alter 'tablename', {OWNER => 'username|@group'}
Example 8.4. User Permission
The user_permission
command shows all access permissions for the current
user for a given table:
user_permission <table>