Table of Contents
The Apache HBase Shell is (J)Ruby's IRB with some HBase particular commands added. Anything you can do in IRB, you should be able to do in the HBase Shell.
To run the HBase shell, do as follows:
$ ./bin/hbase shell
Type help and then <RETURN> to see a listing of shell commands and options. Browse at least the paragraphs at the end of the help emission for the gist of how variables and command arguments are entered into the HBase shell; in particular note how table names, rows, and columns, etc., must be quoted.
See ??? for example basic shell operation.
Here is a nicely formatted listing of all shell commands by Rajeshbabu Chintaguntla.
For examples scripting Apache HBase, look in the HBase bin
directory. Look at the files that end in *.rb
. To run one of these
files, do as follows:
$ ./bin/hbase org.jruby.Main PATH_TO_SCRIPT
A new non-interactive mode has been added to the HBase Shell (HBASE-11658).
Non-interactive mode captures the exit status (success or failure) of HBase Shell
commands and passes that status back to the command interpreter. If you use the normal
interactive mode, the HBase Shell will only ever return its own exit status, which will
nearly always be 0
for success.
To invoke non-interactive mode, pass the -n
or
--non-interactive
option to HBase Shell.
You can use the HBase shell from within operating system script interpreters like the Bash shell which is the default command interpreter for most Linux and UNIX distributions. The following guidelines use Bash syntax, but could be adjusted to work with C-style shells such as csh or tcsh, and could probably be modified to work with the Microsoft Windows script interpreter as well. Submissions are welcome.
Spawning HBase Shell commands in this way is slow, so keep that in mind when you are deciding when combining HBase operations with the operating system command line is appropriate.
Example 1.1. Passing Commands to the HBase Shell
You can pass commands to the HBase Shell in non-interactive mode (see ???) using the echo
command and the |
(pipe) operator. Be sure to escape characters
in the HBase commands which would otherwise be interpreted by the shell. Some
debug-level output has been truncated from the example below.
$echo "describe 'test1'" | ./hbase shell -n
Version 0.98.3-hadoop2, rd5e65a9144e315bb0a964e7730871af32f5018d5, Sat May 31 19:56:09 PDT 2014 describe 'test1' DESCRIPTION ENABLED 'test1', {NAME => 'cf', DATA_BLOCK_ENCODING => 'NON true E', BLOOMFILTER => 'ROW', REPLICATION_SCOPE => '0', VERSIONS => '1', COMPRESSION => 'NONE', MIN_VERSIO NS => '0', TTL => 'FOREVER', KEEP_DELETED_CELLS => 'false', BLOCKSIZE => '65536', IN_MEMORY => 'false' , BLOCKCACHE => 'true'} 1 row(s) in 3.2410 seconds
To suppress all output, echo it to /dev/null:
$ echo "describe 'test'" | ./hbase shell -n > /dev/null 2>&1
Example 1.2. Checking the Result of a Scripted Command
Since scripts are not designed to be run interactively, you need a way to check
whether your command failed or succeeded. The HBase shell uses the standard
convention of returning a value of 0
for successful commands, and
some non-zero value for failed commands. Bash stores a command's return value in a
special environment variable called $?
. Because that variable is
overwritten each time the shell runs any command, you should store the result in a
different, script-defined variable.
This is a naive script that shows one way to store the return value and make a decision based upon it.
#!/bin/bash echo "describe 'test'" | ./hbase shell -n > /dev/null 2>&1 status=$? echo "The status was " $status if ($status == 0); then echo "The command succeeded" else echo "The command may have failed." fi return $status
Getting an exit code of 0 means that the command you scripted definitely succeeded. However, getting a non-zero exit code does not necessarily mean the command failed. The command could have succeeded, but the client lost connectivity, or some other event obscured its success. This is because RPC commands are stateless. The only way to be sure of the status of an operation is to check. For instance, if your script creates a table, but returns a non-zero exit value, you should check whether the table was actually created before trying again to create it.
You can enter HBase Shell commands into a text file, one command per line, and pass that file to the HBase Shell.
Example 1.3. Example Command File
create 'test', 'cf' list 'test' put 'test', 'row1', 'cf:a', 'value1' put 'test', 'row2', 'cf:b', 'value2' put 'test', 'row3', 'cf:c', 'value3' put 'test', 'row4', 'cf:d', 'value4' scan 'test' get 'test', 'row1' disable 'test' enable 'test'
Example 1.4. Directing HBase Shell to Execute the Commands
Pass the path to the command file as the only argument to the hbase shell command. Each command is executed and its output is shown. If you do not include the exit command in your script, you are returned to the HBase shell prompt. There is no way to programmatically check each individual command for success or failure. Also, though you see the output for each command, the commands themselves are not echoed to the screen so it can be difficult to line up the command with its output.
$./hbase shell ./sample_commands.txt
0 row(s) in 3.4170 seconds TABLE test 1 row(s) in 0.0590 seconds 0 row(s) in 0.1540 seconds 0 row(s) in 0.0080 seconds 0 row(s) in 0.0060 seconds 0 row(s) in 0.0060 seconds ROW COLUMN+CELL row1 column=cf:a, timestamp=1407130286968, value=value1 row2 column=cf:b, timestamp=1407130286997, value=value2 row3 column=cf:c, timestamp=1407130287007, value=value3 row4 column=cf:d, timestamp=1407130287015, value=value4 4 row(s) in 0.0420 seconds COLUMN CELL cf:a timestamp=1407130286968, value=value1 1 row(s) in 0.0110 seconds 0 row(s) in 1.5630 seconds 0 row(s) in 0.4360 seconds
You can pass VM options to the HBase Shell using the HBASE_SHELL_OPTS
environment variable. You can set this in your environment, for instance by editing
~/.bashrc
, or set it as part of the command to launch HBase
Shell. The following example sets several garbage-collection-related variables, just for
the lifetime of the VM running the HBase Shell. The command should be run all on a
single line, but is broken by the \
character, for
readability.
$ HBASE_SHELL_OPTS="-verbose:gc -XX:+PrintGCApplicationStoppedTime -XX:+PrintGCDateStamps \
-XX:+PrintGCDetails -Xloggc:$HBASE_HOME/logs/gc-hbase.log" ./bin/hbase shell
HBase 0.95 adds shell commands that provide a jruby-style object-oriented references for tables. Previously all of the shell commands that act upon a table have a procedural style that always took the name of the table as an argument. HBase 0.95 introduces the ability to assign a table to a jruby variable. The table reference can be used to perform data read write operations such as puts, scans, and gets well as admin functionality such as disabling, dropping, describing tables.
For example, previously you would always specify a table name:
hbase(main):000:0> create ‘t’, ‘f’ 0 row(s) in 1.0970 seconds hbase(main):001:0> put 't', 'rold', 'f', 'v' 0 row(s) in 0.0080 seconds hbase(main):002:0> scan 't' ROW COLUMN+CELL rold column=f:, timestamp=1378473207660, value=v 1 row(s) in 0.0130 seconds hbase(main):003:0> describe 't' DESCRIPTION ENABLED 't', {NAME => 'f', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROW', REPLICATION_ true SCOPE => '0', VERSIONS => '1', COMPRESSION => 'NONE', MIN_VERSIONS => '0', TTL => '2 147483647', KEEP_DELETED_CELLS => 'false', BLOCKSIZE => '65536', IN_MEMORY => 'false ', BLOCKCACHE => 'true'} 1 row(s) in 1.4430 seconds hbase(main):004:0> disable 't' 0 row(s) in 14.8700 seconds hbase(main):005:0> drop 't' 0 row(s) in 23.1670 seconds hbase(main):006:0>
Now you can assign the table to a variable and use the results in jruby shell code.
hbase(main):007 > t = create 't', 'f' 0 row(s) in 1.0970 seconds => Hbase::Table - t hbase(main):008 > t.put 'r', 'f', 'v' 0 row(s) in 0.0640 seconds hbase(main):009 > t.scan ROW COLUMN+CELL r column=f:, timestamp=1331865816290, value=v 1 row(s) in 0.0110 seconds hbase(main):010:0> t.describe DESCRIPTION ENABLED 't', {NAME => 'f', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROW', REPLICATION_ true SCOPE => '0', VERSIONS => '1', COMPRESSION => 'NONE', MIN_VERSIONS => '0', TTL => '2 147483647', KEEP_DELETED_CELLS => 'false', BLOCKSIZE => '65536', IN_MEMORY => 'false ', BLOCKCACHE => 'true'} 1 row(s) in 0.0210 seconds hbase(main):038:0> t.disable 0 row(s) in 6.2350 seconds hbase(main):039:0> t.drop 0 row(s) in 0.2340 seconds
If the table has already been created, you can assign a Table to a variable by using the get_table method:
hbase(main):011 > create 't','f' 0 row(s) in 1.2500 seconds => Hbase::Table - t hbase(main):012:0> tab = get_table 't' 0 row(s) in 0.0010 seconds => Hbase::Table - t hbase(main):013:0> tab.put ‘r1’ ,’f’, ‘v’ 0 row(s) in 0.0100 seconds hbase(main):014:0> tab.scan ROW COLUMN+CELL r1 column=f:, timestamp=1378473876949, value=v 1 row(s) in 0.0240 seconds hbase(main):015:0>
The list functionality has also been extended so that it returns a list of table names as strings. You can then use jruby to script table operations based on these names. The list_snapshots command also acts similarly.
hbase(main):016 > tables = list(‘t.*’) TABLE t 1 row(s) in 0.1040 seconds => #<#<Class:0x7677ce29>:0x21d377a4> hbase(main):017:0> tables.map { |t| disable t ; drop t} 0 row(s) in 2.2510 seconds => [nil] hbase(main):018:0>
Create an .irbrc
file for yourself in your home directory.
Add customizations. A useful one is command history so commands are save across
Shell invocations:
$ more .irbrc require 'irb/ext/save-history' IRB.conf[:SAVE_HISTORY] = 100 IRB.conf[:HISTORY_FILE] = "#{ENV['HOME']}/.irb-save-history"
See the ruby documentation of
.irbrc
to learn about other possible configurations.
To convert the date '08/08/16 20:56:29' from an hbase log into a timestamp, do:
hbase(main):021:0> import java.text.SimpleDateFormat hbase(main):022:0> import java.text.ParsePosition hbase(main):023:0> SimpleDateFormat.new("yy/MM/dd HH:mm:ss").parse("08/08/16 20:56:29", ParsePosition.new(0)).getTime() => 1218920189000
To go the other direction:
hbase(main):021:0> import java.util.Date hbase(main):022:0> Date.new(1218920189000).toString() => "Sat Aug 16 20:56:29 UTC 2008"
To output in a format that is exactly like that of the HBase log format will take a little messing with SimpleDateFormat.
You can set a debug switch in the shell to see more output -- e.g. more of the stack trace on exception -- when you run a command:
hbase> debug <RETURN>