Table of Contents
You cannot skip major versions upgrading. If you are upgrading from version 0.90.x to 0.94.x, you must first go from 0.90.x to 0.92.x and then go from 0.92.x to 0.94.x.
It may be possible to skip across versions -- for example go from 0.92.2 straight to 0.98.0 just following the 0.96.x upgrade instructions -- but we have not tried it so cannot say whether it works or not.
Review ???, in particular the section on Hadoop version.
HBase has two versioning schemes, pre-1.0 and post-1.0. Both are detailed below.
Starting with 1.0.0 release, HBase uses Semantic Versioning for it release versioning. In summary:
Given a version number MAJOR.MINOR.PATCH, increment the:
- MAJOR version when you make incompatible API changes,
- MINOR version when you add functionality in a backwards-compatible manner, and
- PATCH version when you make backwards-compatible bug fixes.
- Additional labels for pre-release and build metadata are available as extensions to the MAJOR.MINOR.PATCH format.
In addition to the usual API versioning considerations HBase has other compatibility dimensions that we need to consider.
(Y means we support the compatibility. N means we can break it.)
Table 1.1. Compatibility Matrix
Major | Minor | Patch | |
---|---|---|---|
Client-Server wire Compatibility | N | Y | Y |
Server-Server Compatibility | N | Y | Y |
File Format Compatibility | N[a] | Y | Y |
Client API Compatibility | N | Y | Y |
Client Binary Compatibility | N | N | Y |
Server-Side Limited API Compatibility | |||
| N | Y | Y |
| N | N | Y |
| N | N | N |
Dependency Compatibility | N | Y | Y |
Operational Compatibility | N | N | Y |
[a] Running an offline upgrade tool without rollback might be needed. We will typically only support migrating data from major version X to major version X+1. |
HBase has a lot of API points, but for the compatibility matrix above, we differentiate between Client API, Limited Private API, and Private API. HBase uses a version of Hadoop's Interface classification. HBase's Interface classification classes can be found here.
HBase Client API consists of all the classes or methods that are marked with InterfaceAudience.Public interface. All main classes in hbase-client and dependent modules have either InterfaceAudience.Public, InterfaceAudience.LimitedPrivate, or InterfaceAudience.Private marker. Not all classes in other modules (hbase-server, etc) have the marker. If a class is not annotated with one of these, it is assumed to be a InterfaceAudience.Private class.
LimitedPrivate annotation comes with a set of target consumers for the interfaces. Those consumers are coprocessors, phoenix, replication endpoint implemnetations or similar. At this point, HBase only guarantees source and binary compatibility for these interfaces between patch versions.
All classes annotated with InterfaceAudience.Private or all classes that do not have the annotation are for HBase internal use only. The interfaces and method signatures can change at any point in time. If you are relying on a particular interface that is marked Private, you should open a jira to propose changing the interface to be Public or LimitedPrivate, or an interface exposed for this purpose.
Before the semantic versioning scheme pre-1.0, HBase tracked either Hadoop's versions (0.2x) or 0.9x versions. If you are into the arcane, checkout our old wiki page on HBase Versioning which tries to connect the HBase version dots. Below sections cover ONLY the releases before 1.0.
Ahead of big releases, we have been putting up preview versions to start the feedback cycle turning-over earlier. These "Development" Series releases, always odd-numbered, come with no guarantees, not even regards being able to upgrade between two sequential releases (we reserve the right to break compatibility across "Development" Series releases). Needless to say, these releases are not for production deploys. They are a preview of what is coming in the hope that interested parties will take the release for a test drive and flag us early if we there are issues we've missed ahead of our rolling a production-worthy release.
Our first "Development" Series was the 0.89 set that came out ahead of HBase 0.90.0. HBase 0.95 is another "Development" Series that portends HBase 0.96.0. 0.99.x is the last series in "developer preview" mode before 1.0. Afterwards, we will be using semantic versioning naming scheme (see above).
When we say two HBase versions are compatible, we mean that the versions are wire and binary compatible. Compatible HBase versions means that clients can talk to compatible but differently versioned servers. It means too that you can just swap out the jars of one version and replace them with the jars of another, compatible version and all will just work. Unless otherwise specified, HBase point versions are (mostly) binary compatible. You can safely do rolling upgrades between binary compatible versions; i.e. across point versions: e.g. from 0.94.5 to 0.94.6. See Does compatibility between versions also mean binary compatibility? discussion on the hbaes dev mailing list.
A rolling upgrade is the process by which you update the servers in your cluster a server at a time. You can rolling upgrade across HBase versions if they are binary or wire compatible. See <xlnk></xlnk> for more on what this means. Coarsely, a rolling upgrade is a graceful stop each server, update the software, and then restart. You do this for each server in the cluster. Usually you upgrade the Master first and then the regionservers. See <xlink></xlink> for tools that can help use the rolling upgrade process.
For example, in the below, hbase was symlinked to the actual hbase install. On upgrade, before running a rolling restart over the cluser, we changed the symlink to point at the new HBase software version and then ran
$ HADOOP_HOME=~/hadoop-2.6.0-CRC-SNAPSHOT ~/hbase/bin/rolling-restart.sh --config ~/conf_hbase
The rolling-restart script will first gracefully stop and restart the master, and then each of the regionservers in turn. Because the symlink was changed, on restart the server will come up using the new hbase version. Check logs for errors as the rolling upgrade proceeds.
Unless otherwise specified, HBase point versions are binary compatible. You can do a <xlink></xlink> between hbase point versions. For example, you can go to 0.94.6 from 0.94.5 by doing a rolling upgrade across the cluster replacing the 0.94.5 binary with a 0.94.6 binary.
In the minor version-particular sections below, we call out where the versions are wire/protocol compatible and in this case, it is also possible to do a <xlink></xlink>. For example, in <xlink></xlink>, we state that it is possible to do a rolling upgrade between hbase-0.98.x and hbase-1.0.0.
In this section we first note the significant changes that come in with 1.0.0 HBase and then we go over the upgrade process. Be sure to read the significant changes section with care so you avoid surprises.
In here we list important changes that are in 1.0.0 since 0.98.x., changes you should be aware that will go into effect once you upgrade.
See ???.
The ports used by HBase changed. The used to be in the 600XX range. In
hbase-1.0.0 they have been moved up out of the ephemeral port range and are
160XX instead (Master web UI was 60010 and is now 16030; the RegionServer
web UI was 60030 and is now 16030, etc). If you want to keep the old port
locations, copy the port setting configs from hbase-default.xml
into hbase-site.xml
, change them back to the old values
from hbase-0.98.x era, and ensure you've distributed your configurations before
you restart.
You may have made use of this configuration if you are using BucketCache.
If NOT using BucketCache, this change does not effect you.
Its removal means that your L1 LruBlockCache is now sized
using hfile.block.cache.size
-- i.e. the way you
would size the onheap L1 LruBlockCache if you were NOT doing
BucketCache -- and the BucketCache size is not whatever the
setting for hbase.bucketcache.size is. You may need to adjust
configs to get the LruBlockCache and BucketCache sizes set to
what they were in 0.98.x and previous. If you did not set this
config., its default value was 0.9. If you do nothing, your
BucketCache will increase in size by 10%. Your L1 LruBlockCache will
become hfile.block.cache.size
times your java
heap size (hfile.block.cache.size is a float between 0.0 and 1.0).
To read more, see
HBASE-11520 Simplify offheap cache config by removing the confusing "hbase.bucketcache.percentage.in.combinedcache".
See the release notes on the issue HBASE-12068 [Branch-1] Avoid need to always do KeyValueUtil#ensureKeyValue for Filter transformCell; be sure to follow the recommendations therein.
??? is off by default in hbase-1.0. Enabling it can make a big difference improving HBase MTTR. Enable this feature if you are doing a clean stop/start when you are upgrading. You cannot rolling upgrade on to this feature (caveat if you are running on a version of hbase in excess of hbase-0.98.4 -- see HBASE-12577 Disable distributed log replay by default for more).
You cannot do a <xlink></xlink> from 0.96.x to 1.0.0 without first doing a rolling upgrade to 0.98.x. See comment in HBASE-11164 Document and test rolling updates from 0.98 -> 1.0 for the why. Also because hbase-1.0.0 enables hfilev3 by default, HBASE-9801 Change the default HFile version to V3, and support for hfilev3 only arrives in 0.98, this is another reason you cannot rolling upgrade from hbase-0.96.x; if the rolling upgrade stalls, the 0.96.x servers cannot open files written by the servers running the newer hbase-1.0.0 hfilev3 writing servers.
There are no known issues running a <xlink></xlink> from hbase-0.98.x to hbase-1.0.0.
You cannot rolling upgrade from 0.94.x to 1.x.x. You must stop your cluster, install the 1.x.x software, run the migration described at Section 1.5.1, “Executing the 0.96 Upgrade” (substituting 1.x.x. wherever we make mention of 0.96.x in the section below), and then restart. Be sure to upgrade your zookeeper if it is a version less than the required 3.4.x.
A rolling upgrade from 0.96.x to 0.98.x works. The two versions are not binary compatible.
Additional steps are required to take advantage of some of the new features of 0.98.x, including cell visibility labels, cell ACLs, and transparent server side encryption. See the ??? chapter of this guide for more information. Significant performance improvements include a change to the write ahead log threading model that provides higher transaction throughput under high load, reverse scanners, MapReduce over snapshot files, and striped compaction.
Clients and servers can run with 0.98.x and 0.96.x versions. However, applications may need to be recompiled due to changes in the Java API.
A rolling upgrade from 0.94.x directly to 0.98.x does not work. The upgrade path follows the same procedures as Section 1.5, “Upgrading from 0.94.x to 0.96.x”. Additional steps are required to use some of the new features of 0.98.x. See Section 1.3, “Upgrading from 0.96.x to 0.98.x” for an abbreviated list of these features.
Do not deploy 0.96.x Deploy a 0.98.x at least. See EOL 0.96.
You will have to stop your old 0.94.x cluster completely to upgrade. If you are replicating between clusters, both clusters will have to go down to upgrade. Make sure it is a clean shutdown. The less WAL files around, the faster the upgrade will run (the upgrade will split any log files it finds in the filesystem as part of the upgrade process). All clients must be upgraded to 0.96 too.
The API has changed. You will need to recompile your code against 0.96 and you may need to adjust applications to go against new APIs (TODO: List of changes).
HDFS and ZooKeeper should be up and running during the upgrade process.
hbase-0.96.0 comes with an upgrade script. Run
$ bin/hbase upgrade
to see its usage. The script has two main modes: -check, and -execute.
The check step is run against a running 0.94 cluster. Run
it from a downloaded 0.96.x binary. The check step is
looking for the presence of HFileV1
files. These are
unsupported in hbase-0.96.0. To purge them -- have them rewritten as HFileV2 --
you must run a compaction.
The check step prints stats at the end of its run (grep for “Result:” in the log) printing absolute path of the tables it scanned, any HFileV1 files found, the regions containing said files (the regions we need to major compact to purge the HFileV1s), and any corrupted files if any found. A corrupt file is unreadable, and so is undefined (neither HFileV1 nor HFileV2).
To run the check step, run $ bin/hbase upgrade -check. Here is sample output:
Tables Processed: hdfs://localhost:41020/myHBase/.META. hdfs://localhost:41020/myHBase/usertable hdfs://localhost:41020/myHBase/TestTable hdfs://localhost:41020/myHBase/t Count of HFileV1: 2 HFileV1: hdfs://localhost:41020/myHBase/usertable /fa02dac1f38d03577bd0f7e666f12812/family/249450144068442524 hdfs://localhost:41020/myHBase/usertable /ecdd3eaee2d2fcf8184ac025555bb2af/family/249450144068442512 Count of corrupted files: 1 Corrupted Files: hdfs://localhost:41020/myHBase/usertable/fa02dac1f38d03577bd0f7e666f12812/family/1 Count of Regions with HFileV1: 2 Regions to Major Compact: hdfs://localhost:41020/myHBase/usertable/fa02dac1f38d03577bd0f7e666f12812 hdfs://localhost:41020/myHBase/usertable/ecdd3eaee2d2fcf8184ac025555bb2af There are some HFileV1, or corrupt files (files with incorrect major version)
In the above sample output, there are two HFileV1 in two regions, and one corrupt file. Corrupt files should probably be removed. The regions that have HFileV1s need to be major compacted. To major compact, start up the hbase shell and review how to compact an individual region. After the major compaction is done, rerun the check step and the HFileV1s shoudl be gone, replaced by HFileV2 instances.
By default, the check step scans the hbase root directory (defined as hbase.rootdir in the configuration). To scan a specific directory only, pass the -dir option.
$ bin/hbase upgrade -check -dir /myHBase/testTable
The above command would detect HFileV1s in the /myHBase/testTable directory.
Once the check step reports all the HFileV1 files have been rewritten, it is safe to proceed with the upgrade.
After the check step shows the cluster is free of HFileV1, it is safe to proceed with the upgrade. Next is the execute step. You must SHUTDOWN YOUR 0.94.x CLUSTER before you can run the execute step. The execute step will not run if it detects running HBase masters or regionservers.
HDFS and ZooKeeper should be up and running during the upgrade process. If zookeeper is managed by HBase, then you can start zookeeper so it is available to the upgrade by running $ ./hbase/bin/hbase-daemon.sh start zookeeper
The execute upgrade step is made of three substeps.
Namespaces: HBase 0.96.0 has support for namespaces. The upgrade needs to reorder directories in the filesystem for namespaces to work.
ZNodes: All znodes are purged so that new ones can be written in their place using a new protobuf'ed format and a few are migrated in place: e.g. replication and table state znodes
WAL Log Splitting: If the 0.94.x cluster shutdown was not clean, we'll split WAL logs as part of migration before we startup on 0.96.0. This WAL splitting runs slower than the native distributed WAL splitting because it is all inside the single upgrade process (so try and get a clean shutdown of the 0.94.0 cluster if you can).
To run the execute step, make sure that first you have copied hbase-0.96.0 binaries everywhere under servers and under clients. Make sure the 0.94.0 cluster is down. Then do as follows:
$ bin/hbase upgrade -execute
Here is some sample output.
Starting Namespace upgrade Created version file at hdfs://localhost:41020/myHBase with version=7 Migrating table testTable to hdfs://localhost:41020/myHBase/.data/default/testTable ….. Created version file at hdfs://localhost:41020/myHBase with version=8 Successfully completed NameSpace upgrade. Starting Znode upgrade …. Successfully completed Znode upgrade Starting Log splitting … Successfully completed Log splitting
If the output from the execute step looks good, stop the zookeeper instance you started to do the upgrade:
$ ./hbase/bin/hbase-daemon.sh stop zookeeper
Now start up hbase-0.96.0.
It will fail with an exception like the below. Upgrade.
17:22:15 Exception in thread "main" java.lang.IllegalArgumentException: Not a host:port pair: PBUF 17:22:15 * 17:22:15 api-compat-8.ent.cloudera.com �� ���( 17:22:15 at org.apache.hadoop.hbase.util.Addressing.parseHostname(Addressing.java:60) 17:22:15 at org.apache.hadoop.hbase.ServerName.&init>(ServerName.java:101) 17:22:15 at org.apache.hadoop.hbase.ServerName.parseVersionedServerName(ServerName.java:283) 17:22:15 at org.apache.hadoop.hbase.MasterAddressTracker.bytesToServerName(MasterAddressTracker.java:77) 17:22:15 at org.apache.hadoop.hbase.MasterAddressTracker.getMasterAddress(MasterAddressTracker.java:61) 17:22:15 at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getMaster(HConnectionManager.java:703) 17:22:15 at org.apache.hadoop.hbase.client.HBaseAdmin.&init>(HBaseAdmin.java:126) 17:22:15 at Client_4_3_0.setup(Client_4_3_0.java:716) 17:22:15 at Client_4_3_0.main(Client_4_3_0.java:63)
When you upgrade from versions prior to 0.96, META
needs to be
converted to use protocol buffers. This is controlled by the configuration
option hbase.MetaMigrationConvertingToPB
, which is set to
true
by default. Therefore, by default, no action is
required on your part.
The migration is a one-time event. However, every time your cluster starts,
META
is scanned to ensure that it does not need to be
converted. If you have a very large number of regions, this scan can take a long
time. Starting in 0.98.5, you can set
hbase.MetaMigrationConvertingToPB
to
false
in hbase-site.xml
, to disable
this start-up scan. This should be considered an expert-level setting.
We used to think that 0.92 and 0.94 were interface compatible and that you can do a rolling upgrade between these versions but then we figured that HBASE-5357 Use builder pattern in HColumnDescriptor changed method signatures so rather than return void they instead return HColumnDescriptor. This will throw
java.lang.NoSuchMethodError: org.apache.hadoop.hbase.HColumnDescriptor.setMaxVersions(I)V
.... so 0.92 and 0.94 are NOT compatible. You cannot do a rolling upgrade between them.
You will find that 0.92.0 runs a little differently to 0.90.x releases. Here are a few things to watch out for upgrading from 0.90.x to 0.92.0.
If you've not patience, here are the important things to know upgrading.
Once you upgrade, you can’t go back.
MSLAB is on by default. Watch that heap usage if you have a lot of regions.
Distributed Log Splitting is on by default. It should make region server failover faster.
There’s a separate tarball for security.
If -XX:MaxDirectMemorySize is set in your hbase-env.sh, it’s going to enable the experimental off-heap cache (You may not want this).
To move to 0.92.0, all you need to do is shutdown your cluster, replace your hbase
0.90.x with hbase 0.92.0 binaries (be sure you clear out all 0.90.x instances) and
restart (You cannot do a rolling restart from 0.90.x to 0.92.x -- you must restart).
On startup, the .META.
table content is rewritten removing the
table schema from the info:regioninfo
column. Also, any flushes
done post first startup will write out data in the new 0.92.0 file format, HFile V2. This
means you cannot go back to 0.90.x once you’ve started HBase 0.92.0 over your HBase
data directory.
In 0.92.0, the hbase.hregion.memstore.mslab.enabled
flag is set to true (See ???). In 0.90.x it was false
. When it is
enabled, memstores will step allocate memory in MSLAB 2MB chunks even if the
memstore has zero or just a few small elements. This is fine usually but if you had
lots of regions per regionserver in a 0.90.x cluster (and MSLAB was off), you may
find yourself OOME'ing on upgrade because the thousands of regions * number of
column families * 2MB MSLAB (at a minimum)
puts your heap over the top.
Set hbase.hregion.memstore.mslab.enabled
to
false
or set the MSLAB size down from 2MB by setting
hbase.hregion.memstore.mslab.chunksize
to something less.
Previous, WAL logs on crash were split by the Master alone. In 0.92.0, log splitting is done by the cluster (See See “HBASE-1364 [performance] Distributed splitting of regionserver commit logs” or see the blog post Apache HBase Log Splitting). This should cut down significantly on the amount of time it takes splitting logs and getting regions back online again.
In 0.92.0, ??? indices and bloom filters take up residence in the same LRU used caching blocks that come from the filesystem. In 0.90.x, the HFile v1 indices lived outside of the LRU so they took up space even if the index was on a ‘cold’ file, one that wasn’t being actively used. With the indices now in the LRU, you may find you have less space for block caching. Adjust your block cache accordingly. See the ??? for more detail. The block size default size has been changed in 0.92.0 from 0.2 (20 percent of heap) to 0.25.
Run 0.92.0 on Hadoop 1.0.x (or CDH3u3 when it ships). The performance benefits are worth making the move. Otherwise, our Hadoop prescription is as it has been; you need an Hadoop that supports a working sync. See ???.
If running on Hadoop 1.0.x (or CDH3u3), enable local read. See Practical Caching presentation for ruminations on the performance benefits ‘going local’ (and for how to enable local reads).
If you can, upgrade your zookeeper. If you can’t, 3.4.2 clients should work against 3.3.X ensembles (HBase makes use of 3.4.2 API).
In 0.92.0, we’ve added an experimental online schema alter facility (See ???). Its off by default. Enable it at your own risk. Online alter and splitting tables do not play well together so be sure your cluster quiescent using this feature (for now).
The webui has had a few additions made in 0.92.0. It now shows a list of the regions currently transitioning, recent compactions/flushes, and a process list of running processes (usually empty if all is well and requests are being handled promptly). Other additions including requests by region, a debugging servlet dump, etc.
We now ship with two tarballs; secure and insecure HBase. Documentation on how to setup a secure HBase is on the way.
0.92.0 adds two new features: multi-slave and multi-master replication. The way to enable this is the same as adding a new peer, so in order to have multi-master you would just run add_peer for each cluster that acts as a master to the other slave clusters. Collisions are handled at the timestamp level which may or may not be what you want, this needs to be evaluated on a per use case basis. Replication is still experimental in 0.92 and is disabled by default, run it at your own risk.
If an OOME, we now have the JVM kill -9 the regionserver process so it goes down fast. Previous, a RegionServer might stick around after incurring an OOME limping along in some wounded state. To disable this facility, and recommend you leave it in place, you’d need to edit the bin/hbase file. Look for the addition of the -XX:OnOutOfMemoryError="kill -9 %p" arguments (See [HBASE-4769] - ‘Abort RegionServer Immediately on OOME’)
0.92.0 stores data in a new format, ???. As HBase runs, it will move all your data from HFile v1 to HFile v2 format. This auto-migration will run in the background as flushes and compactions run. HFile V2 allows HBase run with larger regions/files. In fact, we encourage that all HBasers going forward tend toward Facebook axiom #1, run with larger, fewer regions. If you have lots of regions now -- more than 100s per host -- you should look into setting your region size up after you move to 0.92.0 (In 0.92.0, default size is now 1G, up from 256M), and then running online merge tool (See “HBASE-1621 merge tool should work on online cluster, but disabled table”).
This version of 0.90.x HBase can be started on data written by HBase 0.20.x or HBase 0.89.x. There is no need of a migration step. HBase 0.89.x and 0.90.x does write out the name of region directories differently -- it names them with a md5 hash of the region name rather than a jenkins hash -- so this means that once started, there is no going back to HBase 0.20.x.
Be sure to remove the hbase-default.xml
from your
conf
directory on upgrade. A 0.20.x version of this file will
have sub-optimal configurations for 0.90.x HBase. The
hbase-default.xml
file is now bundled into the HBase jar and
read from there. If you would like to review the content of this file, see it in the src
tree at src/main/resources/hbase-default.xml
or see ???.
Finally, if upgrading from 0.20.x, check your .META.
schema in the
shell. In the past we would recommend that users run with a 16kb
MEMSTORE_FLUSHSIZE
. Run hbase> scan '-ROOT-'
in the
shell. This will output the current .META.
schema. Check
MEMSTORE_FLUSHSIZE
size. Is it 16kb (16384)? If so, you will need
to change this (The 'normal'/default value is 64MB (67108864)). Run the script
bin/set_meta_memstore_size.rb
. This will make the necessary
edit to your .META.
schema. Failure to run this change will make for
a slow cluster. See HBASE-3499
Users upgrading to 0.90.0 need to have their .META. table updated with the
right MEMSTORE_SIZE
[2] Note that this indicates what could break, not that it will break. We will/should add specifics in our release notes.