2.3. Hadoop

This version of HBase will only run on Hadoop 0.20.x. It will not run on hadoop 0.21.x (but may run on 0.22.x/0.23.x). HBase will lose data unless it is running on an HDFS that has a durable sync. Hadoop 0.20.2, Hadoop 0.20.203.0, and Hadoop 0.20.204.0 DO NOT have this attribute. Currently only Hadoop versions 0.20.205.x or any release in excess of this version has a durable sync. You have to explicitly enable it though by setting dfs.support.append equal to true on both the client side -- in hbase-site.xml though it should be on in your base-default.xml file -- and on the serverside in hdfs-site.xml (You will have to restart your cluster after setting this configuration). Ignore the chicken-little comment you'll find in the hdfs-site.xml in the description for this configuration; it says it is not enabled because there are ... bugs in the 'append code' and is not supported in any production cluster. because it is not true (I'm sure there are bugs but the append code has been running in production at large scale deploys and is on by default in the offerings of hadoop by commercial vendors) [5] [6][7].

Or use the Cloudera or MapR distributions. Cloudera' CDH3 is Apache Hadoop 0.20.x plus patches including all of the branch-0.20-append additions needed to add a durable sync. Use the released, most recent version of CDH3.

MapR includes a commercial, reimplementation of HDFS. It has a durable sync as well as some other interesting features that are not yet in Apache Hadoop. Their M3 product is free to use and unlimited.

Because HBase depends on Hadoop, it bundles an instance of the Hadoop jar under its lib directory. The bundled jar is ONLY for use in standalone mode. In distributed mode, it is critical that the version of Hadoop that is out on your cluster match what is under HBase. Replace the hadoop jar found in the HBase lib directory with the hadoop jar you are running on your cluster to avoid version mismatch issues. Make sure you replace the jar in HBase everywhere on your cluster. Hadoop version mismatch issues have various manifestations but often all looks like its hung up.

2.3.1. Hadoop Security

HBase will run on any Hadoop 0.20.x that incorporates Hadoop security features -- e.g. Y! 0.20S or CDH3B3 -- as long as you do as suggested above and replace the Hadoop jar that ships with HBase with the secure version.

2.3.2. dfs.datanode.max.xcievers

An Hadoop HDFS datanode has an upper bound on the number of files that it will serve at any one time. The upper bound parameter is called xcievers (yes, this is misspelled). Again, before doing any loading, make sure you have configured Hadoop's conf/hdfs-site.xml setting the xceivers value to at least the following:

      <property>
        <name>dfs.datanode.max.xcievers</name>
        <value>4096</value>
      </property>
      

Be sure to restart your HDFS after making the above configuration.

Not having this configuration in place makes for strange looking failures. Eventually you'll see a complain in the datanode logs complaining about the xcievers exceeded, but on the run up to this one manifestation is complaint about missing blocks. For example: 10/12/08 20:10:31 INFO hdfs.DFSClient: Could not obtain block blk_XXXXXXXXXXXXXXXXXXXXXX_YYYYYYYY from any node: java.io.IOException: No live nodes contain current block. Will get new block locations from namenode and retry... [8]



[5] Until recently only the branch-0.20-append branch had a working sync but no official release was ever made from this branch. You had to build it yourself. Michael Noll wrote a detailed blog, Building an Hadoop 0.20.x version for HBase 0.90.2, on how to build an Hadoop from branch-0.20-append. Recommended.

[6] Praveen Kumar has written a complimentary article, Building Hadoop and HBase for HBase Maven application development.

[7] dfs.support.append

[8] See Hadoop HDFS: Deceived by Xciever for an informative rant on xceivering.