Chapter 1. Building and Developing HBase

Table of Contents

1.1. HBase Repositories
1.1.1. SVN
1.1.2. Git
1.2. IDEs
1.2.1. Eclipse
1.3. Building HBase
1.3.1. Building in snappy compression support
1.3.2. Adding an HBase release to Apache's Maven Repository
1.4. Maven Build Commands
1.4.1. Compile
1.4.2. Run all Unit Tests
1.4.3. Run a Single Unit Test
1.4.4. Run a Few Unit Tests
1.4.5. Run all Unit Tests for a Package
1.4.6. Integration Tests
1.5. Getting Involved
1.5.1. Mailing Lists
1.5.2. Jira
1.6. Developing
1.6.1. Codelines
1.6.2. Unit Tests
1.7. Submitting Patches
1.7.1. Create Patch
1.7.2. Patch File Naming
1.7.3. Unit Tests
1.7.4. Attach Patch to Jira
1.7.5. Common Patch Feedback
1.7.6. ReviewBoard
1.7.7. Committing Patches

This chapter will be of interest only to those building and developing HBase (i.e., as opposed to just downloading the latest distribution).

1.1. HBase Repositories

1.1.1. SVN

svn co http://svn.apache.org/repos/asf/hbase/trunk hbase-core-trunk 
        

1.1.2. Git

git clone git://git.apache.org/hbase.git
        

1.2. IDEs

1.2.1. Eclipse

1.2.1.1. Code Formatting

See HBASE-3678 Add Eclipse-based Apache Formatter to HBase Wiki for an Eclipse formatter to help ensure your code conforms to HBase'y coding convention. The issue includes instructions for loading the attached formatter.

Also, no @author tags - that's a rule. Quality Javadoc comments are appreciated. And include the Apache license.

1.2.1.2. Subversive Plugin

Download and install the Subversive plugin.

Set up an SVN Repository target from Section 1.1.1, “SVN”, then check out the code.

1.2.1.3. HBase Project Setup

To set up your Eclipse environment for HBase, close Eclipse and execute...
mvn eclipse:eclipse
            
... from your local HBase project directory in your workspace to generate some new .project and .classpathfiles. Then reopen Eclipse.

1.2.1.4. Maven Plugin

Download and install the Maven plugin. For example, Help -> Install New Software -> (search for Maven Plugin)

1.2.1.5. Maven Classpath Variable

The M2_REPO classpath variable needs to be set up for the project. This needs to be set to your local Maven repository, which is usually ~/.m2/repository

If this classpath variable is not configured, you will see compile errors in Eclipse like this...
Description	Resource	Path	Location	Type
The project cannot be built until build path errors are resolved	hbase		Unknown	Java Problem 
Unbound classpath variable: 'M2_REPO/asm/asm/3.1/asm-3.1.jar' in project 'hbase'	hbase		Build path	Build Path Problem
Unbound classpath variable: 'M2_REPO/com/github/stephenc/high-scale-lib/high-scale-lib/1.1.1/high-scale-lib-1.1.1.jar' in project 'hbase'	hbase		Build path	Build Path Problem 
Unbound classpath variable: 'M2_REPO/com/google/guava/guava/r09/guava-r09.jar' in project 'hbase'	hbase		Build path	Build Path Problem
Unbound classpath variable: 'M2_REPO/com/google/protobuf/protobuf-java/2.3.0/protobuf-java-2.3.0.jar' in project 'hbase'	hbase		Build path	Build Path Problem Unbound classpath variable:
            

1.2.1.6. Import via m2eclipse

If you install the m2eclipse and import the HBase pom.xml in your workspace, you will have to fix your eclipse Build Path. Remove target folder, add target/generated-jamon and target/generated-sources/java folders. You may also remove from your Build Path the exclusions on the src/main/resources and src/test/resources to avoid error message in the console 'Failed to execute goal org.apache.maven.plugins:maven-antrun-plugin:1.6:run (default) on project hbase: 'An Ant BuildException has occured: Replace: source file .../target/classes/hbase-default.xml doesn't exist'. This will also reduce the eclipse build cycles and make your life easier when developing.

1.2.1.7. Eclipse Known Issues

Eclipse will currently complain about Bytes.java. It is not possible to turn these errors off.

            
Description	Resource	Path	Location	Type
Access restriction: The method arrayBaseOffset(Class) from the type Unsafe is not accessible due to restriction on required library /System/Library/Java/JavaVirtualMachines/1.6.0.jdk/Contents/Classes/classes.jar	Bytes.java	/hbase/src/main/java/org/apache/hadoop/hbase/util	line 1061	Java Problem
Access restriction: The method arrayIndexScale(Class) from the type Unsafe is not accessible due to restriction on required library /System/Library/Java/JavaVirtualMachines/1.6.0.jdk/Contents/Classes/classes.jar	Bytes.java	/hbase/src/main/java/org/apache/hadoop/hbase/util	line 1064	Java Problem
Access restriction: The method getLong(Object, long) from the type Unsafe is not accessible due to restriction on required library /System/Library/Java/JavaVirtualMachines/1.6.0.jdk/Contents/Classes/classes.jar	Bytes.java	/hbase/src/main/java/org/apache/hadoop/hbase/util	line 1111	Java Problem
             

1.2.1.8. Eclipse - More Information

For additional information on setting up Eclipse for HBase development on Windows, see Michael Morello's blog on the topic.

1.3. Building HBase

This section will be of interest only to those building HBase from source.

1.3.1. Building in snappy compression support

Pass -Dsnappy to trigger the snappy maven profile for building snappy native libs into hbase.

1.3.2. Adding an HBase release to Apache's Maven Repository

Follow the instructions at Publishing Maven Artifacts. The 'trick' to making it all work is answering the questions put to you by the mvn release plugin properly, making sure it is using the actual branch AND before doing the mvn release:perform step, VERY IMPORTANT, hand edit the release.properties file that was put under ${HBASE_HOME} by the previous step, release:perform. You need to edit it to make it point at right locations in SVN.

If you see run into the below, its because you need to edit version in the pom.xml and add -SNAPSHOT to the version (and commit).

[INFO] Scanning for projects...
[INFO] Searching repository for plugin with prefix: 'release'.
[INFO] ------------------------------------------------------------------------
[INFO] Building HBase
[INFO]    task-segment: [release:prepare] (aggregator-style)
[INFO] ------------------------------------------------------------------------
[INFO] [release:prepare {execution: default-cli}]
[INFO] ------------------------------------------------------------------------
[ERROR] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] You don't have a SNAPSHOT project in the reactor projects list.
[INFO] ------------------------------------------------------------------------
[INFO] For more information, run Maven with the -e switch
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 3 seconds
[INFO] Finished at: Sat Mar 26 18:11:07 PDT 2011
[INFO] Final Memory: 35M/423M
[INFO] -----------------------------------------------------------------------

1.4. Maven Build Commands

All commands executed from the local HBase project directory.

Note: use Maven 3 (Maven 2 may work but we suggest you use Maven 3).

1.4.1. Compile

mvn compile
          

1.4.2. Run all Unit Tests

mvn test
          

1.4.3. Run a Single Unit Test

mvn test -Dtest=TestXYZ
          

1.4.4. Run a Few Unit Tests

mvn test -Dtest=TestXYZ,TestABC
          

1.4.5. Run all Unit Tests for a Package

mvn test -Dtest=org.apache.hadoop.hbase.client.*
          

1.4.6. Integration Tests

HBase 0.92 added a verify maven target. Invoking it with run all the phases up to and including the verify phase via the maven failsafe plugin, running all the unit tests as well as the long running unit and integration tests.

mvn verify
          

1.5. Getting Involved

HBase gets better only when people contribute!

As HBase is an Apache Software Foundation project, see ??? for more information about how the ASF functions.

1.5.1. Mailing Lists

Sign up for the dev-list and the user-list. See the mailing lists page. Posing questions - and helping to answer other people's questions - is encouraged! There are varying levels of experience on both lists so patience and politeness are encouraged (and please stay on topic.)

1.5.2. Jira

Check for existing issues in Jira. If it's either a new feature request, enhancement, or a bug, file a ticket.

1.5.2.1. Jira Priorities

The following is a guideline on setting Jira issue priorities:

  • Blocker: Should only be used if the issue WILL cause data loss or cluster instability reliably.
  • Critical: The issue described can cause data loss or cluster instability in some cases.
  • Major: Important but not tragic issues, like updates to the client API that will add a lot of much-needed functionality or significant bugs that need to be fixed but that don't cause data loss.
  • Minor: Useful enhancements and annoying but not damaging bugs.
  • Trivial: Useful enhancements but generally cosmetic.

1.5.2.2. Code Blocks in Jira Comments

A commonly used macro in Jira is {code}. If you do this in a Jira comment...

{code}
   code snippet
{code}

... Jira will format the code snippet like code, instead of a regular comment. It improves readability.

1.6. Developing

1.6.1. Codelines

Most development is done on TRUNK. However, there are branches for minor releases (e.g., 0.90.1, 0.90.2, and 0.90.3 are on the 0.90 branch).

If you have any questions on this just send an email to the dev dist-list.

1.6.2. Unit Tests

In HBase we use JUnit 4. If you need to run miniclusters of HDFS, ZooKeeper, HBase, or MapReduce testing, be sure to checkout the HBaseTestingUtility. Alex Baranau of Sematext describes how it can be used in HBase Case-Study: Using HBaseTestingUtility for Local Testing and Development (2010).

1.6.2.1. Mockito

Sometimes you don't need a full running server unit testing. For example, some methods can make do with a a org.apache.hadoop.hbase.Server instance or a org.apache.hadoop.hbase.master.MasterServices Interface reference rather than a full-blown org.apache.hadoop.hbase.master.HMaster. In these cases, you maybe able to get away with a mocked Server instance. For example:

              TODO...
              

1.7. Submitting Patches

1.7.1. Create Patch

Patch files can be easily generated from Eclipse, for example by selecting "Team -> Create Patch". Patches can also be created by git diff and svn diff.

Please submit one patch-file per Jira. For example, if multiple files are changed make sure the selected resource when generating the patch is a directory. Patch files can reflect changes in multiple files.

Make sure you review Section 1.2.1.1, “Code Formatting” for code style.

1.7.2. Patch File Naming

The patch file should have the HBase Jira ticket in the name. For example, if a patch was submitted for Foo.java, then a patch file called Foo_HBASE_XXXX.patch would be acceptable where XXXX is the HBase Jira number.

If you generating from a branch, then including the target branch in the filename is advised, e.g., HBASE-XXXX-0.90.patch.

1.7.3. Unit Tests

Yes, please. Please try to include unit tests with every code patch (and especially new classes and large changes). Make sure unit tests pass locally before submitting the patch.

Also, see Section 1.6.2.1, “Mockito”.

1.7.4. Attach Patch to Jira

The patch should be attached to the associated Jira ticket "More Actions -> Attach Files". Make sure you click the ASF license inclusion, otherwise the patch can't be considered for inclusion.

Once attached to the ticket, click "Submit Patch" and the status of the ticket will change. Committers will review submitted patches for inclusion into the codebase. Please understand that not every patch may get committed, and that feedback will likely be provided on the patch. Fear not, though, because the HBase community is helpful!

1.7.5. Common Patch Feedback

The following items are representative of common patch feedback. Your patch process will go faster if these are taken into account before submission.

See the Java coding standards for more information on coding conventions in Java.

1.7.5.1. Space Invaders

Rather than do this...

if ( foo.equals( bar ) ) {     // don't do this

... do this instead...

if (foo.equals(bar)) {

Also, rather than do this...

foo = barArray[ i ];     // don't do this

... do this instead...

foo = barArray[i];   

1.7.5.2. Auto Generated Code

Auto-generated code in Eclipse often looks like this...

 public void readFields(DataInput arg0) throws IOException {    // don't do this
   foo = arg0.readUTF();                                       // don't do this

... do this instead ...

 public void readFields(DataInput di) throws IOException {
   foo = di.readUTF();

See the difference? 'arg0' is what Eclipse uses for arguments by default.

1.7.5.3. Long Lines

Keep lines less than 80 characters.

Bar bar = foo.veryLongMethodWithManyArguments(argument1, argument2, argument3, argument4, argument5);  // don't do this

... do this instead ...

Bar bar = foo.veryLongMethodWithManyArguments(argument1,
 argument2, argument3,argument4, argument5); 

... or this, whichever looks better ...

Bar bar = foo.veryLongMethodWithManyArguments(
 argument1, argument2, argument3,argument4, argument5); 

1.7.5.4. Trailing Spaces

This happens more than people would imagine.

Bar bar = foo.getBar();     <--- imagine there's an extra space(s) after the semicolon instead of a line break.

Make sure there's a line-break after the end of your code, and also avoid lines that have nothing but whitespace.

1.7.5.5. Implementing Writable

Every class returned by RegionServers must implement Writable. If you are creating a new class that needs to implement this interface, don't forget the default constructor.

1.7.5.6. Javadoc

This is also a very common feedback item. Don't forget Javadoc!

1.7.5.7. Javadoc - Useless Defaults

Don't just leave the @param arguments the way your IDE generated them. Don't do this...

  /**
   * 
   * @param bar             <---- don't do this!!!!
   * @return                <---- or this!!!!
   */
  public Foo getFoo(Bar bar);

... either add something descriptive to the @param and @return lines, or just remove them. But the preference is to add something descriptive and useful.

1.7.5.8. One Thing At A Time, Folks

If you submit a patch for one thing, don't do auto-reformatting or unrelated reformatting of code on a completely different area of code.

Likewise, don't add unrelated cleanup or refactorings outside the scope of your Jira.

1.7.5.9. Ambigious Unit Tests

Make sure that you're clear about what you are testing in your unit tests and why.

1.7.6. ReviewBoard

Larger patches should go through ReviewBoard.

For more information on how to use ReviewBoard, see the ReviewBoard documentation.

1.7.7. Committing Patches

Committers do this. See How To Commit in the HBase wiki.

Commiters will also resolve the Jira, typically after the patch passes a build.