HBase Setup: Standalone Mode on windows machine

HBase is a popular NoSQL kind of database over HDFS. It is a column-oriented database based on the Google's BigTable. It can be setup in 3-modes:
  • Standalone Mode (without HDFS)
  • Pseudo-Distributed mode (With single node HDFS)
  • Fully-distributed mode
Out of the 3, for now it can be setup on windows only in standalone mode. But this mode is enough to play around and get familiar with it. It is quite easy to set-up but there are a few caveats that come in handy while troubleshooting especially if you are new to HBase. I'll mention them as I go in red. So,let's get started with the basic setup:
  1. First step is to download the stable version(found in the folder named "stable" at the mirror loction) of HBase as a tarball from the website and unzip into a suitable location.  From the suggested mirror site, look for the hbase--bin.tar.gz




  2. Next thing to ensure is that Java is installed on your system and JAVA_HOME is set. To verify that open command prompt (cmd) and type "java version" and it should show you the version of java installed. You also have the option of mentioning the path directly in the hbase-env.bat that resides in the /conf.
  3. Tip: Beware of spaces in the Java path. HBase has problems identifying such path. If your Java resides in "C:/Program Files", use the shorthand "C:/Progra~1" instead.


  4. The other things to change in the hbase-env.bat are:
    • HBASE_IDENT_STRING (It may already be there, just needs to be uncommented)
     set HBASE_IDENT_STRING=%USERNAME%  
    

  5. Now we need to modify the hbase-site.xml which is the main configuration file for hbase. It needs following properties for a standalone instance:
    • hbase.root.dir
    • hbase.tmp.dir: This is where HBase creates its local datastore and some other folders concerning the zookeeper.
    • hbase.zookeeper.quorum: Zookeeper instance. In this case, it will be 127.0.0.1 (I'm using the ip instead of "localhost" as HBase sometimes has trouble identifying the ip from domain name).
     
          <property>  
               <name>hbase.root.dir</name>  
               <value>D:\tmp\hbase\data</value>  
          </property>  
          <property>  
               <name>hbase.tmp.dir</name>  
               <value>D:\tmp\hbase\tmp</value>  
          </property>  
          <property>  
               <name>hbase.zookeeper.quorum</name>  
               <value>127.0.0.1</value>  
          </property>  
    

  6. Set uo HBASE_HOME as the environment variable and include the %HBASE_BIN%\bin to the Path env variable.
  7. Open a command prompt and go to the bin folder in HBase installation location. Run the file "start-hbase.cmd" to start HBase.
  8. Tip/Caveat: If you do not have a Hadoop installation on your machine Hbase startup may fail saying that it cannot find the "winutil" library. This is because HBase looks for windows native lib winutils and hadoop.dll in the path mentioned against "HADOOP_HOME" env variable. So keep these two files in any directory and export that path as HADOOP_HOME either as env variable or in the hbase-env.bat.
Once started in a command prompt, navigate to the bin folder in Hbase installation and run command "hbase shell" to start the hbase shell. You can then run some hbase shell commands to verify your installation (More on the commands in next post!)


Enjoy playing around with HBASE!




Comments

Popular posts from this blog

Plotting Choropleths with Shapefiles in R- ggplot2 tutorial

Self-Organizing Maps: An interesting Neural Network