HBASE-3448 : RegionSplitter Utility class for manual region splitting

Review Request #1469 - Created Jan. 16, 2011 and updated

For certain use cases, there are a number of advantages to manually splitting regions instead of having the HBase split code determine this for you automatically. There are currently some API additions to HBaseAdmin and HTable that allow you to manually split on a small scale. This JIRA is about importing a RegionSplitter utility program to help pre-split and perform rolling splits on a live table when needed. Will also add documentation to answer common questions about why you would pre-split.
bin/hbase org.apache.hadoop.hbase.util.RegionSplitter -c 60 -f test:rs myTable
bin/hbase org.apache.hadoop.hbase.util.RegionSplitter -r myTable
  1. I did a quick pass.  Looks good to me Nicolas.  A few comments below.
  2. Watch the old white space.. and if you get a chance use javadoc formatting -- its like HTML formatting -- in here else you lose your nice layout when its transformed.
  3. Once committed, I'll pull this stuff up into the HBase book.  Its good stuff on auto-split or not.
  4. Does this code exist in HBaseAdmin too?
    1. for the most part, yes.  The big difference is making sure the whole operation is synchronous, even if it takes a long time.
  5. This is a long method.
    1. yes. but it's mostly coupled code.  it's definitely worth refactoring if some of this code is useful beyond this application.
  6. This is a long wait.  Make it shorter?
    1. actually, this is pretty short because you need to wait for 2 major compactions to complete (was taking about 1 hr each on our clusters).  If this is an issue, we can do an exponential backoff sleep interval like I did with the first section