HBASE-2501 refactor StoreFile and HFile

Review Request #182 — Created June 15, 2010 and submitted

Refactor StoreFile and StoreFileScanner and normalize the code to the structure we want. More work left to push KeyValue up to Store left.

  1. Lots of gratuitous changes making the patch fatter than it need be (lots of white-space changes).
    Publishing this review because afraid I might lose what I've done so far.... part 2 coming in a second.
  2. Whats changing here?
  3. whats changing here?
  4. Loads of changes above but seem to be white-space only.
    This buf.rewind() seems misaligned.  Is this buf.rewind correct here?
  5. Whats an hfile name?  Name of the file?
  6. Interesting, this goes a way?
  7. StoreFileReader implements Reader?  Or Reader goes away?
  8. We now have a StoreFileReader but we retain StoreFile.Writer, not a StoreFileWriter?
  9. This says HFile in comment.  That intentional?
  10. Whats this comment about?
  2. YOu mean the 'regionserver' package?
  3. I don't like putting the throws here...
  4. You mean change them so they don't mention hfile?  Just remove them but support the old one just in case?
  5. whats the change here?
  6. Needs a class comment
  7. These are duplicated?
  8. So, this is not in the interface any more but still an implemenation?
  9. shouldSeek is a bloom filter doo-hickey?
  10. I prefer the previous incarnation where the line ends in an operator... so reader naturally is led to the next line because line can't end in an operator -- not this operator anyways.
  11. what changed in above?
  12. We should add to NOTICE.txt mention of onelab.
  13. what are the changes in the above?
  2. lots of trailing white space auto-removed... 
  3. this doesnt belong here, it belongs in the StoreFileScanner level, which is responsible for higher level metadata (such as bloom filters)
  4. There will never be a unified reader interface between StoreFile and HFile - nor do we want one.  Since there is only one implementation of each, there are no interfaces.  StoreFileReader replaces the previous references to HFile.Reader and is the class that code will use now
  5. it's a mystery! no one knows!  Because StoreFileReader is bigger than StoreFile.Writer and I broke it out?  Balancing between excessively large StoreFile class and class file explosion?
  6. yes, because fileinfo is a HFile concept
  7. ah looks like leftovers
  8. that's right, the bloom filter is NOT a hfile concept.
  9. trailing white space - I had rebased this change, so I dont know why the previous trailing white space commits didn't remove it already?
  10. probably a stale comment
  11. StoreFileReader is the interface and the implementation, there is no seperate interface (yet).
  12. this is comments to myself for a future change, I should remove it
  13. it is, and the scanner now directly supports it (by delegating to the reader)
  1. Please also consider my second set of comments and post update to patch so we can get it in.  Thanks RR.
  1. How do I know my second set of comments have been addressed?  You make no reference to them.  Do I assume they have been accomodated or do I have to go verify for myself?
    1. you can check the diff between the diffs... but reviewboard doesnt seem to be carrying the comments to the new diff... odd
  2. Is this right?
    1. The HalfStoreFileReader derives from StoreFile.Reader, and it also provides a getScanner() (which is a HFileScanner). This is wrapped by StoreFile and StoreFileScanner.  Few people who use the getScanner() need to be changed to use the StoreFileScanner, and I have marked getScanner() as @Deprecated.
  3. Does this belong in this patch?
    1. Reviewboard says this is from Diff revision 1, and I have removed the file
Review request changed


  1. +1