Refactor "Corrupt Data" Tests in TestHLogSplit

Review Request #1115 - Created Oct. 28, 2010 and submitted

Alex Newman
old-hbase
hbase-2935
hbase
nspiegelberg
So I took the approach taken with the instrumented sequence file rider. Not sure if exposing those methods in the WALReader is cool so I am down with other suggestions.

- Add a FaultySequenceFileLogReader
- Un-commented the 2935 tests
Running on our internal hudson as I type this. Also I ran the TestHLogSplit test locally. I'll update the review if hudson is green.
  1. This looks great to me.  Only one thing that seems odd, the first thing below.   Nicolas, what you think?
  2. Do you need to do this?  You can't get your WALReader in before first construction?
    
    I think I see why you want to use it... its because the test already has an HLog?  What about just making a new HLog instance in your tests so you don't have to have this?
    1. I think it would be easier to change logReaderClass to package private and directly access that.  Especially since the wal namespace is so restrictive
  3. enum.valueOf might help here?  see http://leepoint.net/notes-java/oop/enums/enums.html
  4. 
      
  1. 
      
  2. i had a quick look and i dont think there is a way out, since logWriterClass (and reader) are statics, there is no way to be first to the punch, since i believe surefire is running all the tests in 1 VM.
    
    Probably making this a package private instead woudl be a good goal.  It'd be nice if there was a annotation to indicate this was  test hook, but doc should suffice for now. so 'static void resetLogReaderClass'
  3. 
      
Review request changed
  1. 
      
  2. woops pretend this says 
        return FailureType.valueOf(conf.get("faultysequencefilelogreader.failuretype", FailureType.NONE.name()));
    
  3. 
      
  1. I'm okay with the general concept/direction.  I think you should get rid of resetLogReaderClass(), but other than that, it looks pretty clean :)  I think there's more edge cases with IOE/ParseExceptions that we could theoretically handle, but this is a positive step and mitigates the regression.
    1. Right on, what are some edge cases? What's the best way to nuke that state? Eventually, with distributed log splitting, this will eventually need to work in a distributed way.
  2. 
      
  1. Anything else need to be done here?
  2. 
      
Loading...