HBASE-2223

Review Request #76 - Created May 21, 2010 and submitted

Jean-Daniel Cryans
old-hbase
HBASE-2223
hbase
This is HBASE-2223 AKA Replication 2.0, it is currently only a "preview patch" as it's pretty much feature complete, works on a cluster, has unit tests and whatnot, but it could use a lot more testing and cleaning and ideas from others.


  1. 
      
  2. Silly little nit: hbase.replication.enable instead. 
  3. Should not force a config variable like this? Set only if unset? Principle of least user surprise.
  4. Should be something like replicationMaster for clarity.
  5. Comment is out of date.
  6. Remove, don't comment.
  7. Remove, don't comment.
  8. Should be a replication specific config var?
  9. Should be a replication specific config var?
  10. Should be replication.peers?
  11. Should be replication.master?
  12. Should be replication.state?
  13. Should be replication.clusterId?
  14. Make this a real shell command now that replication is in core.
    1. https://issues.apache.org/jira/browse/HBASE-2201
  15. Should be a shell command.
  16. 
      
  1. 
      
  2. Why does this start at `true'?
    
    Furthermore, I don't see where this attribute is used, I guess I'm missing something.  It's only referenced in instantiateHLog, does it mean it can be a local variable in there?
    
    Ah, after reading more code, I think I get it.  ReplicationStatusWatcher can change this value, which will influence the creation of subsequent HLogs.  Is that correct?
    1. Actually this is a AtomicBoolean, and it is passed down to ReplicationSourceManager and ReplicationZookeeperHelper in instantiateHLog. The latter will change the value that will affect the former. It's starting at 'true' because when you enable replication you expect it to work without doing anything else fancy?
  3. The variable newlog is useless.  You can directly return the HLog instance.
  4. I don't understand this comment.
    1. could be clearer yes. I'm trying to say that we need to do the RZH instantiation here because we need to be set on the start code we use. It could be closer to where it's really set tho, it's coming from when replication was a contrib and I didn't have access to most of the HRS methods.
  5. This seems unnecessary to me.
  6. This call largely duplicates the one below.  In one case you use `fs' in the other `this.fs'.  Be consistent.
    
    Since all the arguments but one are the same, I suggest you move the call outside of the if block.  Furthermore, when this.replicationEnabled is false, you can still pass this.replicationManager since it will be null.
  7. In some places you use `this.fs' and in others you use `getFileSystem()'.  Since getFileSystem() is protected, I suppose it means it's expected to be overridden in certain sub-classes.  What's the rule as to when to use `this.fs' vs. `getFileSystem()'?
    1. yeah it's still leftovers from when replication was contrib. Should be using this.fs
  8. You're calling kv.getFamily 3 times in a row, how about adding a local variable?
  9. I suggest moving the check `scope != REPLICATION_SCOPE_LOCAL' first as it's faster.
  10. What's the status of this?
    1. I wasn't sure, but I think we can delete since it's done around line 1489
  11. How about promoting this to `LOG.info'?
    It seems like it's not going to be too noisy (correct me if I'm wrong) and it can be useful to know when firefighting.  I know a lot of production systems run with debug logging but maybe that's because we should revisit what is logged at the INFO level vs. DEBUG?
    1. I'd say the reason is that HBase is still a young project, and we often have bugs that are hard to find because of the distributed coordination. That said, yeah it could be INFO.
  12. Can we stop using this idiom to inherit from HConstants?  This is a known bad pattern and is recommended against in many places including Effective Java (item 17).
    1. Open a jira. Better do it at once.
  13. +1 with Andrew on both comments
  14. This line exceeds 80 characters, please wrap it around after the 1st argument of getLong.
  15. This local variable is useless IMO.
  16. You don't need to synchronize on concurrent collection.  Actually, Joshua Bloch says that doing this is "ALWAYS wrong!" and "BROKEN!"
    
    See slide 30:
    http://www.cs.umd.edu/class/fall2009/cmsc132H/slides/still-effective.pdf
    1. Yeah well this part should actually be refactored with HCM.TableServers.getHRegionConnection.
  17. use a <pre> block to make this more readable in the HTML version of the javadoc.
  18. Don't implement HConstants.
  19. Can you please comment all the attributes and explain what they contain / what they do?
    
    I don't find them all super explicit at first sight.
  20. Why use an AtomicBoolean instead of a volatile boolean?
  21. Normally the first sentence in a javadoc should be terminated by a period.
  22. Maybe it would be good to mention that this can be `null' and what it entails when it is.
  23. +1 on all the following naming comments.
  24. By reading this javadoc I'm not sure what this String is supposed to look like.
  25. Isn't this going to throw an NPE if get() returns null?
    1. mmm yeah should check for that.
  26. Please document the `znode' argument.
  27. So in this method the key in peerClusters is called `znode' whereas in the previous method it's called `peerClusterId'.  I find this confusing.
  28. I think this should throw an IllegalArgumentException instead of just logging an error.
  29. Don't do this.  Use proper logging.
  30. Don't do this.  Use proper logging.
  31. Don't do this.  Use proper logging.
    
  32. Don't do this.  Use proper logging.
  33. Don't do this.  Use proper logging.
  34. Don't do this.  Use proper logging.
  35. Don't do this.  Use proper logging.
  36. You can move this to just before line 345 (Effective Java #29).
  37. Don't do this.  Use proper logging.
  38. Don't do this.  Use proper logging.
  39. Don't do this.  Use proper logging.
  40. Don't do this.  Use proper logging.
  41. Remove unnecessary parentheses.
  42. Don't do this.  Use proper logging.
  43. Don't do this.  Use proper logging.
  44. Don't do this.  Use proper logging.
  45. Don't implement HConstants.
  46. Fix indentation.  Don't do this.  Use proper logging.
  47. Don't do this.  Use proper logging.
  48. 
      
  1. I stopped at ReplicationLogCleaner.java, will continue tomorrow.
  2. 
      
  1. 
      
  2. Please document.  The code also handles when this argument is null.  I think this needs to be spelled out in the javadoc.
  3. How much more expensive is it to complete the loop instead of doing the early return here?
    With the early return, each call is a bit less expensive but you have to go through this method more frequently since `this.hlogs' is almost always going to be partial.
    Without the early return the call might be a bit more expensive but at least `this.hlogs' is accurate.
    1. Good question. In the current code, we will only return to query Zookeeper when there's a new log that's being archived e.g. the rest of the time it will always serve from memory. IMO this solution will query a lot less, but will keep HLogs longer than they need to (unless the cache is refreshed, we could think a HLog is being used while it has not been for while). I still prefer that to hitting ZK as much times as you have region servers in your cluster. A smarter thing to do would be to keep watches on all HLogs, but a lot harder to manage and I thought that code be improved later.
  4. I don't think you need to store yet another reference to this object.  Since you're passing it on to the `ttlCleaner', you don't need that extra reference here, I think.
    1. Yeah, but it creates a weird cohesion between both objects.
  5. And if you do what I said above, you can replace this with
      return ttlCleaner.getConf();
  6. Shouldn't this forward the call to the `ttlCleaner'?
  7. It would be good to comment what this lock protects and what the condition is used for.
    1. yeah might as well comment all attributes
  8. Why are some arguments declared `final' and others not?  Please be consistent.  I'm in favor of using `final' everywhere.
    1. They're not final when the pointer needs to change?
    2. Ah sorry now I see it. Yeah I don't need final there, the class attributes already are
  9. I think it would be good to document the fact that this method will typically be called from another thread than the thread that executes `run' so that other people reading the code will quickly get a good grasp of what are the concurrency / locking requirements.
  10. I think you don't need to wrap this around.
  11. This should be outside of the `try' block, otherwise you may attempt to unlock something you've not locked, which will throw a runtime IllegalMonitorStateException.
  12. Move this local variable to right before the nested while loop at line 170.
  13. Shouldn't you catch and deal with the InterruptedException specifically for this call?  Right now it'll be caught by the very broad `catch (Exception ex)' at line 246 and it will cause replication to stop.  I'm not sure this is intended.
  14. You're calling `edit.getKeyValues()' 9 times below, which makes the code harder to read.  Can you please use a local variable?
  15. So Delete operations are "unbuffered" unlike Put operations, which you "buffer" in the `puts' list.  Does that mean that a Delete can be executed before the Put that was creating the data in the first place, and that the Delete will fail first and the Put will survive second?
  16. Shouldn't you call `putTable' in a `finally' statement to ensure you return the `table' to the pool if the call to `delete' throws an exception?
    1. If deletes throws an exception, it will stop the replication anyways.
  17. Shouldn't there also be an upper bound on how many elements we can add to `puts'?  If I use HBase with a single table then this `if' condition will never become true.
    1. True, but at line 207 it will take care of it
  18. Same comment as above regarding the use of a `finally'.
  19. You're calling editsSize.get() both here and 4 lines above.  Are you assuming that the value won't change in the mean time?  I think you should use a local variable to avoid this problem.
    1. Well it's used between 2 different threads. 
  20. instead of `+ ex', pass `ex' as a second argument.
  21. Ditto, pass `re' as a 2nd argument instead of concatenating it to the string.
  22. Since this method is `protected', please use Javadoc.
  23. Instead of duplicating the code of the method `close' from above, just call `close();'.
  24. Instead of using this constructor I recommend using the constructor where you can specify in advance the capacity of the StringBuilder, since you can trivially tell in advance what the final size of the string will be.
  25. 
      
  1. I'll pick up at ReplicationSource.java after lunch
  2. 
      
  1. I'll pick up where I stopped in TestReplication.java tomorrow.
  2. Terminate this sentence with a period.
    
    Generally speak you must terminate the first sentence of a javadoc with a period.
  3. I don't find `replication.source.size.capacity' particularly explicit.  What is it supposed to control?
    1. I'll put more details into their attributes. Also if we document those configs I will add details in hbase-default.xml
  4. Nit pick: mapOfAddr since in English "Address" starts with "Addr" :)
    1. Avoue que ça te mélanges autant que moi ;)
  5. How about using a do-while instead of repeating this line of code?
  6. If HServerAddress respects the equal contract, consider using a set instead of a map.
    1. And then, even better, might as well just use a HashSet :P
  7. Move this outside of the try block.
  8. The explicit call to toString() here is useless.
  9. Is there any reason why you're checking this here?
    1. It's a symptom of a problem with my code, I'll fix that.
  10. Log a message too.
    This will call Log#error(java.lang.Object message) and I presume you meant to call Log#error(java.lang.Object message, java.lang.Throwable t)
  11. I find this comment confusing.  Can you make it a bit more explicit, especially towards the "normally has a position" part?
    1. Yeah, in fact I could just remove it... It means that we recovered a queue from another region server, and normally you are tailing a file but a RS could fail between 2 logs and that means that the new one wasn't read from yet.
  12. Don't lock in the `try' block.
  13. Why are you checking this again here?
  14. All the double-negations with noIOE are harder to read than if instead you were starting with boolean gotIOE = false and set it to true when you get an IOE.
  15. This `try' block is massive, would it be possible to refactor it using a private method to make the code a bit more readable?
  16. Use a local variable for entry.getKey() instead of calling it so many times in a row.
  17. Unnecessary call to toString().
  18. It seems that considerDumping will always be true except when you fail to stat() the file due to an unexpected error.  That seems suspicious to me.  Is this intended?  If yes, can you explain?
    1. I agree that part is weird, I built this while HDFS wasn't stable yet for tail'ing files and ended up having to decide if I should ditch a file that was broken. Currently you consider dumping the file when you get an EOF and if:
       - The queue was recovered and is empty. It's very suspicious. Usually you get EOFs on file that are opened but totally empty. Might as well get rid of it?
       - The number of entries read is greater than 0. That means you were reading the file, and instead of just getting null at the end you get an EOF in the face. By experience, it's usually a race condition à la HDFS-1057 then just redo the reading. If it failed 10 times, it means that that file is totally broken. 
      
      Comments?
  19. Need a message in first argument.
  20. Please document everything properly.
  21. How about storing the result of edit.getKeyValues() in a local instead of repeating that code and repeating the call twice at every loop?
  22. Put this in a try-finally block.  Just because you're nulling one reference doesn't protect you from this thread getting interrupted and then you're left with a locked lock.
  23. You need to enclose te <li> in <ul> or <ol> and close the tags properly.
  24. Can you please document everything non obvious.
  25. queueRecovered should be documented too.
  26. Does this statement need to be in the synchronized block?  If yes, why?
  27. "Add" vs "tries" – be consistent.  Use imperative everywhere or don't use it.
  28. Since log messages can be intertwined if the server is busy logging a lot, how about making this and the previous line a single message.
    1. Already documented in the interface
  29. This doesn't need to be in the synchronized block, does it?
  30. Remove this line and the next to avoid code duplication with line 250/251.
  31. Don't you only need to synchronize on those two lines instead of all the other ones too?
    1. All from the implemented interface
  32. Why is this `protected' instead of `private'?
  33. Pardon my ignorance but what's the `2' supposed to mean here?
  34. I'm not sure whether this is useful since junit will already give us all the info in the event of an unexpected exception.  Furthermore, the first argument to LOG.error must be a message.
  35. This would be a better API if it received a boolean in argument and transformed it in a String itself.
  36. Maybe you can add a watcher for yourself too so you get notified when you can continue instead of sleeping SLEEP_TIME and hoping that the timing will work out OK?
    1. Seems the class would be even more complicated.
  37. Do you need an empty method?
    1. Was keeping it around in case I need it again.
  38. 
      
  1. 
      
  2. This test is pretty long and tests quite a few different things.  Consider to break it down into multiple smaller unit tests.
  3. I'm not sure to understand why this test is named like that.  Could you comment this test a bit more to explain what the intent is?
  4. This seems brittle to me.
    1. Kinda, but it's currently the best I found.
    1. There are some race conditions currently preventing that to work. Same reason I need to make sure in this test that the master cluster has all the edits when I kill a region server GC-style. See this code:
      
          // Test we actually have all the rows, we may miss some because we
          // don't have IO fencing.
          if (res.length != initialCount) {
            LOG.warn("We lost some rows on the master cluster!");
            // We don't really expect the other cluster to have more rows
            initialCount = res.length;
          }
      
      So if I can lose rows on the master cluster, I can also lose rows on the slave cluster... but rows that are lost because of a lack of IO-fencing or because of a bug in the replication will be the same from the client perspective.
  5. I believe this method can be made `static'.
  6. I'm not a big fan of this.
  7. Do you really need to define the 3 following empty methods?
  8. Consider writing a comment explaining what this test does.
  9. Missing space before `conf'.
  10. +"" is an idiom I don't like.
  11. I don't understand why you're doing +=2 and /2 instead of just setting the upper bound of the loop to BATCH_SIZE/2.
  12. Missing space before `scanRes'.
  13. 
      
  1. 
      
  2. One space before `port'.
  3. Unnecessary call to `toString()'.
    1. It's untouched by my patch, but ok ;)
  4. 
      
  1. I got as far as ReplicationLogCleaner... will continue later.
  2. bin/replication/add_peer.rb (Diff revision 5)
     
     
    Should you point at some replication documentation here?  Is there such a thing?
    1. package.html later, should I point to it?
  3. bin/replication/copy_tables_desc.rb (Diff revision 5)
     
     
    This could get a bit annoying I'd say.
    1. It helped me a lot, remove if people complain?
  4. This has to go here?  Can it go into one of the replication classes?
    1. Used by master and region server, to me it belongs there.
  5. Can't you just do c.get("key", defaultvalue)?
    1. No, I also do a check on replication.
  6. You writing startcode into zk?  Why not write servername -- the host+port+startcode combo?
    1. To be coherent with the rest of the code that uses zookeeper.
  7. Is this directory name?  Confusingly named given rootdir+regLogPathStr only adds up to repLogPath.
    1. I don't understand you, but this code is going to be removed in my next patch as I'm simplifying RepSink.
  8. Replication needs package documentation or else an article (like metrics) -- oh, i see it later... nm
  9. Peers are named '1', '2'?  Can't we have more meaningful names here?
    1. We agreed that peers are identified with a short internally as it is stored. We could use an external mapping of short->cute_name.
  10. Use servername instead of startcode
    1. Same comment as before, needs to be coherent.
  11. All RS's in a master cluster replicate?
    1. Yep... was that an implicit way of saying that I need to document that in RZH?
  12. Should this class be called WRapper instaad of Helper?
  13. You mean 'ensemble' here rather than 'quorum' (Patrick will kill you if he sees you calling it a 'quorum' when you mean the other)
    1. Argh I'm trying to correct myself but I'm still missing some of them. Thx!
  14. We keep up the replication position in zk?  How much do we replicate in one go?  Its not a single edit, is it?  We do this for every log file?
    1. Yes. A defined amount specified in ReplicationSource. No. Every current log file, we only replicate one at a time per region server.
    1. I'll do like the rest and log.error
  15. We return empty map if clusters size is == 1?  Should that be clusters.size == 0?
    1. That part isn't clear enough, so the reason it's 1 and not 0 is that we put a lock in there so it's listed in the znodes we fetch. Actually this should be <= 1 rather than ==.
    1. See previous comment, we lock the dead region server's znode by putting a lock in there, but we don't want to process the hlogs under since... it's not a cluster. Could use more doc.
  16. Just logging errors?  What if session expired (our discussion from last day)?
    1. Yes I need to review how I handle it in RZH, but I'd also need to review ZKW since some methods will hid it in there.
  17. Whats this about?  You need to run zk yourself but no zoo.cfg?
    1. I... don't remember why I wrote this.
  18. And if not?  What if replicating single-family only?
    1. Forgot to update that after we added scoping, updating.
  19. Has to be offline?  Will this always be the case?
    1. Currently everything is static, but I hope we can move on from that in the future.
    1. This is a log snippet that's coming from a region server. Do you want to see more documentation about it in package.html or in the logging itself?
  20. 
      
  1. First pass.  Maybe by the 3rd pass I'll have an idea of whats going on.  General comment is that there is a lot of new code here but tests seem to test replication system.  There are few instances of unit tests ensuring newly added methods are working properly.
  2. For sure setConf will have been called before we get here?  So, stuff gets setup by setConf?  Can setConf be called more than once?  How do I know how to use this class?  Not doc'd.  Doesn't have a Constructor.
    1. LogCleanerDelegate is the interface that defines the general behavior. Yes should have a constructor.
  3. The way this is done, if I didn't want to wait on the ttl, then I'd have to write a new class.  Can't we have ttl and recplication be distinct and then if I want delete based off ttl and whether log up in zk, then chain them?
    1. I don't follow, chaining is already how I do it.
    1. Yeah, RepSink is a mix of 2 solutions but only features the worst of both. The next patch will significantly make it better.
  4. Should read this out of config. rather than hardcode 10.
  5. Long while loop; can break it up?
  6. Only operate on the first kv?
  7. Do you have to write position back to zk?
  8. Can code from HLog be used here?
    1. I ain't.. but it's used like one.
    1. This is the down side of the way of I'm caping the log entries by size or number. I'm reusing the same HLog.Entry[] entriesArray to read from HLogs (and the entries in it). For example, replicationQueueSizeCapacity=64MB and replicationQueueNbCapacity=25k. Let's say on a first run we reach 25k without reaching the size, so we'll replicate the whole array. Now on the second run let's say we reached 64MB after only 10k rows, then we only want to replicate that and not the 15k "leftovers".
  9. Not a constructor.  If javadoc in an interface, you don't need to reproduce the javadoc in the implementation.
  10. This should be SortedSet, not TreeSet... or NavigableSet.
  11. No dfs in this test.  Thats intentional?
  12. Can't you squash some of these tests together?  They each start up own minidfscluster... just start it once?
    1. They don't?
      
        @Before
        public void setUp() throws Exception {
          table1 = TEST_UTIL.truncateTable(TABLE_NAME1);
          table2 = TEST_UTIL.truncateTable(TABLE_NAME2);
          Thread.sleep(SLEEP_TIME);
        }
  13. 
      
  1. Missing is more desciption of how this feature works.  There is package doc. on how to get it going but needs fuller description of how replication works (how zk is used, how it manages hlogs, how it sends edits to remote cluster, etc.).  There is insufficient description in javadoc.  Without this, there is a danger that only the author will be able to figure whats going on in here.
    
    How does one get insight on to how well or badly replication is working?
    
    Alot of errors are logged and then we just move on.  Thats fine for an alpha package but need a plan for making it all more robust?  What you thinking?
  2. bin/replication/add_peer.rb (Diff revision 7)
     
     
    These log messages add nothing?  Remove?  Just restating what was passed on cmdline.
  3. bin/replication/add_peer.rb (Diff revision 7)
     
     
    Why do this?  Its annoying?
    
    It'd be better spending time interrogating what the user passed to see if it makes sense throwing errors if incorrectly formatted or if it doesn't look like it makes sense?
  4. bin/replication/add_peer.rb (Diff revision 7)
     
     
    A message on end saying what it did can be comforting to users.
  5. bin/replication/copy_tables_desc.rb (Diff revision 7)
     
     
    See above comments.
  6. Change name of this config to hbase.replication.
  7. Does this method have prerequisites? For example, does replication have to be enabled?
  8. Is this a leak from another patch?
    1. Like I said in the description of this patch: "It also has the patch from HBASE-2707 applied because without it it's too easy to fail TestReplication. "
  9. This is a leak from another patch?
    1. Like I said in the description of this patch: "It also has the patch from HBASE-2707 applied because without it it's too easy to fail TestReplication. "
  10. Whats going on here?  We're setting into config the logcleaner to use but then in the lines after this we go ahead and create OldLogsCleaner anyways?
    1. OldLogsCleaner uses the the class that was specified in the configuration.
  11. Just call this 'replication' or 'replicationEnabled'
    1. Side note, I think every reviewer had a different preference for the name of this attribute.
  12. We are a ReplicationSink or a ReplicationMaster?  One or the other?  Can we not move all of this Replication stuff into a Replication class including the setup of whether we are a Sink or Master rather than have it hang out here in HRS?  
  13. I don't grok the comment
  14. This could be null if we are a replication master?
  15. If the above mentioned Replication class were a singleton, it could be shared here with HRS
    1. I created a visitor list to inspect log entries before they get appended. Replication is now out of HLog.
  16. It'd be good if we didn't have to create this byte [] per edit.  I should reinstitute the comparator I had that took KVs but only compared family portion of the KV... no need to create family byte [] then.
  17. YOu might say why its being moved.  Should be INFO level?  WALs are kinda important to track?
  18. isReplicating is what you would name the method that accesses the boolean replicating... with this name the accessor should be named isIsReplicating.
  19. Why do it this way?  Why an if/else?  Why not do if (...length != 3) throw .... no need of an else.
  20. This is an error? Is it critical fail (We dropped recording a WAL)  Are we just logging and moving on?
    1. https://issues.apache.org/jira/browse/HBASE-2791
  21. Same here.... We just log and keep going?  Are any of these fatal?
  22. If this fails, what are the repercusssions?
  23. What happens if this server dies?  It gets cleaned up by master?
    
    What happens to replication if master dies?  Does master failover need to do anything special to pick up running replication?
  24. Why this implementation have to know about other implementations?  Can't we do a chain of decision classes? Any class can say no?  As soon as any decision class says no, we exit the chain.... So in this case, first on the chain would be the ttl decision... then would be this one... and third would be the snapshotting decision. You don't have to do the chain as part of this patch but please open an issue to implement.
    1. https://issues.apache.org/jira/browse/HBASE-2792
  25. Its not just MDC, right?
    1. That'll be Multi Cluster Replication then!
  26. Is it user tables or CF in user tables?
  27. Maybe say something about how it works... pool of HTables to write remote peer, etc.
  28. Why again does replication keep its own set of logs rather than just keep positions on HRS logs (whether local HRS or the HRS of others?)  So it prevails across HRS failures?  Aren't logs archived now rather than thrown away so this should be fine?  Otherwise, we are writing edits to FS twice, right?
    1. It's totally different in this context, but I currently removed the logs in ReplicationSink until some refactoring happens.
  29. This should be inside a finally block?
  30. Duplicated code -- see up about ten lines.
    1. Not it's different, one inserts and the other deletes
    2. Ok oh yeah there is a better way to do it.
  31. You are stopping the regionserver because replication failed?
    1. It's actually more like 55 seconds
  32. We carry edits in memory?  And its outside of our normal accounting -- e.g. cache and max for memstores?  How we get it into the mix?  Else we could OOME if we don't account for this memory payload.
    1. You think ReplicationSource should implement HeapSize?
  33. Stop cluster or stop the host?
  34. Whats this about?  IN particular, the 'handling story'...
    1. I'll talk about this more in the doc
  35. When would it make sense to ever replicate catalog table entries?
    
    Why we even in here reading a file if replicating == false
    1. - When would it make sense to ever replicate catalog table entries?
      Never I'd say, why do you ask?
      
      - Why we even in here reading a file if replicating == false
      A user can switch that at anytime
  36. Are you checking size of these edits?  You'll read 25k entries of any size?
    1. That's exactly what this does:
      
            if ((this.reader.getPosition() - this.position)
                >= this.replicationQueueSizeCapacity ||
  37. Where do you explain replication hlog filenaming convention?
    1. Replication doesn't have a special naming convention.
  38. Can only do one sink, right?  Might want to say so.  How does the source know the sink to replicate to?  Should that be part of the Interface?
    
    The comment for the Interface could say how the Interface is used... init, then set sink, then per file archived, call ... etc.
  39. Make this "This class manages all the replication sources"
    
    "There are two classes of them"... what you mean to say here?  Two classes of source?
    1. Yes, recovered or not, and it's even explained the 2 lines after.
    1. Changing it to:
      Old sources, they were recovered from a failed region server and our only goal is to finish replicating the HLog queue it had up in ZK
  40. You should say more here about whats going on.. either here or up in the class comment.  It has watchers on all other RSs in current cluster and will try to take on the workloads of others on crash (?).
  41. You reporting status or updating updated position into zk?  There is a running status being kept up in zk?  Shouldn't this log message have the log path in it too?
    
    This method does more than update position or report status... looks like it changes state in this class updating set of logs?
    1. will change name and add doc
  42. Shouldn't this be nonmodifiable List that you give out here since it seems you have it synchronized internally here.
  43. What other regionserver?  There is alot that goes unexplained in here?  Could point to the doc. on how replication works.  Same for the transer method above (I have an idea because you white boarded it for me).
  44. 
      
Review request changed
  1. +1 on commit if all tests pass.  I sat wé J-D and went over the changes.  All is good for first commit.
  2. 
      
Loading...