Review Board 1.6.3

Basic throttling in Flume, http://issues.cloudera.com/browse/FLUME-64

Review Request #335 - submitted 3 years, 5 months ago

vibhor bhatt Reviewers
flume
FLUME-64 HenryR, jon, phunt
None flume
Basic throttling patch, I haven't put the throttling limits required in the configuration file so that one could play with it on run-time in debug mode.
Tested it with 10 threads pumping asciisynth data into a file; the bucket variable in controller class never went below zero, and the total bandwidth consumed by the FlumeNode never went more than 5% above the limit (observed it through "iotop").
Review request changed
Updated 3 years, 9 months ago (July 20th, 2010, 2:48 p.m.)
  • changed from This patch is to implement basic throttling in Flume. This new feature should be able to limit the number of bytes/sec being sent from all the source nodes to sink nodes on one single FlumeNode. This feature will help ensure that Flume does not exhaust all the bandwidth on one single machine. to Basic throttling in Flume, http://issues.cloudera.com/browse/FLUME-64

Posted 3 years, 9 months ago (July 20th, 2010, 7:51 p.m.)
Nice first shot.  We need to do alittle bit of work to make it stylistically ok, and there are a few questions about the implementation.

Fix the spacing.  2 space indent is our convention.  Check the wiki here for a eclipse formatter config.  See this:
https://wiki.cloudera.com/display/engineering/Java+Style+Guide
"you can find profiles in Dropbox under cloudera/development/eclipse"

Please comment public methods.  /** */ style.

Not sure making (Throttle)Controller a subclass of thread makes sense.  This seems like something that should be encapsulated within it. Would prefer just a class/limited interface that hides the fact that you are using a separate thread for this.

/** for javadoc comments  */ 
grammar: delete push
thrad

decrements # of tokens in bucket
till => until

call => calls
fix grammar
Please give a more specific name to the class because controller means nothing to me.

ThrottleController or Throttler sounds reasonable.
why not static also? (its a constant right?)
make a config property?
CONSTANT
static
currentBucketCapacity/Size/Tokens? or something like that
/** comment */

Also why return boolean? what does it mean for someone reading comments?

Restriction on when this is run?  add Preconditions to call?
synchronization issue?
payloadHeaderSize?
sp

units might help.
/** Comment please */
Use Clock.sleep instead of Thread.sleep
LOG error message
comment saying that this doesn't block  until thread has finished.  It is just a signal.

Why named halt?
Use CountDownLatch?
rename to spend/useTokens?
LOG exception
Config file?

Also, don't need 'this' here.
This thread is never going to be null!?

Also, this will break if the node gets close and restarted.
lower case variable names