FLUME-286: DFO mode does not detect network failure
Review Request #1162 - Created Nov. 3, 2010 and submitted
Previously, the thrift rpc clients defaulted to using infinity (0) for a timeout value. This changes the timeout value to be a configurable value that defaults to 10 seconds (5 seconds would disconnect heartbeating nodes after every heartbeat!)
Previous thrift rpc related tests continue to work. Manually tested that thrift rpc for dfo seem to recovers properly, thrift heartbeats recover properly. Details: two physical machines, one with master+collector node , one with agent node. Start agent sending data (console via agentDFOSink), disconnect ethernet wire on master. notice that agent is still available after timeout and writes data to disk. reconnect wire, notice that after retry timeout, dfo disk logs get set to collector. I think I could automate this test, but it would require linux and root access to use firewall to simulate network partition failures.