In this commit, we add a new CLI option to control if we D/C on slow
pongs or not. Due to the existence of head-of-the-line blocking at
various levels of abstraction (app buffer, slow processing, TCP kernel
buffers, etc), if there's a flurry of gossip messages (eg: 1K channel
updates), then even with a reasonable processing latency, a peer may
still not read our ping in time.
To give users another option, we add a flag that allows users to disable
this behavior. The default remains.
The error was never used as the init couldn't return an error, so we do
away with that. We also modify the main event loop dispatch to more
closely match other areas of the codebase.
In this commit, we make all calls to disconnect after a ping/pong
violation is detected in the `PingManager` async. We do this to avoid
circular waiting that may occur if the disconnect call back ends up
waiting on the peer goroutine to be torn down. If this happens, then the
peer goroutine will be blocked on the ping manager fully tearing down,
which is blocked on the peer disconnect succeeding.
This is a similar class of issue we've delt with recently as pertains to
the peer and the server: sync all back execution must not lead to
a circular waiting loop.
Fixes#8379
This commit refactors some of the bookkeeping around the ping logic
inside of the Brontide. If the pong response is noncompliant with
the spec or if it times out, we disconnect from the peer.