These race conditions originate from the mock database storing and
returning pointers, rather than returning a copy.
Observed on Travis:
WARNING: DATA RACE
Read at 0x00c0003222b8 by goroutine 149:
github.com/lightningnetwork/lnd/watchtower/wtclient.(*sessionQueue).drainBackups()
/home/runner/work/lnd/lnd/watchtower/wtclient/session_queue.go:288 +0xed
github.com/lightningnetwork/lnd/watchtower/wtclient.(*sessionQueue).sessionManager()
/home/runner/work/lnd/lnd/watchtower/wtclient/session_queue.go:281 +0x450
Previous write at 0x00c0003222b8 by goroutine 93:
github.com/lightningnetwork/lnd/watchtower/wtclient.getClientSessions()
/home/runner/work/lnd/lnd/watchtower/wtclient/client.go:365 +0x24f
github.com/lightningnetwork/lnd/watchtower/wtclient.(*TowerClient).handleNewTower()
/home/runner/work/lnd/lnd/watchtower/wtclient/client.go:1063 +0x23e
github.com/lightningnetwork/lnd/watchtower/wtclient.(*TowerClient).backupDispatcher()
/home/runner/work/lnd/lnd/watchtower/wtclient/client.go:784 +0x10b9
The logger string used to identify the wtclient and wtclientrpc loggers
was the same, leading to being unable to modify the log level of the
wtclient logger as it would be overwritten with the wtclientrpc's one.
To simplify things, we decide to use the existing RPC logger for
wtclientrpc.
The new table format for the pay command started to use the
`Millisecond()` method on `time.Duration`. However, this method was only
added in Go 1.13, so this breaks the build for Go 1.12. We replace this
by manual division. `time.Duration` "natively" is in nanoseconds, so we
covert to milli seconds by dividing my `time.Millisecond`, which is
1,000,000.
In this commit, we eliminate an extraneous copy in the `QueryPayments`
method. Before this commit, we would copy each payment from the initial
FetchPayments call into a new slice. However, pointers to payments are
return from `FetchPayments`, so we can just maintain that same reference
rather than copying again when we want to limit our response.
If the server hasn't fully started yet, it's possible that the channel
event store hasn't either, so it won't be able to consume any requests
until then. To prevent blocking, we'll just omit the uptime related
fields for now.
There seems to be a misinterpretation of a variable between the
btcwallet and gozmq libraries. When establish a ZMQ connection, it
expects a timeout, which is used to set read deadlines and determine how
long we should wait before attempting a reconnection. Within btcwallet
and lnd however, this is is interpreted as a polling duration,
explaining the current value of 100ms. Under load, especially on
less-capable hardware, this leads to high resource usage as we get into
a constant reconnection loop. To remedy this, we use a timeout of 5s
instead, which is a much more reasonable value for read timeouts, and is
also what's used for LN peers.