This commit simplifies insertChanEdgePolicyMig. Much of the logic can be
removed given that this method is only used in the context of the graph
SQL migration.
This should improve the performance of the migration quite a lot since
it removes the extra GetChannelAndNodesBySCID call.
Finally, we make the channel-policy part of the SQL migration idempotent
by adding a migration-only policy insert query which will not error out
if the policy already exists and does not have a timestamp that is newer
than the existing records timestamp. To keep the commit simple, a
insertChanEdgePolicyMig function is added which is basically identical
to the updateChanEdgePolicy function except for the fact that it uses
the newly added query. In the next commit, it will be simplified even
more.
In this commit, we make the channel part of the graph SQL migration
idempotent (retry-safe!). We do this by adding a migration-only channel
insert query that will not error out if a the query is called and a
chanenl with the given scid&version already exists. We also ensure that
errors are not thrown if existing channel features & extra types are
re-added.
There is no need to use the "collect-then-update" pattern for node
insertion during the SQL migration since if we do have any previously
persisted data for the node and happen to re-run the insertion for that
node, the data will be exactly the same. So we can make use of "On
conflict, no nothing" here too.
In this commit, the graph SQL migration is updated so that the node
migration step is retry-safe. This is done by using migration specific
logic & queries that do not use the same node-update-constraint as the
normal node upsert logic. For normal "run-time" logic, we always expect
a node update to have a newer timestamp than any previously stored one.
But for the migration, we will only ever be dealing with a single
announcement for a given node & to make things retry-safe, we dont want
the query to error if we re-insert the exact same node.
Finally, we update the migrateZombieIndex function to use batch
validation just like was done in the previous commits. Here, we
additionally make sure to validate the entire zombie index entry and not
just the SCID.
As was done in the previous commits for nodes & channels, we update the
migrateClosedSCIDIndex function here so that it validates migrated
entries in batches rather than one-by-one.
As was done in the previous commits for nodes & channels, we update the
migratePruneLog function here so that it validates migrated entries in
batches rather than one-by-one.
Restructue the `migrateChannelsAndPolicies` function so that it does the
validation of migrated channels and policies in batches. So instead of
fetching channel and its policies individually after migrating it, we
wait for a minimum batch size to be reached and then validate a batch of
them together. This lets us make way fewer DB round trips.
Restructue the `migrateNodes` function so that it does the validation of
migrated nodes in batches. So instead of fetching each node individually
after migrating it, we wait for a minimum batch size to be reached and
then validate a batch of nodes together. This lets us make way fewer DB
round trips.
We do this so that this lookup is only done in the situation it is
actually needed. During a migration, we dont need to special case this
AlreadyExists error since we will terminate the transaction either way.
So there is no need for the extra lookup during the migration.
A timing analysis showed that this query was significantly impacting the
performance of the migration when run with a postgres backend.
The following performance gains were measured using the new benchmark
test.
```
name old time/op new time/op delta
ChanUpdatesInHorizon-native-sqlite-10 18.5s ± 3% 2.0s ± 5% -89.11% (p=0.000 n=9+9)
ChanUpdatesInHorizon-native-postgres-10 59.0s ± 3% 0.8s ±10% -98.65% (p=0.000 n=10+9)
```
Let the helper method only take the params it needs so that we dont need
to construct an entire models.ChannelEdgeInfo object to pass to it. This
will be useful later on.
Here we adjust the ForEachNodeCached graph DB method to pass in a node's
addresses into the provided call-back if requested. This will allow us
to improve the performance of node/channel iteration in the autopilot
subserver.
Previously, ForEachNodeCached would batch fetch node _feature_ data but
would still fetch the channel set of each node in a node-by-node fashion
which is not ideal. So this commit updates this method to make use of
the new sqldb.ExecuteCollectAndBatchWithSharedDataQuery helper. It lets
us batch load channel data for a range of node IDs.
This _greatly_ improves the performance of the method.
In this commit, we update the SQLStore's ForEachNodeCacheable and
ForEachNodeCached methods to use batch collection for node feature bits.
This results in the following performance gains:
```
name old time/op new time/op delta
ForEachNodeCacheable-native-sqlite-10 184ms ± 2% 145ms ±10% -21.45% (p=0.000 n=10+10)
ForEachNodeCacheable-native-postgres-10 697ms ± 8% 51ms ± 4% -92.68% (p=0.000 n=9+10)
```
This commit is a pure refactor. Here all we are doing is removing the
old `pageSize` constant used in the SQLStore code and instead using the
value provided in the QueryConfig struct.
We rename this helper along the config types & helper types for it
because the word "page" is used more often in the context of paging
through results using an offset and limit whereas this helper is
specifically used to split up the slice in queries of the form
"WHERE x in []slice". We do this rename so that there is mimimal
confusion in contexts where we use batching along with actual paging.
The config struct is also renamed to QueryConfig in preparation for it
holding more config options.
Here, we add a new `buildNodeWithBatchData` helper method that can be
used to construct a `models.LightningNode` object using pre-fetched
batch data. The existing `buildNode` method is then adjusted to use this
new helper.
In this commit, we add the queries that will be needed to batch-fetch
the data of a set of nodes. The logic for using these new queries is
also added but not used yet.
In this commit, we remove the LEFT JOIN query that was used for fetching
a nodes addresses. The reason it was used before was to ensure that we'd
get an empty address list if the node did exist but had no addresses.
This was for the purposes of the `AddrsForNode` method since it needs to
return false/true to indicate if the given node exists.
Use the new `SLICES` directive to add a DeleteChannels query which takes
a set of DB channel IDs. Then replace all our calls to DeleteChannel
with a paginated call to DeleteChannels.