In preparation for adding a NodeAnnouncement2 struct along with a
NodeAnnouncement interface, this commit renames the existing
NodeAnnouncment struct to NodeAnnouncement1.
In this commit, we update the SQL store implementation to support the
new iterator-based API for ChanUpdatesInHorizon. This includes adding
SQL query pagination support and helper functions for efficient batch
processing.
The SQL implementation uses cursor-based pagination with configurable
batch sizes, allowing efficient iteration over large result sets without
loading everything into memory. The query is optimized to use indexes
effectively and minimize database round trips.
New SQL query GetChannelsByPolicyLastUpdateRange is updated to support:
- Cursor-based pagination using (max_update_time, id) compound cursor
- Configurable batch sizes via MaxResults parameter
- Efficient batch caching with updateChanCacheBatch helper
In this commit, we update the SQL store implementation to support the
new iterator-based API for NodeUpdatesInHorizon. This includes adding a
new SQL query that supports efficient pagination through result sets.
The SQL implementation uses cursor-based pagination with configurable
batch sizes, allowing efficient iteration over large result sets without
loading everything into memory. The query is optimized to use indexes
effectively and minimize database round trips.
New SQL query GetNodesByLastUpdateRange is updated to support:
* Cursor-based pagination using (last_update, pub_key) compound cursor
* Optional filtering for public nodes only
* Configurable batch sizes via MaxResults parameter
In this commit, we refactor the ChanUpdatesInHorizon method to return
an iterator instead of a slice. This change significantly reduces
memory usage when dealing with large result sets by allowing callers to
process items incrementally rather than loading everything into memory
at once.
In this commit, we refactor the NodeUpdatesInHorizon method to return
an iterator instead of a slice. This change significantly reduces
memory usage when dealing with large result sets by allowing callers to
process items incrementally rather than loading everything into memory
at once.
The new implementation uses Go 1.23's iter.Seq type to provide a
standard iterator interface. The method now supports configurable batch
sizes through functional options, allowing fine-tuned control over
memory usage and performance characteristics.
Rather than reading all the entries from disk into memory (before this
commit, we did consult the cache for most entries, skipping the disk
hits), we now expose a chunked iterator instead.
We also make the process of filtering out public nodes first class. This
saves many newly created db transactions later.
In this commit, we introduce a new options pattern for configuring
iterator behavior in the graph database. This includes configuration
for batch sizes when iterating over channel and node updates, as well
as an option to filter for public nodes only.
The new functional options pattern allows callers to customize iterator
behavior without breaking existing APIs. Default batch sizes are set to
1000 entries for both channel and node updates, which provides a good
balance between memory usage and performance.
The SQL* helpers are meant to always set the `Valid` field of the
sql.Null* type to true. Otherwise they cannot be used to set a valid,
empty field. However, we dont want to break the behaviour of the
existing SQLStr helper and so this commit adds a new helper with the
desired functionality.
In this commit, we take advantage of the graph SQL migration and use it
to also extract DNS addresses from the opaque address type. We use
opaque addresses to store addresses that we dont understand yet. We
recently added logic for DNS addresses and so we may have persisted node
announcements that have DNS addresses but we would currently have them
stored under the opaque address type. So we use this migration to see if
we can extract such addresses.
A few decisions were made here:
1) If multiple DNS addressees are extracted, this is ok and we continue
to migrate the node even though this is actually invalid at a
protocol level. We will currently check (at a higher level) that a node
announcement only has 1 DNS address in it before we broadcast it though.
2) If an invalid DNS address is encountered (so we hit the DNS type
descriptor but then the rest of the DNS address payload is invalid
and cannot be parsed into the expected hostname:port, then we skip
migrating the node completely.
The first byte of an opaque addr must be one that we dont understand
yet. We do this update in preparation for doing an on-the-fly parse of
persisted opaque addrs to see if they contain addrs that we now support.
For this to work, the first byte cant be 0x01 since this maps to a known
address.
This commit simplifies insertChanEdgePolicyMig. Much of the logic can be
removed given that this method is only used in the context of the graph
SQL migration.
This should improve the performance of the migration quite a lot since
it removes the extra GetChannelAndNodesBySCID call.
Finally, we make the channel-policy part of the SQL migration idempotent
by adding a migration-only policy insert query which will not error out
if the policy already exists and does not have a timestamp that is newer
than the existing records timestamp. To keep the commit simple, a
insertChanEdgePolicyMig function is added which is basically identical
to the updateChanEdgePolicy function except for the fact that it uses
the newly added query. In the next commit, it will be simplified even
more.
In this commit, we make the channel part of the graph SQL migration
idempotent (retry-safe!). We do this by adding a migration-only channel
insert query that will not error out if a the query is called and a
chanenl with the given scid&version already exists. We also ensure that
errors are not thrown if existing channel features & extra types are
re-added.
There is no need to use the "collect-then-update" pattern for node
insertion during the SQL migration since if we do have any previously
persisted data for the node and happen to re-run the insertion for that
node, the data will be exactly the same. So we can make use of "On
conflict, no nothing" here too.
In this commit, the graph SQL migration is updated so that the node
migration step is retry-safe. This is done by using migration specific
logic & queries that do not use the same node-update-constraint as the
normal node upsert logic. For normal "run-time" logic, we always expect
a node update to have a newer timestamp than any previously stored one.
But for the migration, we will only ever be dealing with a single
announcement for a given node & to make things retry-safe, we dont want
the query to error if we re-insert the exact same node.
In preparation for handling retries on the source DB side, we thread
through the `reset` call-backs properly so that we can reset appropriate
variables.
In preparation for making the channel & policy migration logic
idempotent in a step-by-step manner, we add a test here that only tests
the migration of channels _without_ policies so that we can first focus
on just making the channel migration idempotent.
Currently, the graph SQL migration is not retry safe. Meaning that if
the source DB exeutes a retry under the hood, this could result in the
migration failing. In preparation for fixing this, we adust the
migration test accordingly.
This will help us test idempotency later on, but it also ensures that
TestMigrateGraphToSQL is properly testing writes to the
graph_channel_policy_extra_types table.
Use the new feature of Go 1.24, fix linter warnings.
This change was produced by:
- running golangci-lint run --fix
- sed 's/context.Background/t.Context/' -i `git grep -l context.Background | grep test.go`
- manually fixing broken tests
- itest, lntest: use ht.Context() where ht or hn is available
- in HarnessNode.Stop() we keep using context.Background(), because it is
called from a cleanup handler in which t.Context() is canceled already.
Finally, we update the migrateZombieIndex function to use batch
validation just like was done in the previous commits. Here, we
additionally make sure to validate the entire zombie index entry and not
just the SCID.
As was done in the previous commits for nodes & channels, we update the
migrateClosedSCIDIndex function here so that it validates migrated
entries in batches rather than one-by-one.
As was done in the previous commits for nodes & channels, we update the
migratePruneLog function here so that it validates migrated entries in
batches rather than one-by-one.
Restructue the `migrateChannelsAndPolicies` function so that it does the
validation of migrated channels and policies in batches. So instead of
fetching channel and its policies individually after migrating it, we
wait for a minimum batch size to be reached and then validate a batch of
them together. This lets us make way fewer DB round trips.
Restructue the `migrateNodes` function so that it does the validation of
migrated nodes in batches. So instead of fetching each node individually
after migrating it, we wait for a minimum batch size to be reached and
then validate a batch of nodes together. This lets us make way fewer DB
round trips.
We do this so that this lookup is only done in the situation it is
actually needed. During a migration, we dont need to special case this
AlreadyExists error since we will terminate the transaction either way.
So there is no need for the extra lookup during the migration.
A timing analysis showed that this query was significantly impacting the
performance of the migration when run with a postgres backend.
This commit adds a BenchmarkFindOptimalSQLQueryConfig test in the
graph/db package which runs ForEachNode and ForEachChannel queries
against a local backend using various different values for the sql
QueryConfig struct. This is done to determine good default values to
use for the config options for sqlite vs postgres.