Merge bitcoin/bitcoin#25740: assumeutxo: background validation completion

2b373fe49d docs: update assumeutxo.md (James O'Beirne)
87a1108c81 test: add snapshot completion unittests (James O'Beirne)
d70919a88f refactor: make MempoolMutex() public (James O'Beirne)
7300ced9de log: add LoadBlockIndex() message for assumedvalid blocks (James O'Beirne)
d96c59cc5c validation: add ChainMan logic for completing UTXO snapshot validation (James O'Beirne)
f2a4f3376f move-only-ish: init: factor out chainstate initialization (James O'Beirne)
637a90b973 add Chainstate::HasCoinsViews() (James O'Beirne)
c29f26b47b validation: add CChainState::m_disabled and ChainMan::isUsable (James O'Beirne)
5ee22cdafd add ChainstateManager.GetSnapshot{BaseHeight,BaseBlock}() (James O'Beirne)

Pull request description:

  This is part of the [assumeutxo project](https://github.com/bitcoin/bitcoin/projects/11) (parent PR: https://github.com/bitcoin/bitcoin/pull/15606)

  Part two of replacing https://github.com/bitcoin/bitcoin/pull/24232.

  ---

  When a user activates a snapshot, the serialized UTXO set data is used to create an "assumed-valid" chainstate, which becomes active in an attempt to get the node to network tip as quickly as possible. Simultaneously in the background, the already-existing chainstate continues "conventional" IBD to both accumulate full block data and serve as a belt-and-suspenders to validate the assumed-valid chainstate.

  Once the background chainstate's tip reaches the base block of the snapshot used, we set `m_stop_use` on that chainstate and immediately take the hash of its UTXO set; we verify that this matches the assumeutxo value in the source code. Note that while we ultimately want to remove this background chainstate, we don't do so until the following initialization process, when we again check the UTXO set hash of the background chainstate, and if it continues to match, we remove the (now unnecessary) background chainstate, and move the (previously) assumed-valid chainstate into its place. We then reinitialize the chainstate in the normal way.

  As noted in previous comments, we could do the filesystem operations "inline" immediately when the background validation completes, but that's basically just an optimization that saves disk space until the next restart. It didn't strike me as worth the risk of moving chainstate data around on disk during runtime of the node, though maybe my concerns are overblown.

  The final result of this completion process is a fully-validated chain, where the only evidence that the user synced using assumeutxo is the existence of a `base_blockhash` file in the `chainstate` directory.

ACKs for top commit:
  achow101:
    ACK 2b373fe49d

Tree-SHA512: a204e1d6e6932dd83c799af3606b01a9faf893f04e9ee1a36d63f2f1ccfa9118bdc1c107d86976aa0312814267e6a42074bf3e2bf1dead4b2513efc6d955e13d
This commit is contained in:
Andrew Chow
2023-03-07 18:36:51 -05:00
5 changed files with 737 additions and 90 deletions

View File

@@ -3,9 +3,9 @@
Assumeutxo is a feature that allows fast bootstrapping of a validating bitcoind
instance with a very similar security model to assumevalid.
The RPC commands `dumptxoutset` and `loadtxoutset` are used to respectively generate
and load UTXO snapshots. The utility script `./contrib/devtools/utxo_snapshot.sh` may
be of use.
The RPC commands `dumptxoutset` and `loadtxoutset` (yet to be merged) are used to
respectively generate and load UTXO snapshots. The utility script
`./contrib/devtools/utxo_snapshot.sh` may be of use.
## General background
@@ -22,10 +22,6 @@ be of use.
chainstate running asynchronously in the background. We also use this flag to control
which index entries are added to setBlockIndexCandidates during LoadBlockIndex().
- Indexing implementations via BaseIndex can no longer assume that indexation happens
sequentially, since background validation chainstates can submit BlockConnected
events out of order with the active chain.
- The concept of UTXO snapshots is treated as an implementation detail that lives
behind the ChainstateManager interface. The external presentation of the changes
required to facilitate the use of UTXO snapshots is the understanding that there are
@@ -76,9 +72,15 @@ original chainstate remains in use as active.
Once the snapshot chainstate is loaded and validated, it is promoted to active
chainstate and a sync to tip begins. A new chainstate directory is created in the
datadir for the snapshot chainstate called `chainstate_snapshot`. When this directory
is present in the datadir, the snapshot chainstate will be detected and loaded as
active on node startup (via `DetectSnapshotChainstate()`).
datadir for the snapshot chainstate called `chainstate_snapshot`.
When this directory is present in the datadir, the snapshot chainstate will be detected
and loaded as active on node startup (via `DetectSnapshotChainstate()`).
A special file is created within that directory, `base_blockhash`, which contains the
serialized `uint256` of the base block of the snapshot. This is used to reinitialize
the snapshot chainstate on subsequent inits. Otherwise, the directory is a normal
leveldb database.
| | |
| ---------- | ----------- |
@@ -88,7 +90,7 @@ active on node startup (via `DetectSnapshotChainstate()`).
The snapshot begins to sync to tip from its base block, technically in parallel with
the original chainstate, but it is given priority during block download and is
allocated most of the cache (see `MaybeRebalanceCaches()` and usages) as our chief
consideration is getting to network tip.
goal is getting to network tip.
**Failure consideration:** if shutdown happens at any point during this phase, both
chainstates will be detected during the next init and the process will resume.
@@ -107,33 +109,32 @@ sequentially.
### Background chainstate hits snapshot base block
Once the tip of the background chainstate hits the base block of the snapshot
chainstate, we stop use of the background chainstate by setting `m_stop_use` (not yet
committed - see #15606), in `CompleteSnapshotValidation()`, which is checked in
`ActivateBestChain()`). We hash the background chainstate's UTXO set contents and
ensure it matches the compiled value in `CMainParams::m_assumeutxo_data`.
The background chainstate data lingers on disk until shutdown, when in
`ChainstateManager::Reset()`, the background chainstate is cleaned up with
`ValidatedSnapshotShutdownCleanup()`, which renames the `chainstate_[hash]` datadir as
`chainstate`.
chainstate, we stop use of the background chainstate by setting `m_disabled`, in
`CompleteSnapshotValidation()`, which is checked in `ActivateBestChain()`). We hash the
background chainstate's UTXO set contents and ensure it matches the compiled value in
`CMainParams::m_assumeutxo_data`.
| | |
| ---------- | ----------- |
| number of chainstates | 2 (ibd has `m_stop_use=true`) |
| number of chainstates | 2 (ibd has `m_disabled=true`) |
| active chainstate | snapshot |
**Failure consideration:** if bitcoind unexpectedly halts after `m_stop_use` is set on
the background chainstate but before `CompleteSnapshotValidation()` can finish, the
need to complete snapshot validation will be detected on subsequent init by
`ChainstateManager::CheckForUncleanShutdown()`.
The background chainstate data lingers on disk until the program is restarted.
### Bitcoind restarts sometime after snapshot validation has completed
When bitcoind initializes again, what began as the snapshot chainstate is now
indistinguishable from a chainstate that has been built from the traditional IBD
process, and will be initialized as such.
After a shutdown and subsequent restart, `LoadChainstate()` cleans up the background
chainstate with `ValidatedSnapshotCleanup()`, which renames the `chainstate_snapshot`
datadir as `chainstate` and removes the now unnecessary background chainstate data.
| | |
| ---------- | ----------- |
| number of chainstates | 1 |
| active chainstate | ibd |
| active chainstate | ibd (was snapshot, but is now fully validated) |
What began as the snapshot chainstate is now indistinguishable from a chainstate that
has been built from the traditional IBD process, and will be initialized as such.
A file will be left in `chainstate/base_blockhash`, which indicates that the
chainstate, even though now fully validated, was originally started from a snapshot
with the corresponding base blockhash.