It is redundant with -logsourcelocations and the log messages are
clearer without it.
Also, remove a double-space.
Also, add braces around `if` touched in the next commit.
This tiny behavior change requires a test fixup.
Make the block db open RAII style by calling it in the BlockManager
constructor.
Before this change the block tree db was needlessly re-opened during
startup when loading a completed snapshot. Improve this by letting the
block manager open it on construction. This also simplifies the test
code a bit.
The change was initially motivated to make it easier for users of the
kernel library to instantiate a BlockManager that may be used to read
data from disk without loading the block index into a cache.
This commit is done in preparation for the next commit. Here, the block
tree options are moved to the blockmanager options and the block tree is
instantiated through a helper method of the BlockManager, which is
removed again in the next commit.
Co-authored-by: MarcoFalke <*~=`'#}+{/-|&$^_@721217.xyz>
The check for whether the block tree db has been wiped before calling
NeedsRedownload() is confusing. The boolean is set in case of a reindex.
It was originally introduced to guard NeedsRedownload in case of a
reindex in #21009. However NeedsRedownload already returns early if the
chain's tip is not loaded. Since that is the case during a reindex, the
pre-check is redundant.
This avoids having to rely on implicit casts when passing them to the
various functions allocating the caches.
This also ensures that if the requested amount of db_cache does not fit
in a size_t, it is clamped to the maximum value of a size_t.
Also take this opportunity to make the total amounts of cache in the
chainstate manager a size_t too.
Carrying non-kernel related fields in the cache sizes for the indexes is
confusing for kernel library users. The cache sizes also are set
currently with magic numbers in bitcoin-chainstate. The comments for the
cache size calculations are also not completely clear.
Solve these things by moving the kernel-specific cache size fields to
their own struct.
This slightly changes the way the cache is allocated if the txindex
and/or blockfilterindex is used. Since they are now given precedence
over the block tree db cache, this results in a bit less cache being
allocated to the block tree db, coinsdb and coins caches. The effect is
negligible though, i.e. cache sizes with default dbcache reported
through the logs are:
master:
Cache configuration:
* Using 2.0 MiB for block index database
* Using 56.0 MiB for transaction index database
* Using 49.0 MiB for basic block filter index database
* Using 8.0 MiB for chain state database
* Using 335.0 MiB for in-memory UTXO set (plus up to 286.1 MiB of unused mempool space)
this branch:
Cache configuration:
* Using 2.0 MiB for block index database
* Using 56.2 MiB for transaction index database
* Using 49.2 MiB for basic block filter index database
* Using 8.0 MiB for chain state database
* Using 334.5 MiB for in-memory UTXO set (plus up to 286.1 MiB of unused mempool space)
3a4a788ee0 init: Correct coins db cache size setting (TheCharlatan)
Pull request description:
The chainstate caches are currently re-balanced on startup even in the non-assumeutxo case, leading to the database being needlessly re-opened and its cache re-allocated.
Similar to `InitCoinsCache` and `m_coinstip_cache_size_bytes`, the `m_coinsdb_cache_size_bytes` should be set in `InitCoinsDB`.
Together with only conservatively setting the cache values when a assumeutxo chainstate is present, this allows for skipping the cache re-balance during initialization in the normal non-assumeutxo case.
Before:
```
2024-10-09T21:22:17Z Checking all blk files are present...
2024-10-09T21:22:17Z Initializing chainstate Chainstate [ibd] @ height -1 (null)
2024-10-09T21:22:17Z Opening LevelDB in /home/drgrid/.bitcoin/signet/chainstate
2024-10-09T21:22:17Z Opened LevelDB successfully
2024-10-09T21:22:17Z Using obfuscation key for /home/drgrid/.bitcoin/signet/chainstate: b0a6f4e95fd05c92
2024-10-09T21:22:17Z Loaded best chain: hashBestChain=0000000e119967d4937dad58456885ae43fb1761db686947e2f8e168c9a39a4f height=216852 date=2024-10-09T21:06:16Z progress=0.999989
2024-10-09T21:22:17Z Opening LevelDB in /home/drgrid/.bitcoin/signet/chainstate
2024-10-09T21:22:17Z Opened LevelDB successfully
2024-10-09T21:22:17Z Using obfuscation key for /home/drgrid/.bitcoin/signet/chainstate: b0a6f4e95fd05c92
2024-10-09T21:22:17Z [Chainstate [ibd] @ height 216852 (0000000e119967d4937dad58456885ae43fb1761db686947e2f8e168c9a39a4f)] resized coinsdb cache to 8.0 MiB
2024-10-09T21:22:17Z [Chainstate [ibd] @ height 216852 (0000000e119967d4937dad58456885ae43fb1761db686947e2f8e168c9a39a4f)] resized coinstip cache to 440.0 MiB
2024-10-09T21:22:17Z init message: Verifying blocks…
```
After:
```
2024-10-09T21:21:37Z Checking all blk files are present...
2024-10-09T21:21:37Z Initializing chainstate Chainstate [ibd] @ height -1 (null)
2024-10-09T21:21:37Z Opening LevelDB in /home/drgrid/.bitcoin/signet/chainstate
2024-10-09T21:21:37Z Opened LevelDB successfully
2024-10-09T21:21:37Z Using obfuscation key for /home/drgrid/.bitcoin/signet/chainstate: b0a6f4e95fd05c92
2024-10-09T21:21:37Z Loaded best chain: hashBestChain=0000012c12b48011a7d9150ce96ed6a44bbf32b09eeecaff4a667789dda2a566 height=216850 date=2024-10-09T20:37:05Z progress=0.999971
2024-10-09T21:21:37Z init message: Verifying blocks…
```
The change may also be verified by looking at the `feature_assumeutxo.py` functional test debug logs.
ACKs for top commit:
fjahr:
utACK 3a4a788ee0
achow101:
ACK 3a4a788ee0
laanwj:
Code review ACK 3a4a788ee0
BrandonOdiwuor:
Code Review ACK 3a4a788ee0
Tree-SHA512: 87878d0d196bb426370d4b4bd180ca52a34017a0799ecea651c2532461fd2927b0f7cc8182276a7d9bb1fe0ede7d0ad677e3714ca22f321917d711c643acc578
The chainstate caches are currently re-balanced on startup
even in the non-assumeutxo case, leading to the database being
needlessly re-opened and its cache re-allocated.
Similar to `InitCoinsCache` and `m_coinstip_cache_size_bytes` the
`m_coinsdb_cache_size_bytes` should be set in `InitCoinsDB`.
Together with only conservatively setting the cache values when a
assumeutxo chainstate is present, this allows for skipping the cache
re-balance during initialization in the normal non-assumeutxo case.
This is a just a mechanical change, renaming and inverting the meaning
of the indexing variable.
"m_blockfiles_indexed" is a more straightforward name for this variable
because this variable just indicates whether or not
<datadir>/blocks/blk?????.dat files have been indexed in the
<datadir>/blocks/index LevelDB database. The name "m_reindexing" was
more confusing, it could be true even if -reindex was not specified, and
false when it was specified. Also, the previous name unnecessarily
required thinking about the whole reindexing process just to understand
simple checks in validation code about whether blocks were indexed.
The motivation for this change is to follow up on previous commits,
moving away from having multiple variables called "reindex" internally,
and instead naming variables individually after what they do and
represent.
Drop confusing kernel options:
BlockManagerOpts::reindex
ChainstateLoadOptions::reindex
ChainstateLoadOptions::reindex_chainstate
Replacing them with more straightforward options:
ChainstateLoadOptions::wipe_block_tree_db
ChainstateLoadOptions::wipe_chainstate_db
Having two options called "reindex" which did slightly different things
was needlessly confusing (one option wiped the block tree database, and
the other caused block files to be rescanned). Also the previous set of
options did not allow rebuilding the block database without also
rebuilding the chainstate database, when it should be possible to do
those independently.
fReindex is one of the last remaining globals exposed by the kernel
library, so move it into the blockstorage class to reduce the amount of
global mutable state and make the kernel library a bit less awkward to
use.
Block arrival information (and the preciousblock RPC, a related concept) are
both chainstate-agnostic, so these are moved to ChainstateManager. This should
just be a refactor, without any observable behavior changes.
The thread does not only load blocks, it loads the mempool and,
in a future commit, will start the indexes as well.
Also, renamed the 'ThreadImport' function to 'ImportBlocks'
And the 'm_load_block' class member to 'm_thread_load'.
-BEGIN VERIFY SCRIPT-
sed -i "s/ThreadImport/ImportBlocks/g" $(git grep -l ThreadImport -- ':!/doc/')
sed -i "s/loadblk/initload/g" $(git grep -l loadblk -- ':!/doc/release-notes/')
sed -i "s/m_load_block/m_thread_load/g" $(git grep -l m_load_block)
-END VERIFY SCRIPT-
Make LoadChainstate return an explicit error when snapshot validation succeeds,
but there is an error trying to replace the background chainstate with the
snapshot chainstate. Previously in this case LoadChainstate would trigger a
shutdown and return INTERRUPTED, now it will return an actual error code.
There's no real change to behavior other than error message being formatted a
little differently.
Motivation for this change is to replace error handling via callbacks with
error handling via return value ahead of
https://github.com/bitcoin/bitcoin/pull/27861
This commit is part of the libbitcoinkernel project and seeks to remove
the ChainstateManager's and, more generally, the kernel library's
dependency on interface_ui with options methods in this and the
following few commits. By removing interface_ui from the kernel library,
its dependency on boost is reduced to just boost::multi_index.
This is a commit in preparation for the next few commits. The functions
are moved to methods to avoid their re-declaration for the purpose of
passing in BlockManager options.
The functions that were now moved into the BlockManager should no longer
use the params as an argument, but instead use the member variable.
In the moved ReadBlockFromDisk and UndoReadFromDisk, change
the function signature to accept a reference to a CBlockIndex instead of
a raw pointer. The pointer is expected to be non-null, so reflect that
in the type.
To allow for the move of functions to BlockManager methods all call
sites require an instantiated BlockManager, or a callback to one.
The fs.* files are already part of the libbitcoin_util library. With the
introduction of the fs_helpers.* it makes sense to move fs.* into the
util/ directory as well.
2b373fe49d docs: update assumeutxo.md (James O'Beirne)
87a1108c81 test: add snapshot completion unittests (James O'Beirne)
d70919a88f refactor: make MempoolMutex() public (James O'Beirne)
7300ced9de log: add LoadBlockIndex() message for assumedvalid blocks (James O'Beirne)
d96c59cc5c validation: add ChainMan logic for completing UTXO snapshot validation (James O'Beirne)
f2a4f3376f move-only-ish: init: factor out chainstate initialization (James O'Beirne)
637a90b973 add Chainstate::HasCoinsViews() (James O'Beirne)
c29f26b47b validation: add CChainState::m_disabled and ChainMan::isUsable (James O'Beirne)
5ee22cdafd add ChainstateManager.GetSnapshot{BaseHeight,BaseBlock}() (James O'Beirne)
Pull request description:
This is part of the [assumeutxo project](https://github.com/bitcoin/bitcoin/projects/11) (parent PR: https://github.com/bitcoin/bitcoin/pull/15606)
Part two of replacing https://github.com/bitcoin/bitcoin/pull/24232.
---
When a user activates a snapshot, the serialized UTXO set data is used to create an "assumed-valid" chainstate, which becomes active in an attempt to get the node to network tip as quickly as possible. Simultaneously in the background, the already-existing chainstate continues "conventional" IBD to both accumulate full block data and serve as a belt-and-suspenders to validate the assumed-valid chainstate.
Once the background chainstate's tip reaches the base block of the snapshot used, we set `m_stop_use` on that chainstate and immediately take the hash of its UTXO set; we verify that this matches the assumeutxo value in the source code. Note that while we ultimately want to remove this background chainstate, we don't do so until the following initialization process, when we again check the UTXO set hash of the background chainstate, and if it continues to match, we remove the (now unnecessary) background chainstate, and move the (previously) assumed-valid chainstate into its place. We then reinitialize the chainstate in the normal way.
As noted in previous comments, we could do the filesystem operations "inline" immediately when the background validation completes, but that's basically just an optimization that saves disk space until the next restart. It didn't strike me as worth the risk of moving chainstate data around on disk during runtime of the node, though maybe my concerns are overblown.
The final result of this completion process is a fully-validated chain, where the only evidence that the user synced using assumeutxo is the existence of a `base_blockhash` file in the `chainstate` directory.
ACKs for top commit:
achow101:
ACK 2b373fe49d
Tree-SHA512: a204e1d6e6932dd83c799af3606b01a9faf893f04e9ee1a36d63f2f1ccfa9118bdc1c107d86976aa0312814267e6a42074bf3e2bf1dead4b2513efc6d955e13d
Trigger completion when a background validation chainstate reaches the
same height as a UTXO snapshot, and handle cleaning up the chainstate
on subsequent startup.
0af16e7134 doc: add release note for #25574 (Martin Zumsande)
57ef2a4812 validation: report if pruning prevents completion of verification (Martin Zumsande)
0c7785bb25 init, validation: Improve handling if VerifyDB() fails due to insufficient dbcache (Martin Zumsande)
d6f781f1cf validation: return VerifyDBResult::INTERRUPTED if verification was interrupted (Martin Zumsande)
6360b5302d validation: Change return value of VerifyDB to enum type (Martin Zumsande)
Pull request description:
`VerifyDB()` can fail to complete due to insufficient dbcache at the level 3 checks. This PR improves the error handling in this case in the following ways:
- The rpc `-verifychain` now returns false if the check can't be completed due to insufficient cache
- During init, we only log a warning if the default values for `-checkblocks` and `-checklevel` are taken and the check doesn't complete. However, if the user actively specifies one of these args, we return with an InitError if we can't complete the check.
This PR also changes `-verifychain` RPC to return `false` if the verification didn't finish due to missing block data (pruning) or due to being interrupted by the node being shutdown.
Previously, this PR also included a fix for a possible assert during verification - this was done in #27009 (now merged).
ACKs for top commit:
achow101:
ACK 0af16e7134
ryanofsky:
Code review ACK 0af16e7134. Only small suggested changes since the last review, like renaming some of the enum values. I did leave more suggestions, but they are not very important and could be followups
john-moffett:
ACK 0af16e7134
MarcoFalke:
lgtm re-ACK 0af16e7134 🎚
Tree-SHA512: 84b4f767cf9bfbafef362312757c9bf765b41ae3977f4ece840e40c52a2266b1457832df0cdf70440be0aac2168d9b58fc817238630b0b6812f3836ca950bc0e
Moves chainstate initialization into its own function. This is
necessary to later support a more readable way of handling
background-validation chainstate cleanup during init, since the
chainstate initialization functions may need to be repeated after
moving leveldb filesystem content around.
This commit isn't strictly necessary, but the alternative is to (ab)use
the `while` loop in init.cpp with a `continue` on the basis of a
specific ChainstateLoadingError return value from LoadChainstate. Not
only is this harder to read, but it can't be unittested.
The approach here lets us consolidate background-validation cleanup to
LoadChainstate, and therefore exercise it within tests.
This commit is most easily reviewed with
git diff --color-moved=dimmed-zebra
--color-moved-ws=ignore-space-change
The rpc command verifychain now fails if the dbcache was not sufficient
to complete the verification at the specified level and depth.
In the same situation, the VerifyDB check during Init will now fail (and lead to
an early shutdown) if the user has explicitly specified -checkblocks or
-checklevel but the check couldn't be executed because of the limited
cache. If the user didn't change any of the two and is using the defaults, log a warning
but don't prevent the node from starting up.
This does not change behavior. It is in preparation for
special handling of the case where VerifyDB doesn't finish
for various reasons, but doesn't fail.
Use DBParams struct to remove ArgsManager uses from txdb.
To reduce size of this commit, this moves references to gArgs variable out of
txdb.cpp to calling code in chainstate.cpp. But these moves are temporary. The
gArgs references in chainstate.cpp are moved out to calling code in init.cpp in
later commits.
This commit does not change behavior.
This changes the minimum chain work for the bitcoin-chainstate
executable. Previously it was uint256{}, now it is the chain's default
minimum chain work.
Add functionality for activating a snapshot-based chainstate if one is
detected on-disk.
Also cautiously initialize chainstate cache usages so that we don't
somehow blow past our cache allowances during initialization, then
rebalance at the end of init.
Co-authored-by: Russell Yanofsky <russ@yanofsky.org>