3141eab9c669488a2e7fef5f60d356ac92294922 test: add functional test for ScanAndUnlinkAlreadyPrunedFiles (Andrew Toth)
e252909e561e47d75cb3a892657662a139f6532c test: add unit test for ScanAndUnlinkAlreadyPrunedFiles (Andrew Toth)
77557dda4a123515d0fa2a545ee21d7c43a66988 prune: scan and unlink already pruned block files on startup (Andrew Toth)
Pull request description:
There are a few cases where we can mark a block and undo file as pruned in our block index, but not actually remove the files from disk.
1. If we call `FindFilesToPrune` or `FindFilesToPruneManual` and crash before `UnlinkPrunedFiles`.
2. If on Windows there is an open file handle to the file somewhere else when calling `fs::remove` in `UnlinkPrunedFiles` (https://en.cppreference.com/w/cpp/filesystem/remove, https://learn.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-deletefilew#remarks). This could be from another process, or if we are calling `ReadBlockFromDisk`/`ReadRawBlockFromDisk` without having a lock on `cs_main` (which has been allowed since ccd8ef65f9).
This PR mitigates this by scanning all pruned block files on startup after `LoadBlockIndexDB` and unlinking them again.
ACKs for top commit:
achow101:
ACK 3141eab9c669488a2e7fef5f60d356ac92294922
pablomartin4btc:
re-ACK with added functional test 3141eab9c669488a2e7fef5f60d356ac92294922.
furszy:
Code review ACK 3141eab9
theStack:
Code-review ACK 3141eab9c669488a2e7fef5f60d356ac92294922
Tree-SHA512: 6c73bc57838ad1b7e5d441af3c4d6bf4c61c4382e2b86485e57fbb74a61240710c0ceeceb8b4834e610ecfa3175c6955c81ea4b2285fee11ca6383f472979d8d
0af16e7134459e0820ab95d751093876c1ec4c6d doc: add release note for #25574 (Martin Zumsande)
57ef2a4812f443b2d734f43cebf3ef5038da83f2 validation: report if pruning prevents completion of verification (Martin Zumsande)
0c7785bb2540b69564104767d38342704230cbc2 init, validation: Improve handling if VerifyDB() fails due to insufficient dbcache (Martin Zumsande)
d6f781f1cfcbc2c2ad5ee289a0642ed00386d013 validation: return VerifyDBResult::INTERRUPTED if verification was interrupted (Martin Zumsande)
6360b5302d2675788de5c4a28ea77d823f6d809e validation: Change return value of VerifyDB to enum type (Martin Zumsande)
Pull request description:
`VerifyDB()` can fail to complete due to insufficient dbcache at the level 3 checks. This PR improves the error handling in this case in the following ways:
- The rpc `-verifychain` now returns false if the check can't be completed due to insufficient cache
- During init, we only log a warning if the default values for `-checkblocks` and `-checklevel` are taken and the check doesn't complete. However, if the user actively specifies one of these args, we return with an InitError if we can't complete the check.
This PR also changes `-verifychain` RPC to return `false` if the verification didn't finish due to missing block data (pruning) or due to being interrupted by the node being shutdown.
Previously, this PR also included a fix for a possible assert during verification - this was done in #27009 (now merged).
ACKs for top commit:
achow101:
ACK 0af16e7134459e0820ab95d751093876c1ec4c6d
ryanofsky:
Code review ACK 0af16e7134459e0820ab95d751093876c1ec4c6d. Only small suggested changes since the last review, like renaming some of the enum values. I did leave more suggestions, but they are not very important and could be followups
john-moffett:
ACK 0af16e7134459e0820ab95d751093876c1ec4c6d
MarcoFalke:
lgtm re-ACK 0af16e7134459e0820ab95d751093876c1ec4c6d 🎚
Tree-SHA512: 84b4f767cf9bfbafef362312757c9bf765b41ae3977f4ece840e40c52a2266b1457832df0cdf70440be0aac2168d9b58fc817238630b0b6812f3836ca950bc0e
and remove m_snapshot_validated. This state can now be inferred by the
number of isUsable chainstates.
m_disabled is used to signal that a chainstate should no longer be used
by validation logic; it is used as a sentinel when background validation
completes or if the snapshot chainstate is found to be invalid.
isUsable is a convenience method that incorporates m_disabled.
The rpc command verifychain now fails if the dbcache was not sufficient
to complete the verification at the specified level and depth.
In the same situation, the VerifyDB check during Init will now fail (and lead to
an early shutdown) if the user has explicitly specified -checkblocks or
-checklevel but the check couldn't be executed because of the limited
cache. If the user didn't change any of the two and is using the defaults, log a warning
but don't prevent the node from starting up.
This does not change behavior. It is in preparation for
special handling of the case where VerifyDB doesn't finish
for various reasons, but doesn't fail.
Add CoinsViewOptions struct to remove ArgsManager uses from txdb.
To reduce size of this commit, this moves references to gArgs variable out of
txdb.cpp to calling code in validation.cpp. But these moves are temporary. The
gArgs references in validation.cpp are moved out to calling code in init.cpp in
later commits.
This commit does not change behavior.
The previous behavior, skipping some L3 DisconnectBlock calls,
but still attempting to reconnect these blocks at L4, makes
ConnectBlock assert.
The variable skipped_l3_checks is introduced because even with an
insufficient cache for the L3 checks, the L1/L2 checks in the same
loop should still be completed.
Fixes#25563.
282019cd3ddb060db350654e6f855f7ea37e0d34 refactor: add kernel/cs_main.* (fanquake)
Pull request description:
One place to find / include `cs_main`.
No more:
> // Actually declared in validation.cpp; can't include because of circular dependency.
> extern RecursiveMutex cs_main;
Ultimately, no more need to include `validation.h` (which also includes (heavy/boost filled) `txmempool.h`) everywhere for `cs_main`. See #26087 for another example of why that is useful.
ACKs for top commit:
ajtowns:
ACK 282019cd3ddb060db350654e6f855f7ea37e0d34
Tree-SHA512: 142835b794873e7a09c3246d6101843ae81ec0c6295e6873130c98a2abfa5f7282748d0f1a37237a779cc71c3bc0a75d03b20313ef5398c83d4814215cbc8287
9567bfeab95cc0932073641dd162903850987d43 clang-tidy: Add `performance-no-automatic-move` check (Hennadii Stepanov)
Pull request description:
Split from bitcoin/bitcoin#26642 as [requested](https://github.com/bitcoin/bitcoin/pull/26642#discussion_r1054673201).
For the problem description see https://clang.llvm.org/extra/clang-tidy/checks/performance/no-automatic-move.html.
The following types are affected:
- `std::pair<CAddress, NodeSeconds>`
- `std::vector<CAddress>`
- `UniValue`, also see bitcoin/bitcoin#25429
- `QColor`
- `CBlock`
- `MempoolAcceptResult`
- `std::shared_ptr<CWallet>`
- `std::optional<SelectionResult>`
- `CTransactionRef`, which is `std::shared_ptr<const CTransaction>`
ACKs for top commit:
andrewtoth:
ACK 9567bfeab95cc0932073641dd162903850987d43
aureleoules:
ACK 9567bfeab95cc0932073641dd162903850987d43
Tree-SHA512: 9b6a5d539205b41d2c86402d384318ed2e1d89e66333ebd200a48fd7df3ce6f6c60a3e989eda5cc503fb34b8d82526f95e56776e1af51e63b49e3a1fef72dbcb
This makes the interface more predictable and useful. The caller
understands one or more transactions failed, and can learn what happened
with each transaction. We already have this information, so we might as
well return it.
It doesn't make sense to do this for other PackageValidationResult
values because:
- PCKG_RESULT_UNSET: this means everything succeeded, so the individual
failures are no longer accurate.
- PCKG_MEMPOOL_ERROR: something went wrong with the mempool logic;
transaction failures might not be meaningful.
- PCKG_POLICY: this means something was wrong with the package as a
whole. The caller should use the PackageValidationState to find the
error, rather than looking at individual MempoolAcceptResults.
This value creates an extremely confusing interface as its existence is
dependent upon implementation details (whether something was submitted
on its own, etc). MempoolAcceptResult::m_effective_feerate is much more
helpful, as it always exists for submitted transactions.
Bug: not setting package_state means package_state.IsValid() == true and
the caller does not know that this failed.
We won't be validating this transaction again, so it makes sense to return this
failure to the caller.
Rename package_state to package_state_quit_early to make it more clear
what this variable is used for and what its scope is.
Co-authored-by: Greg Sanders <gsanders87@gmail.com>
47c4b1f52ab8d95d7deef83050bad49d1e3e5990 mempool: log/halt when CalculateMemPoolAncestors fails unexpectedly (stickies-v)
5481f65849313ff947f38433b1ac28285a7f7694 mempool: add AssumeCalculateMemPoolAncestors helper function (stickies-v)
f911bdfff95eba3793fffaf71a31cc8bfc6f80c9 mempool: use util::Result for CalculateMemPoolAncestors (stickies-v)
66e028f7399b6511f9b73b1cef54b6a6ac38a024 mempool: use util::Result for CalculateAncestorsAndCheckLimits (stickies-v)
Pull request description:
Upon reviewing the documentation for `CTxMemPool::CalculateMemPoolAncestors`, I noticed `setAncestors` was meant to be an `out` parameter but actually is an `in,out` parameter, as can be observed by adding `assert(setAncestors.empty());` as the first line in the function and running `make check`. This PR fixes this unexpected behaviour and introduces refactoring improvements to make intents and effects of the code more clear.
## Unexpected behaviour
This behaviour occurs only in the package acceptance path, currently only triggered by `testmempoolaccept` and `submitpackage` RPCs.
In `MemPoolAccept::AcceptMultipleTransactions()`, we first call `PreChecks()` and then `SubmitPackage()` with the same `Workspace ws` reference. `PreChecks` leaves `ws.m_ancestors` in a potentially non-empty state, before it is passed on to `MemPoolAccept::SubmitPackage`. `SubmitPackage` is the only place where `setAncestors` isn't guaranteed to be empty before calling `CalculateMemPoolAncestors`. The most straightforward fix is to just forcefully clear `setAncestors` at the beginning of CalculateMemPoolAncestors, which is done in the first bugfix commit.
## Improvements
### Return value instead of out-parameters
This PR updates the function signatures for `CTxMemPool::CalculateMemPoolAncestors` and `CTxMemPool::CalculateAncestorsAndCheckLimits` to use a `util::Result` return type and eliminate both the `setAncestors` `in,out`-parameter as well as the error string. It simplifies the code and makes the intent and effects more explicit.
### Observability
There are 7 instances where we currently call `CalculateMemPoolAncestors` without actually checking if the function succeeded because we assume that it can't fail, such as in [miner.cpp](69b10212ea/src/node/miner.cpp (L399)). This PR adds a new wrapper `AssumeCalculateMemPoolAncestors` function that logs such unexpected failures, or in case of debug builds even halts the program. It's not crucial to the objective, more of an observability improvement that seems sensible to add on here.
ACKs for top commit:
achow101:
ACK 47c4b1f52ab8d95d7deef83050bad49d1e3e5990
w0xlt:
ACK 47c4b1f52a
glozow:
ACK 47c4b1f52ab8d95d7deef83050bad49d1e3e5990
furszy:
light code review ACK 47c4b1f5
aureleoules:
ACK 47c4b1f52ab8d95d7deef83050bad49d1e3e5990
Tree-SHA512: d908dad00d1a5645eb865c4877cc0bae74b9cd3332a3641eb4a285431aef119f9fc78172d38b55c592168a73dae83242e6af3348815f7b37cbe2d448a3a58648
b2aa9e85289fc654106a890c35935e9c76c411fb Add release note for MIN_STANDARD_TX_NONWITNESS_SIZE relaxation (Greg Sanders)
8c5b3646b5afe8a61f5c66478d8e11f0d2ce5108 Relax MIN_STANDARD_TX_NONWITNESS_SIZE to 65 non-witness bytes (Greg Sanders)
Pull request description:
Since the original fix was set to be a "reasonable" transaction to reduce allocations and the true motivation later revealed, it makes sense to relax this check to something more principled.
There are more exotic transaction patterns that could take advantage of a relaxed requirement, such as 1 input, 1 output OP_RETURN to burn a utxo to fees for CPFP purposes when change isn't practical.
Two changes could be accomplished:
1) Anything not 64 bytes could be allowed
2) Anything above 64 bytes could be allowed
In the Great Consensus Cleanup, suggestion (2)
was proposed as a consensus change, and is the simpler of the two suggestions. It would not allow an "empty" OP_RETURN but would reduce the required padding from 22 bytes to 5.
The functional test is also modified to test the actual case
we care about: 64 bytes
Related mailing list discussions here:
https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2022-October/020995.html
And a couple years earlier:
https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2020-May/017883.html
ACKs for top commit:
achow101:
reACK b2aa9e85289fc654106a890c35935e9c76c411fb
glozow:
reACK b2aa9e85289fc654106a890c35935e9c76c411fb
pablomartin4btc:
re-ACK b2aa9e8528
jonatack:
ACK b2aa9e85289fc654106a890c35935e9c76c411fb with some suggestions
Tree-SHA512: c1ec1af9ddcf31b2272209a4f1ee0c5607399f8172e5a1dfd4604cf98bfb933810dd9369a5917ad122add003327c9fcf6ee26995de3aca41d5c42dba527991ad
Since the original fix was set to be a "reasonable" transaction
to reduce allocations and the true motivation later revealed,
it makes sense to relax this check to something more principled.
There are more exotic transaction patterns that could take advantage
of a relaxed requirement, such as 1 input, 1 output OP_RETURN to burn
a utxo to fees for CPFP purposes when change isn't practical.
Two changes could be accomplished:
1) Anything not 64 bytes could be allowed
2) Anything above 64 bytes could be allowed
In the Great Consensus Cleanup, suggestion (2) was the route taken.
It would not allow an "empty" OP_RETURN
but would reduce the required padding from 22 bytes to 5.
The functional test is also modified to test the actual case
we care about: 64 bytes
07dfbb5bb8115c680621f361c65d9cde2f8c52f2 Make static nLastFlush and nLastWrite Chainstate members (Aurèle Oulès)
Pull request description:
Fixes#22189.
The `static std::multimap<uint256, FlatFilePos> mapBlocksUnknownParent; ` referenced in the issue was already fixed by #25571. I don't believe Chainstate references any other static variables.
ACKs for top commit:
jamesob:
ACK 07dfbb5bb8115c680621f361c65d9cde2f8c52f2 ([`jamesob/ackr/26513.1.aureleoules.make_static_nlastflush_a`](https://github.com/jamesob/bitcoin/tree/ackr/26513.1.aureleoules.make_static_nlastflush_a))
theStack:
Concept and code-review ACK 07dfbb5bb8115c680621f361c65d9cde2f8c52f2
Tree-SHA512: 0f26463c079bbc5e0e62707d4ca4c8c9bbb99edfa3391d48d4915d24e2a1190873ecd4f9f11da25b44527671cdc82c41fd8234d56a4592a246989448d34406b0
38941a703e079709bb465ad1fcde50e11350f8f1 refactor: Move `txmempool_entry.h` --> `kernel/mempool_entry.h` (Hennadii Stepanov)
Pull request description:
This PR addresses the https://github.com/bitcoin/bitcoin/pull/17786#discussion_r1027818360:
> why not move it to the right place, that is to `kernel/txmempool_entry.h`?
ACKs for top commit:
MarcoFalke:
review ACK 38941a703e079709bb465ad1fcde50e11350f8f1 📊
Tree-SHA512: 0145974b63b67ca1d9d89af2dd9d4438beca480c16a563f330da05fec49b8394d7ba20ed83cf7d50b2e19454e006978ebed42b0e07887b98d00210f3201ce9ba
d885bb2f6ea34cd21dacfebe763a07dbb389c1bd test: Test exclusion of OP_RETURN from getblockstats (Fabian Jahr)
ba9d288b2468f047e5a8e4637dd8749e247ff547 test: Fix getblockstats test data generator (Fabian Jahr)
2ca5a496c2f3cbcc63ea15fa05c1658e7f527bbc rpc: Improve getblockstats (Fabian Jahr)
cb94db119f4643f49da63520d64efc99fb0c0795 validation, index: Add unspendable coinbase helper functions (Fabian Jahr)
Pull request description:
Fixes#19885
The genesis block does not have undo data saved to disk so the RPC errored because of that.
ACKs for top commit:
achow101:
ACK d885bb2f6ea34cd21dacfebe763a07dbb389c1bd
aureleoules:
ACK d885bb2f6ea34cd21dacfebe763a07dbb389c1bd
stickies-v:
ACK d885bb2f6
Tree-SHA512: f37bda736ed605b7a41a81eeb4bfbb5d2b8518f847819e5d6a090548a61caf1455623e15165d72589ab3f4478252b00e7b624f9313ad6708cac06dd5edb62e9a
c8dc0e3eaa9e0f956c5177bcb69632beb0d51770 refactor: Inline `CTxMemPoolEntry` class's functions (Hennadii Stepanov)
75bbe594e54402ed248ecf2bdfe54e8283d20fff refactor: Move `CTxMemPoolEntry` class to its own module (Hennadii Stepanov)
Pull request description:
This PR:
- gets rid of the `policy/fees` -> `txmempool` -> `policy/fees` circular dependency
- is an alternative to #13949, which nukes only one circular dependency
ACKs for top commit:
ryanofsky:
Code review ACK c8dc0e3eaa9e0f956c5177bcb69632beb0d51770. Just include and whitespace changes since last review, and there's a moveonly commit now so it's very easy to review
theStack:
Code-review ACK c8dc0e3eaa9e0f956c5177bcb69632beb0d51770
glozow:
utACK c8dc0e3eaa9e0f956c5177bcb69632beb0d51770, agree these changes are an improvement.
Tree-SHA512: 36ece824e6ed3ab1a1e198b30a906c8ac12de24545f840eb046958a17315ac9260c7de26e11e2fbab7208adc3d74918db7a7e389444130f8810548ca2e81af41
db929893ef0bc86ea2708cdbcf41152240cd7c73 Faster -reindex by initially deserializing only headers (Larry Ruane)
c72de9990ae8f1744006d9c852023b882d5ed80c util: add CBufferedFile::SkipTo() to move ahead in the stream (Larry Ruane)
48a68908ba3d5e077cda7bd1e908b923fbead824 Add LoadExternalBlockFile() benchmark (Larry Ruane)
Pull request description:
### Background
During the first part of reindexing, `LoadExternalBlockFile()` sequentially reads raw blocks from the `blocks/blk00nnn.dat` files (rather than receiving them from peers, as with initial block download) and eventually adds all of them to the block index. When an individual block is initially read, it can't be immediately added unless all its ancestors have been added, which is rare (only about 8% of the time), because the blocks are not sorted by height. When the block can't be immediately added to the block index, its disk location is saved in a map so it can be added later. When its parent is later added to the block index, `LoadExternalBlockFile()` reads and deserializes the block from disk a second time and adds it to the block index. Most blocks (92%) get deserialized twice.
### This PR
During the initial read, it's rarely useful to deserialize the entire block; only the header is needed to determine if the block can be added to the block index immediately. This change to `LoadExternalBlockFile()` initially deserializes only a block's header, then deserializes the entire block only if it can be added immediately. This reduces reindex time on mainnet by 7 hours on a Raspberry Pi, which translates to around a 25% reduction in the first part of reindexing (adding blocks to the index), and about a 6% reduction in overall reindex time.
Summary: The performance gain is the result of deserializing each block only once, except its header which is deserialized twice, but the header is only 80 bytes.
ACKs for top commit:
andrewtoth:
ACK db929893ef0bc86ea2708cdbcf41152240cd7c73
achow101:
ACK db929893ef0bc86ea2708cdbcf41152240cd7c73
aureleoules:
ACK db929893ef0bc86ea2708cdbcf41152240cd7c73 - minor changes and new benchmark since last review
theStack:
re-ACK db929893ef0bc86ea2708cdbcf41152240cd7c73
stickies-v:
re-ACK db929893e
Tree-SHA512: 5a5377192c11edb5b662e18f511c9beb8f250bc88aeadf2f404c92c3232a7617bade50477ebf16c0602b9bd3b68306d3ee7615de58acfd8cae664d28bb7b0136
Since faf44876db555f7488c8df96db9fa88b793f897c, the maxtipage comparison
in IsInitialBlockDownload() has been broken, since the NodeClock::now()
time_point is in the system's native denomination (micrcoseconds).
Without this patch, specifying the maximum allowable -maxtipage
(9223372036854775807) results in a SIGABRT crash.
Co-authored-by: MacroFake <falke.marco@gmail.com>
aaaa7bd0ba60aa7114810d4794940882d987c0aa iwyu: Add missing includes (MacroFake)
fa9ebec096ae185576a54aa80bd2a9e57f867ed4 Remove g_parallel_script_checks (MacroFake)
fa7c834b9f988fa7f2ace2d67b1628211b7819df Move ::fCheckBlockIndex into ChainstateManager (MacroFake)
fa43188d86288fa6666307a77c106c8f069ebdbe Move ::fCheckpointsEnabled into ChainstateManager (MacroFake)
cccca83099453bf0882bce4f897f77eee5836e8b Move ::nMinimumChainWork into ChainstateManager (MacroFake)
fa29d0b57cdeb91c8798d5c90ba9cc18085e99fb Move ::hashAssumeValid into ChainstateManager (MacroFake)
faf44876db555f7488c8df96db9fa88b793f897c Move ::nMaxTipAge into ChainstateManager (MacroFake)
Pull request description:
It seems preferable to assign globals to a class (in this case `ChainstateManager`), than to leave them dangling. This should clarify scope for code-readers, as well as clarifying unit test behaviour.
ACKs for top commit:
dergoegge:
Code review ACK aaaa7bd0ba60aa7114810d4794940882d987c0aa
ryanofsky:
Code review ACK aaaa7bd0ba60aa7114810d4794940882d987c0aa. No changes since last review, other than rebase
aureleoules:
reACK aaaa7bd0ba60aa7114810d4794940882d987c0aa
Tree-SHA512: 83ec3ba0fb4f1dad95810d4bd4e578454e0718dc1bdd3a794cc4e48aa819b6f5dad4ac4edab3719bdfd5f89cbe23c2740a50fd56c1ff81c99e521c5f6d4e898d
When a block is initially read from a blk*.dat file during reindexing,
it can be added to the block index only if all of its ancestor blocks
have been added, which is rare. If the block's ancestors have not been
added, the block must be re-read from disk later when it can be added.
This commit: During the initial block read, deserialize only its header,
rather than the entire block, since this is sufficient to determine
if its parent (and thus all its ancestors) has been added. This is a
performance improvement.
Making the checks to identify BIP30 available outside of validation.cpp is needed for reporting and tracking statistics on specific blocks and the UTXO set correctly.
5d3f98d27879cd6d84b8590e947336e8d09613ed refactor: Replace m_params with chainman.GetParams() (Aurèle Oulès)
Pull request description:
Fixes a TODO introduced in #24595.
Removes `m_params` from `CChainState` class and replaces it with `m_chainman.GetParams()`.
ACKs for top commit:
MarcoFalke:
review ACK 5d3f98d27879cd6d84b8590e947336e8d09613ed 🌎
Tree-SHA512: de0fe31450d281cc7307c0d820495e86c93c7998e77a148db2c703da66cff1059e6560c041f1864913c42075aa24d259c2623d45e929ca0a8056ed330a9f9978