This does not change behavior. It is in preparation for
special handling of the case where VerifyDB doesn't finish
for various reasons, but doesn't fail.
Add CoinsViewOptions struct to remove ArgsManager uses from txdb.
To reduce size of this commit, this moves references to gArgs variable out of
txdb.cpp to calling code in validation.cpp. But these moves are temporary. The
gArgs references in validation.cpp are moved out to calling code in init.cpp in
later commits.
This commit does not change behavior.
The previous behavior, skipping some L3 DisconnectBlock calls,
but still attempting to reconnect these blocks at L4, makes
ConnectBlock assert.
The variable skipped_l3_checks is introduced because even with an
insufficient cache for the L3 checks, the L1/L2 checks in the same
loop should still be completed.
Fixes#25563.
282019cd3ddb060db350654e6f855f7ea37e0d34 refactor: add kernel/cs_main.* (fanquake)
Pull request description:
One place to find / include `cs_main`.
No more:
> // Actually declared in validation.cpp; can't include because of circular dependency.
> extern RecursiveMutex cs_main;
Ultimately, no more need to include `validation.h` (which also includes (heavy/boost filled) `txmempool.h`) everywhere for `cs_main`. See #26087 for another example of why that is useful.
ACKs for top commit:
ajtowns:
ACK 282019cd3ddb060db350654e6f855f7ea37e0d34
Tree-SHA512: 142835b794873e7a09c3246d6101843ae81ec0c6295e6873130c98a2abfa5f7282748d0f1a37237a779cc71c3bc0a75d03b20313ef5398c83d4814215cbc8287
9567bfeab95cc0932073641dd162903850987d43 clang-tidy: Add `performance-no-automatic-move` check (Hennadii Stepanov)
Pull request description:
Split from bitcoin/bitcoin#26642 as [requested](https://github.com/bitcoin/bitcoin/pull/26642#discussion_r1054673201).
For the problem description see https://clang.llvm.org/extra/clang-tidy/checks/performance/no-automatic-move.html.
The following types are affected:
- `std::pair<CAddress, NodeSeconds>`
- `std::vector<CAddress>`
- `UniValue`, also see bitcoin/bitcoin#25429
- `QColor`
- `CBlock`
- `MempoolAcceptResult`
- `std::shared_ptr<CWallet>`
- `std::optional<SelectionResult>`
- `CTransactionRef`, which is `std::shared_ptr<const CTransaction>`
ACKs for top commit:
andrewtoth:
ACK 9567bfeab95cc0932073641dd162903850987d43
aureleoules:
ACK 9567bfeab95cc0932073641dd162903850987d43
Tree-SHA512: 9b6a5d539205b41d2c86402d384318ed2e1d89e66333ebd200a48fd7df3ce6f6c60a3e989eda5cc503fb34b8d82526f95e56776e1af51e63b49e3a1fef72dbcb
This makes the interface more predictable and useful. The caller
understands one or more transactions failed, and can learn what happened
with each transaction. We already have this information, so we might as
well return it.
It doesn't make sense to do this for other PackageValidationResult
values because:
- PCKG_RESULT_UNSET: this means everything succeeded, so the individual
failures are no longer accurate.
- PCKG_MEMPOOL_ERROR: something went wrong with the mempool logic;
transaction failures might not be meaningful.
- PCKG_POLICY: this means something was wrong with the package as a
whole. The caller should use the PackageValidationState to find the
error, rather than looking at individual MempoolAcceptResults.
This value creates an extremely confusing interface as its existence is
dependent upon implementation details (whether something was submitted
on its own, etc). MempoolAcceptResult::m_effective_feerate is much more
helpful, as it always exists for submitted transactions.
Bug: not setting package_state means package_state.IsValid() == true and
the caller does not know that this failed.
We won't be validating this transaction again, so it makes sense to return this
failure to the caller.
Rename package_state to package_state_quit_early to make it more clear
what this variable is used for and what its scope is.
Co-authored-by: Greg Sanders <gsanders87@gmail.com>
47c4b1f52ab8d95d7deef83050bad49d1e3e5990 mempool: log/halt when CalculateMemPoolAncestors fails unexpectedly (stickies-v)
5481f65849313ff947f38433b1ac28285a7f7694 mempool: add AssumeCalculateMemPoolAncestors helper function (stickies-v)
f911bdfff95eba3793fffaf71a31cc8bfc6f80c9 mempool: use util::Result for CalculateMemPoolAncestors (stickies-v)
66e028f7399b6511f9b73b1cef54b6a6ac38a024 mempool: use util::Result for CalculateAncestorsAndCheckLimits (stickies-v)
Pull request description:
Upon reviewing the documentation for `CTxMemPool::CalculateMemPoolAncestors`, I noticed `setAncestors` was meant to be an `out` parameter but actually is an `in,out` parameter, as can be observed by adding `assert(setAncestors.empty());` as the first line in the function and running `make check`. This PR fixes this unexpected behaviour and introduces refactoring improvements to make intents and effects of the code more clear.
## Unexpected behaviour
This behaviour occurs only in the package acceptance path, currently only triggered by `testmempoolaccept` and `submitpackage` RPCs.
In `MemPoolAccept::AcceptMultipleTransactions()`, we first call `PreChecks()` and then `SubmitPackage()` with the same `Workspace ws` reference. `PreChecks` leaves `ws.m_ancestors` in a potentially non-empty state, before it is passed on to `MemPoolAccept::SubmitPackage`. `SubmitPackage` is the only place where `setAncestors` isn't guaranteed to be empty before calling `CalculateMemPoolAncestors`. The most straightforward fix is to just forcefully clear `setAncestors` at the beginning of CalculateMemPoolAncestors, which is done in the first bugfix commit.
## Improvements
### Return value instead of out-parameters
This PR updates the function signatures for `CTxMemPool::CalculateMemPoolAncestors` and `CTxMemPool::CalculateAncestorsAndCheckLimits` to use a `util::Result` return type and eliminate both the `setAncestors` `in,out`-parameter as well as the error string. It simplifies the code and makes the intent and effects more explicit.
### Observability
There are 7 instances where we currently call `CalculateMemPoolAncestors` without actually checking if the function succeeded because we assume that it can't fail, such as in [miner.cpp](69b10212ea/src/node/miner.cpp (L399)). This PR adds a new wrapper `AssumeCalculateMemPoolAncestors` function that logs such unexpected failures, or in case of debug builds even halts the program. It's not crucial to the objective, more of an observability improvement that seems sensible to add on here.
ACKs for top commit:
achow101:
ACK 47c4b1f52ab8d95d7deef83050bad49d1e3e5990
w0xlt:
ACK 47c4b1f52a
glozow:
ACK 47c4b1f52ab8d95d7deef83050bad49d1e3e5990
furszy:
light code review ACK 47c4b1f5
aureleoules:
ACK 47c4b1f52ab8d95d7deef83050bad49d1e3e5990
Tree-SHA512: d908dad00d1a5645eb865c4877cc0bae74b9cd3332a3641eb4a285431aef119f9fc78172d38b55c592168a73dae83242e6af3348815f7b37cbe2d448a3a58648
b2aa9e85289fc654106a890c35935e9c76c411fb Add release note for MIN_STANDARD_TX_NONWITNESS_SIZE relaxation (Greg Sanders)
8c5b3646b5afe8a61f5c66478d8e11f0d2ce5108 Relax MIN_STANDARD_TX_NONWITNESS_SIZE to 65 non-witness bytes (Greg Sanders)
Pull request description:
Since the original fix was set to be a "reasonable" transaction to reduce allocations and the true motivation later revealed, it makes sense to relax this check to something more principled.
There are more exotic transaction patterns that could take advantage of a relaxed requirement, such as 1 input, 1 output OP_RETURN to burn a utxo to fees for CPFP purposes when change isn't practical.
Two changes could be accomplished:
1) Anything not 64 bytes could be allowed
2) Anything above 64 bytes could be allowed
In the Great Consensus Cleanup, suggestion (2)
was proposed as a consensus change, and is the simpler of the two suggestions. It would not allow an "empty" OP_RETURN but would reduce the required padding from 22 bytes to 5.
The functional test is also modified to test the actual case
we care about: 64 bytes
Related mailing list discussions here:
https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2022-October/020995.html
And a couple years earlier:
https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2020-May/017883.html
ACKs for top commit:
achow101:
reACK b2aa9e85289fc654106a890c35935e9c76c411fb
glozow:
reACK b2aa9e85289fc654106a890c35935e9c76c411fb
pablomartin4btc:
re-ACK b2aa9e8528
jonatack:
ACK b2aa9e85289fc654106a890c35935e9c76c411fb with some suggestions
Tree-SHA512: c1ec1af9ddcf31b2272209a4f1ee0c5607399f8172e5a1dfd4604cf98bfb933810dd9369a5917ad122add003327c9fcf6ee26995de3aca41d5c42dba527991ad
Since the original fix was set to be a "reasonable" transaction
to reduce allocations and the true motivation later revealed,
it makes sense to relax this check to something more principled.
There are more exotic transaction patterns that could take advantage
of a relaxed requirement, such as 1 input, 1 output OP_RETURN to burn
a utxo to fees for CPFP purposes when change isn't practical.
Two changes could be accomplished:
1) Anything not 64 bytes could be allowed
2) Anything above 64 bytes could be allowed
In the Great Consensus Cleanup, suggestion (2) was the route taken.
It would not allow an "empty" OP_RETURN
but would reduce the required padding from 22 bytes to 5.
The functional test is also modified to test the actual case
we care about: 64 bytes
07dfbb5bb8115c680621f361c65d9cde2f8c52f2 Make static nLastFlush and nLastWrite Chainstate members (Aurèle Oulès)
Pull request description:
Fixes#22189.
The `static std::multimap<uint256, FlatFilePos> mapBlocksUnknownParent; ` referenced in the issue was already fixed by #25571. I don't believe Chainstate references any other static variables.
ACKs for top commit:
jamesob:
ACK 07dfbb5bb8115c680621f361c65d9cde2f8c52f2 ([`jamesob/ackr/26513.1.aureleoules.make_static_nlastflush_a`](https://github.com/jamesob/bitcoin/tree/ackr/26513.1.aureleoules.make_static_nlastflush_a))
theStack:
Concept and code-review ACK 07dfbb5bb8115c680621f361c65d9cde2f8c52f2
Tree-SHA512: 0f26463c079bbc5e0e62707d4ca4c8c9bbb99edfa3391d48d4915d24e2a1190873ecd4f9f11da25b44527671cdc82c41fd8234d56a4592a246989448d34406b0
38941a703e079709bb465ad1fcde50e11350f8f1 refactor: Move `txmempool_entry.h` --> `kernel/mempool_entry.h` (Hennadii Stepanov)
Pull request description:
This PR addresses the https://github.com/bitcoin/bitcoin/pull/17786#discussion_r1027818360:
> why not move it to the right place, that is to `kernel/txmempool_entry.h`?
ACKs for top commit:
MarcoFalke:
review ACK 38941a703e079709bb465ad1fcde50e11350f8f1 📊
Tree-SHA512: 0145974b63b67ca1d9d89af2dd9d4438beca480c16a563f330da05fec49b8394d7ba20ed83cf7d50b2e19454e006978ebed42b0e07887b98d00210f3201ce9ba
d885bb2f6ea34cd21dacfebe763a07dbb389c1bd test: Test exclusion of OP_RETURN from getblockstats (Fabian Jahr)
ba9d288b2468f047e5a8e4637dd8749e247ff547 test: Fix getblockstats test data generator (Fabian Jahr)
2ca5a496c2f3cbcc63ea15fa05c1658e7f527bbc rpc: Improve getblockstats (Fabian Jahr)
cb94db119f4643f49da63520d64efc99fb0c0795 validation, index: Add unspendable coinbase helper functions (Fabian Jahr)
Pull request description:
Fixes#19885
The genesis block does not have undo data saved to disk so the RPC errored because of that.
ACKs for top commit:
achow101:
ACK d885bb2f6ea34cd21dacfebe763a07dbb389c1bd
aureleoules:
ACK d885bb2f6ea34cd21dacfebe763a07dbb389c1bd
stickies-v:
ACK d885bb2f6
Tree-SHA512: f37bda736ed605b7a41a81eeb4bfbb5d2b8518f847819e5d6a090548a61caf1455623e15165d72589ab3f4478252b00e7b624f9313ad6708cac06dd5edb62e9a
c8dc0e3eaa9e0f956c5177bcb69632beb0d51770 refactor: Inline `CTxMemPoolEntry` class's functions (Hennadii Stepanov)
75bbe594e54402ed248ecf2bdfe54e8283d20fff refactor: Move `CTxMemPoolEntry` class to its own module (Hennadii Stepanov)
Pull request description:
This PR:
- gets rid of the `policy/fees` -> `txmempool` -> `policy/fees` circular dependency
- is an alternative to #13949, which nukes only one circular dependency
ACKs for top commit:
ryanofsky:
Code review ACK c8dc0e3eaa9e0f956c5177bcb69632beb0d51770. Just include and whitespace changes since last review, and there's a moveonly commit now so it's very easy to review
theStack:
Code-review ACK c8dc0e3eaa9e0f956c5177bcb69632beb0d51770
glozow:
utACK c8dc0e3eaa9e0f956c5177bcb69632beb0d51770, agree these changes are an improvement.
Tree-SHA512: 36ece824e6ed3ab1a1e198b30a906c8ac12de24545f840eb046958a17315ac9260c7de26e11e2fbab7208adc3d74918db7a7e389444130f8810548ca2e81af41
db929893ef0bc86ea2708cdbcf41152240cd7c73 Faster -reindex by initially deserializing only headers (Larry Ruane)
c72de9990ae8f1744006d9c852023b882d5ed80c util: add CBufferedFile::SkipTo() to move ahead in the stream (Larry Ruane)
48a68908ba3d5e077cda7bd1e908b923fbead824 Add LoadExternalBlockFile() benchmark (Larry Ruane)
Pull request description:
### Background
During the first part of reindexing, `LoadExternalBlockFile()` sequentially reads raw blocks from the `blocks/blk00nnn.dat` files (rather than receiving them from peers, as with initial block download) and eventually adds all of them to the block index. When an individual block is initially read, it can't be immediately added unless all its ancestors have been added, which is rare (only about 8% of the time), because the blocks are not sorted by height. When the block can't be immediately added to the block index, its disk location is saved in a map so it can be added later. When its parent is later added to the block index, `LoadExternalBlockFile()` reads and deserializes the block from disk a second time and adds it to the block index. Most blocks (92%) get deserialized twice.
### This PR
During the initial read, it's rarely useful to deserialize the entire block; only the header is needed to determine if the block can be added to the block index immediately. This change to `LoadExternalBlockFile()` initially deserializes only a block's header, then deserializes the entire block only if it can be added immediately. This reduces reindex time on mainnet by 7 hours on a Raspberry Pi, which translates to around a 25% reduction in the first part of reindexing (adding blocks to the index), and about a 6% reduction in overall reindex time.
Summary: The performance gain is the result of deserializing each block only once, except its header which is deserialized twice, but the header is only 80 bytes.
ACKs for top commit:
andrewtoth:
ACK db929893ef0bc86ea2708cdbcf41152240cd7c73
achow101:
ACK db929893ef0bc86ea2708cdbcf41152240cd7c73
aureleoules:
ACK db929893ef0bc86ea2708cdbcf41152240cd7c73 - minor changes and new benchmark since last review
theStack:
re-ACK db929893ef0bc86ea2708cdbcf41152240cd7c73
stickies-v:
re-ACK db929893e
Tree-SHA512: 5a5377192c11edb5b662e18f511c9beb8f250bc88aeadf2f404c92c3232a7617bade50477ebf16c0602b9bd3b68306d3ee7615de58acfd8cae664d28bb7b0136
Since faf44876db555f7488c8df96db9fa88b793f897c, the maxtipage comparison
in IsInitialBlockDownload() has been broken, since the NodeClock::now()
time_point is in the system's native denomination (micrcoseconds).
Without this patch, specifying the maximum allowable -maxtipage
(9223372036854775807) results in a SIGABRT crash.
Co-authored-by: MacroFake <falke.marco@gmail.com>
aaaa7bd0ba60aa7114810d4794940882d987c0aa iwyu: Add missing includes (MacroFake)
fa9ebec096ae185576a54aa80bd2a9e57f867ed4 Remove g_parallel_script_checks (MacroFake)
fa7c834b9f988fa7f2ace2d67b1628211b7819df Move ::fCheckBlockIndex into ChainstateManager (MacroFake)
fa43188d86288fa6666307a77c106c8f069ebdbe Move ::fCheckpointsEnabled into ChainstateManager (MacroFake)
cccca83099453bf0882bce4f897f77eee5836e8b Move ::nMinimumChainWork into ChainstateManager (MacroFake)
fa29d0b57cdeb91c8798d5c90ba9cc18085e99fb Move ::hashAssumeValid into ChainstateManager (MacroFake)
faf44876db555f7488c8df96db9fa88b793f897c Move ::nMaxTipAge into ChainstateManager (MacroFake)
Pull request description:
It seems preferable to assign globals to a class (in this case `ChainstateManager`), than to leave them dangling. This should clarify scope for code-readers, as well as clarifying unit test behaviour.
ACKs for top commit:
dergoegge:
Code review ACK aaaa7bd0ba60aa7114810d4794940882d987c0aa
ryanofsky:
Code review ACK aaaa7bd0ba60aa7114810d4794940882d987c0aa. No changes since last review, other than rebase
aureleoules:
reACK aaaa7bd0ba60aa7114810d4794940882d987c0aa
Tree-SHA512: 83ec3ba0fb4f1dad95810d4bd4e578454e0718dc1bdd3a794cc4e48aa819b6f5dad4ac4edab3719bdfd5f89cbe23c2740a50fd56c1ff81c99e521c5f6d4e898d
When a block is initially read from a blk*.dat file during reindexing,
it can be added to the block index only if all of its ancestor blocks
have been added, which is rare. If the block's ancestors have not been
added, the block must be re-read from disk later when it can be added.
This commit: During the initial block read, deserialize only its header,
rather than the entire block, since this is sufficient to determine
if its parent (and thus all its ancestors) has been added. This is a
performance improvement.
Making the checks to identify BIP30 available outside of validation.cpp is needed for reporting and tracking statistics on specific blocks and the UTXO set correctly.
5d3f98d27879cd6d84b8590e947336e8d09613ed refactor: Replace m_params with chainman.GetParams() (Aurèle Oulès)
Pull request description:
Fixes a TODO introduced in #24595.
Removes `m_params` from `CChainState` class and replaces it with `m_chainman.GetParams()`.
ACKs for top commit:
MarcoFalke:
review ACK 5d3f98d27879cd6d84b8590e947336e8d09613ed 🌎
Tree-SHA512: de0fe31450d281cc7307c0d820495e86c93c7998e77a148db2c703da66cff1059e6560c041f1864913c42075aa24d259c2623d45e929ca0a8056ed330a9f9978
This changes the flag for the bitcoin-chainstate executable. Previously
it was false, now it is the chain's default value (still false for the
main chain).
This changes the minimum chain work for the bitcoin-chainstate
executable. Previously it was uint256{}, now it is the chain's default
minimum chain work.
e899d4ca6f44785b6e5481e200a6a9a8d2448612 init: limit bip30 exceptions to coinbase txs (Chris Geihsler)
511eb7fdeac36da9d6576c878a1c8d390b38f1bd Ignore problematic blocks in DisconnectBlock (Chris Geihsler)
Pull request description:
Fixes https://github.com/bitcoin/bitcoin/issues/22596
When using checklevel=4, block verification fails because of duplicate coinbase transactions involving blocks 91812 and 91722. There was already a check in place within `ConnectBlock` to ignore the problematic blocks, but `DisconnectBlock` did not contain a similar check to ignore these blocks when called from `VerifyDB`.
By ignoring these two blocks in `DisconnectBlock`, the block verification process succeeds at checklevel=4.
(Note to reviewers: this is my first contribution to Bitcoin Core, so any feedback is most welcome. Thanks in advance for reviewing!)
## Steps to reproduce:
Use the following bitcoin.conf file and start bitcoind. I only used block data through block ~100000 so that the verification process was much faster.
```
assumevalid=0
checkblocks=0
checklevel=4
```
Without this change, you will see the following error when the blocks are verified:
```
2022-04-14T02:56:44Z init message: Verifying blocks…
2022-04-14T02:56:44Z Verifying last 101881 blocks at level 4
2022-04-14T02:56:44Z [0%]...[10%]...[20%]...[30%]...[40%]...ERROR: VerifyDB(): *** coin database inconsistencies found (last 10160 blocks, 142571 good transactions before that)
2022-04-14T02:57:01Z : Corrupted block database detected.
Please restart with -reindex or -reindex-chainstate to recover.
: Corrupted block database detected.
Please restart with -reindex or -reindex-chainstate to recover.
```
With this change, you will see this instead:
```
2022-04-14T02:32:29Z init message: Verifying blocks…
2022-04-14T02:32:29Z Verifying last 101746 blocks at level 4
2022-04-14T02:32:29Z [0%]...[10%]...[20%]...[30%]...[40%]...[50%]...[60%]...[70%]...[80%]...[90%]...[DONE].
2022-04-14T02:32:48Z No coin database inconsistencies in last 101746 blocks (226126 transactions)
```
ACKs for top commit:
laanwj:
Code review ACK e899d4ca6f44785b6e5481e200a6a9a8d2448612
achow101:
ACK e899d4ca6f44785b6e5481e200a6a9a8d2448612
jamesob:
(Biased) ACK e899d4ca6f44785b6e5481e200a6a9a8d2448612 ([`jamesob/ackr/24851.2.seejee.init_ignore_bip_30_verif`](https://github.com/jamesob/bitcoin/tree/ackr/24851.2.seejee.init_ignore_bip_30_verif))
Tree-SHA512: d2f6d25e9619aee32c1a73fe846b1b587698eaa5a4994fa6424f1038f45654f9fd52b74a69843cc84d90168d74827130ccf8e9201502f5d52281acdb20429291
bf9597606166323158bbf631137b82d41f39334f doc: add note about snapshot chainstate init (James O'Beirne)
e4d799528696c5ede38c257afaffd367917e0de8 test: add testcases for snapshot initialization (James O'Beirne)
cced4e7336d93a2dc88e4a61c49941887766bd72 test: move-only-ish: factor out LoadVerifyActivateChainstate() (James O'Beirne)
51fc9241c08a00f1f407f1534853a5cddbbc0a23 test: allow on-disk coins and block tree dbs in tests (James O'Beirne)
3c361391b8f5971eb3c7b620aa7ad9b437cc515e test: add reset_chainstate parameter for snapshot unittests (James O'Beirne)
00b357c215ed900145bd770525a341ba0ed9c027 validation: add ResetChainstates() (James O'Beirne)
3a29dfbfb2c16a50d854f6f81428a68aa9180509 move-only: test: make snapshot chainstate setup reusable (James O'Beirne)
8153bd9247dad3982d54488bcdb3960470315290 blockmanager: avoid undefined behavior during FlushBlockFile (James O'Beirne)
ad67ff377c2b271cb4683da2fb25fd295557f731 validation: remove snapshot datadirs upon validation failure (James O'Beirne)
34d159033106cc595cfa852695610bfe419c989c add utilities for deleting on-disk leveldb data (James O'Beirne)
252abd1e8bc5cdf4368ad55e827a873240535b28 init: add utxo snapshot detection (James O'Beirne)
f9f1735f139b6a1f1c7fea50717ff90dc4ba2bce validation: rename snapshot chainstate dir (James O'Beirne)
d14bebf100aaaa25c7558eeed8b5c536da99885f db: add StoragePath to CDBWrapper/CCoinsViewDB (James O'Beirne)
Pull request description:
This is part of the [assumeutxo project](https://github.com/bitcoin/bitcoin/projects/11) (parent PR: https://github.com/bitcoin/bitcoin/pull/15606)
---
Half of the replacement for #24232. The original PR grew larger than expected throughout the review process.
This change adds the ability to initialize a snapshot-based chainstate during init if one is detected on disk. This is of course unused as of now (aside from in unittests) given that we haven't yet enabled actually loading snapshots.
Don't be scared! There are some big move-only commits in here.
Accompanying changes include:
- moving the snapshot coinsdb directory from being called `chainstate_[base blockhash]` to `chainstate_snapshot`, since we only support one snapshot in use at a time. This simplifies some logic, but it necessitates writing that base blockhash out to a file within the coinsdb dir. See [discussion here](https://github.com/bitcoin/bitcoin/pull/24232#discussion_r832762880).
- adding a simple fix in `FlushBlockFile()` that avoids a crash when attemping to flush to disk before `LoadBlockIndexDB()` is called, which happens when calling `MaybeRebalanceCaches()` during multiple chainstate init.
- improving the unittest to allow testing with on-disk chainstates - necessary to test a simulated restart and re-initialization.
ACKs for top commit:
naumenkogs:
utACK bf9597606166323158bbf631137b82d41f39334f
ariard:
Code Review ACK bf9597606
ryanofsky:
Code review ACK bf9597606166323158bbf631137b82d41f39334f. Changes since last review: rebasing, switching from CAutoFile to AutoFile, adding comments, switching from BOOST_CHECK to Assert in test util, using chainman.GetMutex() in tests, destroying one ChainstateManager before creating a new one in tests
fjahr:
utACK bf9597606166323158bbf631137b82d41f39334f
aureleoules:
ACK bf9597606166323158bbf631137b82d41f39334f
Tree-SHA512: 15ae75caf19f8d12a12d2647c52897904d27b265a7af6b4ae7b858592eeadb8f9da6c2394b6baebec90adc28742c053e3eb506119577dae7c1e722ebb3b7bcc0
bcb0cacac28e98a39dc856c574a0872fe17059e9 reindex, log, test: fixes#21379 (mruddy)
Pull request description:
Fixes#21379.
The blocks/blk?????.dat files are mutated and become increasingly malformed, or corrupt, as a result of running the re-indexing process.
The mutations occur after the re-indexing process has finished, as new blocks are appended, but are a result of a re-indexing process miscalculation that lingers in the block manager's `m_blockfile_info` `nSize` data until node restart.
These additions to the blk files are non-fatal, but also not desirable.
That is, this is a form of data corruption that the reading code is lenient enough to process (it skips the extra bytes), but it adds some scary looking log messages as it encounters them.
The summary of the problem is that the re-index process double counts the size of the serialization header (magic message start bytes [4 bytes] + length [4 bytes] = 8 bytes) while calculating the blk data file size (both values already account for the serialization header's size, hence why it is over accounted).
This bug manifests itself in a few different ways, after re-indexing, when a new block from a peer is processed:
1. If the new block will not fit into the last blk file processed while re-indexing, while remaining under the 128MiB limit, then the blk file is flushed to disk and truncated to a size that is 8 greater than it should be. The truncation adds zero bytes (see `FlatFileSeq::Flush` and `TruncateFile`).
1. If the last blk file processed while re-indexing has logical space for the new block under the 128 MiB limit:
1. If the blk file was not already large enough to hold the new block, then the zeros are, in effect, added by `fseek` when the file is opened for writing. Eight zero bytes are added to the end of the last blk file just before the new block is written. This happens because the write offset is 8 too great due to the miscalculation. The result is 8 zero bytes between the end of the last block and the beginning of the next block's magic + length + block.
1. If the blk file was already large enough to hold the new block, then the current existing file contents remain in the 8 byte gap between the end of the last block and the beginning of the next block's magic + length + block. Commonly, when this occcurs, it is due to the blk file containing blocks that are not connected to the block tree during reindex and are thus left behind by the reindex process and later overwritten when new blocks are added. The orphaned blocks can be valid blocks, but due to the nature of concurrent block download, the parent may not have been retrieved and written by the time the node was previously shutdown.
ACKs for top commit:
LarryRuane:
tested code-review ACK bcb0cacac28e98a39dc856c574a0872fe17059e9
ryanofsky:
Code review ACK bcb0cacac28e98a39dc856c574a0872fe17059e9. This is a disturbing bug with an easy fix which seems well-worth merging.
mzumsande:
ACK bcb0cacac28e98a39dc856c574a0872fe17059e9 (reviewed code and did some testing, I agree that it fixes the bug).
w0xlt:
tACK bcb0cacac2
Tree-SHA512: acc97927ea712916506772550451136b0f1e5404e92df24cc05e405bb09eb6fe7c3011af3dd34a7723c3db17fda657ae85fa314387e43833791e9169c0febe51