fa4cb13b52 test: [doc] Manually unify stale headers (MarcoFalke)
fa5f297748 scripted-diff: [doc] Unify stale copyright headers (MarcoFalke)
Pull request description:
Historically, the upper year range in file headers was bumped manually
or with a script.
This has many issues:
* The script is causing churn. See for example commit 306ccd4, or
drive-by first-time contributions bumping them one-by-one. (A few from
this year: https://github.com/bitcoin/bitcoin/pull/32008,
https://github.com/bitcoin/bitcoin/pull/31642,
https://github.com/bitcoin/bitcoin/pull/32963, ...)
* Some, or likely most, upper year values were wrong. Reasons for
incorrect dates could be code moves, cherry-picks, or simply bugs in
the script.
* The upper range is not needed for anything.
* Anyone who wants to find the initial file creation date, or file
history, can use `git log` or `git blame` to get more accurate
results.
* Many places are already using the `-present` suffix, with the meaning
that the upper range is omitted.
To fix all issues, this bumps the upper range of the copyright headers
to `-present`.
Further notes:
* Obviously, the yearly 4-line bump commit for the build system (c.f.
b537a2c02a) is fine and will remain.
* For new code, the date range can be fully omitted, as it is done
already by some developers. Obviously, developers are free to pick
whatever style they want. One can list the commits for each style.
* For example, to list all commits that use `-present`:
`git log --format='%an (%ae) [%h: %s]' -S 'present The Bitcoin'`.
* Alternatively, to list all commits that use no range at all:
`git log --format='%an (%ae) [%h: %s]' -S '(c) The Bitcoin'`.
<!--
* The lower range can be wrong as well, so it could be omitted as well,
but this is left for a follow-up. A previous attempt was in
https://github.com/bitcoin/bitcoin/pull/26817.
ACKs for top commit:
l0rinc:
ACK fa4cb13b52
rkrux:
re-ACK fa4cb13b52
janb84:
ACK fa4cb13b52
Tree-SHA512: e5132781bdc4417d1e2922809b27ef4cf0abb37ffb68c65aab8a5391d3c917b61a18928ec2ec2c75ef5184cb79a5b8c8290d63e949220dbeab3bd2c0dfbdc4c5
0465574c12 test: Fixes send_blocks_and_test docs (Sergi Delgado Segura)
09c95f21e7 test: Adds block tiebreak over restarts tests (Sergi Delgado Segura)
18524b072e Make nSequenceId init value constants (Sergi Delgado Segura)
8b91883a23 Set the same best tip on restart if two candidates have the same work (Sergi Delgado Segura)
5370bed21e test: add functional test for complex reorgs (Pieter Wuille)
ab145cb3b4 Updates CBlockIndexWorkComparator outdated comment (Sergi Delgado Segura)
Pull request description:
This PR grabs some interesting bits from https://github.com/bitcoin/bitcoin/pull/29284 and fixes some edge cases in how block tiebreaks are dealt with.
## Regarding #29284
The main functionality from the PR was dropped given it was not an issue anymore, however, reviewers pointed out some comments were outdated https://github.com/bitcoin/bitcoin/pull/29284#discussion_r1522023578 (which to my understanding may have led to thinking that there was still an issue) it also added test coverage for the aforementioned case which was already passing on master and is useful to keep.
## New functionality
While reviewing the superseded PR, it was noticed that blocks that are loaded from disk may face a similar issue (check https://github.com/bitcoin/bitcoin/pull/29284#issuecomment-1994317785 for more context).
The issue comes from how tiebreaks for equal work blocks are handled: if two blocks have the same amount of work, the one that is activatable first wins, that is, the one for which we have all its data (and all of its ancestors'). The variable that keeps track of this, within `CBlockIndex` is `nSequenceId`, which is not persisted over restarts. This means that when a node is restarted, all blocks loaded from disk are defaulted the same `nSequenceId`: 0.
Now, when trying to decide what chain is best on loading blocks from disk, the previous tiebreaker rule is not decisive anymore, so the `CBlockIndexWorkComparator` has to default to its last rule: whatever block is loaded first (has a smaller memory address).
This means that if multiple same work tip candidates were available before restarting the node, it could be the case that the selected chain tip after restarting does not match the one before.
Therefore, the way `nSequenceId` is initialized is changed to:
- 0 for blocks that belong to the previously known best chain
- 1 to all other blocks loaded from disk
ACKs for top commit:
sipa:
utACK 0465574c12
TheCharlatan:
ACK 0465574c12
furszy:
Tested ACK 0465574c12.
Tree-SHA512: 161da814da03ce10c34d27d79a315460a9c98d019b85ee35bc5daa991ed3b6a2e69a829e421fc70d093a83cf7a2e403763041e594df39ed1991445e54c16532a
Also removed CChain::GetLocator() and replaced its call
with GetLocator() which uses LocatorEntries instead.
Co-authored-by: ryanofsky <ryan@ofsky.org>
Co-authored-by: l0rinc <l0rinc@users.noreply.github.com>
Before this, if we had two (or more) same work tip candidates and restarted our node,
it could be the case that the block set as tip after bootstrap didn't match the one
before stopping. That's because the work and `nSequenceId` of both block will be the same
(the latter is only kept in memory), so the active chain after restart would have depended
on what tip candidate was loaded first.
This makes sure that we are consistent over reboots.
IsValid() also returns false for blocks that have not been
validated yet up to the default validity level of BLOCK_VALID_TRANSACTIONS but
are not marked as invalid - e.g. if we only know the header.
Here, we specifically want to filter for invalid blocks.
Also removes the default arg from IsValid() which is now unused outside
of tests, to prevent this kind of misuse for the future.
Co-authored-by: TheCharlatan <seb.kung@gmail.com>
The `PopulateAndValidateSnapshot` function introduced in
f6e2da5fb7 from #19806 has been setting fake
`nTx` and `nChainTx` values that can show up in RPC results (see #29328) and
make `CBlockIndex` state hard to reason about, because it is difficult to know
whether the values are real or fake.
Revert to previous behavior of setting `nTx` and `nChainTx` to 0 when the
values are unknown, instead of faking them.
This commit fixes at least two assert failures in the (pindex->nChainTx ==
pindex->nTx + prev_chain_tx) check that would happen previously. Tests for
these failures are added separately in the next two commits.
Compatibility note: This change could result in -checkblockindex failures if a
snapshot was loaded by a previous version of Bitcoin Core and not fully
validated, because fake nTx values will have been saved to the block index. It
would be pretty easy to avoid these failures by adding some compatibility code
to `LoadBlockIndex` and changing `nTx` values from 1 to 0 when they are fake
(when `(pindex->nStatus & BLOCK_VALID_MASK) < BLOCK_VALID_TRANSACTIONS`), but a
little simpler not to worry about being compatible in this case.
Copy construction of CBlockIndex objects is a footgun because of the
wide use of equality-by-pointer comparison in the code base. There are
also potential lifetime confusions of using copied instances, since
there are recursive pointer references (e.g. pprev).
We can't just delete the copy constructors because they are used for
derived classes (CDiskBlockIndex), so we mark them protected.
Delete move constructors and declare the destructor to satisfy the
"rule of 5."
This introduces an insignificant performance penalty, as it means locator
construction needs to use the skiplist-based CBlockIndex::GetAncestor()
function instead of the lookup-based CChain, but avoids the need for
callers to have access to a relevant CChain object.
and mark the inherited CBlockIndex#GetBlockHash public interface member
as deleted, to disallow calling it in the derived CDiskBlockIndex class.
Here is a failing test on master demonstrating the inconsistent behavior of the
current design: calling the same inherited public interface functions on the
same CDiskBlockIndex object should yield identical behavior.
```diff
diff --git a/src/test/validation_chainstatemanager_tests.cpp b/src/test/validation_chainstatemanager_tests.cpp
index 6dc522b421..dac3840f32 100644
--- a/src/test/validation_chainstatemanager_tests.cpp
+++ b/src/test/validation_chainstatemanager_tests.cpp
@@ -240,6 +240,15 @@ BOOST_FIXTURE_TEST_CASE(chainstatemanager_activate_snapshot, TestChain100Setup)
const CBlockIndex* tip = chainman.ActiveTip();
BOOST_CHECK_EQUAL(tip->nChainTx, au_data.nChainTx);
+ // CDiskBlockIndex "is a" CBlockIndex, as it publicly inherits from it.
+ // Test that calling the same inherited interface functions on the same
+ // object yields identical behavior.
+ CDiskBlockIndex index{tip};
+ CBlockIndex *pB = &index;
+ CDiskBlockIndex *pD = &index;
+ BOOST_CHECK_EQUAL(pB->GetBlockHash(), pD->GetBlockHash());
+ BOOST_CHECK_EQUAL(pB->ToString(), pD->ToString());
+
```
The GetBlockHash() test assertion only passes on master because the different
methods invoked by the current design happen to return the same result. If one
of the two is changed, it fails like the ToString() assertion does.
Redefining inherited non-virtual functions is well-documented as incorrect
design to avoid inconsistent behavior (see Scott Meyers, "Effective C++", Item
36). Class usage is confusing when the behavior depends on the pointer
definition instead of the object definition (static binding happening where
dynamic binding was expected). This can lead to unsuspected or hard-to-track
bugs.
Outside of critical hot spots, correctness usually comes before optimisation,
but the current design dates back to main.cpp and it may possibly have been
chosen to avoid the overhead of dynamic dispatch. This solution does the same:
the class sizes are unchanged and no vptr or vtbl is added.
There are better designs for doing this that use composition instead of
inheritance or that separate the public interface from the private
implementations. One example of the latter would be a non-virtual public
interface that calls private virtual implementation methods, i.e. the Template
pattern via the Non-Virtual Interface (NVI) idiom.
and remove a now-redundant assert preceding a GetBlockHash() caller.
This protects against UB here, and in case of failure (which would
indicate a consensus bug), the debug log will print
bitcoind: chain.h:265: uint256 CBlockIndex::GetBlockHash() const: Assertion `phashBlock != nullptr' failed.
Aborted
instead of
Segmentation fault
fa832103aa Avoid integer sanitizer warnings in chain.o (MarcoFalke)
Pull request description:
The two changes make the code more self-documenting and also allow to remove 5 file-wide suppressions for the module
ACKs for top commit:
PastaPastaPasta:
utACK fa832103aa
jonatack:
ACK fa832103aa
Tree-SHA512: d32a06099c56eed9f69130a3209f989872acc593f849528acd7746ee6caa96688cc32de37e8e59ad5d25dcb8912e341f1a43e50642dadeff6ca7624d0873ad10
ef72e9bd41 doc: nChainTx needs to become a 64-bit earlier due to SegWit (Sjors Provoost)
Pull request description:
As of block 597,379 txcount is 460,596,047 (see `chainparams.cpp`), while `uint32` can handle up to 4,294,967,296.
Pre segwit the [minimum transaction size](https://en.bitcoin.it/wiki/Maximum_transaction_rate) was 166 bytes, so the worst case number of transactions per block was ~6000. As the original source comment for `unsigned int nChainTx` says, that should last until the year 2030.
With SegWit the smallest possible transaction is 60 bytes (potentially increased to 65 with a future soft fork, see #15482), without a witness:
```
4 bytes version
1 byte input count
36 bytes outpoint
1 byte scriptSigLen (0x00)
0 bytes scriptSig
4 bytes sequence
1 byte output count
8 bytes value
1 byte scriptPubKeyLen
1 byte scriptPubKey (OP_TRUE)
4 bytes locktime
```
That puts the maximum number of transactions per block at 16,666 so we might have to deal with this as early as a block 827,450 in early 2024.
Given that it's a memory-only thing and we want to allow users many years to upgrade, I would suggest fixing this in v0.20 and back-porting it.
ACKs for top commit:
practicalswift:
re-ACK ef72e9bd41
jarolrod:
ACK ef72e9bd41
theStack:
ACK ef72e9bd41
Tree-SHA512: d8509ba7641796cd82af156354ff3a12ff7ec0f7b11215edff6696e95f8ca0e3596f719f3492ac3acb4b0884ac4e5bddc76f107b656bc2ed95a8ef1b2b5d4f71
Instead of (ab)using the existing BLOCK_VALID_* flags to mark CBlockIndex entries which
we haven't yet fully validated (but assume validity for use with UTXO snapshot
loading), introduce a status flag that specifically marks an assumed-valid state.
This state is then removed in RaiseValidity() when the block has actually been
validated.
This distinction will allow us to make the necessary changes to various parts of the
system to facilitate assumeutxo/background chainstate validation but without leaking
details like snapshot height, as we had done previously.
Changes that actually make use of this flag follow in future commits.
* Rename RewindBlockIndex -> NeedsRedownload (follow-up to commit
d831e711ca)
* Fix typos
* Inline comments about faking chain data to avoid duplicating them