d9319b06cf refactor: unify container presence checks - non-trivial counts (Lőrinc)
039307554e refactor: unify container presence checks - trivial counts (Lőrinc)
8bb9219b63 refactor: unify container presence checks - find (Lőrinc)
Pull request description:
### Summary
Instead of counting occurrences in sets and maps, the C++20 `::contains` method expresses the intent unambiguously and can return early on first encounter.
### Context
Applied clang‑tidy's [readability‑container‑contains](https://clang.llvm.org/extra/clang-tidy/checks/readability/container-contains.html) check, though many cases required manual changes since tidy couldn't fix them automatically.
### Changes
The changes made here were:
| From | To |
|------------------------|------------------|
| `m.find(k) == m.end()` | `!m.contains(k)` |
| `m.find(k) != m.end()` | `m.contains(k)` |
| `m.count(k)` | `m.contains(k)` |
| `!m.count(k)` | `!m.contains(k)` |
| `m.count(k) == 0` | `!m.contains(k)` |
| `m.count(k) != 1` | `!m.contains(k)` |
| `m.count(k) == 1` | `m.contains(k)` |
| `m.count(k) < 1` | `!m.contains(k)` |
| `m.count(k) > 0` | `m.contains(k)` |
| `m.count(k) != 0` | `m.contains(k)` |
> Note that `== 1`/`!= 1`/`< 1` only apply to simple [maps](https://en.cppreference.com/w/cpp/container/map/contains)/[sets](https://en.cppreference.com/w/cpp/container/set/contains) and had to be changed manually.
There are many other cases that could have been changed, but we've reverted most of those to reduce conflict with other open PRs.
-----
<details>
<summary>clang-tidy command on Mac</summary>
```bash
rm -rfd build && \
cmake -B build \
-DCMAKE_C_COMPILER="$(brew --prefix llvm)/bin/clang" \
-DCMAKE_CXX_COMPILER="$(brew --prefix llvm)/bin/clang++" \
-DCMAKE_OSX_SYSROOT="$(xcrun --show-sdk-path)" \
-DCMAKE_C_FLAGS="-target arm64-apple-macos11" \
-DCMAKE_CXX_FLAGS="-target arm64-apple-macos11" \
-DCMAKE_EXPORT_COMPILE_COMMANDS=ON -DBUILD_BENCH=ON -DBUILD_FUZZ_BINARY=ON -DBUILD_FOR_FUZZING=ON
"$(brew --prefix llvm)/bin/run-clang-tidy" -quiet -p build -j$(nproc) -checks='-*,readability-container-contains' | grep -v 'clang-tidy'
```
</details>
Note: this is a take 2 of https://github.com/bitcoin/bitcoin/pull/33094 with fewer contentious changes.
ACKs for top commit:
optout21:
reACK d9319b06cf
sedited:
ACK d9319b06cf
janb84:
re ACK d9319b06cf
pablomartin4btc:
re-ACK d9319b06cf
ryanofsky:
Code review ACK d9319b06cf. I manually reviewed the full change, and it seems there are a lot of positive comments about this and no more very significant conflicts, so I will merge it shortly.
Tree-SHA512: e4415221676cfb88413ccc446e5f4369df7a55b6642347277667b973f515c3c8ee5bfa9ee0022479c8de945c89fbc9ff61bd8ba086e70f30298cbc1762610fe1
Deduplicate code looping over chainstate objects and calling
ActivateBestChain() and avoid need for code outside ChainstateManager to use
the GetAll() method.
Use to replace m_active_chainstate, m_ibd_chainstate, and m_snapshot_chainstate
members. This has several benefits:
- Ensures ChainstateManager treats chainstates instances equally, making
distinctions based on their attributes, not having special cases and making
assumptions based on their identities.
- Normalizes ChainstateManager representation so states that should be
impossible to reach and validation code has no handling for (like
m_snapshot_chainstate being set and m_ibd_chainstate being unset, or both
being set but m_active_chainstate pointing to the m_ibd_chainstate) can no
longer be represented.
- Makes ChainstateManager more extensible so new chainstates can be added for
different purposes, like indexing or generating and validating assumeutxo
snapshots without interrupting regular node operations. With the
m_chainstates member, new chainstates can be added and handled without needing
to make changes all over validation code or to copy/paste/modify the existing
code that's been already been written to handle m_ibd_chainstate and
m_snapshot_chainstate.
- Avoids terms that are confusing and misleading:
- The term "active chainstate" term is confusing because multiple chainstates
will be active and in use at the same time. Before a snapshot is validated,
wallet code will use the snapshot chainstate, while indexes will use the IBD
chainstate, and netorking code will use both chainstates, downloading
snapshot blocks at higher priority, but also IBD blocks simultaneously.
- The term "snapshot chainstate" is ambiguous because it could refer either
to the chainstate originally loaded from a snapshot, or to the chainstate
being used to validate a snapshot that was loaded, or to a chainstate being
used to produce a snapshot, but it is arbitrary used to refer the first
thing. The terms "most-work chainstate" or "assumed-valid chainstate" should
be less ambiguous ways to refer to chainstates loaded from snapshots.
- The term "IBD chainstate" is not just ambiguous but actively confusing
because technically IBD ends and the node is considered synced when the
snapshot chainstate finishes syncing, so in practice the IBD chainstate
will mostly by synced after IBD is complete. The term "fully-validated" is
a better way of describing the characteristics and purpose of this
chainstate.
SnapshotBlockhash() is only called two places outside of tests, and is used
redundantly in some tests, checking the same field as other checks. Simplify by
dropping the method and using the m_from_snapshot_blockhash field directly.
IsSnapshotValidated() is only called one place outside of tests, and is use
redundantly in some tests, asserting that a snapshot is not validated when a
snapshot chainstate does not even exist. Simplify by dropping the method and
checking Chainstate m_assumeutxo field directly.
IsSnapshotActive() method is only called one place outside of tests and
asserts, and is confusing because it returns true even after the snapshot is
fully validated.
The documentation which said this "implies that a background validation
chainstate is also in use" is also incorrect, because after the snapshot is
validated, the background chainstate gets disabled and IsUsable() would return
false.
Change ChainstateRole parameter passed to wallets and indexes. Wallets and
indexes need to know whether chainstate is historical and whether it is fully
validated. They should not be aware of the assumeutxo snapshot validation
process.
CurrentChainstate() is basically the same as ActiveChainstate() except it
requires cs_main to be locked when it is called, instead of locking cs_main
internally.
The name "current" should also be less confusing than "active" because multiple
chainstates can be active, and CurrentChainstate() returns the chainstate
targeting the current network tip, regardless of what chainstates are being
downloaded or how they are used.
Use to simplify code determining the chainstate leveldb paths. New method is
the now the only code that needs to figure out the storage path, so the path
doesn't need to be constructed multiple places and backed out of leveldb.
Remove hardcoded references to m_ibd_chainstate and m_snapshot_chainstate so
MaybeCompleteSnapshotValidation function can be simpler and focus on validating
the snapshot without dealing with internal ChainstateManager states.
This is a step towards being able to validate the snapshot outside of
ActivateBestChain loop so cs_main is not locked for minutes when the snapshot
block is connected.
Move duplicate code from ChainstateManager::ActivateSnapshot and
ChainstateManager::ActivateExistingSnapshot methods to a new
ChainstateManager::AddChainstate method.
The "AddChainstate" method name doesn't mention snapshots even though it is
only used to add snapshot chainstates now, because it becomes more generalized
in a later commit in this PR ("refactor: Add ChainstateManager::m_chainstates
member")
Get rid of m_disabled/IsUsable members. Instead of marking chains disabled for
different reasons, store chainstate assumeutxo status explicitly and use that
information to determine how chains should be treated.
Make Chainstate objects aware of what block they are targeting. This makes
Chainstate objects more self contained, so it's possible for validation code to
look at one Chainstate object and know what blocks to connect to it without
needing to consider global validation state or look at other Chainstate
objects.
The motivation for this change is to make validation and networking code more
readable, so understanding it just requires knowing about chains and blocks,
not reasoning about assumeutxo download states. This change also enables
simplifications to the ChainstateManager interface in subsequent commits, and
could make it easier to implement new features like creating new Chainstate
objects to generate UTXO snapshots or index UTXO data.
Note that behavior of the MaybeCompleteSnapshotValidation function is not
changing here but some checks that were previously impossible to trigger like
the BASE_BLOCKHASH_MISMATCH case have been turned into asserts.
c1e554d3e5 refactor: consolidate 3 separate locks into one block (Andrew Toth)
41479ed1d2 test: add test for periodic flush inside ActivateBestChain (Andrew Toth)
84820561dc validation: periodically flush dbcache during reindex-chainstate (Andrew Toth)
Pull request description:
After #30611 we periodically do a non-erasing flush of the dbcache to disk roughly every hour during IBD.
The intention was to also do this periodic flush during reindex-chainstate, so we would not risk losing progress during a system failure when reindexing with a high dbcache value.
It was discovered that reindex-chainstate does not perform a PERIODIC flush until it has already reached the tip. Since reindexing to tip usually happens within 24 hours, this behaviour was unnoticed with the previous periodic flush interval. Note that reindex-chainstate still does IF_NEEDED flushes during `ConnectBlock`, so this also would not be noticed when running with a lower dbcache value.
This patch moves the PERIODIC flush from after the outer loop in `ActivateBestChain` to inside the outer loop after we release `cs_main`. This will periodically flush during IBD, reindex-chainstate, and steady state.
ACKs for top commit:
l0rinc:
ACK c1e554d3e5
achow101:
ACK c1e554d3e5
sipa:
utACK c1e554d3e5
Tree-SHA512: c447ad03e16c9978b8ed2c285b38e1b4c56e7778ab93b6f64435116f47b8931017f5f56ab53eb61656693146aaced776f666af573a41ab28e8f2b6d8657fa756
fa89f60e31 scripted-diff: LogPrintLevel(*,BCLog::Level::*,*) -> LogError()/LogWarning() (MarcoFalke)
fa6c7a1954 scripted-diff: LogPrintLevel(*,BCLog::Level::Debug,*) -> LogDebug() (MarcoFalke)
Pull request description:
Errors and warnings should normally not happen. However, if they do happen, it is easier to spot them, if they are all logged in the same format via `LogError` or `LogWarning`.
So do that with a scripted-diff.
This is a minimal behavior change and unifies the log output from:
[net:error] Something bad happened
[net:warning] Something problematic happened
to either
[error] Something bad happened
[warning] Something problematic happened
or, when `-loglevelalways=1` is enabled:
[all:error] Something bad happened
[all:warning] Something problematic happened
Such a behavior change is desired, because all warning and error logs are written in the same style in the source code and they are logged in the same format for log consumers.
Removing the category should be harmless, because warning and error messages should be descriptive and unique anyway.
ACKs for top commit:
ajtowns:
ACK fa89f60e31
stickies-v:
ACK fa89f60e31
rkrux:
lgtm code review ACK fa89f60e31
Tree-SHA512: dafa47ab561609a79005faf008fe188dd714f6e07bb2dfbe4db49290d6636b12eb7ac4a18ed32bcc5526641a9f258dbc37c08e10c223ec068b97976590ff0b52
0ac969cddf validation: don't reallocate cache for short-lived CCoinsViewCache (Lőrinc)
c8f5e446dc coins: reduce lookups in dbcache layer propagation (Lőrinc)
Pull request description:
This change is part of [[IBD] - Tracking PR for speeding up Initial Block Download](https://github.com/bitcoin/bitcoin/pull/32043)
### Summary
Previously, when the parent coins cache had no entry and the child did, `BatchWrite` performed a find followed by `try_emplace`, which resulted in multiple `SipHash` computations and bucket traversals on the common insert path.
On a different path, these caches were recreated needlessly for every block connection.
### Fix for double fetch
This change uses a single leading `try_emplace` and branches on the returned `inserted` flag. In the `FRESH && SPENT` case (not used in production, only exercised by tests), we erase the just-inserted placeholder (which is constant time with no rehash anyway). Semantics are unchanged for all valid parent/child state combinations.
This change is a minimal version of [bitcoin/bitcoin@`723c49b` (#32128)](723c49b63b) and draws simplification ideas [bitcoin/bitcoin@`ae76ec7` (#30673)](ae76ec7bcf) and https://github.com/bitcoin/bitcoin/pull/30326.
### Fix for temporary cache recreation
Related to parent cache propagation, the second commit makes it possible to avoid destructuring-recreating-destructuring of these short-live parent caches created for each new block.
A few temporary `CCoinsViewCache`'s are destructed right after the `Flush()`, therefore it is not necessary to call `ReallocateCache` to recreate them right before they're killed anyway.
This change was based on a subset of https://github.com/bitcoin/bitcoin/pull/28945, the original authors and relevant commenters were added as coauthors to this version.
-----
Reindex-chainstate indicates ~1% speedup.
<details>
<summary>Details</summary>
```python
COMMITS="647cdb4f7e8041affed887e2325ee03a91078bb1 0b0c3293ffd75afb27dadc0b28426b40132a8c6b"; \
STOP=909090; DBCACHE=4500; \
CC=gcc; CXX=g++; \
BASE_DIR="/mnt/my_storage"; DATA_DIR="$BASE_DIR/BitcoinData"; LOG_DIR="$BASE_DIR/logs"; \
(echo ""; for c in $COMMITS; do git fetch -q origin $c && git log -1 --pretty='%h %s' $c || exit 1; done; echo "") && \
hyperfine \
--sort command \
--runs 2 \
--export-json "$BASE_DIR/rdx-$(sed -E 's/(\w{8})\w+ ?/\1-/g;s/-$//'<<<"$COMMITS")-$STOP-$DBCACHE-$CC.json" \
--parameter-list COMMIT ${COMMITS// /,} \
--prepare "killall bitcoind 2>/dev/null; rm -f $DATA_DIR/debug.log; git checkout {COMMIT}; git clean -fxd; git reset --hard && \
cmake -B build -G Ninja -DCMAKE_BUILD_TYPE=Release -DENABLE_IPC=OFF && ninja -C build bitcoind && \
./build/bin/bitcoind -datadir=$DATA_DIR -stopatheight=$STOP -dbcache=1000 -printtoconsole=0; sleep 20" \
--cleanup "cp $DATA_DIR/debug.log $LOG_DIR/debug-{COMMIT}-$(date +%s).log" \
"COMPILER=$CC ./build/bin/bitcoind -datadir=$DATA_DIR -stopatheight=$STOP -dbcache=$DBCACHE -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0"
647cdb4f7e Merge bitcoin/bitcoin#33311: net: Quiet down logging when router doesn't support natpmp/pcp
0b0c3293ff validation: don't reallocate cache for short-lived CCoinsViewCache
Benchmark 1: COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=909090 -dbcache=4500 -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 (COMMIT = 647cdb4f7e)
Time (mean ± σ): 16233.508 s ± 9.501 s [User: 19064.578 s, System: 951.672 s]
Range (min … max): 16226.790 s … 16240.226 s 2 runs
Benchmark 2: COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=909090 -dbcache=4500 -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 (COMMIT = 0b0c3293ffd75afb27dadc0b28426b40132a8c6b)
Time (mean ± σ): 16039.626 s ± 17.284 s [User: 18870.130 s, System: 950.722 s]
Range (min … max): 16027.405 s … 16051.848 s 2 runs
Relative speed comparison
1.01 ± 0.00 COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=909090 -dbcache=4500 -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 (COMMIT = 647cdb4f7e)
1.00 COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=909090 -dbcache=4500 -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 (COMMIT = 0b0c3293ffd75afb27dadc0b28426b40132a8c6b)
```
</details>
ACKs for top commit:
optout21:
utACK 0ac969cddf
achow101:
ACK 0ac969cddf
andrewtoth:
utACK 0ac969cddf
sedited:
ACK 0ac969cddf
Tree-SHA512: 9fcc3f1a8314368576a4fba96ca72665527eaa3a97964ab5b39491757f3527147d134f79a5c3456f76c1330c7ef862989d23f764236c5e2563be89a81c1cee47
4b47113698 validation: Reword CheckForkWarningConditions and call it also during IBD and at startup (Martin Zumsande)
2f51951d03 p2p: Add warning message when receiving headers for blocks cached as invalid (Martin Zumsande)
Pull request description:
In case of corruption that leads to a block being marked as invalid that is seen as valid by the rest of the network, the user currently doesn't receive good error messages, but will often be stuck in an endless headers-sync loop with no explanation (#26391).
This PR improves warnings in two ways:
- When we receive a header that is already saved in our disk, but invalid, add a warning. This will happen repeatedly during the headerssync loop (see https://github.com/bitcoin/bitcoin/issues/26391#issuecomment-1291765534 on how to trigger it artificially).
- Removes the IBD check from `CheckForkWarningConditions` and adds a call to the function during init (`LoadChainTip()`). The existing check was added in 55ed3f1475 a long time ago when we had more sophisticated fork detection that could lead to false positives during IBD, but that logic was removed in fa62304c97 so that I don't see a reason to suppress the warning anymore.
Fixes#26391 (We'll still do the endless looping, trying to find a peer with a headers that we can use, but will now repeatedly log warnings while doing so).
ACKs for top commit:
glozow:
ACK `git range-diff 6d2c8ea9dbd77c71051935b5ab59224487509559...4b4711369880369729893ba7baef11ba2a36cf4b`
theStack:
re-ACK 4b47113698
sedited:
ACK 4b47113698
Tree-SHA512: 78bc53606374636d616ee10fdce0324adcc9bcee2806a7e13c9471e4c02ef00925ce6daef303bc153b7fcf5a8528fb4263c875b71d2e965f7c4332304bc4d922
This is a minimal behavior change and changes log output from:
[net:error] Something bad happened
[net:warning] Something problematic happened
to either
[error] Something bad happened
[warning] Something problematic happened
or, when -loglevelalways=1 is enabled:
[all:error] Something bad happened
[all:warning] Something problematic happened
Such a behavior change is desired, because all warning and error logs
are written in the same style in the source code and they are logged in
the same format for log consumers.
-BEGIN VERIFY SCRIPT-
sed --regexp-extended --in-place \
's/LogPrintLevel\((BCLog::[^,]*), BCLog::Level::(Error|Warning), */Log\2(/g' \
$( git grep -l LogPrintLevel ':(exclude)src/test/logging_tests.cpp' )
-END VERIFY SCRIPT-
The changes made here were:
| From | To |
|-------------------|------------------|
| `m.count(k)` | `m.contains(k)` |
| `!m.count(k)` | `!m.contains(k)` |
| `m.count(k) == 0` | `!m.contains(k)` |
| `m.count(k) != 0` | `m.contains(k)` |
| `m.count(k) > 0` | `m.contains(k)` |
The commit contains the trivial, mechanical refactors where it doesn't matter if the container can have multiple elements or not
Co-authored-by: Jan B <608446+janb84@users.noreply.github.com>
The existing IBD disable was added at a time when CheckForkWarningConditions
did also sophisticated fork detection that could lead to false positives
during IBD (55ed3f1475).
The fork detection logic doesn't exist anymore
(since fa62304c97), so the IBD check is no
longer necessary.
Displaying the log at startup will help node operators diagnose the
problem better.
Also unify log message and alert warning text, since a long invalid chain
could be due to chainstate corruption or an actual consensus incompatibility
with peers. Previously the log assumed the former and the alert the latter.
Currently, if database corruption leads to a block being marked as
invalid incorrectly, we can get stuck in an infinite headerssync
loop with no indication what went wrong or how to fix it.
With the added log message, users will receive an explicit warning after each
failed headerssync attempt with an outbound peer.
17cf9ff7ef Use cluster size limit for -maxmempool bound, and allow -maxmempool=0 in general (Suhas Daftuar)
315e43e5d8 Sanity check `GetFeerateDiagram()` in CTxMemPool::check() (Suhas Daftuar)
de2e9a24c4 test: extend package rbf functional test to larger clusters (Suhas Daftuar)
4ef4ddb504 doc: update policy/packages.md for new package acceptance logic (Suhas Daftuar)
79f73ad713 Add check that GetSortedScoreWithTopology() agrees with CompareMiningScoreWithTopology() (Suhas Daftuar)
a86ac11768 Update comments for CTxMemPool class (Suhas Daftuar)
9567eaa66d Invoke TxGraph::DoWork() at appropriate times (Suhas Daftuar)
6c5c44f774 test: add functional test for new cluster mempool RPCs (Suhas Daftuar)
72f60c877e doc: Update mempool_replacements.md to reflect feerate diagram checks (Suhas Daftuar)
21693f031a Expose cluster information via rpc (Suhas Daftuar)
72e74e0d42 fuzz: try to add more code coverage for mempool fuzzing (Suhas Daftuar)
f107417490 bench: add more mempool benchmarks (Suhas Daftuar)
7976eb1ae7 Avoid violating mempool policy limits in tests (Suhas Daftuar)
84de685cf7 Stop tracking parents/children outside of txgraph (Suhas Daftuar)
88672e205b Rewrite GatherClusters to use the txgraph implementation (Suhas Daftuar)
1ca4f01090 Fix miniminer_tests to work with cluster limits (Suhas Daftuar)
1902111e0f Eliminate CheckPackageLimits, which no longer does anything (Suhas Daftuar)
3a646ec462 Rework RBF and TRUC validation (Suhas Daftuar)
19b8479868 Make getting parents/children a function of the mempool, not a mempool entry (Suhas Daftuar)
5560913e51 Rework truc_policy to use descendants, not children (Suhas Daftuar)
a4458d6c40 Use txgraph to calculate descendants (Suhas Daftuar)
c8b6f70d64 Use txgraph to calculate ancestors (Suhas Daftuar)
241a3e666b Simplify ancestor calculation functions (Suhas Daftuar)
b9cec7f0a1 Make removeConflicts private (Suhas Daftuar)
0402e6c780 Remove unused limits from CalculateMemPoolAncestors (Suhas Daftuar)
08be765ac2 Remove mempool logic designed to maintain ancestor/descendant state (Suhas Daftuar)
fc4e3e6bc1 Remove unused members from CTxMemPoolEntry (Suhas Daftuar)
ff3b398d12 mempool: eliminate accessors to mempool entry ancestor/descendant cached state (Suhas Daftuar)
b9a2039f51 Eliminate use of cached ancestor data in miniminer_tests and truc_policy (Suhas Daftuar)
ba09fc9774 mempool: Remove unused function CalculateDescendantMaximum (Suhas Daftuar)
8e49477e86 wallet: Replace max descendant count with cluster_count (Suhas Daftuar)
e031085fd4 Eliminate Single-Conflict RBF Carve Out (Suhas Daftuar)
cf3ab8e1d0 Stop enforcing descendant size/count limits (Suhas Daftuar)
89ae38f489 test: remove rbf carveout test from mempool_limit.py (Suhas Daftuar)
c0bd04d18f Calculate descendant information for mempool RPC output on-the-fly (Suhas Daftuar)
bdcefb8a8b Use mempool/txgraph to determine if a tx has descendants (Suhas Daftuar)
69e1eaa6ed Add test case for cluster size limits to TRUC logic (Suhas Daftuar)
9cda64b86c Stop enforcing ancestor size/count limits (Suhas Daftuar)
1f93227a84 Remove dependency on cached ancestor data in mini-miner (Suhas Daftuar)
9fbe0a4ac2 rpc: Calculate ancestor data from scratch for mempool rpc calls (Suhas Daftuar)
7961496dda Reimplement GetTransactionAncestry() to not rely on cached data (Suhas Daftuar)
feceaa42e8 Remove CTxMemPool::GetSortedDepthAndScore (Suhas Daftuar)
21b5cea588 Use cluster linearization for transaction relay sort order (Suhas Daftuar)
6445aa7d97 Remove the ancestor and descendant indices from the mempool (Suhas Daftuar)
216e693729 Implement new RBF logic for cluster mempool (Suhas Daftuar)
ff8f115dec policy: Remove CPFP carveout rule (Suhas Daftuar)
c3f1afc934 test: rewrite PopulateMempool to not violate mempool policy (cluster size) limits (Suhas Daftuar)
47ab32fdb1 Select transactions for blocks based on chunk feerate (Suhas Daftuar)
dec138d1dd fuzz: remove comparison between mini_miner block construction and miner (Suhas Daftuar)
6c2bceb200 bench: rewrite ComplexMemPool to not create oversized clusters (Suhas Daftuar)
1ad4590f63 Limit mempool size based on chunk feerate (Suhas Daftuar)
b11c89cab2 Rework miner_tests to not require large cluster limit (Suhas Daftuar)
95a8297d48 Check cluster limits when using -walletrejectlongchains (Suhas Daftuar)
95762e6759 Do not allow mempool clusters to exceed configured limits (Suhas Daftuar)
edb3e7cdf6 [test] rework/delete feature_rbf tests requiring large clusters (glozow)
435fd56711 test: update feature_rbf.py replacement test (Suhas Daftuar)
34e32985e8 Add new (unused) limits for cluster size/count (Suhas Daftuar)
838d7e3553 Add transactions to txgraph, but without cluster dependencies (Suhas Daftuar)
d5ed9cb3eb Add accessor for sigops-adjusted weight (Suhas Daftuar)
1bf3b51396 Add sigops adjusted weight calculator (Suhas Daftuar)
c18c68a950 Create a txgraph inside CTxMemPool (Suhas Daftuar)
29a94d5b2f Make CTxMemPoolEntry derive from TxGraph::Ref (Suhas Daftuar)
92b0079fe3 Allow moving CTxMemPoolEntry objects, disallow copying (Suhas Daftuar)
6c73e47448 mempool: Store iterators into mapTx in mapNextTx (Suhas Daftuar)
51430680ec Allow moving an Epoch::Marker (Suhas Daftuar)
Pull request description:
[Reopening #28676 here as a new PR, because GitHub is slow to load the page making it hard to scroll through and see comments. Also, that PR was originally opened with a prototype implementation which has changed significantly with the introduction of `TxGraph`.]
This is an implementation of the [cluster mempool proposal](https://delvingbitcoin.org/t/an-overview-of-the-cluster-mempool-proposal/393).
This branch implements the following observable behavior changes:
- Maintains a partitioning of the mempool into connected clusters (via the `txgraph` class), which are limited in vsize to 101 kvB by default, and limited in count to 64 by default.
- Each cluster is sorted ("linearized") to try to optimize for selecting highest-feerate-subsets of a cluster first
- Transaction selection for mining is updated to use the cluster linearizations, selecting highest feerate "chunks" first for inclusion in a block template.
- Mempool eviction is updated to use the cluster linearizations, selecting lowest feerate "chunks" first for removal.
- The RBF rules are updated to: (a) drop the requirement that no new inputs are introduced; (b) change the feerate requirement to instead check that the feerate diagram of the mempool will strictly improve; (c) replace the direct conflicts limit with a directly-conflicting-clusters limit.
- The CPFP carveout rule is eliminated (it doesn't make sense in a cluster-limited mempool)
- The ancestor and descendant limits are no longer enforced.
- New cluster count/cluster vsize limits are now enforced instead.
- Transaction relay now uses chunk feerate comparisons to determine the order that newly received transactions are announced to peers.
Additionally, the cached ancestor and descendant data are dropped from the mempool, along with the multi_index indices that were maintained to sort the mempool by ancestor and descendant feerates. For compatibility (eg with wallet behavior or RPCs exposing this), this information is now calculated dynamically instead.
ACKs for top commit:
instagibbs:
reACK 17cf9ff7ef
glozow:
reACK 17cf9ff7ef
sipa:
ACK 17cf9ff7ef
Tree-SHA512: bbde46d913d56f8d9c0426cb0a6c4fa80b01b0a4c2299500769921f886082fb4f51f1694e0ee1bc318c52e1976d7ebed8134a64eda0b8044f3a708c04938eee7
Calculating mempool ancestors for a new transaction should not be done until
after cluster size limits have been enforced, to limit CPU DoS potential.
Achieve this by reworking TRUC and RBF validation logic:
- TRUC policy enforcement is now done using only mempool parents of
new transactions, not all mempool ancestors (note that it's fine to calculate
ancestors of in-mempool transactions, if the number of such calls is
reasonably bounded).
- RBF replacement checks are performed earlier (which allows for checking
cluster size limits earlier, because cluster size checks cannot happen until
after all conflicts are staged for removal).
- Verifying that a new transaction doesn't conflict with an ancestor now
happens later, in AcceptSingleTransaction() rather than in PreChecks(). This
means that the test is not performed at all in AcceptMultipleTransactions(),
but in package acceptance we already disallow RBF in situations where a
package transaction has in-mempool parents.
Also to ensure that all RBF validation logic is applied in both the single
transaction and multiple transaction cases, remove the optimization that skips
the PackageMempoolChecks() in the case of a single transaction being validated
in AcceptMultipleTransactions().
Now that ancestor calculation never fails (due to ancestor/descendant limits
being eliminated), we can eliminate the error handling from
CalculateMemPoolAncestors.
With a total ordering on mempool transactions, we are now able to calculate a
transaction's mining score at all times. Use this to improve the RBF logic:
- we no longer enforce a "no new unconfirmed parents" rule
- we now require that the mempool's feerate diagram must improve in order
to accept a replacement
- the topology restrictions for conflicts in the package rbf setting have been
eliminated
Revert the temporary change to mempool_ephemeral_dust.py that were previously
made due to RBF validation checks being reordered.
Co-authored-by: Gregory Sanders <gsanders87@gmail.com>, glozow <gloriajzhao@gmail.com>
The addition of a cluster size limit makes the CPFP carveout rule useless,
because carveout cannot be used to bypass the cluster size limit. Remove this
policy rule and update tests to no longer rely on the behavior.
Include an adjustment to mempool_tests.cpp due to the additional memory used by
txgraph.
Includes a temporary change to the mempool_ephemeral_dust.py functional test,
due to validation checks being reordered. This change will revert once the RBF
rules are changed in a later commit.
743abbcbde refactor: inline constant return value of `BlockTreeDB::WriteBatchSync` and `BlockManager::WriteBlockIndexDB` and `BlockTreeDB::WriteFlag` (Lőrinc)
e030240e90 refactor: inline constant return value of `CDBWrapper::Erase` and `BlockTreeDB::WriteReindexing` (Lőrinc)
cdab9480e9 refactor: inline constant return value of `CDBWrapper::Write` (Lőrinc)
d1847cf5b5 refactor: inline constant return value of `TxIndex::DB::WriteTxs` (Lőrinc)
50b63a5698 refactor: inline constant return value of `CDBWrapper::WriteBatch` (Lőrinc)
Pull request description:
Related to https://github.com/bitcoin/bitcoin/pull/31144#discussion_r2223587480
### Summary
`WriteBatch` always returns `true` - the errors are handled by throwing `dbwrapper_error` instead.
### Context
This boolean return value of the `Write` methods is confusing because it's inconsistent with `CDBWrapper::Read`, which catches exceptions and returns a boolean to indicate success/failure. It's bad that `Read` returns and `Write` throws - but it's a lot worse that `Write` advertises a return value when it actually communicates errors through exceptions.
### Solution
This PR removes the constant return values from write methods and inlines `true` at their call sites. Many upstream methods had boolean return values only because they were propagating these constants - those have been cleaned up as well.
Methods that returned a constant `true` value that now return `void`:
- `CDBWrapper::WriteBatch`, `CDBWrapper::Write`, `CDBWrapper::Erase`
- `TxIndex::DB::WriteTxs`
- `BlockTreeDB::WriteReindexing`, `BlockTreeDB::WriteBatchSync`, `BlockTreeDB::WriteFlag`
- `BlockManager::WriteBlockIndexDB`
### Note
`CCoinsView::BatchWrite` (and transitively `CCoinsViewCache::Flush` & `CCoinsViewCache::Sync`) were intentionally not changed here. While all implementations return `true`, the base `CCoinsView::BatchWrite` returns `false`. Changing this would cause `coins_view` tests to fail with:
> terminating due to uncaught exception of type std::logic_error: Not all unspent flagged entries were cleared
We can fix that in a follow-up PR.
ACKs for top commit:
achow101:
ACK 743abbcbde
janb84:
ACK 743abbcbde
TheCharlatan:
ACK 743abbcbde
sipa:
ACK 743abbcbde
Tree-SHA512: b2a550bff066216f1958d2dd9a7ef6a9949de518cc636f8ab9c670e0b7a330c1eb8c838e458a8629acb8ac980cea6616955cd84436a7b8ab9096f6d648073b1e
1fc7a81f1f log: reduce excessive messages during block replay (Lőrinc)
Pull request description:
### Summary
After an incomplete reindex the blocks will need to be replayed.
This results in excessive `Rolling back` and `Rolling forward` messages which quickly triggers the recently introduced log rate limiter.
Change the logging strategy to:
- Add single `LogInfo` messages showing the full range being replayed for both rollback and roll forward;
- Log progress at `LogInfo` level only every 10,000 blocks to track the long operations.
### Reproducer:
* Start a normal ibd, stop after some progress
* Do a reindex, stop before it finishes
* Restart the node normally without specifying the reindex parameter
It should start rolling the blocks forward.
Before this change the excessive logging would show:
```
[*] Rolling forward 000000002f4f55aecfccc911076dc3f73ac0288c83dc1d79db0a026441031d40 (46245)
[*] Rolling forward 0000000017ffcf34c8eac010c529670ba6745ea59cf1edf7b820928e3b40acf6 (46246)
```
After the change it shows:
```
Replaying blocks
Rolling forward to 00000000000000001034012d7e4facaf16ca747ea94b8ea66743086cfe298ef8 (326223 to 340991)
Rolling forward 00000000000000000faabab19f17c0178c754dbed023e6c871dcaf74159c5f02 (330000)
Rolling forward 00000000000000000d9b2508615d569e18f00c034d71474fc44a43af8d4a5003 (340000)
...
Rolled forward to 00000000000000001034012d7e4facaf16ca747ea94b8ea66743086cfe298ef8
```
(similarly to rolling back)
ACKs for top commit:
Crypt-iQ:
crACK 1fc7a81f1f
stickies-v:
ACK 1fc7a81f1f
achow101:
ACK 1fc7a81f1f
vasild:
ACK 1fc7a81f1f
hodlinator:
Concept ACK 1fc7a81f1f
Tree-SHA512: 44ed1da8336de5a3d937e11a13e6f1789064e23eb70640a1c406fbb0074255344268f6eb6b06f036ca8d22bfeb4bdea319c3085a2139d848f6d36a4f8352b76a
The `mutated` parameter is never used at any call site - all callers pass `nullptr`.
The explicit comment in `validation.cpp` explains the reason:
// The malleation check is ignored; as the transaction tree itself
// already does not permit it, it is impossible to trigger in the
// witness tree.
A few temporary `CCoinsViewCache`'s are destructed right after the `Flush()`, therefore it is not necessary to call `ReallocateCache` to recreate them right before they're killed anyway.
* `Flush()` - retains existing functionality;
* `Flush(/*will_reuse_cache=*/false)` - skips destruction and reallocation of the parent cache since it will soon go out of scope anyway;
For the `will_reuse_cache` parameter we want to see exactly which ones will reallocate memory and which won't - since both can be valid usages.
This change was based on a subset of https://github.com/bitcoin/bitcoin/pull/28945.
Co-authored-by: Martin Ankerl <martin.ankerl@gmail.com>
0465574c12 test: Fixes send_blocks_and_test docs (Sergi Delgado Segura)
09c95f21e7 test: Adds block tiebreak over restarts tests (Sergi Delgado Segura)
18524b072e Make nSequenceId init value constants (Sergi Delgado Segura)
8b91883a23 Set the same best tip on restart if two candidates have the same work (Sergi Delgado Segura)
5370bed21e test: add functional test for complex reorgs (Pieter Wuille)
ab145cb3b4 Updates CBlockIndexWorkComparator outdated comment (Sergi Delgado Segura)
Pull request description:
This PR grabs some interesting bits from https://github.com/bitcoin/bitcoin/pull/29284 and fixes some edge cases in how block tiebreaks are dealt with.
## Regarding #29284
The main functionality from the PR was dropped given it was not an issue anymore, however, reviewers pointed out some comments were outdated https://github.com/bitcoin/bitcoin/pull/29284#discussion_r1522023578 (which to my understanding may have led to thinking that there was still an issue) it also added test coverage for the aforementioned case which was already passing on master and is useful to keep.
## New functionality
While reviewing the superseded PR, it was noticed that blocks that are loaded from disk may face a similar issue (check https://github.com/bitcoin/bitcoin/pull/29284#issuecomment-1994317785 for more context).
The issue comes from how tiebreaks for equal work blocks are handled: if two blocks have the same amount of work, the one that is activatable first wins, that is, the one for which we have all its data (and all of its ancestors'). The variable that keeps track of this, within `CBlockIndex` is `nSequenceId`, which is not persisted over restarts. This means that when a node is restarted, all blocks loaded from disk are defaulted the same `nSequenceId`: 0.
Now, when trying to decide what chain is best on loading blocks from disk, the previous tiebreaker rule is not decisive anymore, so the `CBlockIndexWorkComparator` has to default to its last rule: whatever block is loaded first (has a smaller memory address).
This means that if multiple same work tip candidates were available before restarting the node, it could be the case that the selected chain tip after restarting does not match the one before.
Therefore, the way `nSequenceId` is initialized is changed to:
- 0 for blocks that belong to the previously known best chain
- 1 to all other blocks loaded from disk
ACKs for top commit:
sipa:
utACK 0465574c12
TheCharlatan:
ACK 0465574c12
furszy:
Tested ACK 0465574c12.
Tree-SHA512: 161da814da03ce10c34d27d79a315460a9c98d019b85ee35bc5daa991ed3b6a2e69a829e421fc70d093a83cf7a2e403763041e594df39ed1991445e54c16532a
45bd891465 log: split assumevalid ancestry-failure-reason message (Lőrinc)
6c13a38ab5 log: separate script verification reasons (Lőrinc)
f2ea6f04e7 refactor: untangle assumevalid decision branches (Lőrinc)
9bc298556c validation: log initial script verification state (Lőrinc)
4fad4e992c test: add assumevalid scenarios scaffold (Lőrinc)
91ac64b0a6 log: reword `signature validations` to `script verification` in `assumevalid` log (Lőrinc)
Pull request description:
### Summary
Users can encounter cases where script checks are unexpectedly enabled (e.g. after reindex, or when `assumevalid`/`minimumchainwork` gates fail). Without an explicit line, they must infer state from the absence of a message, which is incomplete and error-prone.
The existing "Assuming ancestors of block …" line does not reliably indicate whether script checks are actually enabled, which makes debugging/benchmarking confusing.
### What this changes
We make the initial **script-verification** state explicit and log **why** checks are enabled to avoid confusion.
* Always log the first script-verification state on startup, **before** the first `UpdateTip`.
* Flatten the nested `assumevalid` conditionals into a linear gating sequence for readability.
* Extend the functional test to assert the old behavior with the new reason strings.
This is a **logging-only** test change it shouldn't change any other behavior.
### Example output
The state (with reason) is logged at startup and whenever the reason changes, e.g.:
* `Disabling script verification at block #904336 (000000000000000000014106b2082b1a18aaf3091e8b337c6fed110db8c56620).`
* `Enabling script verification at block #912527 (000000000000000000010bb6aa3ecabd7d41738463b6c6621776c2e40dbe738a): block too recent relative to best header.`
* `Enabling script verification at block #912684 (00000000000000000001375cf7b90b2b86e559d05ed92ca764d376702ead3858): block height above assumevalid height.`
------
Follow-up to https://github.com/bitcoin/bitcoin/pull/32975#discussion_r2329269037
ACKs for top commit:
Eunovo:
re-ACK 45bd891465
achow101:
ACK 45bd891465
hodlinator:
re-ACK 45bd891465
yuvicc:
ACK 45bd891465
andrewtoth:
ACK 45bd891465
ajtowns:
ACK 45bd891465
Tree-SHA512: 58328d7c418a6fe18f1c7fe1dd31955bb6fce8b928b0df693f6200807932eb5933146300af886a80a1d922228d93faf531145186dae55ad4ad1f691970732eca
652424ad16 test: additional test coverage for script_verify_flags (Anthony Towns)
417437eb01 script/verify_flags: extend script_verify_flags to 64 bits (Anthony Towns)
3cbbcb66ef script/interpreter: make script_verify_flag_name an ordinary enum (Anthony Towns)
bddcadee82 script/verify_flags: make script_verify_flags type safe (Anthony Towns)
a5ead122fe script/interpreter: introduce script_verify_flags typename (Anthony Towns)
4577fb2b1e rpc: have getdeploymentinfo report script verify flags (Anthony Towns)
a3986935f0 validation: export GetBlockScriptFlags() (Anthony Towns)
5db8cd2d37 Move mapFlagNames and FormatScriptFlags logic to script/interpreter.h (Anthony Towns)
Pull request description:
We currently use 21 of 32 possible bits for `SCRIPT_VERIFY_*` flags, with open PRs that may use 8 more (#29247, #31989, #32247, #32453). The mutinynet fork that has included many experimental soft fork features is [already reusing bits here](d4a86277ed/src/script/interpreter.h (L175-L195)). Therefore, bump this to 64 bits.
In order to make it easier to update this logic in future, this PR also introduces a dedicated type for the script flags, and disables implicit conversion between that type and the underlying integer type. To make verifying that this change doesn't cause flags to disappear, this PR also resurrects the changes from #28806 so that the script flags that are consensus enforced on each block can be queried via getdeploymentinfo.
ACKs for top commit:
instagibbs:
reACK 652424ad16
achow101:
ACK 652424ad16
darosior:
ACK 652424ad16
theStack:
Code-review ACK 652424ad16🎏
Tree-SHA512: 7b30152196cdfdef8b9700b571b7d7d4e94d28fbc5c26ea7532788037efc02e4b1d8de392b0b20507badfdc26f5c125f8356a479604a9149b8aae23a7cf5549f
When the assumevalid ancestry check fails, log a precise reason:
- "block height above assumevalid height" if the block is above the assumevalid block (the default reason)
- "block not in of assumevalid chain" otherwise
The new split was added under the existing condition to simplify conceptually that the two cases are related.
It could still be useful to know when the block is just above the assumevalid block or when it's not even on the same chain.
Update the functional test to assert the new reason strings. No behavior change.
Co-authored-by: Hodlinator <172445034+hodlinator@users.noreply.github.com>
Flatten nested conditionals into a linear gating sequence for readability and precise logging. No functional change, TODOs are addressed in next commit