55b931934a removed duplicate calling of GetDescriptorScriptPubKeyMan (Saikiran)
Pull request description:
Removed duplicate call to GetDescriptorScriptPubKeyMan and
Instead of checking linearly I have used find method so time complexity reduced significantly for GetDescriptorScriptPubKeyMan
after this fix improved performance of importdescriptor part refs https://github.com/bitcoin/bitcoin/issues/32013.
**Steps to reproduce in testnet environment**
**Input size:** 2 million address in the wallet
**Step1:** call importaddresdescriptor rpc method
observe the time it has taken.
**With the provided fix:**
Do the same steps again
observe the time it has taken.
There is a huge improvement in the performance. (previously it may take 5 to 6 seconds now it will take 1 seconds or less)
main changes i've made during this pr:
1. remove duplicate call to GetDescriptorScriptPubKeyMan method
2. And inside GetDescriptorScriptPubKeyMan method previously we checking **each address linearly** so each time it is calling HasWallet method which has aquired lock.
3. Now i've modified this logic call **find method on the map (O(logn)**) time it is taking, so only once we calling HasWallet method.
**Note:** Smaller inputs in the wallet you may not see the issue but huge wallet size it will definitely impact the performance.
ACKs for top commit:
achow101:
ACK 55b931934a
w0xlt:
ACK 55b931934a
Tree-SHA512: 4a7fdbcbb4e55bd034e9cf28ab4e7ee3fb1745fc8847adb388c98a19c952a1fb66d7b54f0f28b4c2a75a42473923742b4a99fb26771577183a98e0bcbf87a8ca
3669ecd4cc doc: Document fuzz build options (Anthony Towns)
c1d01f59ac fuzz: enable running fuzz test cases in Debug mode (Anthony Towns)
Pull request description:
When building with
BUILD_FOR_FUZZING=OFF
BUILD_FUZZ_BINARY=ON
CMAKE_BUILD_TYPE=Debug
allow the fuzz binary to execute given test cases (without actual fuzzing) to make it easier to reproduce fuzz test failures in a more normal debug build.
In Debug builds, deterministic fuzz behaviour is controlled via a runtime variable, which is normally false, but set to true automatically in the fuzz binary, unless the FUZZ_NONDETERMINISM environment variable is set.
ACKs for top commit:
maflcko:
re-ACK 3669ecd4cc🏉
marcofleon:
re ACK 3669ecd4cc
ryanofsky:
Code review ACK 3669ecd4cc with just variable renamed and documentation added since last review
Tree-SHA512: 5da5736462f98437d0aa1bd01aeacb9d46a9cc446a748080291067f7a27854c89f560f3a6481b760b9a0ea15a8d3ad90cd329ee2a008e5e347a101ed2516449e
When building with
BUILD_FOR_FUZZING=OFF
BUILD_FUZZ_BINARY=ON
CMAKE_BUILD_TYPE=Debug
allow the fuzz binary to execute given test cases (without actual
fuzzing) to make it easier to reproduce fuzz test failures in a more
normal debug build.
In Debug builds, deterministic fuzz behaviour is controlled via a runtime
variable, which is normally false, but set to true automatically in the
fuzz binary, unless the FUZZ_NONDETERMINISM environment variable is set.
2835216ec0 txgraph: make GroupClusters use partition numbers directly (optimization) (Pieter Wuille)
c72c8d5d45 txgraph: compare sequence numbers instead of Cluster* (bugfix) (Pieter Wuille)
Pull request description:
Part of cluster mempool: #30289
The implicit transaction ordering for transactions in a TxGraphImpl is defined by:
1. higher chunk feerate first
2. lower Cluster* object pointer first
3. lower position within cluster linearization first.
Number (2) is not deterministic, as it intricately depends on the heap allocation algorithm. Fix this by giving each Cluster a unique `uint64_t m_sequence` value, and sorting by those instead.
The second commit then uses this new approach to optimize GroupClusters a bit more, avoiding some repeated checks and dereferences, by making a local copy of the involved sequence numbers.
Thanks to @dergoegge for pointing this out.
ACKs for top commit:
instagibbs:
reACK 2835216ec0
marcofleon:
ACK 2835216ec0
glozow:
utACK 2835216ec0
Tree-SHA512: d772a55b9ed620159b934a42a39fca7f900d4aa89c099a280a0c61ea0bd7c4fc39b388281ffc775064ea77b0b17263871b4c9763aa71c710a79287d5eb2cd4b4
fa6a007b8e fuzz: Avoid integer sanitizer warnings in policy_estimator target (MarcoFalke)
Pull request description:
It seems odd to write a fuzz target to trigger integer sanitizer warnings in `CBlockPolicyEstimator::processBlockTx` and then suppress them. If the scenario can happen in reality, the code should be properly fixed to handle the cases. If not, it seems better to fix the fuzz target to not trigger meaningless traces.
Do that here by keeping track of the current height and limiting mempool entries to at most this entry height.
ACKs for top commit:
brunoerg:
ACK fa6a007b8e
dergoegge:
utACK fa6a007b8e
Tree-SHA512: 2092017dc309fb095fe5d43cfb76efb691795f303d567ee919be2b5cac19a944293636229903dc4d1e8b9fe5daf9dc3058544321eff1735f91f804c3baa36cd0
ec81a72b36 net: Add randomized prefix to Tor stream isolation credentials (laanwj)
c47f81e8ac net: Rename `_randomize_credentials` Proxy parameter to `tor_stream_isolation` (laanwj)
Pull request description:
Add a class TorsStreamIsolationCredentialsGenerator that generates unique credentials based on a randomly generated session prefix and an atomic counter. Use this in `ConnectThroughProxy` instead of a simple atomic int counter.
This makes sure that different launches of the application won't share the same credentials, and thus circuits, even in edge cases.
Example with `-debug=proxy`:
```
2025-03-31T16:30:27Z [proxy] SOCKS5 sending proxy authentication 0afb2da441f5c105-0:0afb2da441f5c105-0
2025-03-31T16:30:31Z [proxy] SOCKS5 sending proxy authentication 0afb2da441f5c105-1:0afb2da441f5c105-1
```
Thanks to hodlinator in https://github.com/bitcoin/bitcoin/pull/32166#discussion_r2020973352 for the idea.
ACKs for top commit:
hodlinator:
re-ACK ec81a72b36
jonatack:
ACK ec81a72b36
danielabrozzoni:
tACK ec81a72b36
Tree-SHA512: 195f5885fade77545977b91bdc41394234ae575679cb61631341df443fd8482cd74650104e323c7dbfff7826b10ad61692cca1284d6810f84500a3488f46597a
faa3ce3199 fuzz: Avoid influence on the global RNG from peerman m_rng (MarcoFalke)
faf4c1b6fc fuzz: Disable unused validation interface and scheduler in p2p_headers_presync (MarcoFalke)
fafaca6cbc fuzz: Avoid setting the mock-time twice (MarcoFalke)
fad22149f4 refactor: Use MockableSteadyClock in ReportHeadersPresync (MarcoFalke)
fa9c38794e test: Introduce MockableSteadyClock::mock_time_point and ElapseSteady helper (MarcoFalke)
faf2d512c5 fuzz: Move global node id counter along with other global state (MarcoFalke)
fa98455e4b fuzz: Set ignore_incoming_txs in p2p_headers_presync (MarcoFalke)
faf2e238fb fuzz: Shuffle files before testing them (MarcoFalke)
Pull request description:
This should make the `p2p_headers_presync` fuzz target more deterministic.
Tracking issue: https://github.com/bitcoin/bitcoin/issues/29018.
The first commits adds an `ElapseSteady` helper and type aliases. The second commit uses those helpers in `ReportHeadersPresync` and in the fuzz target to increase determinism.
### Testing
It can be tested via (setting 32 parallel threads):
```
cargo run --manifest-path ./contrib/devtools/deterministic-fuzz-coverage/Cargo.toml -- $PWD/bld-cmake/ $PWD/../b-c-qa-assets/fuzz_corpora/ p2p_headers_presync 32
```
The failing diff is contained in the commit messages, if applicable.
ACKs for top commit:
Crypt-iQ:
tACK faa3ce3199
janb84:
Re-ACK [faa3ce3](faa3ce3199)
marcofleon:
ACK faa3ce3199
Tree-SHA512: 7e2e0ddf3b4e818300373d6906384df57a87f1eeb507fa43de1ba88cf03c8e6752a26b6e91bfb3ee26a21efcaf1d0d9eaf70d311d1637b671965ef4cb96e6b59
This should avoid the remaining non-determistic code coverage paths.
Without this patch, the tool would report a diff (only when running
without libFuzzer):
cargo run --manifest-path ./contrib/devtools/deterministic-fuzz-coverage/Cargo.toml -- $PWD/bld-cmake/ $PWD/../qa-assets/fuzz_corpora/ p2p_headers_presync 32
It should be sufficient to set it once. Especially, if the dynamic value
is only used by ResetAndInitialize.
This also avoids non-determistic code paths, when ResetAndInitialize may
re-initialize m_next_inv_to_inbounds.
Without this patch, the tool would report a diff:
cargo run --manifest-path ./contrib/devtools/deterministic-fuzz-coverage/Cargo.toml -- $PWD/bld-cmake/ $PWD/../qa-assets/fuzz_corpora/ p2p_headers_presync 32
...
- 1126| 3| m_next_inv_to_inbounds = now + m_rng.rand_exp_duration(average_interval);
- 1127| 3| }
+ 1126| 10| m_next_inv_to_inbounds = now + m_rng.rand_exp_duration(average_interval);
+ 1127| 10| }
1128| 491| return m_next_inv_to_inbounds;
...
This allows the clock to be mockable in tests. Also, replace cs_main
with GetMutex() while touching this function.
Also, use the ElapseSteady test helper in the p2p_headers_presync fuzz
target to make it more deterministic.
The m_last_presync_update variable is a global that is not reset in
ResetAndInitialize. However, it is only used for logging, so completely
disable it for now.
Without this patch, the tool would report a diff:
cargo run --manifest-path ./contrib/devtools/deterministic-fuzz-coverage/Cargo.toml -- $PWD/bld-cmake/ $PWD/../qa-assets/fuzz_corpora/ p2p_headers_presync 32
...
4468| 81| auto now = std::chrono::steady_clock::now();
4469| 81| if (now < m_last_presync_update + std::chrono::milliseconds{250}) return;
- ^80
+ ^79
...
The global m_headers_presync_stats is not reset in ResetAndInitialize.
This may lead to non-determinism.
Fix it by incrementing the global node id counter instead.
Without this patch, the tool would report a diff:
cargo run --manifest-path ./contrib/devtools/deterministic-fuzz-coverage/Cargo.toml -- $PWD/bld-cmake/ $PWD/../qa-assets/fuzz_corpora/ p2p_headers_presync 32
...
2587| 3.73k| if (best_it == m_headers_presync_stats.end()) {
------------------
- | Branch (2587:17): [True: 80, False: 3.65k]
+ | Branch (2587:17): [True: 73, False: 3.66k]
------------------
...
When iterating over all fuzz input files in a folder, the order should
not matter.
However, shuffling may be useful to detect non-determinism.
Thus, shuffle in fuzz.cpp, when using neither libFuzzer, nor AFL.
Also, shuffle in the deterministic-fuzz-coverage tool, when using
libFuzzer.
Rather than use an ad-hoc reimplementation of wide multiplication inside the
fuzz test, reuse arith_uint256, which already has this. It's larger than what we
need here, but performance isn't a concern in this test, and it does what we need.
Add a class TorsStreamIsolationCredentialsGenerator that generates
unique credentials based on a randomly generated session prefix
and an atomic counter.
This makes sure that different launches of the application won't share
the same credentials, and thus circuits, even in edge cases.
Example with `-debug=proxy`:
```
2025-03-31T16:30:27Z [proxy] SOCKS5 sending proxy authentication 0afb2da441f5c105-0:0afb2da441f5c105-0
2025-03-31T16:30:31Z [proxy] SOCKS5 sending proxy authentication 0afb2da441f5c105-1:0afb2da441f5c105-1
```
Thanks to hodlinator for the idea.
57d8b1f1b3 cmake: Avoid fuzzer "multiple definition of `main'" errors (Ryan Ofsky)
Pull request description:
This change builds libraries with `-fsanitize=fuzzer-no-link` instead of `-fsanitize=fuzzer` when the cmake `-DSANITIZERS=fuzzer` option is specified. This is necessary to make fuzzing and IPC cmake options compatible with each other and avoid CI failures in #30975 which enables IPC in the fuzzer CI build:
https://cirrus-ci.com/task/5366255504326656?logs=ci#L2817https://cirrus-ci.com/task/5233064575500288?logs=ci#L2384
The failures can also be reproduced by checking out #31741 and building with `cmake -B build -DBUILD_FOR_FUZZING=ON -DSANITIZERS=fuzzer -DENABLE_IPC=ON` with this fix reverted.
The fix updates the cmake build so when `-DSANITIZERS=fuzzer` is specified, the fuzz test binary is built with `-fsanitize=fuzzer` (so it can use libFuzzer's main function), and libraries are built with `-fsanitize=fuzzer-no-link` (so they can be linked into other executables with their own main functions).
Previously when `-DSANITIZERS=fuzzer` was specified, `-fsanitize=fuzzer` was applied to ALL libraries and executables. This was inappropriate because it made it impossible to build any executables other than the fuzz test executable without triggering link errors:
- `` multiple definition of `main' ``
- `` "undefined reference to `LLVMFuzzerTestOneInput' ``
if they depended on any libraries instrumented for fuzzing.
This was especially a problem when the `ENABLE_IPC` option was set because it made building the `mpgen` code generator impossible so nothing else that depended on generated sources, including the fuzz test binary, could be built either.
This commit was previously part of https://github.com/bitcoin/bitcoin/pull/31741 and had some discussion there starting in https://github.com/bitcoin/bitcoin/pull/31741#pullrequestreview-2619682385
---
This PR is part of the [process separation project](https://github.com/bitcoin/bitcoin/issues/28722).
ACKs for top commit:
hebasto:
ACK 57d8b1f1b3, tested on Ubuntu 24.04.
Tree-SHA512: 4011adbc0b08742e83cf7c0560d3d5b5694a863358e6ac9a21239626b4a8fedceca66db34b5a46136a7b26849bb1d8710c894689322ae97e1c407687c3f57d50
This abstracts out the finding of the connected component that includes
a given element from FindConnectedComponent (which just finds any connected
component).
Use this in the txgraph fuzz test, which was effectively reimplementing this
logic. At the same time, improve its performance by replacing a vector with a
set.
b1de59e896 fuzz: extract unsequenced operations with side-effects (Lőrinc)
Pull request description:
https://github.com/bitcoin/bitcoin/pull/30746#discussion_r1817851827 introduced unsequenced operations with side-effects - which is undefined behavior, i.e. the right hand side can be evaluated before the left hand side, which happens to mutate it.
<details>
<summary>Tried to find other occurrences</summary>
```bash
clang++ --analyze -std=c++20 -I./src -I./src/test -I./src/test/fuzz src/test/fuzz/base_encode_decode.cpp src/psbt.cpp
```
but it didn't warn about UB.
Grepped for similar ones, but could find any other one in the codebase:
```bash
> grep -rnE --include='*.cpp' --include='*.h' '\b(\w+)\(([^)]*\b(\w+)\b[^)]*)\)\s*==\s*\3\.' .
./src/test/arith_uint256_tests.cpp:373: BOOST_CHECK(R1L.GetHex() == R1L.ToString());
./src/test/arith_uint256_tests.cpp:374: BOOST_CHECK(R2L.GetHex() == R2L.ToString());
./src/test/arith_uint256_tests.cpp:375: BOOST_CHECK(OneL.GetHex() == OneL.ToString());
./src/test/arith_uint256_tests.cpp:376: BOOST_CHECK(MaxL.GetHex() == MaxL.ToString());
./src/test/fuzz/cluster_linearize.cpp:565: assert(depgraph.FeeRate(best_anc.transactions) == best_anc.feerate);
./src/test/fuzz/cluster_linearize.cpp:646: assert(depgraph.FeeRate(found.transactions) == found.feerate);
./src/test/fuzz/cluster_linearize.cpp:765: assert(depgraph.FeeRate(chunk_info.transactions) == chunk_info.feerate);
./src/test/fuzz/base_encode_decode.cpp:95: assert(DecodeBase64PSBT(psbt, random_string, error) == error.empty());
./src/test/fuzz/key.cpp:102: assert(pubkey.data() == pubkey.begin());
./src/test/skiplist_tests.cpp:42: BOOST_CHECK(vIndex[from].GetAncestor(0) == vIndex.data());
./src/script/signingprovider.cpp:535: ComputeTapbranchHash(node.sub[1]->hash, node.sub[1]->hash) == node.hash) {
./src/pubkey.h:78: return vch.size() > 0 && GetLen(vch[0]) == vch.size();
./src/cluster_linearize.h:881: Assume(elem.inc.feerate.IsEmpty() == elem.pot_feerate.IsEmpty());
```
</details>
Hodlinator deduced the UB on Windows in https://github.com/bitcoin/bitcoin/issues/32135#issuecomment-2751723855Fixes#32135
ACKs for top commit:
maflcko:
lgtm ACK b1de59e896
hodlinator:
ACK b1de59e896
marcofleon:
Nice, ACK b1de59e896
brunoerg:
code review ACK b1de59e896
Tree-SHA512: d66524424c7f749eba870f5bd6038da79666ac638047b31dd8ff15a77d927facb54b4735e8afb7984648fdc9e2dd59ea213996c352301fa05978f041511361d4
b2ea365648 txgraph: Add Get{Ancestors,Descendants}Union functions (feature) (Pieter Wuille)
54bceddd3a txgraph: Multiple inputs to Get{Ancestors,Descendant}Refs (preparation) (Pieter Wuille)
aded047019 txgraph: Add CountDistinctClusters function (feature) (Pieter Wuille)
b685d322c9 txgraph: Add DoWork function (feature) (Pieter Wuille)
295a1ca8bb txgraph: Expose ability to compare transactions (feature) (Pieter Wuille)
22c68cd153 txgraph: Allow Refs to outlive the TxGraph (feature) (Pieter Wuille)
82fa3573e1 txgraph: Destroying Ref means removing transaction (feature) (Pieter Wuille)
6b037ceddf txgraph: Cache oversizedness of graphs (optimization) (Pieter Wuille)
8c70688965 txgraph: Add staging support (feature) (Pieter Wuille)
c99c7300b4 txgraph: Abstract out ClearLocator (refactor) (Pieter Wuille)
34aa3da5ad txgraph: Group per-graph data in ClusterSet (refactor) (Pieter Wuille)
36dd5edca5 txgraph: Special-case removal of tail of cluster (Optimization) (Pieter Wuille)
5801e0fb2b txgraph: Delay chunking while sub-acceptable (optimization) (Pieter Wuille)
57f5499882 txgraph: Avoid looking up the same child cluster repeatedly (optimization) (Pieter Wuille)
1171953ac6 txgraph: Avoid representative lookup for each dependency (optimization) (Pieter Wuille)
64f69ec8c3 txgraph: Make max cluster count configurable and "oversize" state (feature) (Pieter Wuille)
1d27b74c8e txgraph: Add GetChunkFeerate function (feature) (Pieter Wuille)
c80aecc24d txgraph: Avoid per-group vectors for clusters & dependencies (optimization) (Pieter Wuille)
ee57e93099 txgraph: Add internal sanity check function (tests) (Pieter Wuille)
05abf336f9 txgraph: Add simulation fuzz test (tests) (Pieter Wuille)
8ad3ed2681 txgraph: Add initial version (feature) (Pieter Wuille)
6eab3b2d73 feefrac: Introduce tagged wrappers to distinguish vsize/WU rates (Pieter Wuille)
d449773899 scripted-diff: (refactor) ClusterIndex -> DepGraphIndex (Pieter Wuille)
bfeb69f6e0 clusterlin: Make IsAcyclic() a DepGraph member function (Pieter Wuille)
0aa874a357 clusterlin: Add FixLinearization function + fuzz test (Pieter Wuille)
Pull request description:
Part of cluster mempool: #30289.
### 1. Overview
This introduces the `TxGraph` class, which encapsulates knowledge about the (effective) fees, sizes, and dependencies between all mempool transactions, but nothing else. In particular, it lacks knowledge about `CTransaction`, inputs, outputs, txids, wtxids, prioritization, validatity, policy rules, and a lot more. Being restricted to just those aspects of the mempool makes the behavior very easy to fully specify (ignoring the actual linearizations produced), and write simulation-based tests for (which are included in this PR).
### 2. Interface
The interface can be largely categorized into:
* Mutation functions:
* `AddTransaction` (add a new transaction with specified feerate, and get a `Ref` object back to identify it).
* `RemoveTransaction` (given a `Ref` object, remove the transaction).
* `AddDependency` (given two `Ref` objects, add a dependency between them).
* `SetTransactionFee` (modify the fee associated with a Ref object).
* Inspector functions:
* `GetAncestors` (get the ancestor set in the form of `Ref*` pointers)
* `GetAncestorsUnion` (like above, but for the union of ancestors of multiple `Ref*` pointers)
* `GetDescendants` (get the descendant set in the form of `Ref*` pointers)
* `GetDescendantsUnion` (like above, but for the union of ancestors of multiple `Ref*` pointers)
* `GetCluster` (get the connected component set in the form of `Ref*` pointers, in the order they would be mined).
* `GetIndividualFeerate` (get the feerate of a transaction)
* `GetChunkFeerate` (get the mining score of a transaction)
* `CountDistinctClusters` (count the number of distinct clusters a list of `Ref`s belong to)
* Staging functions:
* `StartStaging` (make all future mutations operate on a proposed transaction graph)
* `CommitStaging` (apply all the changes that are staged)
* `AbortStaging` (discard all the changes that are staged)
* Miscellaneous functions:
* `DoWork` (do queued-up computations now, so that future operations are fast)
This `TxGraph::Ref` type used as a "handle" on transactions in the graph can be inherited from, and the idea is that in the full cluster mempool implementation (#28676, after it is rebased on this), `CTxMempoolEntry` will inherit from it, and all actually used Ref objects will be `CTxMempoolEntry`s. With that, the mempool code can just cast any `Ref*` returned by txgraph to `CTxMempoolEntry*`.
### 3. Implementation
Internally the graph data is kept in clustered form (partitioned into connected components), for which linearizations are maintained and updated as needed using the `cluster_linearize.h` algorithms under the hood, but this is hidden from the users of this class. Implementation-wise, mutations are generally applied lazily, appending to queues of to-be-removed transactions and to-be-added dependencies, so they can be batched for higher performance. Inspectors will generally only evaluate as much as is needed to answer queries, with roughly 5 levels of processing to go to fully instantiated and acceptable cluster linearizations, in order:
1. `ApplyRemovals` (take batches of to-be-removed transactions and translate them to "holes" in the corresponding Clusters/DepGraphs).
2. `SplitAll` (creating holes in Clusters may cause them to break apart into smaller connected components, so make turn them into separate Clusters/linearizations).
3. `GroupClusters` (figure out which Clusters will need to be combined in order to add requested to-be-added dependencies, as these may span clusters).
4. `ApplyDependencies` (actually merge Clusters as precomputed by `GroupClusters`, and add the dependencies between them).
5. `MakeAcceptable` (perform the LIMO linearization algorithm on Clusters to make sure their linearizations are acceptable).
### 4. Future work
This is only an initial version of TxGraph, and some functionality is missing before #28676 can be rebased on top of it:
* The ability to get comparative feerate diagrams before/after for the set of staged changes (to evaluate RBF incentive-compatibility).
* Mining interface (ability to iterate transactions quickly in mining score order) (see #31444).
* Eviction interface (reverse of mining order, plus memory usage accounting) (see #31444).
* Ability to fix oversizedness of clusters (before or after committing) - this is needed for reorgs where aborting/rejecting the change just is not an option (see #31553).
* Interface for controlling how much effort is spent on LIMO. In this PR it is hardcoded.
Then there are further improvements possible which would not block other work:
* Making Cluster a virtual class with different implementations based on transaction count (which could dramatically reduce memory usage, as most Clusters are just a single transaction, for which the current implementation is overkill).
* The ability to have background thread(s) for improving cluster linearizations.
ACKs for top commit:
instagibbs:
reACK b2ea365648
ajtowns:
reACK b2ea365648
ismaelsadeeq:
reACK b2ea365648🚀
glozow:
ACK b2ea365648
Tree-SHA512: 0f86f73d37651fe47d469db1384503bbd1237b4556e5d50b1d0a3dd27754792d6fc3481f77a201cf2ed36c6ca76e0e44c30e175d112aacb53dfdb9e11d8abc6b
https://github.com/bitcoin/bitcoin/pull/30746#discussion_r1817851827 introduced an unsequenced operations with side-effects - which is undefined behavior, i.e. the right hand side can be evaluated before the left hand side, which happens to mutate it.
Tried:
```
clang++ --analyze -std=c++20 -I./src -I./src/test -I./src/test/fuzz src/test/fuzz/base_encode_decode.cpp src/psbt.cpp
```
but it didn't warn about UB.
Grepped for similar ones, but could find any other one in the codebase:
> grep -rnE --include='*.cpp' --include='*.h' '\b(\w+)\(([^)]*\b(\w+)\b[^)]*)\)\s*==\s*\3\.' .
```
./src/test/arith_uint256_tests.cpp:373: BOOST_CHECK(R1L.GetHex() == R1L.ToString());
./src/test/arith_uint256_tests.cpp:374: BOOST_CHECK(R2L.GetHex() == R2L.ToString());
./src/test/arith_uint256_tests.cpp:375: BOOST_CHECK(OneL.GetHex() == OneL.ToString());
./src/test/arith_uint256_tests.cpp:376: BOOST_CHECK(MaxL.GetHex() == MaxL.ToString());
./src/test/fuzz/cluster_linearize.cpp:565: assert(depgraph.FeeRate(best_anc.transactions) == best_anc.feerate);
./src/test/fuzz/cluster_linearize.cpp:646: assert(depgraph.FeeRate(found.transactions) == found.feerate);
./src/test/fuzz/cluster_linearize.cpp:765: assert(depgraph.FeeRate(chunk_info.transactions) == chunk_info.feerate);
./src/test/fuzz/base_encode_decode.cpp:95: assert(DecodeBase64PSBT(psbt, random_string, error) == error.empty());
./src/test/fuzz/key.cpp:102: assert(pubkey.data() == pubkey.begin());
./src/test/skiplist_tests.cpp:42: BOOST_CHECK(vIndex[from].GetAncestor(0) == vIndex.data());
./src/script/signingprovider.cpp:535: ComputeTapbranchHash(node.sub[1]->hash, node.sub[1]->hash) == node.hash) {
./src/pubkey.h:78: return vch.size() > 0 && GetLen(vch[0]) == vch.size();
./src/cluster_linearize.h:881: Assume(elem.inc.feerate.IsEmpty() == elem.pot_feerate.IsEmpty());
```
Hodlinator deduced the UB on Windows in https://github.com/bitcoin/bitcoin/issues/32135#issuecomment-2751723855
Co-authored-by: Hodlinator <172445034+hodlinator@users.noreply.github.com>
In order to make it possible for higher layers to compare transaction quality
(ordering within the implicit total ordering on the mempool), expose a comparison
function and test it.
Before this commit, if a TxGraph::Ref object is destroyed, it becomes impossible
to refer to, but the actual corresponding transaction node in the TxGraph remains,
and remains indefinitely as there is no way to remove it.
Fix this by making the destruction of TxGraph::Ref trigger immediate removal of
the corresponding transaction in TxGraph, both in main and staging if it exists.
In order to make it easy to evaluate proposed changes to a TxGraph, introduce a
"staging" mode, where mutators (AddTransaction, AddDependency, RemoveTransaction)
do not modify the actual graph, but just a staging version of it. That staging
graph can then be commited (replacing the main one with it), or aborted (discarding
the staging).
Instead of leaving the responsibility on higher layers to guarantee that
no connected component within TxGraph (a barely exposed concept, except through
GetCluster()) exceeds the cluster count limit, move this responsibility to
TxGraph itself:
* TxGraph retains a cluster count limit, but it becomes configurable at construction
time (this primarily helps with testing that it is properly enforced).
* It is always allowed to perform mutators on TxGraph, even if they would cause the
cluster count limit to be exceeded. Instead, TxGraph exposes an IsOversized()
function, which queries whether it is in a special "oversize" state.
* During oversize state, many inspectors are unavailable, but mutators remain valid,
so the higher layer can "fix" the oversize state before continuing.
To make testing more powerful, expose a function to perform an internal sanity
check on the state of a TxGraph. This is especially important as TxGraphImpl
contains many redundantly represented pieces of information:
* graph contains clusters, which refer to entries, but the entries refer back
* graph maintains pointers to Ref objects, which point back to the graph.
This lets us make sure they are always in sync.
This adds a simulation fuzz test for txgraph, by comparing with a naive
reimplementation that models the entire graph as a single DepGraph, and
clusters in TxGraph as connected components within that DepGraph.
Since cluster_linearize.h does not actually have a Cluster type anymore, it is more
appropriate to rename the index type to DepGraphIndex.
-BEGIN VERIFY SCRIPT-
sed -i 's/Data type to represent transaction indices in clusters./Data type to represent transaction indices in DepGraphs and the clusters they represent./' $(git grep -l 'using ClusterIndex')
sed -i 's|\<ClusterIndex\>|DepGraphIndex|g' $(git grep -l 'ClusterIndex')
-END VERIFY SCRIPT-
This function takes an existing ordering for transactions in a DepGraph, and
makes it a valid linearization for it (i.e., topological). Any topological
prefix of the input remains untouched.
Removed duplicate call to GetDescriptorScriptPubKeyMan and
Instead of checking linearly I have used find method so time complexity reduced significantly for GetDescriptorScriptPubKeyMan
after this fix improved performance of importdescriptor part refs #32013.
63b534f97e fuzz: sanity check hardcoded snapshot in utxo_snapshot target (Antoine Poinsot)
3b85eba83a test util: split up ConnectBlock from MineBlock (Antoine Poinsot)
d1527f6b88 qa: correct off-by-one in utxo snapshot fuzz target (Antoine Poinsot)
Pull request description:
The assumeutxo data for the fuzz target could change and invalidate the hash silently, preventing the fuzz target from reaching some code paths. Fix this by introducing a unit test which would break if the snapshot data the fuzz target relies on were to change.
In implementing this i noticed the height used for coins in the fuzz target is actually off-by-one (as if the first block in the created chain was the genesis but it's block `1`), so fix that too.
ACKs for top commit:
mzumsande:
Code Review ACK 63b534f97e
fjahr:
tACK 63b534f97e
Tree-SHA512: 2399b6e74db9b78aab8efba67c57a405d2d7d880ae3b7d8518a1c96cc6266f61f5e77722cd999adeac5d3e03e73d84cf9ae7bdbcc0afae198cc87049dde4012f
ffff4a293a bench: Update span-serialize comment (MarcoFalke)
fa4d6ec97b refactor: Avoid false-positive gcc warning (MarcoFalke)
fa942332b4 scripted-diff: Bump copyright headers after std::span changes (MarcoFalke)
fa0c6b7179 refactor: Remove unused Span alias (MarcoFalke)
fade0b5e5e scripted-diff: Use std::span over Span (MarcoFalke)
fadccc26c0 refactor: Make Span an alias of std::span (MarcoFalke)
fa27e36717 test: Fix broken span_tests (MarcoFalke)
fadf02ef8b refactor: Return std::span from MakeUCharSpan (MarcoFalke)
fa720b94be refactor: Return std::span from MakeByteSpan (MarcoFalke)
Pull request description:
`Span` has some issues:
* It does not support fixed-size spans, which are available through `std::span`.
* It is confusing to have it available and in use at the same time with `std::span`.
* It does not obey the standard library iterator build hardening flags. See https://github.com/bitcoin/bitcoin/issues/31272 for a discussion. For example, this allows to catch issues like the one fixed in commit fabeca3458.
Both types are type-safe and can even implicitly convert into each other in most contexts.
However, exclusively using `std::span` seems less confusing, so do it here with a scripted-diff.
ACKs for top commit:
l0rinc:
reACK ffff4a293a
theuni:
ACK ffff4a293a.
Tree-SHA512: 9cc2f1f43551e2c07cc09f38b1f27d11e57e9e9bc0c6138c8fddd0cef54b91acd8b14711205ff949be874294a121910d0aceffe0e8914c4cff07f1e0e87ad5b8
fac3d93c2b fuzz: Speed up *_package_eval fuzz targets a bit (MarcoFalke)
fa40fd043a fuzz: [refactor] Avoid confusing c-style cast (MarcoFalke)
Pull request description:
Each target is at least 10% faster for me when running over the current set of qa-assets, which seems nice.
The changes `outpoints_value` from a map to an unordered map, which is safe, because the element order is not used in the fuzz test and the map is only used for lookup.
(`mempool_outpoints` can't be changed, because the order matters here. Using unordered_set here may result in a non-deterministic fuzz target, given the same fuzz input.)
ACKs for top commit:
l0rinc:
ACK fac3d93c2b
dergoegge:
Code review ACK fac3d93c2b
Tree-SHA512: 8ae5d4e281505aff76a4003d6e9ea388dbb73860e167385bd6a0a201b3acc939db29ee212594952a9e80e85b3cc4cd726ce6dd49551f74013cb4da8a15cbdfb3
4cd95a2921 refactor: modernize remaining outdated trait patterns (Lőrinc)
ab2b67fce2 scripted-diff: modernize outdated trait patterns - values (Lőrinc)
8327889f35 scripted-diff: modernize outdated trait patterns - types (Lőrinc)
Pull request description:
The use of [`std::underlying_type_t<T>`](https://en.cppreference.com/w/cpp/types/underlying_type) or [`std::is_enum_v<T>`](https://en.cppreference.com/w/cpp/types/is_enum) (and similar ones, introduced in C++14) replace the `typename std::underlying_type<T>::type` and `std::is_enum<T>::value` constructs (available in C++11).
The `_t` and `_v` helper alias templates offer a more concise way to extract the type and value directly.
I've modified the instances I found in the codebase one-by-one (noticed them while investigating https://github.com/bitcoin/bitcoin/pull/31868), and afterwards extracted scripted diff commits to do the trivial ones automatically.
The last commit contains the values that were easier done manually.
I've excluded changes from `src/bench/nanobench.h`, `src/leveldb`, `src/minisketch`, `src/span.h` and `src/sync.h` - let me know if you think they should be included instead.
A few of the code changes can also be reproduced by clang-tidy (but not all of them):
```bash
cmake -B build -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ -DCMAKE_EXPORT_COMPILE_COMMANDS=ON -DBUILD_BENCH=ON -DBUILD_FUZZ_BINARY=ON -DBUILD_FOR_FUZZING=ON && cmake --build build -j$(nproc)
run-clang-tidy -quiet -p build -j $(nproc) -checks='-*,modernize-type-traits' -fix $(git grep -lE '::(value|type)' ./src ':(exclude)src/bench/nanobench.h' ':(exclude)src/leveldb' ':(exclude)src/minisketch' ':(exclude)src/span.h' ':(exclude)src/sync.h')
```
ACKs for top commit:
laanwj:
Concept and code review ACK 4cd95a2921
Tree-SHA512: a4bcf0f267c0f4e02983b4d548ed6f58d464ec379ac5cd1f998b9ec0cf698b53a9f2557a05a342b661f1d94adefc9a0ce2dc8f764d49453aaea95451e2c4c581
d5537c18a9 fuzz: make sure DecodeBase58(Check) is called with valid values more often (Lőrinc)
bad1433ef2 fuzz: Always restrict base conversion input lengths (Lőrinc)
Pull request description:
This is a follow-up to https://github.com/bitcoin/bitcoin/pull/30746, expanding coverage by:
* restricting every input for the base58 conversions, capping max sizes to `100` instead of `1000` or all available input (suggested by marcofleon in https://github.com/bitcoin/bitcoin/pull/30746#discussion_r1963718683) since most actual usage has lengths of e.g. `21`, `34`, `78`.
* providing more valid values to the decoder (suggested by maflcko in https://github.com/bitcoin/bitcoin/pull/30746#discussion_r1957847712) by randomly providing a random input or a valid encoded one; this also enables unifying the roundtrip tests to a single roundtrip per fuzz.
ACKs for top commit:
mzumsande:
Code Review / lightly tested ACK d5537c18a9
maflcko:
review ACK d5537c18a9🚛
Tree-SHA512: 50365654cdac8a38708a7475eaa43396642b7337e2ee8999374c3faafff4f05457abc1a54c701211e0ed24d36c12af77bcad17b49695699be42664f2be660659