e976bd3045 validation: add randomness to periodic write interval (Andrew Toth)
2e2f410681 refactor: replace m_last_write with m_next_write (Andrew Toth)
b557fa7a17 refactor: rename fDoFullFlush to should_write (Andrew Toth)
d73bd9fbe4 validation: write chainstate to disk every hour (Andrew Toth)
0ad7d7abdb test: chainstate write test for periodic chainstate flush (Andrew Toth)
Pull request description:
Since #28233, periodically writing the chainstate to disk every 24 hours does not clear the dbcache. Since #28280, periodically writing the chainstate to disk is proportional only to the amount of dirty entries in the cache. Due to these changes, it is no longer beneficial to only write the chainstate to disk every 24 hours. The periodic flush interval was necessary because every write of the chainstate would clear the dbcache. Now, we can get rid of the periodic flush interval and simply write the chainstate along with blocks and block index at least every hour.
Three benefits of doing this:
1. For IBD or reindex-chainstate with a combination of large dbcache setting, slow CPU, slow internet speed/unreliable peers, it could be up to 24 hours until the chainstate is persisted to disk. A power outage or crash could potentially lose up to 24 hours of progress. If there is a very large amount of dirty cache entries, writing to disk when a flush finally does occur will take a very long time. Crashing during this window of writing can cause https://github.com/bitcoin/bitcoin/issues/11600. By syncing every hour in unison with the block index we avoid this problem. Only a maximum of one hour of progress can be lost, and the window for crashing during writing is much smaller. For IBD with lower dbcache settings, faster CPU, or better internet speed/reliable peers, chainstate writes are already triggered more often than every hour so this change will have no effect on IBD.
2. Based on discussion in #28280, writing only once every 24 hours during long running operation of a node causes IO spikes. Writing smaller chainstate changes every hour like we do with blocks and block index will reduce IO spikes.
3. Faster shutdown speeds. All dirty chainstate entries must be persisted to disk on shutdown. If we have a lot of dirty entries, such as when close to 24 hours or if we sync with a large dbcache, it can take a long time to shutdown. By keeping the chainstate clean we avoid this problem.
Inspired by [this comment](https://github.com/bitcoin/bitcoin/pull/28280#issuecomment-2121088705).
Resolves https://github.com/bitcoin/bitcoin/issues/11600
ACKs for top commit:
achow101:
ACK e976bd3045
davidgumberg:
utACK e976bd3045
sipa:
utACK e976bd3045
l0rinc:
ACK e976bd3045
Tree-SHA512: 5bccd8f1dea47f9820a3fd32fe3bb6841c0167b3d6870cc8f3f7e2368f124af1a914bca6acb06889cd7183638a8dbdbace54d3237c3683f2b567eb7355e015ee
7e8ef959d0 refactor: Fix Sonar rule `cpp:S4998` - avoid unique_ptr const& as parameter (Lőrinc)
e400ac5352 refactor: simplify repeated comparisons in `FindChallenges` (Lőrinc)
f670836112 test: remove old recursive `FindChallenges_recursive` implementation (Lőrinc)
b80d0bdee4 test: avoid stack overflow in `FindChallenges` via manual iteration (Lőrinc)
Pull request description:
`FindChallenges` explores the `Miniscript` node tree by going deep into the first child's subtree, then the second, and so on - effectively performing a pre-order Traversal (Depth-First Search) recursively, using the call stack which can result in stack overflows on Windows debug builds.
This change replaces the recursive implementation with an iterative version using an explicit stack. The new implementation also performs a pre-order depth-first traversal, though it processes children in right-to-left order (rather than left-to-right) due to the LIFO nature of the stack. Since both versions store results in a `std::set`, which automatically sorts and deduplicates elements, the exact traversal order doesn't affect the final result.
It is an alternative to increasing the Windows stack size, as proposed in #32349, and addresses the issue raised in #32341 by avoiding deep recursion altogether.
The change is done in two commits:
* add a new iterative `FindChallenges` method and rename the old method to `*_recursive` (to simplify the next commit where we remove it), asserting that its result matches the original;
* remove the original recursive implementation.
This approach avoids ignoring the `misc-no-recursion` warning as well.
I tried modifying the new method to store results in a vector instead, but it demonstrated that the deduplication provided by `std::set` was necessary. One example showing the need for deduplication:
Recursive (using set):
```
(6, 9070746)
(6, 19532513)
(6, 3343376967)
```
Iterative (using vector attempt):
```
(6, 19532513)
(6, 9070746)
(6, 3343376967)
(6, 9070746) // Duplicate entry
```
The performance of the test is the same as before, with the recursive method.
Fixes https://github.com/bitcoin/bitcoin/issues/32341
ACKs for top commit:
achow101:
ACK 7e8ef959d0
sipa:
utACK 7e8ef959d0
hodlinator:
re-ACK 7e8ef959d0
Tree-SHA512: 9e52eff82a7d76f5d37e3b74c508f08e5fced5386dad504bed111b27ed2b529008a6dd12a5116f009609a94c7ee7ebe3e80a759dda55dd1cb3ae52078f65ec71
f1b142856a test: Same addr, diff port is already connected (David Gumberg)
94e85a82a7 net: remove unnecessary check from AlreadyConnectedToAddress() (Vasil Dimov)
Pull request description:
`CConnman::AlreadyConnectedToAddress()` searches the existent nodes by address or by address-and-port:
```cpp
FindNode(static_cast<CNetAddr>(addr)) || FindNode(addr.ToStringAddrPort())
```
but:
* if there is a match by just the address, then the address-and-port search will not be evaluated and the whole condition will be `true`
* if the there is no node with the same address, then the second search by address-and-port will not find a match either.
The search by address-and-port is comparing against `CNode::m_addr_name` which could be a hostname, e.g. `"node.foobar.com:8333"`, but `addr.ToStringAddrPort()` is always going to be numeric.
---
In other words: let `A` be "CNetAddr equals" and `B` be "addr:port string matches", then:
* If `A` (is `true`), then `B` is irrelevant, so the condition `A || B` is equivalent to `A` is `true`.
* Observation in this PR: if `!A` (`A` is `false`), then `!B` for sure, thus the condition `A || B` is equivalent to `A` is `false`.
So, simplify `A || B` to `A`.
https://en.wikipedia.org/wiki/Modus_tollens `!A => !B` is equivalent to `B => A`. So the added fuzz test asserts that if `B` is `true`, then `A` is `true`.
ACKs for top commit:
davidgumberg:
crACK f1b142856a
achow101:
ACK f1b142856a
theuni:
utACK f1b142856a
mzumsande:
Code Review ACK f1b142856a
Tree-SHA512: d744b60e9bace121faa3a746463f6b6e0e6ef08eac0e7879326cbd5f4721e47e6e10f6203dfd3870a2057c4ddd1860692c070ef048a76d773b84e6c2f840cc86
e3014017ba test: add IsActiveAfter tests for versionbits (Anthony Towns)
60950f77c3 versionbits: docstrings for BIP9Info (Anthony Towns)
7565563bc7 tests: refactor versionbits fuzz test (Anthony Towns)
2e4e9b9608 tests: refactor versionbits unit test (Anthony Towns)
525c00f91b versionbits: Expose VersionBitsConditionChecker via impl header (Anthony Towns)
e74a7049b4 versionbits: Expose StateName function (Anthony Towns)
d00d1ed52c versionbits: Split out internal details into impl header (Anthony Towns)
37b9b67a39 versionbits: Simplify VersionBitsCache API (Anthony Towns)
1198e7d2fd versionbits: Move BIP9 status logic for getblocktemplate to versionbits (Anthony Towns)
b1e967c3ec versionbits: Move getdeploymentinfo logic to versionbits (Anthony Towns)
3bd32c2055 versionbits: Move WarningBits logic from validation to versionbits (Anthony Towns)
5da119e5d0 versionbits: Change BIP9Stats to uint32_t types (Anthony Towns)
a679040ec1 consensus/params: Move version bits period/threshold to bip9 param (Anthony Towns)
e9d617095d versionbits: Remove params from AbstractThresholdConditionChecker (Anthony Towns)
9bc41f1b48 versionbits: Use std::array instead of C-style arrays (Anthony Towns)
Pull request description:
Increases the encapsulation/modularity of the versionbits code, moving more of the logic into the versionbits module rather than having it scattered across validation and rpc code. Updates unit/fuzz tests to test the actual code used rather than just a close approximation of it.
ACKs for top commit:
achow101:
ACK e3014017ba
TheCharlatan:
Re-ACK e3014017ba
darosior:
ACK e3014017ba
Tree-SHA512: 2978db5038354b56fa1dd6aafd511099e9c16504d6a88daeac2ff2702c87bcf3e55a32e2f0a7697e3de76963b68b9d5ede7976ee007e45862fa306911194496d
The original recursive `FindChallenges` explores the Miniscript node tree using depth-first search. Specifically, it performs a pre-order traversal (processing the node's data, then recursively visiting children from left-to-right). This recursion uses the call stack, which can lead to stack overflows on platforms with limited stack space, particularly noticeable in Windows debug builds.
This change replaces the recursive implementation with an iterative version using an explicit stack. The iterative version also performs a depth-first search and processes the node's data before exploring children (preserving pre-order characteristics), although the children are explored in right-to-left order due to the LIFO nature of the explicit stack.
Critically, both versions collect challenges into a `std::set`, which automatically deduplicates and sorts elements. This ensures that not only the final result, but the actual state of the set at any equivalent point in traversal remains identical, despite the difference in insertion order.
This iterative approach is an alternative to increasing the default stack size (as proposed in #32349) and directly addresses the stack overflow issue reported in #32341 by avoiding deep recursion.
The change is done in two commits:
* add a new iterative `FindChallenges` method and rename the old method to `*_recursive` (to simplify removal in the next commit), asserting that its result matches the original;
* Remove the original recursive implementation.
This approach avoids needing to suppress `misc-no-recursion` warnings and provides a portable, low-risk fix.
Using a `std::set` is necessary for deduplication, matching the original function's behavior. An experiment using an `std::vector` showed duplicate challenges being added, confirming the need for the set:
Example failure with vector:
Recursive (set):
(6, 9070746)
(6, 19532513)
(6, 3343376967)
Iterative (vector attempt):
(6, 19532513)
(6, 9070746)
(6, 3343376967)
(6, 9070746) // Duplicate
Co-authored-by: Hennadii Stepanov <32963518+hebasto@users.noreply.github.com>
`CConnman::AlreadyConnectedToAddress()` searches the existent nodes by
address or by address-and-port:
```cpp
FindNode(static_cast<CNetAddr>(addr)) || FindNode(addr.ToStringAddrPort())
```
but:
* if there is a match by just the address, then the address-and-port
search will not be evaluated and the whole condition will be `true`
* if the there is no node with the same address, then the second search
by address-and-port will not find a match either.
The search by address-and-port is comparing against `CNode::m_addr_name`
which could be a hostname, e.g. `"node.foobar.com:8333"`, but
`addr.ToStringAddrPort()` is always going to be numeric.
3dbd50a576 Fix failing util_time_GetTime test on Windows (VolodymyrBg)
Pull request description:
Remove unreliable steady clock time checking from the test that was causing CI failures primarily on Windows. The test previously tried to verify that steady_clock time increases after a 1ms sleep, but this approach is not reliable on all platforms where such a short sleep interval may not consistently result in observable clock changes.
This addresses issue #32197 where the test was reporting failures in the cross-built Windows CI environment. As noted in the discussion, the test is not critical to the functionality of Bitcoin Core, and removing the unreliable part is the most straightforward solution.
ACKs for top commit:
maflcko:
lgtm ACK 3dbd50a576
achow101:
ACK 3dbd50a576
laanwj:
re-ACK 3dbd50a576
Tree-SHA512: 25c80558d9587c7845d3c14464e8d263c8bd9838a510faf44926e5cda5178aee10b03a52464246604e5d27544011d936442ecfa1e4cdaacb66d32c35f7213902
Remove unreliable steady clock time checking from the test that was causing
CI failures primarily on Windows. The test previously tried to verify that
steady_clock time increases after a 1ms sleep, but this approach is not reliable
on all platforms where such a short sleep interval may not consistently result
in observable clock changes.
This addresses issue #32197 where the test was reporting failures in the
cross-built Windows CI environment. As noted in the discussion, the test is not
critical to the functionality of Bitcoin Core, and removing the unreliable part
is the most straightforward solution.
Rename and refocus util_time_GetTime test to util_mocktime
Co-Authored-By: maflcko <6399679+maflcko@users.noreply.github.com>
3c3548a70e validation: clarify final |= BLOCK_FAILED_VALID in InvalidateBlock (Matt Corallo)
aac5488909 validation: correctly update BlockStatus for invalid block descendants (stratospher)
9e29653b42 test: check BlockStatus when InvalidateBlock is used (stratospher)
c99667583d validation: fix traversal condition to mark BLOCK_FAILED_CHILD (stratospher)
Pull request description:
This PR addresses 3 issues related to how `BLOCK_FAILED_CHILD` is set:
1. In `InvalidateBlock()`
- Previously, `BLOCK_FAILED_CHILD` was not being set when it should have been.
- This was due to an incorrect traversal condition, which is fixed in this PR.
2. In `SetBlockFailure()`
- `BLOCK_FAILED_VALID` is now cleared before setting `BLOCK_FAILED_CHILD`.
3. In `InvalidateBlock()`
- if block is already marked as `BLOCK_FAILED_CHILD`, don't mark it as `BLOCK_FAILED_VALID` again.
Also adds a unit test to check `BLOCK_FAILED_VALID` and `BLOCK_FAILED_CHILD` status in `InvalidateBlock()`.
<details>
<summary><h3>looking for feedback on an alternate approach</h3></summary>
<br>
An alternate approach could be removing `BLOCK_FAILED_CHILD` since even though we have a distinction between
`BLOCK_FAILED_VALID` and `BLOCK_FAILED_CHILD` in the codebase, we don't use it for anything. Whenever we check for BlockStatus, we use `BLOCK_FAILED_MASK` which encompasses both of them. See similar discussion in https://github.com/bitcoin/bitcoin/pull/16856.
I have a branch with this approach in https://github.com/stratospher/bitcoin/commits/2025_02_remove_block_failed_child/.
Compared to the version in #16856, it also resets `BLOCK_FAILED_CHILD` already on disk to `BLOCK_FAILED_VALID` when loading from disk so that we won't be in a dirty state in a no-`BLOCK_FAILED_CHILD`-world.
I'm not sure if it's a good idea to remove `BLOCK_FAILED_CHILD` though. would be curious to hear what others think of this approach.
thanks @ mzumsande for helpful discussion regarding this PR!
</details>
ACKs for top commit:
achow101:
ACK 3c3548a70e
TheCharlatan:
Re-ACK 3c3548a70e
mzumsande:
re-ACK 3c3548a70e
Tree-SHA512: 83e0d29dea95b97519d4868135c965b86f6f43be50b15c0bd8f998b3476388fc7cc22b49c0c54ec532ae8222e57dfc436438f0c8e98f54757b384f220488b6a6
55b931934a removed duplicate calling of GetDescriptorScriptPubKeyMan (Saikiran)
Pull request description:
Removed duplicate call to GetDescriptorScriptPubKeyMan and
Instead of checking linearly I have used find method so time complexity reduced significantly for GetDescriptorScriptPubKeyMan
after this fix improved performance of importdescriptor part refs https://github.com/bitcoin/bitcoin/issues/32013.
**Steps to reproduce in testnet environment**
**Input size:** 2 million address in the wallet
**Step1:** call importaddresdescriptor rpc method
observe the time it has taken.
**With the provided fix:**
Do the same steps again
observe the time it has taken.
There is a huge improvement in the performance. (previously it may take 5 to 6 seconds now it will take 1 seconds or less)
main changes i've made during this pr:
1. remove duplicate call to GetDescriptorScriptPubKeyMan method
2. And inside GetDescriptorScriptPubKeyMan method previously we checking **each address linearly** so each time it is calling HasWallet method which has aquired lock.
3. Now i've modified this logic call **find method on the map (O(logn)**) time it is taking, so only once we calling HasWallet method.
**Note:** Smaller inputs in the wallet you may not see the issue but huge wallet size it will definitely impact the performance.
ACKs for top commit:
achow101:
ACK 55b931934a
w0xlt:
ACK 55b931934a
Tree-SHA512: 4a7fdbcbb4e55bd034e9cf28ab4e7ee3fb1745fc8847adb388c98a19c952a1fb66d7b54f0f28b4c2a75a42473923742b4a99fb26771577183a98e0bcbf87a8ca
3669ecd4cc doc: Document fuzz build options (Anthony Towns)
c1d01f59ac fuzz: enable running fuzz test cases in Debug mode (Anthony Towns)
Pull request description:
When building with
BUILD_FOR_FUZZING=OFF
BUILD_FUZZ_BINARY=ON
CMAKE_BUILD_TYPE=Debug
allow the fuzz binary to execute given test cases (without actual fuzzing) to make it easier to reproduce fuzz test failures in a more normal debug build.
In Debug builds, deterministic fuzz behaviour is controlled via a runtime variable, which is normally false, but set to true automatically in the fuzz binary, unless the FUZZ_NONDETERMINISM environment variable is set.
ACKs for top commit:
maflcko:
re-ACK 3669ecd4cc🏉
marcofleon:
re ACK 3669ecd4cc
ryanofsky:
Code review ACK 3669ecd4cc with just variable renamed and documentation added since last review
Tree-SHA512: 5da5736462f98437d0aa1bd01aeacb9d46a9cc446a748080291067f7a27854c89f560f3a6481b760b9a0ea15a8d3ad90cd329ee2a008e5e347a101ed2516449e
When building with
BUILD_FOR_FUZZING=OFF
BUILD_FUZZ_BINARY=ON
CMAKE_BUILD_TYPE=Debug
allow the fuzz binary to execute given test cases (without actual
fuzzing) to make it easier to reproduce fuzz test failures in a more
normal debug build.
In Debug builds, deterministic fuzz behaviour is controlled via a runtime
variable, which is normally false, but set to true automatically in the
fuzz binary, unless the FUZZ_NONDETERMINISM environment variable is set.
e3d7533ac9 test: improves tapscript unit tests (Ethan Heilman)
3e167085ba test: Ensures test fails if witness is not hex (Ethan Heilman)
Pull request description:
This commit creates new test utilities for future Taproot script tests within script_tests.json. The key features of this commit are the addition of three new tags: `#SCRIPT#`, `#CONTROLBLOCK#`, and `#TAPROOTOUTPUT#`. These tags streamline the test creation process by eliminating the need to manually generate these components outside the test suite.
* `#SCRIPT#`: Parses Tapscript and outputs a byte string of opcodes.
* `#CONTROLBLOCK#`: Automatically generates the control block for a given Taproot output.
* `#TAPROOTOUTPUT#`: Generates the final Taproot scriptPubKey.
This code was originally part of the OP_CAT PR https://github.com/bitcoin/bitcoin/pull/29247 but was pulled out into a separate PR to reduce the rebase treadmill for the OP_CAT PR.
Additionally this PR adds a check to ensure that if the witness data can not be parsed as hex the test fails. Prior to this PR, the test code would fail silently and set the values it couldn't parse as empty stack elements. This fix was suggested by @instagibbs.
## Rationale
While writing JSON script tests (script_tests.json) for https://github.com/bitcoin/bitcoin/pull/29247 we ran into the following problem. The JSON script tests are simple and easy to write for pre-Tapscript scripts, but adding or changing a Tapscript test requires substantial work per test. Consider the following pre-tapscript test:
```
["'aa' 'bb'", "CAT 0x4c 0x02 0xaabb EQUAL", "P2SH,STRICTENC", "DISABLED_OPCODE", "CAT disabled"]
````
whereas a Tapscript test for the same script (annotated with comments for better readability) would look like:
```
[
[
"aa",
"bb",
"7e4c02aabb87", // output script
"c0d6889cb081036e0faefa3a35157ad71086b123b2b144b649798b494c300a961d", // control block
0.00000001
],
"",
"0x51 0x20 0x15048ed3a65748549c27b671936987093cf73a4c9cb18522a74fb9553060ca99", // Tapscript output
"P2SH,WITNESS,TAPROOT",
"OK",
"TAPSCRIPT CATs aa and bb together and checks if EQUAL to aabb"
]
```
Computing the Tapscript output, such as `0x51 0x20 0x15048ed3a65748549c27b671936987093cf73a4c9cb18522a74fb9553060ca99`, requires writing custom code and running it for each test. The same is true for the Tapscript control block, such as `c0d6889cb081036e0faefa3a35157ad71086b123b2b144b649798b494c300a961d`. If a test is changed or updated new outputs and control blocks must be computed. The complexity of doing this is likely the reason that no one has added any Tapscript tests to JSON script tests until this PR.
In this PR we address this issue by adding the following improvements to JSON script tests:
Adding simple macros ("#SCRIPT# and #CONTROLBLOCK#) that allow the script test parser to automatically generate and inject a valid Tapscript output and control block to be computed automatically from the JSON script.
Allowing Tapscript scripts to use the human readable strings like pre-script scripts by marking the location of the script in the witness stack using #SCRIPT#. This transforms the unreadable script 7e4c02aabb87 into #SCRIPT# CAT 0x4c 0x02 0xaabb EQUAL.
This results in the following JSON script test which is far easier to write and easier to read.
```
[
[
"aa",
"bb",
"#SCRIPT# CAT",
"#CONTROLBLOCK#",
0.00000001
],
"",
"0x51 0x20 #TAPROOTOUTPUT#",
"P2SH,WITNESS,TAPROOT,OP_CAT",
"OK",
"TAPSCRIPT Test of OP_CAT flag by calling CAT on two elements. TAPSCRIPT_OP_CAT flag is set so CAT is executed."
],
```
ACKs for top commit:
instagibbs:
reACK e3d7533ac9
sipa:
utACK e3d7533ac9
janb84:
Re ACK [e3d7533](e3d7533ac9)
Tree-SHA512: 948c3ec28a4b2b222c2d77e48918ed19d298b51d64662fc20959073edd9978fc796516a392da9755a7e173f556e3021816dc6ce8eb3ed16bbe0fa6ebc574fd48
This commit creates new test utilities for future Taproot script
tests within script_tests.json. The key features of this commit are the
addition of three new tags: `#SCRIPT#`, `#CONTROLBLOCK#`, and
`#TAPROOTOUTPUT#`. These tags streamline the test creation process by
eliminating the need to manually generate these components outside the
test suite.
* `#SCRIPT#`: Parses Tapscript and outputs a byte string of opcodes.
* `#CONTROLBLOCK#`: Automatically generates the control block for a given
Taproot output.
* `#TAPROOTOUTPUT#`: Generates the final Taproot scriptPubKey.
Update src/test/script_tests.cpp
Co-authored-by: Jan B <608446+janb84@users.noreply.github.com>
5cb1241814 feefrac: avoid integer overflow in temporary (Pieter Wuille)
Pull request description:
In `FeeFrac::Div(__int128 n, int32_t d, bool round_down)` in src/util/feefrac.h, the following line computes the result:
```c++
return quot + (mod > 0) - (mod && round_down);
```
The function can only be called under conditions where the result is in range, and thus doesn't involve any integer overflow. However, the intermediary result computed by just `quot + (mod > 0)` may still overflow if it's going to be corrected by the `- (mod && round_down)` that follows.
Fix this by balancing the two correction steps with each other first:
```c++
return quot + ((mod > 0) - (mod && round_down));
```
Fixes#32294.
ACKs for top commit:
l0rinc:
Tested ACK 5cb1241814
maflcko:
lgtm ACK 5cb1241814
achow101:
ACK 5cb1241814
Tree-SHA512: 9daaccdf9acd7652d53b52cad2dc12872558265e863acdde2d6015f885cb87c0505f9bd5be5499fc0a0eded29bec719643f6af1fbc3604518143985094226c95
2835216ec0 txgraph: make GroupClusters use partition numbers directly (optimization) (Pieter Wuille)
c72c8d5d45 txgraph: compare sequence numbers instead of Cluster* (bugfix) (Pieter Wuille)
Pull request description:
Part of cluster mempool: #30289
The implicit transaction ordering for transactions in a TxGraphImpl is defined by:
1. higher chunk feerate first
2. lower Cluster* object pointer first
3. lower position within cluster linearization first.
Number (2) is not deterministic, as it intricately depends on the heap allocation algorithm. Fix this by giving each Cluster a unique `uint64_t m_sequence` value, and sorting by those instead.
The second commit then uses this new approach to optimize GroupClusters a bit more, avoiding some repeated checks and dereferences, by making a local copy of the involved sequence numbers.
Thanks to @dergoegge for pointing this out.
ACKs for top commit:
instagibbs:
reACK 2835216ec0
marcofleon:
ACK 2835216ec0
glozow:
utACK 2835216ec0
Tree-SHA512: d772a55b9ed620159b934a42a39fca7f900d4aa89c099a280a0c61ea0bd7c4fc39b388281ffc775064ea77b0b17263871b4c9763aa71c710a79287d5eb2cd4b4
fa6a007b8e fuzz: Avoid integer sanitizer warnings in policy_estimator target (MarcoFalke)
Pull request description:
It seems odd to write a fuzz target to trigger integer sanitizer warnings in `CBlockPolicyEstimator::processBlockTx` and then suppress them. If the scenario can happen in reality, the code should be properly fixed to handle the cases. If not, it seems better to fix the fuzz target to not trigger meaningless traces.
Do that here by keeping track of the current height and limiting mempool entries to at most this entry height.
ACKs for top commit:
brunoerg:
ACK fa6a007b8e
dergoegge:
utACK fa6a007b8e
Tree-SHA512: 2092017dc309fb095fe5d43cfb76efb691795f303d567ee919be2b5cac19a944293636229903dc4d1e8b9fe5daf9dc3058544321eff1735f91f804c3baa36cd0
The obfuscation (XOR) operations are currently done byte-by-byte during serialization. Buffering the reads will enable batching the obfuscation operations later.
Different operating systems handle file caching differently, so reading larger batches (and processing them from memory) is measurably faster, likely because of fewer native fread calls and reduced lock contention.
Note that `ReadRawBlock` doesn't need buffering since it already reads the whole block directly.
Unlike `ReadBlockUndo`, the new `ReadBlock` implementation delegates to `ReadRawBlock`, which uses more memory than a buffered alternative but results in slightly simpler code and a small performance increase (~0.4%). This approach also clearly documents that `ReadRawBlock` is a logical subset of `ReadBlock` functionality.
The current implementation, which iterates over a fixed-size buffer, provides a more general alternative to Cory Fields' solution of reading the entire block size in advance.
Buffer sizes were selected based on benchmarking to ensure the buffered reader produces performance similar to reading the whole block into memory. Smaller buffers were slower, while larger ones showed diminishing returns.
------
> macOS Sequoia 15.3.1
> C++ compiler .......................... Clang 19.1.7
> cmake -B build -DBUILD_BENCH=ON -DCMAKE_BUILD_TYPE=Release -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ && cmake --build build -j$(nproc) && build/bin/bench_bitcoin -filter='ReadBlockBench' -min-time=10000
Before:
| ns/op | op/s | err% | total | benchmark
|--------------------:|--------------------:|--------:|----------:|:----------
| 2,271,441.67 | 440.25 | 0.1% | 11.00 | `ReadBlockBench`
After:
| ns/op | op/s | err% | total | benchmark
|--------------------:|--------------------:|--------:|----------:|:----------
| 1,738,971.29 | 575.05 | 0.2% | 10.97 | `ReadBlockBench`
------
> Ubuntu 24.04.2 LTS
> C++ compiler .......................... GNU 13.3.0
> cmake -B build -DBUILD_BENCH=ON -DCMAKE_BUILD_TYPE=Release -DCMAKE_C_COMPILER=gcc -DCMAKE_CXX_COMPILER=g++ && cmake --build build -j$(nproc) && build/bin/bench_bitcoin -filter='ReadBlockBench' -min-time=20000
Before:
| ns/op | op/s | err% | ins/op | cyc/op | IPC | bra/op | miss% | total | benchmark
|--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
| 6,895,987.11 | 145.01 | 0.0% | 71,055,269.86 | 23,977,374.37 | 2.963 | 5,074,828.78 | 0.4% | 22.00 | `ReadBlockBench`
After:
| ns/op | op/s | err% | ins/op | cyc/op | IPC | bra/op | miss% | total | benchmark
|--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
| 5,771,882.71 | 173.25 | 0.0% | 65,741,889.82 | 20,453,232.33 | 3.214 | 3,971,321.75 | 0.3% | 22.01 | `ReadBlockBench`
Co-authored-by: maflcko <6399679+maflcko@users.noreply.github.com>
Co-authored-by: Ryan Ofsky <ryan@ofsky.org>
Co-authored-by: Martin Leitner-Ankerl <martin.ankerl@gmail.com>
Co-authored-by: Cory Fields <cory-nospam-@coryfields.com>
Renames the constant to be less verbose and better reflect its purpose:
it represents the size of the storage header that precedes serialized block data on disk,
not to be confused with a block's own header.
-BEGIN VERIFY SCRIPT-
git grep -q "STORAGE_HEADER_BYTES" $(git ls-files) && echo "Error: Target name STORAGE_HEADER_BYTES already exists in the codebase" && exit 1
sed -i 's/BLOCK_SERIALIZATION_HEADER_SIZE/STORAGE_HEADER_BYTES/g' $(git grep -l 'BLOCK_SERIALIZATION_HEADER_SIZE')
-END VERIFY SCRIPT-
babb9f5db6 depends: remove non-native libmultiprocess build (Cory Fields)
5d105fb8c3 depends: Switch libmultiprocess packages to use local git subtree (Ryan Ofsky)
9b35518d2f depends, moveonly: split up int_get_build_id function (Ryan Ofsky)
2d373e2707 lint: Add exclusions for libmultiprocess subtree (Ryan Ofsky)
e88ab394c1 doc: Update documentation to explain libmultiprocess subtree (Ryan Ofsky)
d4bc563982 cmake: Fix clang-tidy "no input files" errors (Ryan Ofsky)
abdf3cb645 cmake: Fix warnings from boost headers (Ryan Ofsky)
8532fcb1c3 cmake: Fix ctest mptest "Unable to find executable" errors (Ryan Ofsky)
d597ab1dee cmake: Support building with libmultiprocess subtree (Ryan Ofsky)
69f0d4adb7 scripted-diff: s/WITH_MULTIPROCESS/ENABLE_IPC/ in cmake (Ryan Ofsky)
a2f28e4be9 Squashed 'src/ipc/libmultiprocess/' content from commit 35944ffd23fa (Ryan Ofsky)
d6244f85c5 depends: Update libmultiprocess library to simplify cmake subtree build (Ryan Ofsky)
Pull request description:
This adds the [libmultiprocess](https://github.com/chaincodelabs/libmultiprocess) library and code generator as a subtree in `src/ipc/libmultiprocess` and allows it to be built with the cmake `-DENABLE_IPC` option, which is disabled by default.
This PR does not entirely remove the depends system [libmultiprocess package](https://github.com/bitcoin/bitcoin/blob/master/depends/packages/native_libmultiprocess.mk) because the package is useful when cross compiling. (A cross-compiling cmake build cannot easily build and run a native code generation tool.) However, it does update the depends package to build from the new git subtree, instead of being downloaded separately from github, so the same sources are used to build both the runtime library and the code generator.
This PR includes the following manual changes (not created automatically with `git subtree add`) which just update the build system and documentation:
- [`d6244f85c509` depends: Update libmultiprocess library to simplify cmake subtree build](d6244f85c5)
- [`69f0d4adb72c` scripted-diff: s/WITH_MULTIPROCESS/ENABLE_IPC/ in cmake](69f0d4adb7)
- [`d597ab1dee6b` cmake: Support building with libmultiprocess subtree](d597ab1dee)
- [`8532fcb1c30d` cmake: Fix ctest mptest "Unable to find executable" errors](8532fcb1c3)
- [`abdf3cb6456f` cmake: Fix warnings from boost headers](abdf3cb645)
- [`d4bc5639829f` cmake: Fix clang-tidy "no input files" errors](d4bc563982)
- [`e88ab394c163` doc: Update documentation to explain libmultiprocess subtree](e88ab394c1)
- [`2d373e27071f` lint: Add exclusions for libmultiprocess subtree](2d373e2707)
- [`9b35518d2f3f` depends, moveonly: split up int_get_build_id function](9b35518d2f)
- [`5d105fb8c3ff` depends: Switch libmultiprocess packages to use local git subtree](5d105fb8c3)
- [`babb9f5db641` depends: remove non-native libmultiprocess build](babb9f5db6)
---
Previous minisketch subtree PR #23114 may be useful for comparison
Instructions for subtree verification can be found:
- https://github.com/bitcoin/bitcoin/blob/master/doc/developer-notes.md#subtrees
- https://github.com/bitcoin/bitcoin/tree/master/test/lint#git-subtree-checksh
TL&DR:
```sh
git remote add --fetch libmultiprocess https://github.com/chaincodelabs/libmultiprocess.git
test/lint/git-subtree-check.sh -r src/ipc/libmultiprocess
```
---
This PR is part of the [process separation project](https://github.com/bitcoin/bitcoin/issues/28722).
ACKs for top commit:
Sjors:
re-ACK babb9f5db6
TheCharlatan:
tACK babb9f5db6
vasild:
ACK babb9f5db6
Tree-SHA512: 43d4eecca5aab63e55c613de935965666eaced327f9fe859a0e9c9b85f7685dc16c5c8d6e03e09ca998628c5d468633f4f743529930b037049abe8e0101e0143
ec81a72b36 net: Add randomized prefix to Tor stream isolation credentials (laanwj)
c47f81e8ac net: Rename `_randomize_credentials` Proxy parameter to `tor_stream_isolation` (laanwj)
Pull request description:
Add a class TorsStreamIsolationCredentialsGenerator that generates unique credentials based on a randomly generated session prefix and an atomic counter. Use this in `ConnectThroughProxy` instead of a simple atomic int counter.
This makes sure that different launches of the application won't share the same credentials, and thus circuits, even in edge cases.
Example with `-debug=proxy`:
```
2025-03-31T16:30:27Z [proxy] SOCKS5 sending proxy authentication 0afb2da441f5c105-0:0afb2da441f5c105-0
2025-03-31T16:30:31Z [proxy] SOCKS5 sending proxy authentication 0afb2da441f5c105-1:0afb2da441f5c105-1
```
Thanks to hodlinator in https://github.com/bitcoin/bitcoin/pull/32166#discussion_r2020973352 for the idea.
ACKs for top commit:
hodlinator:
re-ACK ec81a72b36
jonatack:
ACK ec81a72b36
danielabrozzoni:
tACK ec81a72b36
Tree-SHA512: 195f5885fade77545977b91bdc41394234ae575679cb61631341df443fd8482cd74650104e323c7dbfff7826b10ad61692cca1284d6810f84500a3488f46597a
faa3ce3199 fuzz: Avoid influence on the global RNG from peerman m_rng (MarcoFalke)
faf4c1b6fc fuzz: Disable unused validation interface and scheduler in p2p_headers_presync (MarcoFalke)
fafaca6cbc fuzz: Avoid setting the mock-time twice (MarcoFalke)
fad22149f4 refactor: Use MockableSteadyClock in ReportHeadersPresync (MarcoFalke)
fa9c38794e test: Introduce MockableSteadyClock::mock_time_point and ElapseSteady helper (MarcoFalke)
faf2d512c5 fuzz: Move global node id counter along with other global state (MarcoFalke)
fa98455e4b fuzz: Set ignore_incoming_txs in p2p_headers_presync (MarcoFalke)
faf2e238fb fuzz: Shuffle files before testing them (MarcoFalke)
Pull request description:
This should make the `p2p_headers_presync` fuzz target more deterministic.
Tracking issue: https://github.com/bitcoin/bitcoin/issues/29018.
The first commits adds an `ElapseSteady` helper and type aliases. The second commit uses those helpers in `ReportHeadersPresync` and in the fuzz target to increase determinism.
### Testing
It can be tested via (setting 32 parallel threads):
```
cargo run --manifest-path ./contrib/devtools/deterministic-fuzz-coverage/Cargo.toml -- $PWD/bld-cmake/ $PWD/../b-c-qa-assets/fuzz_corpora/ p2p_headers_presync 32
```
The failing diff is contained in the commit messages, if applicable.
ACKs for top commit:
Crypt-iQ:
tACK faa3ce3199
janb84:
Re-ACK [faa3ce3](faa3ce3199)
marcofleon:
ACK faa3ce3199
Tree-SHA512: 7e2e0ddf3b4e818300373d6906384df57a87f1eeb507fa43de1ba88cf03c8e6752a26b6e91bfb3ee26a21efcaf1d0d9eaf70d311d1637b671965ef4cb96e6b59
868816d962 refactor: Remove SetHexDeprecated (marcofleon)
6b63218ec2 qt: Update SetHexDeprecated to FromHex (marcofleon)
Pull request description:
This is part of https://github.com/bitcoin/bitcoin/pull/32189. I'm separating this out because it's not immediately obvious that it's just a refactor. `SetHexDeprecated()` doesn't do any correctness checks on the input, while `FromHex()` does, so it's theoretically possible that there's a behavior change.
Replaces `uint256::SetHexDeprecated()` calls with `Txid::FromHex()` in four locations:
- `TransactionTableModel::updateTransaction`
- `TransactionView::contextualMenu`
- `TransactionView::abandonTx`
- `TransactionView::bumpFee`
The input strings in these cases aren't user input, so they should only be valid hex strings from `GetHex()` (through `TransactionRecord::getTxHash()`). These conversions should be safe without additional checks.
ACKs for top commit:
laanwj:
Code review ACK 868816d962
w0xlt:
Code review ACK 868816d962
BrandonOdiwuor:
Code Review ACK 868816d962
TheCharlatan:
ACK 868816d962
hebasto:
ACK 868816d962, I have reviewed the code and it looks OK.
Tree-SHA512: 121f149dcc7358231d0327cb3212ec96486a88410174d3c74ab8cbd61bad35185bc0a9740d534492b714811f72a6736bc7ac6eeae590c0ea1365c61cc791da37
This should avoid the remaining non-determistic code coverage paths.
Without this patch, the tool would report a diff (only when running
without libFuzzer):
cargo run --manifest-path ./contrib/devtools/deterministic-fuzz-coverage/Cargo.toml -- $PWD/bld-cmake/ $PWD/../qa-assets/fuzz_corpora/ p2p_headers_presync 32
It should be sufficient to set it once. Especially, if the dynamic value
is only used by ResetAndInitialize.
This also avoids non-determistic code paths, when ResetAndInitialize may
re-initialize m_next_inv_to_inbounds.
Without this patch, the tool would report a diff:
cargo run --manifest-path ./contrib/devtools/deterministic-fuzz-coverage/Cargo.toml -- $PWD/bld-cmake/ $PWD/../qa-assets/fuzz_corpora/ p2p_headers_presync 32
...
- 1126| 3| m_next_inv_to_inbounds = now + m_rng.rand_exp_duration(average_interval);
- 1127| 3| }
+ 1126| 10| m_next_inv_to_inbounds = now + m_rng.rand_exp_duration(average_interval);
+ 1127| 10| }
1128| 491| return m_next_inv_to_inbounds;
...
This allows the clock to be mockable in tests. Also, replace cs_main
with GetMutex() while touching this function.
Also, use the ElapseSteady test helper in the p2p_headers_presync fuzz
target to make it more deterministic.
The m_last_presync_update variable is a global that is not reset in
ResetAndInitialize. However, it is only used for logging, so completely
disable it for now.
Without this patch, the tool would report a diff:
cargo run --manifest-path ./contrib/devtools/deterministic-fuzz-coverage/Cargo.toml -- $PWD/bld-cmake/ $PWD/../qa-assets/fuzz_corpora/ p2p_headers_presync 32
...
4468| 81| auto now = std::chrono::steady_clock::now();
4469| 81| if (now < m_last_presync_update + std::chrono::milliseconds{250}) return;
- ^80
+ ^79
...
This refactor clarifies that the MockableSteadyClock::mock_time_point
has millisecond precision by defining a type an using it.
Moreover, a ElapseSteady helper is added which can be re-used easily.
The global m_headers_presync_stats is not reset in ResetAndInitialize.
This may lead to non-determinism.
Fix it by incrementing the global node id counter instead.
Without this patch, the tool would report a diff:
cargo run --manifest-path ./contrib/devtools/deterministic-fuzz-coverage/Cargo.toml -- $PWD/bld-cmake/ $PWD/../qa-assets/fuzz_corpora/ p2p_headers_presync 32
...
2587| 3.73k| if (best_it == m_headers_presync_stats.end()) {
------------------
- | Branch (2587:17): [True: 80, False: 3.65k]
+ | Branch (2587:17): [True: 73, False: 3.66k]
------------------
...
When iterating over all fuzz input files in a folder, the order should
not matter.
However, shuffling may be useful to detect non-determinism.
Thus, shuffle in fuzz.cpp, when using neither libFuzzer, nor AFL.
Also, shuffle in the deterministic-fuzz-coverage tool, when using
libFuzzer.