b639417b39 net: Add Tor extended SOCKS5 error codes (laanwj)
Pull request description:
Add support for reporting Tor extended SOCKS5 error codes as defined here:
- https://spec.torproject.org/socks-extensions.html#extended-error-codes
- https://gitlab.torproject.org/tpo/core/arti/-/blob/main/crates/tor-socksproto/src/msg.rs?ref_type=heads#L183
These give a more direct indication of the problem in case of errors connecting to hidden services, for example:
```
2025-04-02T10:34:13Z [net] Socks5() connect to [elided].onion:8333 failed: onion service descriptor can not be found
```
In the C Tor implementation, to get these one should set the "ExtendedErrors" flag on the "SocksPort" definition, introduced in version 0.4.3.1.
In Arti, extended error codes are always enabled.
Also, report the raw error code in case of unknown reply values.
ACKs for top commit:
1440000bytes:
utACK b639417b39
w0xlt:
utACK b639417b39
pablomartin4btc:
utACK b639417b39
Tree-SHA512: b30e65cb0f5c9183701373b0ee64cdec40680a3de1a1a365b006538c4d0b7ca8a047d7c6f81a7f5b8a36bae3a20b47a4c2a9850423c7034866e3837fa8fdbfe2
e419b0e17f refactor: Remove manual CDBBatch size estimation (Lőrinc)
8b5e19d8b5 refactor: Delegate to LevelDB for CDBBatch size estimation (Lőrinc)
751077c6e2 Coins: Add `kHeader` to `CDBBatch::size_estimate` (Lőrinc)
Pull request description:
### Summary
The manual batch size estimation of `CDBBatch` serialized size was [added](e66dbde6d1) when LevelDB [didn't expose this functionality yet](https://github.com/google/leveldb/commit/69e2bd2).
The PR refactors the logic to use the native `leveldb::WriteBatch::ApproximateSize()` function, structured in 3 focused commits to incrementally replace the old behavior safely.
### Context
The previous manual size calculation initialized the estimate to 0, instead of LevelDB's header size (containing an 8-byte sequence number followed by a 4-byte count).
This PR corrects that and transitions to the now-available native LevelDB function for improved accuracy and maintainability.
### Approach
The fix and refactor follow a strangle pattern over three commits:
* correct the initialization bug in the existing manual calculation, isolating the fix and ensuring the subsequent assertions use the corrected logic;
* introduce the native `ApproximateSize()` method alongside the corrected manual one, adding assertions to verify their equivalence at runtime;
* remove the verified manual calculation logic and assertions, leaving only the native method.
ACKs for top commit:
sipa:
utACK e419b0e17f
TheCharlatan:
ACK e419b0e17f
laanwj:
Code review ACK e419b0e17f
Tree-SHA512: a12b973dd480d4ffec4ec89a119bf0b6f73bde4e634329d6e4cc3454b867f2faf3742b78ec4a3b6d98ac4fb28fb2174f44ede42d6c701ed871987a7274560691
Rather than use an ad-hoc reimplementation of wide multiplication inside the
fuzz test, reuse arith_uint256, which already has this. It's larger than what we
need here, but performance isn't a concern in this test, and it does what we need.
Since C++20, operator!= is implicitly defaulted using operator==, and
operator<, operator<=, operator>, and operator>= are defaulted using
operator<=>, so it suffices to just provide these two.
Remove the manual batch size estimation logic (`SizeEstimate()` method and `size_estimate` member) from `CDBBatch`.
Size is now determined solely by the `ApproximateSize()` method introduced in the previous commit, which delegates to the native LevelDB function.
The manual calculation is no longer necessary as LevelDB now provides this functionality directly, and the previous commit verified that the native function's results matched the manual estimation.
Assertions comparing the two methods are removed from `txdb.cpp`.
Co-authored-by: Wladimir J. van der Laan <laanwj@protonmail.com>
Serialized batch size can be queried via the underlying LevelDB implementation calling the native `leveldb::WriteBatch::ApproximateSize()`.
The previous manual calculation was added in e66dbde6d1 as part of https://github.com/bitcoin/bitcoin/pull/10195. At that time (April 2017), the version of LevelDB used by Bitcoin Core (and even the latest source) lacked a native function for this. LevelDB added this capability in 69e2bd224b, merged later that year.
The old manual estimation method (`SizeEstimate()`) is kept temporarily in this commit, and assertions are added in `txdb.cpp` to verify its results against `ApproximateSize()` during batch writes. This ensures the native function behaves as expected before removing the manual calculation in the subsequent commit.
The initialization of the manual `size_estimate` in `CDBBatch::Clear()` is corrected from `0` to `kHeader` (LevelDB's fixed batch header size).
This aligns the manual estimate with LevelDB's actual size immediately after clearing, fixing discrepancies that would otherwise be caught by tests in the next commit (e.g., `coins_tests`, `validation_chainstatemanager_tests`).
a40bd374aa Get*Union: disallow nulltpr Refs (Greg Sanders)
57433502e6 CountDistinctClusters: nullptrs disallowed (Greg Sanders)
8bca0d325a TxGraphImpl::Compact: m_main_clusterset.m_removed is always empty (Greg Sanders)
2c5cf987e9 TxGraphImpl::PullIn: only allowed when staging exists (Greg Sanders)
Pull request description:
Was looking at my local coverage report, and noticed a few spots that will not or cannot be hit.
CountDistinctClusters, GetAncestorsUnion, and GetDescendantsUnion accept nullptrs, but the test harness never employs them. Disallow them.
We never call PullIn whenever there isn't staging, so just enforce that invariant via assertion.
Remaining places that are not covered:
1) Relinearize: Currently we seem to always start with a cold (not known to be optimal) cluster, and after one attempt at linearization result into something optimal. This means we never shortcircuit, nor run PostLinearization, nor store the quality as ACCEPTABLE. Reducing iterations causes these lines to be hit. sipa says he will take this on as varying the amount of iterations was meant to be done eventually anyways.
2) We never do a move assignment operator when the lvalue already has a `m_graph` (so we never call UnlinkRef) 3358b1d105/src/txgraph.cpp (L2097)
3) We never use the move constructor: 3358b1d105/src/txgraph.cpp (L2108)
ACKs for top commit:
sipa:
utACK a40bd374aa
glozow:
utACK a40bd374aa
Tree-SHA512: ca88297222e80e0d590889698899f892b9335cfa587a76a6c6ca62c8d846f208b6b0b9a9b1829bafabdb929a1a0c3a75f23edf7dd2b4f5e2dad0235e5bc68ba3
Add support for reporting Tor extended SOCKS5 error codes as defined
here:
- https://spec.torproject.org/socks-extensions.html#extended-error-codes
- https://gitlab.torproject.org/tpo/core/arti/-/blob/main/crates/tor-socksproto/src/msg.rs?ref_type=heads#L183
These give a more direct indication of the problem in case of errors
connecting to hidden services, for example:
```
2025-04-02T10:34:13Z [net] Socks5() connect to [elided].onion:8333 failed: onion service descriptor can not be found
```
In the C Tor implementation, to get these one should set the
"ExtendedErrors" flag on the "SocksPort" definition, introduced in
version 0.4.3.1.
In Arti, extended error codes are always enabled.
Also, report the raw error code in case of unknown reply values.
fa51310121 contrib: Warn about using libFuzzer for coverage check (MarcoFalke)
fa17cdb191 test: Avoid script check worker threads while fuzzing (MarcoFalke)
fa900bb2dc contrib: Only print fuzz output on failure (MarcoFalke)
fa82fe2c73 contrib: Use -Xdemangler=llvm-cxxfilt in deterministic-*-coverage (MarcoFalke)
fa7e931130 contrib: Add optional parallelism to deterministic-fuzz-coverage (MarcoFalke)
Pull request description:
This should make the `partially_downloaded_block` fuzz target even more deterministic.
Follow-up to https://github.com/bitcoin/bitcoin/pull/31841. Tracking issue: https://github.com/bitcoin/bitcoin/issues/29018.
This bundles several changes:
* First, speed up the `deterministic-fuzz-coverage` helper by introducing parallelism.
* Then, a fix to remove spawned test threads or spawn them deterministically. (While testing this, high parallelism and thread contention may be needed)
### Testing
It can be tested via (setting 32 parallel threads):
```
cargo run --manifest-path ./contrib/devtools/deterministic-fuzz-coverage/Cargo.toml -- $PWD/bld-cmake/ $PWD/../b-c-qa-assets/fuzz_corpora/ partially_downloaded_block 32
```
Locally, on a failure, the output would look like:
```diff
....
- 150| 0| m_worker_threads.emplace_back([this, n]() {
- 151| 0| util::ThreadRename(strprintf("scriptch.%i", n));
+ 150| 1| m_worker_threads.emplace_back([this, n]() {
+ 151| 1| util::ThreadRename(strprintf("scriptch.%i", n));
...
```
This excerpt likely indicates that the script threads were started after the fuzz init function returned.
Similarly, for the scheduler thread, it would look like:
```diff
...
227| 0| m_node.scheduler = std::make_unique<CScheduler>();
- 228| 1| m_node.scheduler->m_service_thread = std::thread(util::TraceThread, "scheduler", [&] { m_node.scheduler->serviceQueue(); });
+ 228| 0| m_node.scheduler->m_service_thread = std::thread(util::TraceThread, "scheduler", [&] { m_node.scheduler->serviceQueue(); });
229| 0| m_node.validation_signals =
...
```
ACKs for top commit:
Prabhat1308:
re-ACK [`fa51310`](fa51310121)
hodlinator:
re-ACK fa51310121
janb84:
Re-ACK [fa51310](fa51310121)
Tree-SHA512: 1a935eb19da98c7c3810b8bcc5287e5649ffb55bf50ab78c414a424fef8e703839291bb24040a552c49274a4a0292910a00359bdff72fa29a4f53ad36d7a8720
28dc118001 fuzz: wallet: fix crypter target (brunoerg)
Pull request description:
The crypter target has an issue, it's calling `DecryptKey` with a random secret and a random public key that will unlikely be related to the key used to encrypt, so it won't have any effect. This PR changes fixes it and also removes the `DecryptSecret` call since this function is already (and only) called within `DecryptKey`.
ACKs for top commit:
maflcko:
lgtm ACK 28dc118001🥊
Tree-SHA512: e96b7d33879bf06eeec0726e74e8e0d7020997659bf97dfca5d7c1a7ba65c4d93c78e666b97eebde110564cef2eefc7209d3e3586e4658145827b14d1b01dfc9
fa69c42fdf refactor: Remove spurious virtual from final ~CZMQNotificationInterface (MarcoFalke)
Pull request description:
`virtual` does not make sense here, because:
* The class is `final`, thus the destructor isn't overridden in a derived class
* The destructor also isn't overriding the destructor of the base, clarified in commit 2b3ea39de4
* Clang 21 may warn about this
```
src/zmq/zmqnotificationinterface.h:25:13: error: virtual method '~CZMQNotificationInterface' is inside a 'final' class and can never be overridden [-Werror,-Wunnecessary-virtual-specifier]
25 | virtual ~CZMQNotificationInterface();
| ^
```
Fix all issues by removing it.
ACKs for top commit:
davidgumberg:
crACK fa69c42fdf
janb84:
ACK [fa69c42](fa69c42fdf)
TheCharlatan:
ACK fa69c42fdf
Tree-SHA512: 26ea977f31fe24c116d68dea6c583de7c6fc480877e1baefcde11db4ac191e352027d492ee6ad69a60fe4ff537e0841c638b3a3e81356d9e00c60030845fc96e
8e4a0ddd50 torcontrol: Add comment explaining Proxy credential randomization for Tor privacy (Eval EXEC)
ec5c0b26ce torcontrol: Define tor reply code as const to improve maintainability (Eval EXEC)
Pull request description:
This PR want to:
1. replace tor repy code with const to improve out maintainability.
2. cherry-picked https://github.com/bitcoin/bitcoin/pull/31973 , add comment to explain Proxy credential randomization for Tor privacy
ACKs for top commit:
hodlinator:
re-ACK 8e4a0ddd50
laanwj:
re-ACK 8e4a0ddd50
Tree-SHA512: 038daa6508ca88fceed5c8e155430614cb56976f36d1f8baee5114bca1141122cf94f51814a869848b3442691ee765cbf609cf946b2b35d5135015a9b749d917
0ff66b1c4a fuzz: coinselection: cover `SetBumpFeeDiscount` (brunoerg)
Pull request description:
`SetBumpFeeDiscount` sets the bump fee discount which is used to calculate the waste. We currently have no fuzz coverage for this function, so this PR adds it by calling `SetBumpFeeDiscount` before `RecalculateWaste`.
ACKs for top commit:
marcofleon:
ACK 0ff66b1c4a
Tree-SHA512: d5c1d97daaeb7f9b096bf9bdf6374b8a674a75f464e2b9bb3e1e1774a5805b22840ca1f31bae63f106640d9ce27a99432c3034524340be91c235f6ec3b185cff
8284229a28 refactor: deduplicate anchor witness program bytes (`0x4e,0x73`) (Sebastian Falbesoner)
41f2f058d0 test: add missing segwitv1 test cases to `script_standard_tests` (Sebastian Falbesoner)
Pull request description:
Currently we have two segwitv1 output script types that are considered standard:
- `TxoutType::WITNESS_V1_TAPROOT` (P2TR): witness program has size 32 (introduced with taproot soft-fork)
- `TxoutType::ANCHOR` (P2A): witness program is {0x4e, 0x7e} (introduced with #30352)
This PR adds them to the script standardness unit tests where missing, i.e. for using them with the `ExtractDestination` and `GetScriptForDestination` functions.
ACKs for top commit:
rkrux:
ACK 8284229a28
instagibbs:
reACK 8284229a28
hodlinator:
Code Review ACK 8284229a28
Tree-SHA512: d4a3b47fd31ba33f62d4367811e72a7f442c01b046b0a7217a66be0b9dea5c9041eebfe812c31839ec0f0b14c56948c7c016d3d2de79283583ad8e32c192c6ff
aa7a898c23 doc: use testnet4 in developer docs (Sjors Provoost)
6c217d22fd test: use testnet4 in argsman test (Sjors Provoost)
7c200ece80 test: use testnet4 in key_io_valid.json (Sjors Provoost)
d424bd5941 test: drop unused testnet3 magic bytes (Sjors Provoost)
8cfc09fafe test: cover testnet4 magic in assumeutxo.py (Sjors Provoost)
4281e3603a zmq: use testnet4 in zmq_sub.py example (Sjors Provoost)
Pull request description:
In preparation for dropping testnet3 entirely in #31974 this PR migrates a few things to testnet4:
* the ZMQ examples
* developer docs
* various unit tests
* the snapshot magic byte check in `feature_assumeutxo.py`
It drops `testnet3` from `MAGIC_BYTES` in the test framework, since no test uses it.
ACKs for top commit:
fjahr:
re-ACK aa7a898c23
maflcko:
lgtm ACK aa7a898c23🔊
hodlinator:
re-ACK aa7a898c23
Tree-SHA512: 235f74273234e8fb2aedf0017dea5c16bb9813ec7a1f89a51abe85691f09830a5ead834115d7db0936e12e55a40bc81888856a8002fe507c1474407e77f8b9fb
Threads may execute their function any time after they are spawned, so
coverage could be non-deterministic.
Fix this,
* for the script check worker threads by disabling them while fuzzing.
* for the scheduler thread by waiting for it to fully start and run the
service queue.
* Range-for avoids ++i/i++ debate and decreases linecount.
* seen_multipath is only used if multipath_segment_index hasn't already been set. Rename it to seen_substitutes to better describe what it does, now that the context implies its involved in multipath.
57d8b1f1b3 cmake: Avoid fuzzer "multiple definition of `main'" errors (Ryan Ofsky)
Pull request description:
This change builds libraries with `-fsanitize=fuzzer-no-link` instead of `-fsanitize=fuzzer` when the cmake `-DSANITIZERS=fuzzer` option is specified. This is necessary to make fuzzing and IPC cmake options compatible with each other and avoid CI failures in #30975 which enables IPC in the fuzzer CI build:
https://cirrus-ci.com/task/5366255504326656?logs=ci#L2817https://cirrus-ci.com/task/5233064575500288?logs=ci#L2384
The failures can also be reproduced by checking out #31741 and building with `cmake -B build -DBUILD_FOR_FUZZING=ON -DSANITIZERS=fuzzer -DENABLE_IPC=ON` with this fix reverted.
The fix updates the cmake build so when `-DSANITIZERS=fuzzer` is specified, the fuzz test binary is built with `-fsanitize=fuzzer` (so it can use libFuzzer's main function), and libraries are built with `-fsanitize=fuzzer-no-link` (so they can be linked into other executables with their own main functions).
Previously when `-DSANITIZERS=fuzzer` was specified, `-fsanitize=fuzzer` was applied to ALL libraries and executables. This was inappropriate because it made it impossible to build any executables other than the fuzz test executable without triggering link errors:
- `` multiple definition of `main' ``
- `` "undefined reference to `LLVMFuzzerTestOneInput' ``
if they depended on any libraries instrumented for fuzzing.
This was especially a problem when the `ENABLE_IPC` option was set because it made building the `mpgen` code generator impossible so nothing else that depended on generated sources, including the fuzz test binary, could be built either.
This commit was previously part of https://github.com/bitcoin/bitcoin/pull/31741 and had some discussion there starting in https://github.com/bitcoin/bitcoin/pull/31741#pullrequestreview-2619682385
---
This PR is part of the [process separation project](https://github.com/bitcoin/bitcoin/issues/28722).
ACKs for top commit:
hebasto:
ACK 57d8b1f1b3, tested on Ubuntu 24.04.
Tree-SHA512: 4011adbc0b08742e83cf7c0560d3d5b5694a863358e6ac9a21239626b4a8fedceca66db34b5a46136a7b26849bb1d8710c894689322ae97e1c407687c3f57d50
ae6b6ea296 wallet: remove redundant `Assert` call when block is disconnected (rkrux)
Pull request description:
It was highlighted in a PR discussion previously that the recently moved `Assert` macro call inside the block disconnected loop had been redundant for quite a while because of the presence of the `assert` macro call at the start of the function. Therefore, it is removed now.
refs #https://github.com/bitcoin/bitcoin/pull/31757#discussion_r1995416821
ACKs for top commit:
fjahr:
utACK ae6b6ea296
l0rinc:
crACK ae6b6ea296
hodlinator:
Code Review ACK ae6b6ea296
Prabhat1308:
Code Review ACK [`ae6b6ea`](ae6b6ea296)
Tree-SHA512: 6bbced88f4b39afcacefb7babe97c180a397d9cd55f18c4c2875bd594547dcdccb2059ac32495e0e8d4e7263b4c1349ca80b2f0fbd46b4450d1d847ba5abd903
This abstracts out the finding of the connected component that includes
a given element from FindConnectedComponent (which just finds any connected
component).
Use this in the txgraph fuzz test, which was effectively reimplementing this
logic. At the same time, improve its performance by replacing a vector with a
set.
It was highlighted in a PR discussion previously that the recently
moved `Assert` macro call inside the block disconnected loop had
been redundant for quite a while because of the presence of the
`assert` macro call at the start of the function. Therefore, it
is removed now.
refs #https://github.com/bitcoin/bitcoin/pull/31757#discussion_r1995416821
52ede28a8a doc: Update comments for AreInputsStandard to match code (Anthony Towns)
Pull request description:
The comment about extra data stuffed in scriptSigs was introduced in #4365 which introduced `ScriptSigArgsExpected()`, and became incorrect after #7387 / #7453 (checks are now performed by `SCRIPT_VERIFY_CLEANSTACK` during script validation and `IsPushOnly()` in `IsStandardTx()`). Drops the details on what a p2sh with many checksigs would look like, which was already done in #4365, but only for main.cpp not the duplicated comment in main.h, which was merged into policy/policy.cpp in #6335 and later moved to the right place in #10682.
ACKs for top commit:
instagibbs:
ACK 52ede28a8a
darosior:
ACK 52ede28a8a
Tree-SHA512: 5ee9a775c81d4c23aca2f8f938ab8bfa7605af489ddb78788613195be8744c7fb7a37bae271093f67f572577452651d4958706b55346e99cf8d32ac0fc34df03
b1de59e896 fuzz: extract unsequenced operations with side-effects (Lőrinc)
Pull request description:
https://github.com/bitcoin/bitcoin/pull/30746#discussion_r1817851827 introduced unsequenced operations with side-effects - which is undefined behavior, i.e. the right hand side can be evaluated before the left hand side, which happens to mutate it.
<details>
<summary>Tried to find other occurrences</summary>
```bash
clang++ --analyze -std=c++20 -I./src -I./src/test -I./src/test/fuzz src/test/fuzz/base_encode_decode.cpp src/psbt.cpp
```
but it didn't warn about UB.
Grepped for similar ones, but could find any other one in the codebase:
```bash
> grep -rnE --include='*.cpp' --include='*.h' '\b(\w+)\(([^)]*\b(\w+)\b[^)]*)\)\s*==\s*\3\.' .
./src/test/arith_uint256_tests.cpp:373: BOOST_CHECK(R1L.GetHex() == R1L.ToString());
./src/test/arith_uint256_tests.cpp:374: BOOST_CHECK(R2L.GetHex() == R2L.ToString());
./src/test/arith_uint256_tests.cpp:375: BOOST_CHECK(OneL.GetHex() == OneL.ToString());
./src/test/arith_uint256_tests.cpp:376: BOOST_CHECK(MaxL.GetHex() == MaxL.ToString());
./src/test/fuzz/cluster_linearize.cpp:565: assert(depgraph.FeeRate(best_anc.transactions) == best_anc.feerate);
./src/test/fuzz/cluster_linearize.cpp:646: assert(depgraph.FeeRate(found.transactions) == found.feerate);
./src/test/fuzz/cluster_linearize.cpp:765: assert(depgraph.FeeRate(chunk_info.transactions) == chunk_info.feerate);
./src/test/fuzz/base_encode_decode.cpp:95: assert(DecodeBase64PSBT(psbt, random_string, error) == error.empty());
./src/test/fuzz/key.cpp:102: assert(pubkey.data() == pubkey.begin());
./src/test/skiplist_tests.cpp:42: BOOST_CHECK(vIndex[from].GetAncestor(0) == vIndex.data());
./src/script/signingprovider.cpp:535: ComputeTapbranchHash(node.sub[1]->hash, node.sub[1]->hash) == node.hash) {
./src/pubkey.h:78: return vch.size() > 0 && GetLen(vch[0]) == vch.size();
./src/cluster_linearize.h:881: Assume(elem.inc.feerate.IsEmpty() == elem.pot_feerate.IsEmpty());
```
</details>
Hodlinator deduced the UB on Windows in https://github.com/bitcoin/bitcoin/issues/32135#issuecomment-2751723855Fixes#32135
ACKs for top commit:
maflcko:
lgtm ACK b1de59e896
hodlinator:
ACK b1de59e896
marcofleon:
Nice, ACK b1de59e896
brunoerg:
code review ACK b1de59e896
Tree-SHA512: d66524424c7f749eba870f5bd6038da79666ac638047b31dd8ff15a77d927facb54b4735e8afb7984648fdc9e2dd59ea213996c352301fa05978f041511361d4