The UTXO set has grown significantly, and flushing it from memory to LevelDB often takes over 20 minutes after a successful IBD with large dbcache values.
The final UTXO set is written to disk in batches, which LevelDB sorts into SST files.
By increasing the default batch size, we can reduce overhead from repeated compaction cycles, minimize constant overhead per batch, and achieve more sequential writes.
Experiments with different batch sizes (loaded via assumeutxo at block 840k, then measuring final flush time) show that 64 MiB batches significantly reduce flush time without notably increasing memory usage:
| dbbatchsize | flush_sum (ms) |
|-------------|----------------|
| 8 MiB | ~240,000 |
| 16 MiB | ~220,000 |
| 32 MiB | ~200,000 |
| *64 MiB* | *~150,000* |
| 128 MiB | ~156,000 |
| 256 MiB | ~166,000 |
| 512 MiB | ~186,000 |
| 1 GiB | ~186,000 |
Checking the impact of a `-reindex-chainstate` with `-stopatheight=878000` and `-dbcache=30000` gives:
16 << 20
```
2025-01-12T07:31:05Z Flushed fee estimates to fee_estimates.dat.
2025-01-12T07:31:05Z [warning] Flushing large (26 GiB) UTXO set to disk, it may take several minutes
2025-01-12T07:53:51Z Shutdown: done
```
Flush time: 22 minutes and 46 seconds
64 >> 20
```
2025-01-12T18:30:00Z Flushed fee estimates to fee_estimates.dat.
2025-01-12T18:30:00Z [warning] Flushing large (26 GiB) UTXO set to disk, it may take several minutes
2025-01-12T18:44:43Z Shutdown: done
```
Flush time: ~14 minutes 43 seconds.
`IsCoinBase` means single input with NULL prevout, so it makes sense to restrict duplicate check to non-coinbase transactions only.
The behavior is the same as before, except that single-input-transactions aren't checked for duplicates anymore (~70-90% of the cases, see https://transactionfee.info/charts/transactions-1in).
I've added braces to the conditions and loops to simplify review of followup commits.
> cmake -B build -DBUILD_BENCH=ON -DCMAKE_BUILD_TYPE=Release && cmake --build build -j$(nproc) && build/src/bench/bench_bitcoin -filter='CheckBlockBench|DuplicateInputs|ProcessTransactionBench' -min-time=10000
> C++ compiler .......................... AppleClang 16.0.0.16000026
| ns/block | block/s | err% | total | benchmark
|--------------------:|--------------------:|--------:|----------:|:----------
| 335,917.12 | 2,976.92 | 1.3% | 11.01 | `CheckBlockBench`
| ns/op | op/s | err% | total | benchmark
|--------------------:|--------------------:|--------:|----------:|:----------
| 3,286,337.42 | 304.29 | 1.1% | 10.90 | `DuplicateInputs`
| 9,561.02 | 104,591.35 | 0.2% | 11.02 | `ProcessTransactionBench`
The `CheckTransaction` validation function in https://github.com/bitcoin/bitcoin/blob/master/src/consensus/tx_check.cpp#L41-L45 relies on a correct ordering relation for detecting duplicate transaction inputs.
This update to the tests ensures that:
* Accurate detection of duplicates: Beyond trivial cases (e.g., two identical inputs), duplicates are detected correctly in more complex scenarios.
* Consistency across methods: Both sorted sets and hash-based sets behave identically when detecting duplicates for `COutPoint` and related values.
* Robust ordering and equality relations: The function maintains expected behavior for ordering and equality checks.
Using randomized testing with shuffled inputs (to avoid any remaining bias introduced), the enhanced test validates that `CheckTransaction` remains robust and reliable across various input configurations. It confirms identical behavior to a hashing-based duplicate detection mechanism, ensuring consistency and correctness.
To make sure the new branches in the follow-up commits will be covered, `basic_transaction_tests` was extended a randomized test one comparing against the old implementation (and also an alternative duplicate). The iterations and ranges were chosen such that every new branch is expected to be hit once.
Merged multiple template methods into single constexpr-delimited implementation to reduce template bloat (i.e. related functionality is grouped into a single method, but can be optimized because of C++20 constexpr conditions).
This unifies related methods that were only bound before by similar signatures - and enables `SizeComputer` optimizations later
`util::Xor` method was split out into more focused parts:
* one which assumes tha the `uint64_t` key is properly aligned, doing the first few xors as 64 bits (the memcpy is eliminated in most compilers), and the last iteration is optimized for 8/16/32 bytes.
* an unaligned `uint64_t` key with a `key_offset` parameter which is rotated to accommodate the data (adjusting for endianness).
* a legacy `std::vector<std::byte>` key with an asserted 8 byte size, converted to `uint64_t`.
Note that the default statement alone would pass the tests, but would be very slow, since the 1, 2 and 4 byte versions won't be specialized by the compiler, hence the switch.
Asserts were added throughout the code to make sure every such vector has length 8, since in the next commit we're converting all of them to `uint64_t`.
refactor: Migrate fixed-size obfuscation end-to-end from `std::vector<std::byte>` to `uint64_t`
Since `util::Xor` accepts `uint64_t` values, we're eliminating any repeated vector-to-uint64_t conversions going back to the loading/saving of these values (we're still serializing them as vectors, but converting as soon as possible to `uint64_t`). This is the reason the tests still generate vector values and convert to `uint64_t` later instead of generating it directly.
We're also short-circuit `Xor` calls with 0 key values early to avoid unnecessary calculations (e.g. `MakeWritableByteSpan`) - even assuming that XOR is never called for 0.
> cmake -B build -DBUILD_BENCH=ON -DCMAKE_BUILD_TYPE=Release \
&& cmake --build build -j$(nproc) \
&& build/src/bench/bench_bitcoin -filter='XorHistogram|AutoFileXor' -min-time=10000
C++ compiler .......................... AppleClang 16.0.0.16000026
| ns/byte | byte/s | err% | total | benchmark
|--------------------:|--------------------:|--------:|----------:|:----------
| 0.09 | 10,799,585,470.46 | 1.3% | 11.00 | `AutoFileXor`
| 0.14 | 7,144,743,097.97 | 0.2% | 11.01 | `XorHistogram`
C++ compiler .......................... GNU 13.2.0
| ns/byte | byte/s | err% | ins/byte | cyc/byte | IPC | bra/byte | miss% | total | benchmark
|--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
| 0.59 | 1,706,433,032.76 | 0.1% | 0.00 | 0.00 | 0.620 | 0.00 | 1.8% | 11.01 | `AutoFileXor`
| 0.47 | 2,145,375,849.71 | 0.0% | 0.95 | 1.48 | 0.642 | 0.20 | 9.6% | 10.93 | `XorHistogram`
----
A few other benchmarks that seem to have improved as well (tested with Clang only):
Before:
| ns/op | op/s | err% | total | benchmark
|--------------------:|--------------------:|--------:|----------:|:----------
| 2,237,168.64 | 446.99 | 0.3% | 10.91 | `ReadBlockFromDiskTest`
| 748,837.59 | 1,335.40 | 0.2% | 10.68 | `ReadRawBlockFromDiskTest`
After:
| ns/op | op/s | err% | total | benchmark
|--------------------:|--------------------:|--------:|----------:|:----------
| 1,827,436.12 | 547.21 | 0.7% | 10.95 | `ReadBlockFromDiskTest`
| 49,276.48 | 20,293.66 | 0.2% | 10.99 | `ReadRawBlockFromDiskTest`
To make the benchmarks representative, I've collected the write-vector's sizes during IBD for every invocation of `util::Xor` until 860k blocks, and used it as a basis for the micro-benchmarks, having a similar distribution of random data (taking the 1000 most frequent ones, making sure the very big ones are also covered).
And even though we already have serialization tests, `AutoFileXor` was added to serializing 1 MB via the provided key_bytes.
This was used to test the effect of disabling obfuscation.
> cmake -B build -DBUILD_BENCH=ON -DCMAKE_BUILD_TYPE=Release \
&& cmake --build build -j$(nproc) \
&& build/src/bench/bench_bitcoin -filter='XorHistogram|AutoFileXor' -min-time=10000
C++ compiler .......................... AppleClang 16.0.0.16000026
| ns/byte | byte/s | err% | total | benchmark
|--------------------:|--------------------:|--------:|----------:|:----------
| 1.07 | 937,527,289.88 | 0.4% | 10.24 | `AutoFileXor`
| 0.87 | 1,149,859,017.49 | 0.3% | 10.80 | `XorHistogram`
C++ compiler .......................... GNU 13.2.0
| ns/byte | byte/s | err% | ins/byte | cyc/byte | IPC | bra/byte | miss% | total | benchmark
|--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
| 1.87 | 535,253,389.72 | 0.0% | 9.20 | 3.45 | 2.669 | 1.03 | 0.1% | 11.02 | `AutoFileXor`
| 1.70 | 587,844,715.57 | 0.0% | 9.35 | 5.41 | 1.729 | 1.05 | 1.7% | 10.95 | `XorHistogram`
Instead of copying the data and doing the xor in a 4096 byte array, we're doing it directly on the input.
`DataStream` constructor was also added to enable presized serialization and writing in a single command.
The Obfuscation (XOR) operations are currently done byte-by-byte during serialization, buffering the reads will enable batching the obfuscation operations later (not yet done here).
Also, different operating systems seem to handle file caching differently, so reading bigger batches (and processing those from memory) is also a bit faster (likely because of fewer native fread calls or less locking).
Since `ReadBlock[Undo]` is called with the file position being set after the [undo]block size, we have to start by backtracking 4 bytes to be able to read the expected size first.
As a consequence, the `FlatFilePos pos` parameter in `ReadBlock` is copied now.
`HashVerifier` was included in the try/catch to include the `undo_size` serialization there as well since the try is about `Deserialize` errors. This is why the final checksum verification was also included in the try.
> cmake -B build -DBUILD_BENCH=ON -DCMAKE_BUILD_TYPE=Release && cmake --build build -j$(nproc) && build/src/bench/bench_bitcoin -filter='ReadBlockBench' -min-time=10000
> C++ compiler .......................... AppleClang 16.0.0.16000026
Before:
| ns/op | op/s | err% | total | benchmark
|--------------------:|--------------------:|--------:|----------:|:----------
| 2,289,743.62 | 436.73 | 0.3% | 11.03 | `ReadBlockBench`
After:
| ns/op | op/s | err% | total | benchmark
|--------------------:|--------------------:|--------:|----------:|:----------
| 1,724,703.14 | 579.81 | 0.4% | 11.06 | `ReadBlockBench`
> C++ compiler .......................... GNU 13.3.0
Before:
| ns/op | op/s | err% | ins/op | cyc/op | IPC | bra/op | miss% | total | benchmark
|--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
| 7,786,309.20 | 128.43 | 0.0% | 70,832,812.80 | 23,803,523.16 | 2.976 | 5,073,002.56 | 0.4% | 10.72 | `ReadBlockBench`
After:
| ns/op | op/s | err% | ins/op | cyc/op | IPC | bra/op | miss% | total | benchmark
|--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
| 6,272,557.28 | 159.42 | 0.0% | 63,251,231.42 | 19,739,780.92 | 3.204 | 3,589,886.66 | 0.3% | 10.57 | `ReadBlockBench`
Co-authored-by: Cory Fields <cory-nospam-@coryfields.com>
e637dc2c01c3b566e6c51c911c5881a8d206c924 refactor: Replace uint256 type with Wtxid in PackageMempoolAcceptResult struct (marcofleon)
a3baead7cb8376e3b09f1726b8c466648d187524 validation: use wtxid instead of txid in CheckEphemeralSpends (marcofleon)
Pull request description:
This PR addresses a small bug in [`AcceptMultipleTransactions`](45719390a1/src/validation.cpp (L1598)) where a txid was being inserted into a map that should only hold wtxids. `CheckEphemeralSpends` has an out parameter on failure that records that the child transaction did not spend the parent's dust. Instead of using the txid of this child, use its wtxid.
The second commit in this PR is a refactor of the `PackageMempoolAcceptResult` struct to use the `Wtxid` type instead of `uint256`. This helps to prevent errors like this in the future.
ACKs for top commit:
instagibbs:
ACK e637dc2c01
glozow:
ACK e637dc2c01c, hooray for type safety
dergoegge:
Code review ACK e637dc2c01c3b566e6c51c911c5881a8d206c924
Tree-SHA512: 17039efbb241b7741e2610be5a6d6f88f4c1cbe22d476931ec99e43f993d259a1a5e9334e1042651aff49edbdf7b9e1c1cd070a28dcba5724be6db842e4ad1e0
59c4930394cafc939eb396224b3d60d01ba0ce37 qa: Enable feature_init.py on Windows (Hodlinator)
Pull request description:
Windows has been skipped since feature_init.py was added in #23289. Possibly due to poorer support on older Python versions, or attempts to use `CTRL_C_EVENT` (which didn't work in my testing either) instead of `CTRL_BREAK_EVENT`.
ACKs for top commit:
maflcko:
lgtm ACK 59c4930394cafc939eb396224b3d60d01ba0ce37
BrandonOdiwuor:
Code Review ACK 59c4930394cafc939eb396224b3d60d01ba0ce37
hebasto:
ACK 59c4930394cafc939eb396224b3d60d01ba0ce37, I have reviewed the code and it looks OK.
Tree-SHA512: 4f3649b41bcba2e8d03b8dcb1a7a6882edafb2c456db4b0768fc86018e9e9ed7171cb3d3c99e74b4ef38a3fcf3ab5d2f1865bbd49d791f1ce0a246806634e1a7
568fcdddaec2cc8decba5a098257f31729cc1caa scripted-diff: Adjust documentation per top-level target output location (Hennadii Stepanov)
026bb226e96919603af829d0b677779a234a0f6e cmake: Set top-level target output locations (Hennadii Stepanov)
Pull request description:
This PR sets the target output locations to the `bin` and `lib` subdirectories within the build tree, creating a directory structure that mirrors that of the installed targets.
This approach is widely adopted by the large projects, such as [LLVM](e146c1867e/lldb/cmake/modules/LLDBStandalone.cmake (L128-L130)):
```cmake
set(CMAKE_RUNTIME_OUTPUT_DIRECTORY ${CMAKE_BINARY_DIR}/bin)
set(CMAKE_LIBRARY_OUTPUT_DIRECTORY ${CMAKE_BINARY_DIR}/lib${LLVM_LIBDIR_SUFFIX})
set(CMAKE_ARCHIVE_OUTPUT_DIRECTORY ${CMAKE_BINARY_DIR}/lib${LLVM_LIBDIR_SUFFIX})
```
The `libsecp256k1` project has also recently [adopted](https://github.com/bitcoin-core/secp256k1/pull/1553) this approach.
With this PR, all binaries are conveniently located. For example, run:
```
$ ./build/bin/fuzz
```
instead of:
```
$ ./build/src/test/fuzz/fuzz
```
On Windows, all required DLLs are now located in the same directory as the executables, allowing to run `bitcoin-chainstate.exe` (which loads `bitcoinkernel.dll`) without the need to copy DLLs or modify the `PATH` variable.
The idea was briefly discussed among the build team during the recent CoreDev meeting.
---
**Warning**: This PR changes build locations of newly built executables like `bitcoind` and `test_bitcoin` from `src/` to `bin/` without deleting previously built executables. A clean build is recommended to avoid accidentally running old binaries.
ACKs for top commit:
theStack:
Light re-ACK 568fcdddaec2cc8decba5a098257f31729cc1caa
ryanofsky:
Code review ACK 568fcdddaec2cc8decba5a098257f31729cc1caa. Only change since last review was rebasing. I'm ok with this PR in its current form if other developers are happy with it. I just personally think it is inappropriate to \*silently\* break an everyday developer workflow like `git pull; make bitcoind`. I wouldn't have a problem with this PR if it triggered an explicit error, or if the problem was limited to less common workflows like changing cmake options in an existing build.
TheCharlatan:
Re-ACK 568fcdddaec2cc8decba5a098257f31729cc1caa
theuni:
ACK 568fcdddaec2cc8decba5a098257f31729cc1caa
Tree-SHA512: 1aa5ecd3cd49bd82f1dcc96c8e171d2d19c58aec8dade4bc329df89311f9e50cbf6cf021d004c58a0e1016c375b0fa348ccd52761bcdd179c2d1e61c105e3b9f
fac1dd9dffba1033245c283bc0468e801c14e910 test: Fix authproxy named args debug logging (MarcoFalke)
Pull request description:
In Python the meaning of `args or argsn` is that `argsn` is fully ignored when `args` is a list with at least one element. However, the RPC server accepts mixed positional and named args in the same RPC.
Fix the debug log by always printing both. Also, add a new `_json_dumps` helper to avoid bloated code.
Can be tested via `--tracerpc` on a call that uses named args mixed with positional args.
ACKs for top commit:
i-am-yuvi:
Tested ACK fac1dd9dffba1033245c283bc0468e801c14e910
rkrux:
tACK fac1dd9dffba1033245c283bc0468e801c14e910
musaHaruna:
Tested ACK [fac1dd9](fac1dd9dff)
ryanofsky:
Code review ACK fac1dd9dffba1033245c283bc0468e801c14e910. Thanks for logging fix. This change should have been included in #19762
Tree-SHA512: ff63fbc2564b2c7589e9294baacf4c7a79f10d593776813392510702ca726e3893a29db3ba261f3aee1789a59bb215d7cb10fc85ca1a02632631d3722ddcdfc5
9132824947005421057f6a5f035082c7b99f3853 qt: 29.0 translations update (Hennadii Stepanov)
Pull request description:
This PR follows our [Release Process](bd0ee07310/doc/release-process.md) and concludes the translation-specific efforts for this release cycle. It follows two previous translation-related PRs, https://github.com/bitcoin/bitcoin/pull/31809 and https://github.com/bitcoin-core/gui/pull/854.
It is one of the steps required _before_ branch-off, as scheduled in https://github.com/bitcoin/bitcoin/issues/31029.
The previous similar PR: https://github.com/bitcoin/bitcoin/pull/30715.
**Notes for reviewers:**
1. This is the first release process conducted after migrating the build system to CMake. The [bitcoin-maintainer-tools/update-translations.py](https://github.com/bitcoin-core/bitcoin-maintainer-tools/blob/main/update-translations.py) tool, which is used to fetch translations from [Transifex.com](https://www.transifex.com/bitcoin/bitcoin), still generates the no-longer-needed `src/Makefile.qt_locale.include` file. Please ignore it.
2. The actual translations on Transifex is a moving target. Therefore, your diff after running [`bitcoin-maintainer-tools/update-translations.py`](https://github.com/bitcoin-core/bitcoin-maintainer-tools/blob/main/update-translations.py) might differ.
3. The translations for the following languages, which appear to be the result of a mistake or an act of vandalism, have been discarded:
- Czech (cs)
- Danish (da)
- Dutch (nl)
4. Changes to the Thai (th) translation have been discarded due to multiple unsolicited pronunciation notes.
ACKs for top commit:
glozow:
ACK 9132824947005421057f6a5f035082c7b99f3853
Tree-SHA512: 560dbd587eec563fa26f2ff07d950c2e86b89a7768deef7397aee80d527ad4b10c1f17d4abab6ecfcffd143e3a2d2a4e45b453197ad19c1a64087f98ab80ed4d
The translations for the following languages, which appear to be the
result of a mistake or an act of vandalism, have been discarded:
- Czech (cs)
- Danish (da)
- Dutch (nl)
Changes to the Thai (th) translation have been discarded due to multiple
unsolicited pronunciation notes.
c94195c077ff227e5e2d80e803e1400d7f60812b doc: add note to windows build about stripping bin (fanquake)
Pull request description:
The Windows binaries are particularly big when they contain debug info, closing in on 500mb. Add a note to the Windows build instructions about using `--strip`.
I haven't tested this (the copying out to WSL). If we don't want to add this note, in favour of [user-presents or similar](https://github.com/bitcoin/bitcoin/issues/30593#issuecomment-2271304490), then we should just close#30593.
Fixes#30593.
ACKs for top commit:
hodlinator:
ACK c94195c077ff227e5e2d80e803e1400d7f60812b
hebasto:
ACK c94195c077ff227e5e2d80e803e1400d7f60812b.
Tree-SHA512: c55670486ef60c6bda720e65443e17747b840e220c5bf6d6c0b77590d95cd6c8f040bc0e67dfa8eb11451f4f2eac9faf25d74ea68251b881773836f4113e8595
d79dab0fa999002a0c5b70c1688240e2a5032ce1 doc: warn against having qt6 installed on macOS (Sjors Provoost)
Pull request description:
Document #31009 in time for the v29 release.
ACKs for top commit:
achow101:
ACK d79dab0fa999002a0c5b70c1688240e2a5032ce1
hebasto:
ACK d79dab0fa999002a0c5b70c1688240e2a5032ce1.
Tree-SHA512: 4c6e557b6410c7fd766e1cdc356ae9f7410fbb4746732580e5bdf33ba43dca64e6f2fb66677d1e0c8fa71c19f212dc81ac73dc4277f2fd966bbd41c20d9291f8
611999e09777716d1fa686254db20845aff3dffe doc: link to benchcoin over bitcoinperf (fanquake)
Pull request description:
Seems like linking to https://github.com/bitcoin-dev-tools/benchcoin is now the best thing to do here. If not, we can just drop the other links.
ACKs for top commit:
l0rinc:
ACK 611999e09777716d1fa686254db20845aff3dffe
laanwj:
ACK 611999e09777716d1fa686254db20845aff3dffe
hebasto:
ACK 611999e09777716d1fa686254db20845aff3dffe. I agree. I've had a great experience using it.
Tree-SHA512: 558060bec92099befaa047e9192e5172e6a0cdfc5530d1f8b4d64ac717ce999a993d39c5d108fa9df3e30b2fc089e31d720f344153381e7c53f0ed40938ae1e0
The Windows binaries are particularly big when they contain debug
info, closing in on 500mb. Add a note to the Windows build instructions
about using `--strip`.
a3c3f37e71efc1ad13fcad49b1ac651e5843b26b ci: Do not try to install for fuzz builds (Hennadii Stepanov)
Pull request description:
This PR is a follow-up to https://github.com/bitcoin/bitcoin/pull/31844 and extends the changes from fb0546b1c5ebb858605bef4c9fa001782e0ab213 to all fuzz builds in the CI.
Fixes https://github.com/bitcoin/bitcoin/issues/32001.
ACKs for top commit:
dergoegge:
utACK a3c3f37e71efc1ad13fcad49b1ac651e5843b26b
Tree-SHA512: bc422c53f6f06f25a0e13e788ade7c98711d864773a909487b8863e7cacfbc499ea466ec675a955279a89e247745ff0e845cd42896b4d405c4441d5e9f3a9c1b
f5d8b66a8cf23f9ccc51fb9702943c8a5f755f43 Squashed 'src/minisketch/' changes from eb37a9b8e7..d1e6bb8bbf (fanquake)
Pull request description:
Includes:
* https://github.com/bitcoin-core/minisketch/pull/92
ACKs for top commit:
hebasto:
ACK 4fde88bc469dc1c827591f764bd635038ccaf852, I've updated the subtree locally and got zero diff with this PR.
Tree-SHA512: 0ddaa6b64ca14da244d455594bc122a059fd1d199d28a7a78f266e352811568bd0f30d3b1e5e5d859f92753d3979831c095e3f6078f0ba2c909b1566a0e74a0c