bitcoin

mirror of https://github.com/bitcoin/bitcoin.git synced 2026-02-03 22:03:01 +01:00

Author	SHA1	Message	Date
Ava Chow	8c07800b19	Merge bitcoin/bitcoin#32497 : merkle: pre‑reserve leaves to prevent reallocs with odd vtx count `3dd815f048` validation: pre-reserve leaves to prevent reallocs with odd vtx count (Lőrinc) `7fd47e0e56` bench: make `MerkleRoot` benchmark more representative (Lőrinc) `f0a2183108` test: adjust `ComputeMerkleRoot` tests (Lőrinc) Pull request description: #### Summary `ComputeMerkleRoot` [duplicates the last hash](`39b6c139bd/src/consensus/merkle.cpp (L54-L56)`) when the input size is odd. If the caller provides a `std::vector` whose capacity equals its size, that extra `push_back` forces a reallocation, doubling its capacity (causing peak memory usage of 3x the necessary size). This affects roughly half of the created blocks (those with odd transaction counts), causing unnecessary memory fragmentation during every block validation. #### Fix * Pre-reserves vector capacity to account for the odd-count duplication using `(size + 1) & ~1ULL`. * This syntax produces [optimal assembly](https://github.com/bitcoin/bitcoin/pull/32497#discussion_r2553107836) across x86/ARM and 32/64-bit platforms for GCC & Clang. * Eliminates default construction of `uint256` objects that are immediately overwritten by switching from `resize` to `reserve` + `push_back`. #### Memory Impact [Memory profiling](https://github.com/bitcoin/bitcoin/pull/32497#issuecomment-3563724551) shows 50% reduction in peak allocation (576KB → 288KB) and elimination of reallocation overhead. #### Validation The benchmark was updated to use an odd leaf count to demonstrate the real-world scenario where the reallocation occurs. A full `-reindex-chainstate` up to block 896 408 ran without triggering the asserts. <details> <summary>Validation asserts</summary> Temporary asserts (not included in this PR) confirm that `push_back` never reallocates and that the coinbase witness hash remains null: ```cpp if (hashes.size() & 1) { assert(hashes.size() < hashes.capacity()); // TODO remove hashes.push_back(hashes.back()); } leaves.reserve((block.vtx.size() + 1) & ~1ULL); // capacity rounded up to even leaves.emplace_back(); assert(leaves.back().IsNull()); // TODO remove ``` </details> #### Benchmark Performance While the main purpose is to improve predictability, the reduced memory operations also improve hashing throughput slightly. ACKs for top commit: achow101: ACK `3dd815f048` optout21: reACK `3dd815f048` hodlinator: re-ACK `3dd815f048` vasild: ACK `3dd815f048` w0xlt: ACK `3dd815f048` with minor nits. danielabrozzoni: Code review ACK `3dd815f048` Tree-SHA512: e7b578f9deadc0de7d61c062c7f65c5e1d347548ead4a4bb74b056396ad7df3f1c564327edc219670e6e2b2cb51f4e1ccfd4f58dd414aeadf2008d427065c11f	2026-01-20 15:47:17 -08:00
Ava Chow	a365c9fe1f	Merge bitcoin/bitcoin#33738 : log: avoid collecting `GetSerializeSize` data when compact block logging is disabled `969c840db5` log,blocks: avoid `ComputeTotalSize` and `GetHash` work when logging is disabled (Lőrinc) `babfda332b` log,net: avoid `ComputeTotalSize` when logging is disabled (Lőrinc) `1658b8f82b` refactor: rename `CTransaction::GetTotalSize` to signal that it's not cached (Lőrinc) Pull request description: ### Context The new accounting options introduced in https://github.com/bitcoin/bitcoin/pull/32582 can be quite heavy, and are not needed when debug logging is disabled. ### Problem `PartiallyDownloadedBlock::FillBlock()` and `PeerManagerImpl::SendBlockTransactions()` accumulate transaction sizes for debug logging by calling `ComputeTotalSize()` in loops, which invokes expensive `GetSerializeSize()` serializations. The block header hash is also only computed for the debug log. ### Fixes Guard the size and hash calculations with `LogAcceptCategory()` checks so the serialization and hashing work only occurs when compact block debug logging is enabled. Also modernized the surrounding code a bit since the change is quite trivial. ### Reproducer You can test the change by starting an up-to-date `bitcoind` node with `-debug=cmpctblock` and observing compact block log lines such as: > [cmpctblock] Successfully reconstructed block 00000000000000000001061eaa6c0fe79258e7f79606e67ac495765cb121a520 with 1 txn prefilled, 3122 txn from mempool (incl at least 3 from extra pool) and 641 txn (352126 bytes) requested <details> <summary>Test patch</summary> ```patch diff --git a/src/blockencodings.cpp b/src/blockencodings.cpp index 58620c93cc..f16eb38fa5 100644 --- a/src/blockencodings.cpp +++ b/src/blockencodings.cpp @@ -186,6 +186,7 @@ bool PartiallyDownloadedBlock::IsTxAvailable(size_t index) const ReadStatus PartiallyDownloadedBlock::FillBlock(CBlock& block, const std::vector<CTransactionRef>& vtx_missing, bool segwit_active) { + LogInfo("PartiallyDownloadedBlock::FillBlock called"); if (header.IsNull()) return READ_STATUS_INVALID; block = header; @@ -218,6 +219,7 @@ ReadStatus PartiallyDownloadedBlock::FillBlock(CBlock& block, const std::vector< } if (LogAcceptCategory(BCLog::CMPCTBLOCK, BCLog::Level::Debug)) { + LogInfo("debug log enabled"); const uint256 hash{block.GetHash()}; // avoid cleared header uint32_t tx_missing_size{0}; for (const auto& tx : vtx_missing) tx_missing_size += tx->ComputeTotalSize(); // avoid cleared txn_available diff --git a/src/net_processing.cpp b/src/net_processing.cpp index 5600c8d389..c081825f77 100644 --- a/src/net_processing.cpp +++ b/src/net_processing.cpp @@ -2470,6 +2470,7 @@ uint32_t PeerManagerImpl::GetFetchFlags(const Peer& peer) const void PeerManagerImpl::SendBlockTransactions(CNode& pfrom, Peer& peer, const CBlock& block, const BlockTransactionsRequest& req) { + LogInfo("PeerManagerImpl::SendBlockTransactions called"); BlockTransactions resp(req); for (size_t i = 0; i < req.indexes.size(); i++) { if (req.indexes[i] >= block.vtx.size()) { @@ -2480,6 +2481,7 @@ void PeerManagerImpl::SendBlockTransactions(CNode& pfrom, Peer& peer, const CBlo } if (LogAcceptCategory(BCLog::CMPCTBLOCK, BCLog::Level::Debug)) { + LogInfo("debug log enabled"); uint32_t tx_requested_size{0}; for (const auto i : req.indexes) tx_requested_size += block.vtx[i]->ComputeTotalSize(); LogDebug(BCLog::CMPCTBLOCK, "Peer %d sent us a GETBLOCKTXN for block %s, sending a BLOCKTXN with %u txns. (%u bytes)\n", pfrom.GetId(), block.GetHash().ToString(), resp.txn.size(), tx_requested_size); ``` </details> ACKs for top commit: davidgumberg: reACK `969c840db5` achow101: ACK `969c840db5` hodlinator: re-ACK `969c840db5` sedited: Re-ACK `969c840db5` danielabrozzoni: reACK `969c840db5` Tree-SHA512: 9780102d29778165144e3602d934ed4cb96660fd7b9ff2581b223c619e419139b8348e60f226af448702ae527736a1806d169b44342c5a82795590f664e16efe	2026-01-20 15:41:30 -08:00
Ava Chow	f7e88e298a	Merge bitcoin/bitcoin#32471 : wallet/rpc: fix listdescriptors RPC fails to return descriptors with private key information when wallet contains descriptors missing any key `9c7e4771b1` test: Test listdescs with priv works even with missing priv keys (Novo) `ed945a6854` walletrpc: reject listdes with priv key on w-only wallets (Novo) `9e5e9824f1` descriptor: ToPrivateString() pass if at least 1 priv key exists (Novo) `5c4db25b61` descriptor: refactor ToPrivateString for providers (Novo) `2dc74e3f4e` wallet/migration: use HavePrivateKeys in place of ToPrivateString (Novo) `e842eb90bb` descriptors: add HavePrivateKeys() (Novo) Pull request description: _TLDR: Currently, `listdescriptors [private=true]` will fail for a non-watch-only wallet if any descriptor has a missing private key(e.g `tr()`, `multi()`, etc.). This PR changes that while making sure `listdescriptors [private=true]` still fails if there no private keys. Closes #32078_ In non-watch-only wallets, it's possible to import descriptors as long as at least one private key is included. It's important that users can still view these descriptors when they need to create a backup—even if some private keys are missing ([#32078 (comment)](https://github.com/bitcoin/bitcoin/issues/32078#issuecomment-2781428475)). This change makes it possible to do so. This change also helps prevent `listdescriptors true` from failing completely, because one descriptor is missing some private keys. ### Notes - The new behaviour is applied to all descriptors including miniscript descriptors - `listdescriptors true` still fails for watch-only wallets to preserve existing behaviour https://github.com/bitcoin/bitcoin/pull/24361#discussion_r920801352 - Wallet migration logic previously used `Descriptor::ToPrivateString()` to determine which descriptor was watchonly. This means that modifying the `ToPrivateString()` behaviour caused descriptors that were previously recognized as "watchonly" to be "non-watchonly". In order to keep the scope of this PR limited to the RPC behaviour, this PR uses a different method to determine `watchonly` descriptors for the purpose of wallet migration. A follow-up PR can be opened to update migration logic to exclude descriptors with some private keys from the `watchonly` migration wallet. ### Relevant PRs https://github.com/bitcoin/bitcoin/pull/24361 https://github.com/bitcoin/bitcoin/pull/32186 ### Testing Functional tests were added to test the new behaviour EDIT `listdescriptors [private=true]` will still fail when there are no private keys because non-watchonly wallets must have private keys and calling `listdescriptors [private=true]` for watchonly wallet returns an error ACKs for top commit: Sjors: ACK `9c7e4771b1` achow101: ACK `9c7e4771b1` w0xlt: reACK `9c7e4771b1` with minor nits rkrux: re-ACK `9c7e4771b1` Tree-SHA512: f9b3b2c3e5425a26e158882e39e82e15b7cb13ffbfb6a5fa2868c79526e9b178fcc3cd88d3e2e286f64819d041f687353780bbcf5a355c63a136fb8179698b60	2026-01-20 12:17:19 -08:00
merge-script	7f5ebef56a	Merge bitcoin/bitcoin#34302 : fuzz: Restore SendMessages coverage in process_message(s) fuzz targets `fabf8d1c5b` fuzz: Restore SendMessages coverage in process_message(s) fuzz targets (MarcoFalke) `fac7fed397` refactor: Use std::reference_wrapper<AddrMan> in Connman (MarcoFalke) Pull request description: Found and reported by Crypt-iQ (thanks!) Currently the process_message(s) fuzz targets do not have any meaningful `SendMessages` code coverage. This is not ideal. Fix the problem by adding back the coverage, and by hardening the code here, so that the problem hopefully does not happen again in the future. ### Historic context for this regression The regression was introduced in commit `fa11eea405`, which built a new deterministic peerman object. However, the patch was incomplete, because it was missing one hunk to replace `g_setup->m_node.peerman->SendMessages(&p2p_node);` with `peerman->SendMessages(&p2p_node);`. This means the stale and empty peerman from the node context and not the freshly created and deterministic peerman was used. A simple fix would be to just submit the missing patch hunk. However, this still leaves the risk that the issue is re-introduced at any time in the future. So instead, I think the stale and empty peerman should be de-constructed, so that any call to it will lead to a hard sanitizer error and fuzz failure. Doing that also uncovered another issue: The connman was holding on to a reference to a stale and empty addrman. So fix all issues by: * Allowing the addrman reference in connman to be re-seatable * Clearing all stale objects, before creating new objects, and then using references to the new objects in all code ACKs for top commit: Crypt-iQ: crACK `fabf8d1c5b` frankomosh: ACK `fabf8d1c5b` marcofleon: code review ACK `fabf8d1c5b` sedited: ACK `fabf8d1c5b` Tree-SHA512: 2e478102b3e928dc7505f00c08d4b9e4f8368407b100bc88f3eb3b82aa6fea5a45bae736c211f5af1551ca0de1a5ffd4a5d196d9473d4c3b87cfed57c9a0b69d	2026-01-20 16:45:18 +01:00
Lőrinc	1658b8f82b	refactor: rename `CTransaction::GetTotalSize` to signal that it's not cached Transaction hashes are cached, it may not be intuitive that their sizes are actually recalculated every time. This is done before the other refactors to clarify why we want to avoid calling this method; Co-authored-by: maflcko <6399679+maflcko@users.noreply.github.com>	2026-01-19 20:20:13 +01:00
merge-script	c57fbbe99d	Merge bitcoin/bitcoin#31650 : refactor: Avoid copies by using const references or by move-construction `fa64d8424b` refactor: Enforce readability-avoid-const-params-in-decls (MarcoFalke) `faf0c2d942` refactor: Avoid copies by using const references or by move-construction (MarcoFalke) Pull request description: Top level `const` in declarations is problematic for many reasons: * It is often a typo, where one wanted to denote a const reference. For example `bool PSBTInputSignedAndVerified(const PartiallySignedTransaction psbt, ...` is missing the `&`. This will create a redundant copy of the value. * In constructors it prevents move construction. * It can incorrectly imply some data is const, like in an imaginary example `std::span<int> Shuffle(const std::span<int>);`, where the `int`s are not const. * The compiler ignores the `const` from the declaration in the implementation. * It isn't used consistently anyway, not even on the same line. Fix some issues by: * Using a const reference to avoid a copy, where read-only of the value is intended. This is only done for values that may be expensive to copy. * Using move-construction to avoid a copy * Applying `readability-avoid-const-params-in-decls` via clang-tidy ACKs for top commit: l0rinc: diff reACK `fa64d8424b` hebasto: ACK `fa64d8424b`, I have reviewed the code and it looks OK. sedited: ACK `fa64d8424b` Tree-SHA512: 293c000b4ebf8fdcc75259eb0283a2e4e7892c73facfb5c3182464d6cb6a868b7f4a6682d664426bf2edecd665cf839d790bef0bae43a8c3bf1ddfdd3d068d38	2026-01-19 11:44:04 +01:00
MarcoFalke	fabf8d1c5b	fuzz: Restore SendMessages coverage in process_message(s) fuzz targets	2026-01-15 15:17:12 +01:00
merge-script	d08c1b3ed9	Merge bitcoin/bitcoin#34288 : fuzz: Exclude too expensive inputs in miniscript_string target `fac70ea8b5` fuzz: Exclude too expensive inputs in miniscript_string target (MarcoFalke) `fa90786478` iwyu: Fix includes for test/fuzz/util/descriptor module (MarcoFalke) Pull request description: Fixes https://github.com/bitcoin/bitcoin/issues/30498 Accepting "expensive" fuzz inputs which have no real use-case is problematic, because it prevents the fuzz engine from spending time on the next useful fuzz input. For example this one will take several seconds (the flamegraph shows the time is spent in minscipt `NoDupCheck`): ``` curl -fLO '`41bae50cff`' FUZZ=miniscript_string /usr/bin/time ./bld-cmake/bin/fuzz ./41bae50cffd1741150a1b330d02ab09f46ff8cd1 ``` Inspecting the inputs shows that it has many sub frags, so rejecting based on `HasTooManySubFrag` should be sufficient. ACKs for top commit: darosior: ACK `fac70ea8b5` brunoerg: code review ACK `fac70ea8b5` dergoegge: utACK `fac70ea8b5` Tree-SHA512: 7f1e0d9ce24d67ec63e5b7c2dd194efa51f38beb013564690afe0f920e5ff1980c85ce344828c0dc3f34b6851db7fe72a76b1a775c6d51c94fb91431834f453b	2026-01-15 13:55:27 +00:00
merge-script	baa554f708	Merge bitcoin/bitcoin#34259 : Find minimal chunks in SFL `da56ef239b` clusterlin: minimize chunks (feature) (Pieter Wuille) Pull request description: Part of #30289. This was split off from #34023, because it's not really an optimization but a feature. The feature existed pre-SFL, so this brings SFL to parity in terms of functionality with the old code. The idea is that while optimality - as achieved by SFL before this PR - guarantees a linearization whose feerate diagram is optimal, it may be possible to split chunks into smaller equal-feerate parts. This is desirable because even though it doesn't change the diagram, it provides more flexibility for optimization (binpacking is easier when the pieces are smaller). Thus, this PR introduces the stronger notion of "minimality": optimal chunks, which are also split into their smallest possible pieces. To accomplish that, an additional step in the SFL algorithm is added which aims to split chunks into minimal equal-feerate parts where possible, without introducing circular dependencies between them. It works based on the observation that if an (already otherwise optimal) chunk has a way of being split into two equal-feerate parts, and T is a given transaction in the chunk, then we can find the split in two steps: * One time, pretend T has $\epsilon$ higher feerate than it really has. If a split exists with T in the top part, this will find it. * The other time, pretend T has $\epsilon$ lower feerate than it really has. If a split exists with T in the bottom part, this will find it. So we try both on each found optimal chunk. If neither works, the chunk is minimal. If one works, recurse into the split chunks to split them further. ACKs for top commit: instagibbs: reACK `da56ef239b` marcofleon: crACK `da56ef239b` Tree-SHA512: 2e94d6b78725f5f9470a939dedef46450b85c4e5e6f30cba0b038622ec2b417380747e8df923d1f303706602ab6d834350716df9678de144f857e3a8d163f6c2	2026-01-15 10:07:21 +00:00
MarcoFalke	fa64d8424b	refactor: Enforce readability-avoid-const-params-in-decls	2026-01-14 23:04:12 +01:00
Ava Chow	b0b65336e7	Merge bitcoin/bitcoin#32740 : refactor: Header sync optimisations & simplifications `de4242f474` refactor: Use reference for chain_start in HeadersSyncState (Daniela Brozzoni) `e37555e540` refactor: Use initializer list in CompressedHeader (Daniela Brozzoni) `0488bdfefe` refactor: Remove unused parameter in ReportHeadersPresync (Daniela Brozzoni) `256246a9fa` refactor: Remove redundant parameter from CheckHeadersPoW (Daniela Brozzoni) `ca0243e3a6` refactor: Remove useless CBlock::GetBlockHeader (Pieter Wuille) `4568652222` refactor: Use std::span in HasValidProofOfWork (Daniela Brozzoni) `4066bfe561` refactor: Compute work from headers without CBlockIndex (Daniela Brozzoni) `0bf6139e19` p2p: Avoid an IsAncestorOfBestHeaderOrTip call (Pieter Wuille) Pull request description: This is a partial* revival of #25968 It contains a list of most-unrelated simplifications and optimizations to the code merged in #25717: - Avoid an IsAncestorOfBestHeaderOrTip call: Just don't call this function when it won't have any effect. - Compute work from headers without CBlockIndex: Avoid the need to construct a CBlockIndex object just to compute work for a header, when its nBits value suffices for that. Also use some Spans where possible. - Remove useless CBlock::GetBlockHeader: There is no need for a function to convert a CBlock to a CBlockHeader, as it's a child class of it. It also contains the following code cleanups, which were suggested by reviewers in #25968: - Remove redundant parameter from CheckHeadersPoW: No need to pass consensusParams, as CheckHeadersPow already has access to m_chainparams.GetConsensus() - Remove unused parameter in ReportHeadersPresync - Use initializer list in CompressedHeader, also make GetFullHeader const - Use reference for chain_start in HeadersSyncState: chain_start can never be null, so it's better to pass it as a reference rather than a raw pointer *I decided to leave out three commits that were in #25968 (`4e7ac7b94d`, `ab52fb4e95`, `7f1cf440ca`), since they're a bit more involved, and I'm a new contributor. If this PR gets merged, I'll comment under #25968 to note that these three commits are still up for grabs :) ACKs for top commit: l0rinc: ACK `de4242f474` polespinasa: re-ACK `de4242f474` sipa: ACK `de4242f474` achow101: ACK `de4242f474` hodlinator: re-ACK `de4242f474` Tree-SHA512: 1de4f3ce0854a196712505f2b52ccb985856f5133769552bf37375225ea8664a3a7a6a9578c4fd461e935cd94a7cbbb08f15751a1da7651f8962c866146d9d4b	2026-01-14 11:38:07 -08:00
MarcoFalke	fac70ea8b5	fuzz: Exclude too expensive inputs in miniscript_string target	2026-01-14 20:02:38 +01:00
MarcoFalke	fa90786478	iwyu: Fix includes for test/fuzz/util/descriptor module Also, fix a typo.	2026-01-14 19:19:18 +01:00
Pieter Wuille	da56ef239b	clusterlin: minimize chunks (feature) After the normal optimization process finishes, and finds an optimal spanning forest, run a second process (while computation budget remains) to split chunks into minimal equal-feerate chunks.	2026-01-12 17:38:30 -05:00
MarcoFalke	fa8d56f9f0	fuzz: Reject too large descriptor leaf sizes in scriptpubkeyman target	2026-01-08 14:26:29 +01:00
MarcoFalke	333333356f	fuzz: [refactor] Use std::span over FuzzBufferType in descriptor utils They are exactly the same, but the descriptor utils should not prescribe to use the FuzzBufferType. Using a dedicated type for them clarifies that the utils are not tied to FuzzBufferType. Also, while touching the lines, use `const` only where it is meaningful.	2026-01-08 12:18:01 +01:00
Novo	9e5e9824f1	descriptor: ToPrivateString() pass if at least 1 priv key exists - Refactor Descriptor::ToPrivateString() to allow descriptors with missing private keys to be printed. Useful in descriptors with multiple keys e.g tr() etc. - The existing behaviour of listdescriptors is preserved as much as possible, if no private keys are availablle ToPrivateString will return false	2026-01-07 10:44:38 +01:00
Pieter Wuille	1808b5aaf7	clusterlin: remove unused FixLinearization (cleanup)	2026-01-05 11:48:34 -05:00
Pieter Wuille	01ffcf464a	clusterlin: support fixing linearizations (feature) This also updates FixLinearization to just be a thin wrapper around Linearize. In a future commit, FixLinearization will be removed entirely.	2026-01-05 11:48:16 -05:00
Ava Chow	ab233255d4	Merge bitcoin/bitcoin#33866 : refactor: Let CCoinsViewCache::BatchWrite return void `6da6f503a6` refactor: Let CCoinsViewCache::BatchWrite return void (TheCharlatan) Pull request description: CCoinsViewCache::BatchWrite always returns true if called from a backed cache, so just return void instead. Also return void from ::Sync and ::Flush. This allows for dropping a FatalError condition and simplifying some dead error handling code a bit. Since we now no longer exercise the "error path" when returning from `CCoinsView::BatchWrite`, make the method clear the cache instead. This should only be exercised by tests and not change production behaviour. This might slightly improve the coins_view fuzz test's ability to generate better coverage. ACKs for top commit: l0rinc: ACK `6da6f503a6` andrewtoth: re-ACK `6da6f503a6` achow101: ACK `6da6f503a6` w0xlt: ACK `6da6f503a6` Tree-SHA512: dfaa325b0cf8108910aebf1b27434aaddb639d10d860e96797c77ea42eca9035a54a7dc1d6a5d4eae2b75fcc9356206d3d5672243d2c906e80d19024c8b95408	2026-01-02 16:49:23 -08:00
bensig	08ed802bab	doc: fix double-word typos in comments	2025-12-30 12:12:26 -08:00
fanquake	3e4765ee10	scripted-diff: [doc] Unify stale copyright headers -BEGIN VERIFY SCRIPT- sed --in-place --regexp-extended \ 's;( 20[0-2][0-9])(-20[0-2][0-9])? The Bitcoin Core developers;\1-present The Bitcoin Core developers;g' \ $( git grep -l 'The Bitcoin Core developers' -- ':(exclude)COPYING' ':(exclude)src/ipc/libmultiprocess' ':(exclude)src/minisketch' ) -END VERIFY SCRIPT-	2025-12-19 16:58:36 +00:00
merge-script	7f295e1d9b	Merge bitcoin/bitcoin#34084 : scripted-diff: [doc] Unify stale copyright headers `fa4cb13b52` test: [doc] Manually unify stale headers (MarcoFalke) `fa5f297748` scripted-diff: [doc] Unify stale copyright headers (MarcoFalke) Pull request description: Historically, the upper year range in file headers was bumped manually or with a script. This has many issues: * The script is causing churn. See for example commit `306ccd4`, or drive-by first-time contributions bumping them one-by-one. (A few from this year: https://github.com/bitcoin/bitcoin/pull/32008, https://github.com/bitcoin/bitcoin/pull/31642, https://github.com/bitcoin/bitcoin/pull/32963, ...) * Some, or likely most, upper year values were wrong. Reasons for incorrect dates could be code moves, cherry-picks, or simply bugs in the script. * The upper range is not needed for anything. * Anyone who wants to find the initial file creation date, or file history, can use `git log` or `git blame` to get more accurate results. * Many places are already using the `-present` suffix, with the meaning that the upper range is omitted. To fix all issues, this bumps the upper range of the copyright headers to `-present`. Further notes: * Obviously, the yearly 4-line bump commit for the build system (c.f. `b537a2c02a`) is fine and will remain. * For new code, the date range can be fully omitted, as it is done already by some developers. Obviously, developers are free to pick whatever style they want. One can list the commits for each style. * For example, to list all commits that use `-present`: `git log --format='%an (%ae) [%h: %s]' -S 'present The Bitcoin'`. * Alternatively, to list all commits that use no range at all: `git log --format='%an (%ae) [%h: %s]' -S '(c) The Bitcoin'`. <!-- * The lower range can be wrong as well, so it could be omitted as well, but this is left for a follow-up. A previous attempt was in https://github.com/bitcoin/bitcoin/pull/26817. ACKs for top commit: l0rinc: ACK `fa4cb13b52` rkrux: re-ACK `fa4cb13b52` janb84: ACK `fa4cb13b52` Tree-SHA512: e5132781bdc4417d1e2922809b27ef4cf0abb37ffb68c65aab8a5391d3c917b61a18928ec2ec2c75ef5184cb79a5b8c8290d63e949220dbeab3bd2c0dfbdc4c5	2025-12-19 16:56:02 +00:00
Pieter Wuille	75bdb925f4	clusterlin: drop support for improvable chunking (simplification) With MergeLinearizations() gone and the LIMO-based Linearize() replaced by SFL, we do not need a class (LinearizationChunking) that can maintain an incrementally-improving chunk set anymore. Replace it with a function (ChunkLinearizationInfo) that just computes the chunks as SetInfos once, and returns them as a vector. This simplifies several call sites too.	2025-12-18 16:01:31 -05:00
Pieter Wuille	91399a7912	clusterlin: remove unused MergeLinearizations (cleanup) This ended up never being used in txgraph.	2025-12-18 16:01:31 -05:00
Pieter Wuille	13aad26b78	clusterlin: randomize various decisions in SFL (feature) This introduces a local RNG inside the SFL state, which is used to randomize various decisions inside the algorithm, in order to make it hard to create pathological clusters which predictably have bad performance. The decisions being randomized are: * When deciding what chunk to attempt to split, the queue order is randomized. * When deciding which dependency to split on, a uniformly random one is chosen among those with higher top feerate than bottom feerate within the chosen chunk. * When deciding which chunks to merge, a uniformly random one among those with the higher feerate difference is picked. * When merging two chunks, a uniformly random dependency between them is now activated. * When making the state topological, the queue of chunks to process is randomized.	2025-12-18 16:01:31 -05:00
Pieter Wuille	ddbfa4dfac	clusterlin: keep FIFO queue of improvable chunks (preparation) This introduces a queue of chunks that still need processing, in both MakeTopological() and OptimizationStep(). This is simultaneously: * A preparation for introducing randomization, by allowing permuting the queue. * An improvement to the fairness of suboptimal solutions, by distributing the work more fairly over chunks. * An optimization, by avoiding retrying chunks over and over again which are already known to be optimal.	2025-12-18 16:01:31 -05:00
Pieter Wuille	3efc94d656	clusterlin: replace cluster linearization with SFL (feature) This replaces the existing LIMO linearization algorithm (which internally uses ancestor set finding and candidate set finding) with the much more performant spanning-forest linearization algorithm. This removes the old candidate-set search algorithm, and several of its tests, benchmarks, and needed utility code. The worst case time per cost is similar to the previous algorithm, so ACCEPTABLE_ITERS is unchanged.	2025-12-18 16:01:31 -05:00
Pieter Wuille	6a8fa821b8	clusterlin: add support for loading existing linearization (feature)	2025-12-18 16:01:22 -05:00
Pieter Wuille	da48ed9f34	clusterlin: ReadLinearization for non-topological (tests) Rather than using an ad-hoc no-dependency copy of the graph when a potentially non-topological linearization is needed in the clusterlin fuzz test, add this directly as a feature in ReadLinearization(). This is preparation for a later commit where another use for such a function is added.	2025-12-18 15:49:07 -05:00
Pieter Wuille	c461259fb6	clusterlin: add class implementing SFL state (preparation) This adds a data structure representing the optimization state for the spanning-forest linearization algorithm (SFL), plus a fuzz test for its correctness. This is preparation for switching over Linearize() to use this algorithm. See https://delvingbitcoin.org/t/spanning-forest-cluster-linearization/1419 for a description of the algorithm.	2025-12-18 15:49:01 -05:00
merge-script	516ae5ede4	Merge bitcoin/bitcoin#31533 : fuzz: Add fuzz target for block index tree and related validation events `db2d39f642` fuzz: add subtest for re-downloading a previously pruned block (Eugene Siegel) `45f5b2dac3` fuzz: Add fuzzer for block index (Martin Zumsande) `c011e3aa54` test: Wrap validation functions with TestChainstateManager (Martin Zumsande) Pull request description: This adds a fuzz target for the block index and various events in validation that interact with it. It can create arbitrary tree-like structure of block indexes, simulating (so far) the following events: - Adding a header - Receiving the full block (may be valid or not) - `ActivateBestChain()` - Reorging the chain to a new chain tip (possibly encountering invalid blocks on the way) - Pruning a block in the best chain - Receiving a previously pruned block again (`getblockfrompeer`) It might be interesting / possible to extend this to more events, such as dealing with more than one chainstate (assumeutxo). The test skips all actual validation of header/ block / transaction data by just simulating the outcome, and also doesn't interact with the data directory. The main goal is to ensure the integrity of the block index tree in all fuzzed constellations, by calling `CheckBlockIndex()` at the end of each iteration. Compared to #29158 this approach has a more limited scope (by skipping all actual validation), but it is fast - it doesn't do a full init sequence on each iteration, but "cleans up" after itself by resetting the global validation state after each iteration. ACKs for top commit: Crypt-iQ: reACK `db2d39f642` maflcko: review ACK `db2d39f642` 🍶 sedited: Re-ACK `db2d39f642` Tree-SHA512: 76cd5f8f4d7d7258620b46d7438bad4508c3bdc98825b48b60f694b5a9838e2b2cf4967c0ead181f86f66f4939ddfe552471851b9d18f84f584c03dd7e09fc43	2025-12-18 15:26:42 +00:00
merge-script	8d38b6f5f1	Merge bitcoin/bitcoin#34091 : fuzz: doc: remove any mention to `address_deserialize_v2` `caf4843a59` fuzz: doc: remove any mention to address_deserialize_v2 (brunoerg) Pull request description: We don't have `address_deserialize_v2` target anymore since `fac81affb5` (we used to have `address_deserialize_v1_notime`, `address_deserialize_v1_withtime` and `address_deserialize_v2` but now we only have a single `address_deserialize` target) so it removes any mention to it. ACKs for top commit: maflcko: review ACK `caf4843a59` 🎾 marcofleon: ACK `caf4843a59` Tree-SHA512: 539d69edbfe4ca11eb0701ed5c789ad81976e3e85e8a229e39e9dc1b1c72264f01d10a1c16d0a3bb4a354794412dc8b625298f4f72430905a00b65faeaa37d6b	2025-12-18 11:35:41 +00:00
Ryan Ofsky	ab513103df	Merge bitcoin/bitcoin#33192 : refactor: unify container presence checks `d9319b06cf` refactor: unify container presence checks - non-trivial counts (Lőrinc) `039307554e` refactor: unify container presence checks - trivial counts (Lőrinc) `8bb9219b63` refactor: unify container presence checks - find (Lőrinc) Pull request description: ### Summary Instead of counting occurrences in sets and maps, the C++20 `::contains` method expresses the intent unambiguously and can return early on first encounter. ### Context Applied clang‑tidy's [readability‑container‑contains](https://clang.llvm.org/extra/clang-tidy/checks/readability/container-contains.html) check, though many cases required manual changes since tidy couldn't fix them automatically. ### Changes The changes made here were: \| From \| To \| \|------------------------\|------------------\| \| `m.find(k) == m.end()` \| `!m.contains(k)` \| \| `m.find(k) != m.end()` \| `m.contains(k)` \| \| `m.count(k)` \| `m.contains(k)` \| \| `!m.count(k)` \| `!m.contains(k)` \| \| `m.count(k) == 0` \| `!m.contains(k)` \| \| `m.count(k) != 1` \| `!m.contains(k)` \| \| `m.count(k) == 1` \| `m.contains(k)` \| \| `m.count(k) < 1` \| `!m.contains(k)` \| \| `m.count(k) > 0` \| `m.contains(k)` \| \| `m.count(k) != 0` \| `m.contains(k)` \| > Note that `== 1`/`!= 1`/`< 1` only apply to simple [maps](https://en.cppreference.com/w/cpp/container/map/contains)/[sets](https://en.cppreference.com/w/cpp/container/set/contains) and had to be changed manually. There are many other cases that could have been changed, but we've reverted most of those to reduce conflict with other open PRs. ----- <details> <summary>clang-tidy command on Mac</summary> ```bash rm -rfd build && \ cmake -B build \ -DCMAKE_C_COMPILER="$(brew --prefix llvm)/bin/clang" \ -DCMAKE_CXX_COMPILER="$(brew --prefix llvm)/bin/clang++" \ -DCMAKE_OSX_SYSROOT="$(xcrun --show-sdk-path)" \ -DCMAKE_C_FLAGS="-target arm64-apple-macos11" \ -DCMAKE_CXX_FLAGS="-target arm64-apple-macos11" \ -DCMAKE_EXPORT_COMPILE_COMMANDS=ON -DBUILD_BENCH=ON -DBUILD_FUZZ_BINARY=ON -DBUILD_FOR_FUZZING=ON "$(brew --prefix llvm)/bin/run-clang-tidy" -quiet -p build -j$(nproc) -checks='-*,readability-container-contains' \| grep -v 'clang-tidy' ``` </details> Note: this is a take 2 of https://github.com/bitcoin/bitcoin/pull/33094 with fewer contentious changes. ACKs for top commit: optout21: reACK `d9319b06cf` sedited: ACK `d9319b06cf` janb84: re ACK `d9319b06cf` pablomartin4btc: re-ACK `d9319b06cf` ryanofsky: Code review ACK `d9319b06cf`. I manually reviewed the full change, and it seems there are a lot of positive comments about this and no more very significant conflicts, so I will merge it shortly. Tree-SHA512: e4415221676cfb88413ccc446e5f4369df7a55b6642347277667b973f515c3c8ee5bfa9ee0022479c8de945c89fbc9ff61bd8ba086e70f30298cbc1762610fe1	2025-12-17 16:17:29 -05:00
brunoerg	caf4843a59	fuzz: doc: remove any mention to address_deserialize_v2	2025-12-17 11:57:11 -03:00
MarcoFalke	fa5f297748	scripted-diff: [doc] Unify stale copyright headers -BEGIN VERIFY SCRIPT- sed --in-place --regexp-extended \ 's;( 20[0-2][0-9])(-20[0-2][0-9])? The Bitcoin Core developers;\1-present The Bitcoin Core developers;g' \ $( git grep -l 'The Bitcoin Core developers' -- ':(exclude)COPYING' ':(exclude)src/ipc/libmultiprocess' ':(exclude)src/minisketch' ) -END VERIFY SCRIPT-	2025-12-16 22:21:15 +01:00
Eugene Siegel	db2d39f642	fuzz: add subtest for re-downloading a previously pruned block This imitates the use of the getblockfrompeer rpc. Note that currently pruning is limited to blocks in the active chain. Co-authored-by: Martin Zumsande <mzumsande@gmail.com>	2025-12-16 11:25:46 -05:00
Martin Zumsande	45f5b2dac3	fuzz: Add fuzzer for block index This fuzz target creates arbitrary tree-like structure of indices, simulating the following events: - Adding a header to the block tree db - Receiving the full block (may be valid or not) - Reorging to a new chain tip (possibly encountering invalid blocks on the way) - pruning The test skips all actual validation of header/ block / transaction data by just simulating the outcome, and also doesn't interact with the data directory. The main goal is to test the integrity of the block index tree in all fuzzed constellations, by calling CheckBlockIndex() at the end of each iteration.	2025-12-16 11:25:46 -05:00
merge-script	13891a8a68	Merge bitcoin/bitcoin#34050 : fuzz: exercise `ComputeMerkleRoot` without `mutated` parameter `7e9de20c0c` fuzz: exercise `ComputeMerkleRoot` without mutated parameter (Lőrinc) Pull request description: The `mutated` parameter in `ComputeMerkleRoot` unlocks a different path that was always exercised in the fuzz test. Adjusted to be fuzzer to pass `nullptr` as well to make sure that path is also tested: `24ed820d4f/src/consensus/merkle.cpp (L49-L53)` Follow-up to https://github.com/bitcoin/bitcoin/pull/33805#discussion_r2589073735 ACKs for top commit: frankomosh: ACK [`7e9de20`](`7e9de20c0c`) hodlinator: ACK `7e9de20c0c` sedited: ACK `7e9de20c0c` Tree-SHA512: bf27029ac04003447b24a95544ec863f9ceca6c28d51ea811dd6ca2b412a2a780bb9fdbcdc82719f39dd710a746eb2446263e8377d67a8be52a1694571d03498	2025-12-16 14:25:55 +00:00
merge-script	4f11ef058b	Merge bitcoin/bitcoin#30214 : refactor: Improve assumeutxo state representation `82be652e40` doc: Improve ChainstateManager documentation, use consistent terms (Ryan Ofsky) `af455dcb39` refactor: Simplify pruning functions (TheCharlatan) `ae85c495f1` refactor: Delete ChainstateManager::GetAll() method (Ryan Ofsky) `6a572dbda9` refactor: Add ChainstateManager::ActivateBestChains() method (Ryan Ofsky) `491d827d52` refactor: Add ChainstateManager::m_chainstates member (Ryan Ofsky) `e514fe6116` refactor: Delete ChainstateManager::SnapshotBlockhash() method (Ryan Ofsky) `ee35250683` refactor: Delete ChainstateManager::IsSnapshotValidated() method (Ryan Ofsky) `d9e82299fc` refactor: Delete ChainstateManager::IsSnapshotActive() method (Ryan Ofsky) `4dfe383912` refactor: Convert ChainstateRole enum to struct (Ryan Ofsky) `352ad27fc1` refactor: Add ChainstateManager::ValidatedChainstate() method (Ryan Ofsky) `a229cb9477` refactor: Add ChainstateManager::CurrentChainstate() method (Ryan Ofsky) `a9b7f5614c` refactor: Add Chainstate::StoragePath() method (Ryan Ofsky) `840bd2ef23` refactor: Pass chainstate parameters to MaybeCompleteSnapshotValidation (Ryan Ofsky) `1598a15aed` refactor: Deduplicate Chainstate activation code (Ryan Ofsky) `9fe927b6d6` refactor: Add Chainstate m_assumeutxo and m_target_utxohash members (Ryan Ofsky) `6082c84713` refactor: Add Chainstate::m_target_blockhash member (Ryan Ofsky) `de00e87548` test: Fix broken chainstatemanager_snapshot_init check (Ryan Ofsky) Pull request description: This PR contains the first part of #28608, which tries to make assumeutxo code more maintainable, and improve it by not locking `cs_main` for a long time when the snapshot block is connected, and by deleting the snapshot validation chainstate when it is no longer used, instead of waiting until the next restart. The changes in this PR are just refactoring. They make `Chainstate` objects self-contained, so for example, it is possible to determine what blocks to connect to a chainstate without querying `ChainstateManager`, and to determine whether a Chainstate is validated without basing it on inferences like `&cs != &ActiveChainstate()` or `GetAll().size() == 1`. The PR also tries to make assumeutxo terminology less confusing, using "current chainstate" to refer to the chainstate targeting the current network tip, and "historical chainstate" to refer to the chainstate downloading old blocks and validating the assumeutxo snapshot. It removes uses of the terms "active chainstate," "usable chainstate," "disabled chainstate," "ibd chainstate," and "snapshot chainstate" which are confusing for various reasons. ACKs for top commit: maflcko: re-review ACK `82be652e40` 🕍 fjahr: re-ACK `82be652e40` sedited: Re-ACK `82be652e40` Tree-SHA512: 81c67abba9fc5bb170e32b7bf8a1e4f7b5592315b4ef720be916d5f1f5a7088c0c59cfb697744dd385552f58aa31ee36176bae6a6e465723e65861089a1252e5	2025-12-16 14:03:34 +00:00
TheCharlatan	6da6f503a6	refactor: Let CCoinsViewCache::BatchWrite return void CCoinsViewCache::BatchWrite always returns true if called from a backed cache, so just return void instead. Also return void from ::Sync and ::Flush. This allows for dropping a FatalError condition and simplifying some dead error handling code a bit. Since we now no longer exercise the "error path" when returning from `CCoinsView::BatchWrite`, make the method clear the cache instead. This should only be exercised by tests and not change production behaviour. This might slightly improve the coins_view fuzz test's ability to generate better coverage. Co-authored-by: l0rinc <pap.lorinc@gmail.com>	2025-12-14 22:25:31 +01:00
marcofleon	a70a14a3f4	refactor: Separate out logic for building a tree-shaped dependency graph	2025-12-12 16:09:53 +01:00
marcofleon	ce29d7d626	fuzz: Fix variable in `clusterlin_postlinearize_tree` check The test intends to verify that running `PostLinearize` a second time on a tree-structured graph doesn't change the result. But `PostLinearize` was being called on the original variable, not the copy. So the check was comparing the unmodified copy against itself, which is useless. Fix by post-linearizing the correct variable.	2025-12-12 15:04:10 +00:00
marcofleon	876e2849b4	fuzz: Fix incorrect loop bounds in `clusterlin_postlinearize_tree` The dependency graphs generated by this test can have holes (unused indices) in them. This means some of the transactions were skipped when using `depgraph_gen.TxCount()` as the upper bound of the loop. Switch to using `depgraph.Positions()` to correctly handle sparse graphs.	2025-12-12 15:02:26 +00:00
Ryan Ofsky	e514fe6116	refactor: Delete ChainstateManager::SnapshotBlockhash() method SnapshotBlockhash() is only called two places outside of tests, and is used redundantly in some tests, checking the same field as other checks. Simplify by dropping the method and using the m_from_snapshot_blockhash field directly.	2025-12-12 06:49:59 -04:00
Ryan Ofsky	840bd2ef23	refactor: Pass chainstate parameters to MaybeCompleteSnapshotValidation Remove hardcoded references to m_ibd_chainstate and m_snapshot_chainstate so MaybeCompleteSnapshotValidation function can be simpler and focus on validating the snapshot without dealing with internal ChainstateManager states. This is a step towards being able to validate the snapshot outside of ActivateBestChain loop so cs_main is not locked for minutes when the snapshot block is connected.	2025-12-12 06:49:59 -04:00
Lőrinc	f0a2183108	test: adjust `ComputeMerkleRoot` tests Update the integer fuzz test to move the vector into `ComputeMerkleRoot`, matching production usage patterns and avoiding unnecessary copies. Update `merkle_test_BlockWitness` to use an odd number of transactions to ensure the test covers the scenario where leaf duplication occurs. Also switch to `GetWitnessHash` to match `BlockWitnessMerkleRoot` semantics. The manual vector setup retains the exact-size `resize` to explicitly verify the behavior against the calculated root.	2025-12-11 14:47:48 +01:00
Lőrinc	7e9de20c0c	fuzz: exercise `ComputeMerkleRoot` without mutated parameter Co-authored-by: sedited <seb.kung@gmail.com>	2025-12-11 12:47:18 +01:00
Ava Chow	b26762bdcb	Merge bitcoin/bitcoin#33805 : merkle: migrate `path` arg to reference and drop unused args `24ed820d4f` merkle: remove unused `mutated` arg from `BlockWitnessMerkleRoot` (Lőrinc) `63d640fa6a` merkle: remove unused `proot` and `pmutated` args from `MerkleComputation` (Lőrinc) `be270551df` merkle: migrate `path` arg of `MerkleComputation` to a reference (Lőrinc) Pull request description: ### Summary Simplifies merkle tree computation by removing dead code found through coverage analysis (following up on #33768 and #33786). ### History #### BlockWitnessMerkleRoot Original `MerkleComputation` was added in `ee60e5625b (diff-706988c23877f8a557484053887f932b2cafb3b5998b50497ce7ff8118ac85a3R131)` where it was called for either `&hash, mutated` or `position, &ret` args. In `1f0e7ca09c (diff-706988c23877f8a557484053887f932b2cafb3b5998b50497ce7ff8118ac85a3L135-L165)` the first usage was inlined in `ComputeMerkleRoot`, leaving the `proot` and , `pmutated` values unused in `MerkleComputation`. Later in `4defdfab94` the method was moved to test and in `63d6ad7c89 (diff-706988c23877f8a557484053887f932b2cafb3b5998b50497ce7ff8118ac85a3R87-R95)` was restored to the code, though with unused parameters again. #### BlockWitnessMerkleRoot `BlockWitnessMerkleRoot` was introduced in `8b49040854` where it was already called with `NULL` `8b49040854 (diff-34d21af3c614ea3cee120df276c9c4ae95053830d7f1d3deaf009a4625409ad2R3509)` or an unused dummy `8b49040854 (diff-34d21af3c614ea3cee120df276c9c4ae95053830d7f1d3deaf009a4625409ad2R3598-R3599)` for the `mutated` parameter. ### Fixes #### BlockWitnessMerkleRoot - Converts `path` parameter from pointer to reference (always non-null at call site) - Removes `proot` and `pmutated` parameters (always `nullptr` at call site) #### BlockWitnessMerkleRoot - Removes unused `mutated` output parameter (always passed as `nullptr`) The change is a refactor that shouldn't introduce any behavioral change, only remove dead code, leftovers from previous refactors. ### Coverage proof https://maflcko.github.io/b-c-cov/total.coverage/src/consensus/merkle.cpp.gcov.html ACKs for top commit: optout21: utACK `24ed820d4f` Sjors: utACK `24ed820d4f` achow101: ACK `24ed820d4f` sedited: ACK `24ed820d4f` hodlinator: ACK `24ed820d4f` Tree-SHA512: 6960411304631bc381a3db7a682f6b6ba51bd58936ca85aa237c69a9109265b736b22ec4d891875bddfcbe8517bd3f014c44a4b387942eee4b01029c91ec93e1	2025-12-10 15:28:50 -08:00
Ava Chow	0f6d8a347a	Merge bitcoin/bitcoin#30442 : precalculate SipHash constant salt XORs `6eb5ba5691` refactor: extract shared `SipHash` state into `SipHashState` (Lőrinc) `118d22ddb4` optimization: cache `PresaltedSipHasher` in `CBlockHeaderAndShortTxIDs` (Lőrinc) `9ca52a4cbe` optimization: migrate `SipHashUint256` to `PresaltedSipHasher` (Lőrinc) `ec11b9fede` optimization: introduce `PresaltedSipHasher` for repeated hashing (Lőrinc) `20330548cf` refactor: extract `SipHash` C0-C3 constants to class scope (Lőrinc) `9f9eb7fbc0` test: rename k1/k2 to k0/k1 in `SipHash` consistency tests (Lőrinc) Pull request description: This change is part of [[IBD] - Tracking PR for speeding up Initial Block Download](https://github.com/bitcoin/bitcoin/pull/32043) ### Summary The in-memory representation of the UTXO set uses (salted) [SipHash](https://github.com/bitcoin/bitcoin/blob/master/src/coins.h#L226) to avoid key collision attacks. Hashing `uint256` keys is performed frequently throughout the codebase. Previously, specialized optimizations existed as standalone functions (`SipHashUint256` and `SipHashUint256Extra`), but the constant salting operations (C0-C3 XOR with keys) were recomputed on every call. This PR introduces `PresaltedSipHasher`, a class that caches the initial SipHash state (v0-v3 after XORing constants with keys), eliminating redundant constant computations when hashing multiple values with the same keys. The optimization is applied uniformly across: - All `SaltedHasher` classes (`SaltedUint256Hasher`, `SaltedTxidHasher`, `SaltedWtxidHasher`, `SaltedOutpointHasher`) - `CBlockHeaderAndShortTxIDs` for compact block short ID computation ### Details The change replaces the standalone `SipHashUint256` and `SipHashUint256Extra` functions with `PresaltedSipHasher` class methods that cache the constant-salted state. This is particularly beneficial for hash map operations where the same salt is used repeatedly (as suggested by Sipa in https://github.com/bitcoin/bitcoin/pull/30442#issuecomment-2628994530). `CSipHasher` behavior remains unchanged; only the specialized `uint256` paths and callers now reuse the cached state instead of recomputing it. ### Measurements Benchmarks were run using local `SaltedOutpointHasherBench_` microbenchmarks (not included in this PR) that exercise `SaltedOutpointHasher` in realistic `std::unordered_set` scenarios. <details> <summary>Benchmarks</summary> ```C++ diff --git a/src/bench/crypto_hash.cpp b/src/bench/crypto_hash.cpp --- a/src/bench/crypto_hash.cpp(revision `9b1a7c3e8d`) +++ b/src/bench/crypto_hash.cpp(revision e1b4f056b3097e7e34b0eda31f57826d81c9d810) @@ -2,7 +2,6 @@ // Distributed under the MIT software license, see the accompanying // file COPYING or http://www.opensource.org/licenses/mit-license.php. - #include <bench/bench.h> #include <crypto/muhash.h> #include <crypto/ripemd160.h> @@ -12,9 +11,11 @@ #include <crypto/sha512.h> #include <crypto/siphash.h> #include <random.h> -#include <span.h> #include <tinyformat.h> #include <uint256.h> +#include <primitives/transaction.h> +#include <util/hasher.h> +#include <unordered_set> #include <cstdint> #include <vector> @@ -205,6 +206,98 @@ }); } +static void SaltedOutpointHasherBench_hash(benchmark::Bench& bench) +{ + FastRandomContext rng{/fDeterministic=/true}; + constexpr size_t size{1000}; + + std::vector<COutPoint> outpoints(size); + for (auto& outpoint : outpoints) { + outpoint = {Txid::FromUint256(rng.rand256()), rng.rand32()}; + } + + const SaltedOutpointHasher hasher; + bench.batch(size).run([&] { + size_t result{0}; + for (const auto& outpoint : outpoints) { + result ^= hasher(outpoint); + } + ankerl::nanobench::doNotOptimizeAway(result); + }); +} + +static void SaltedOutpointHasherBench_match(benchmark::Bench& bench) +{ + FastRandomContext rng{/fDeterministic=/true}; + constexpr size_t size{1000}; + + std::unordered_set<COutPoint, SaltedOutpointHasher> values; + std::vector<COutPoint> value_vector; + values.reserve(size); + value_vector.reserve(size); + + for (size_t i{0}; i < size; ++i) { + COutPoint outpoint{Txid::FromUint256(rng.rand256()), rng.rand32()}; + values.emplace(outpoint); + value_vector.push_back(outpoint); + assert(values.contains(outpoint)); + } + + bench.batch(size).run([&] { + bool result{true}; + for (const auto& outpoint : value_vector) { + result ^= values.contains(outpoint); + } + ankerl::nanobench::doNotOptimizeAway(result); + }); +} + +static void SaltedOutpointHasherBench_mismatch(benchmark::Bench& bench) +{ + FastRandomContext rng{/fDeterministic=/true}; + constexpr size_t size{1000}; + + std::unordered_set<COutPoint, SaltedOutpointHasher> values; + std::vector<COutPoint> missing_value_vector; + values.reserve(size); + missing_value_vector.reserve(size); + + for (size_t i{0}; i < size; ++i) { + values.emplace(Txid::FromUint256(rng.rand256()), rng.rand32()); + COutPoint missing_outpoint{Txid::FromUint256(rng.rand256()), rng.rand32()}; + missing_value_vector.push_back(missing_outpoint); + assert(!values.contains(missing_outpoint)); + } + + bench.batch(size).run([&] { + bool result{false}; + for (const auto& outpoint : missing_value_vector) { + result ^= values.contains(outpoint); + } + ankerl::nanobench::doNotOptimizeAway(result); + }); +} + +static void SaltedOutpointHasherBench_create_set(benchmark::Bench& bench) +{ + FastRandomContext rng{/fDeterministic=/true}; + constexpr size_t size{1000}; + + std::vector<COutPoint> outpoints(size); + for (auto& outpoint : outpoints) { + outpoint = {Txid::FromUint256(rng.rand256()), rng.rand32()}; + } + + bench.batch(size).run([&] { + std::unordered_set<COutPoint, SaltedOutpointHasher> set; + set.reserve(size); + for (const auto& outpoint : outpoints) { + set.emplace(outpoint); + } + ankerl::nanobench::doNotOptimizeAway(set.size()); + }); +} + static void MuHash(benchmark::Bench& bench) { MuHash3072 acc; @@ -276,6 +369,10 @@ BENCHMARK(SHA256_32b_AVX2, benchmark::PriorityLevel::HIGH); BENCHMARK(SHA256_32b_SHANI, benchmark::PriorityLevel::HIGH); BENCHMARK(SipHash_32b, benchmark::PriorityLevel::HIGH); +BENCHMARK(SaltedOutpointHasherBench_hash, benchmark::PriorityLevel::HIGH); +BENCHMARK(SaltedOutpointHasherBench_match, benchmark::PriorityLevel::HIGH); +BENCHMARK(SaltedOutpointHasherBench_mismatch, benchmark::PriorityLevel::HIGH); +BENCHMARK(SaltedOutpointHasherBench_create_set, benchmark::PriorityLevel::HIGH); BENCHMARK(SHA256D64_1024_STANDARD, benchmark::PriorityLevel::HIGH); BENCHMARK(SHA256D64_1024_SSE4, benchmark::PriorityLevel::HIGH); BENCHMARK(SHA256D64_1024_AVX2, benchmark::PriorityLevel::HIGH); ``` </details> > cmake -B build -DBUILD_BENCH=ON -DCMAKE_BUILD_TYPE=Release && cmake --build build -j$(nproc) && build/bin/bench_bitcoin -filter='SaltedOutpointHasherBench' -min-time=10000 > Before: \| ns/op \| op/s \| err% \| total \| benchmark \|--------------------:\|--------------------:\|--------:\|----------:\|:---------- \| 58.60 \| 17,065,922.04 \| 0.3% \| 11.02 \| `SaltedOutpointHasherBench_create_set` \| 11.97 \| 83,576,684.83 \| 0.1% \| 11.01 \| `SaltedOutpointHasherBench_hash` \| 14.50 \| 68,985,850.12 \| 0.3% \| 10.96 \| `SaltedOutpointHasherBench_match` \| 13.90 \| 71,942,033.47 \| 0.4% \| 11.03 \| `SaltedOutpointHasherBench_mismatch` > After: \| ns/op \| op/s \| err% \| total \| benchmark \|--------------------:\|--------------------:\|--------:\|----------:\|:---------- \| 57.27 \| 17,462,299.19 \| 0.1% \| 11.02 \| `SaltedOutpointHasherBench_create_set` \| 11.24 \| 88,997,888.48 \| 0.3% \| 11.04 \| `SaltedOutpointHasherBench_hash` \| 13.91 \| 71,902,014.20 \| 0.2% \| 11.01 \| `SaltedOutpointHasherBench_match` \| 13.29 \| 75,230,390.31 \| 0.1% \| 11.00 \| `SaltedOutpointHasherBench_mismatch` compared to master: ```python create_set - 17,462,299.19 / 17,065,922.04 - 2.3% faster hash - 88,997,888.48 / 83,576,684.83 - 6.4% faster match - 71,902,014.20 / 68,985,850.12 - 4.2% faster mismatch - 75,230,390.31 / 71,942,033.47 - 4.5% faster ``` > C++ compiler .......................... GNU 13.3.0 > Before: \| ns/op \| op/s \| err% \| ins/op \| cyc/op \| IPC \| bra/op \| miss% \| total \| benchmark \|--------------------:\|--------------------:\|--------:\|----------------:\|----------------:\|-------:\|---------------:\|--------:\|----------:\|:---------- \| 136.76 \| 7,312,133.16 \| 0.0% \| 1,086.67 \| 491.12 \| 2.213 \| 119.54 \| 1.1% \| 11.01 \| `SaltedOutpointHasherBench_create_set` \| 23.82 \| 41,978,882.62 \| 0.0% \| 252.01 \| 85.57 \| 2.945 \| 4.00 \| 0.0% \| 11.00 \| `SaltedOutpointHasherBench_hash` \| 60.42 \| 16,549,695.42 \| 0.1% \| 460.51 \| 217.04 \| 2.122 \| 21.00 \| 1.4% \| 10.99 \| `SaltedOutpointHasherBench_match` \| 78.66 \| 12,713,595.35 \| 0.1% \| 555.59 \| 282.52 \| 1.967 \| 20.19 \| 2.2% \| 10.74 \| `SaltedOutpointHasherBench_mismatch` > After: \| ns/op \| op/s \| err% \| ins/op \| cyc/op \| IPC \| bra/op \| miss% \| total \| benchmark \|--------------------:\|--------------------:\|--------:\|----------------:\|----------------:\|-------:\|---------------:\|--------:\|----------:\|:---------- \| 135.38 \| 7,386,349.49 \| 0.0% \| 1,078.19 \| 486.16 \| 2.218 \| 119.56 \| 1.1% \| 11.00 \| `SaltedOutpointHasherBench_create_set` \| 23.67 \| 42,254,558.08 \| 0.0% \| 247.01 \| 85.01 \| 2.906 \| 4.00 \| 0.0% \| 11.00 \| `SaltedOutpointHasherBench_hash` \| 58.95 \| 16,962,220.14 \| 0.1% \| 446.55 \| 211.74 \| 2.109 \| 20.86 \| 1.4% \| 11.01 \| `SaltedOutpointHasherBench_match` \| 76.98 \| 12,991,047.69 \| 0.1% \| 548.93 \| 276.50 \| 1.985 \| 20.25 \| 2.3% \| 10.72 \| `SaltedOutpointHasherBench_mismatch` ```python compared to master: create_set - 7,386,349.49 / 7,312,133.16 - 1.0% faster hash - 42,254,558.08 / 41,978,882.62 - 0.6% faster match - 16,962,220.14 / 16,549,695.42 - 2.4% faster mismatch - 12,991,047.69 / 12,713,595.35 - 2.1% faster ``` ACKs for top commit: achow101: ACK `6eb5ba5691` vasild: ACK `6eb5ba5691` sipa: ACK `6eb5ba5691` Tree-SHA512: 9688b87e1d79f8af9efc18a8487922c5f1735487a9c5b78029dd46abc1d94f05d499cd1036bd615849aa7d6b17d11653c968086050dd7d04300403ebd0e81210	2025-12-10 15:22:34 -08:00

1 2 3 4 5 ...

2027 Commits