merge-script b9bf24cfe2 Merge bitcoin/bitcoin#34616: Cluster mempool: SFL cost model (take 2)
744d47fcee clusterlin: adopt trained cost model (feature) (Pieter Wuille)
4eefdfc5b7 clusterlin: rescale costs (preparation) (Pieter Wuille)
ecc9a84f85 clusterlin: use 'cost' terminology instead of 'iters' (refactor) (Pieter Wuille)
9e7129df29 clusterlin: introduce CostModel class (preparation) (Pieter Wuille)

Pull request description:

  Part of #30289, replaces earlier #34138.

  This introduces a more accurate cost model for SFL, to control how much CPU time is spent inside the algorithm for clusters that cannot be linearized perfectly within a reasonable amount of time.

  The goal is having a metric for the amount of work performed, so that txmempool can impose limits on that work: a lower bound that is always performed (unless optimality is reached before that point, of course), and an upper bound to limit the latency and total CPU time spent on this. There are conflicting design goals here:
  * On the one hand, it seems ideal if this metric is closely correlated to actual CPU time, because otherwise the limits become inaccurate.
  * On the other hand, it seems a nightmare to have the metric be platform/system dependent, as it makes network-wide reasoning nearly impossible. It's expected that slower systems take longer to do the same thing; this holds for everything, and we don't need to compensate for this.

  There are multiple solutions to this:
  * One extreme is just measuring the time. This is very accurate, but extremely platform dependent, and also non-deterministic due to random scheduling/cache effects.
  * The other extreme is using a very abstract metric like counting how many times certain loops/function inside the algorithm run. That is what is implemented in master right now, just counting the sum of the numbers of transactions updated across all `UpdateChunks()` calls. It however necessarily fails to account for significant portions of runtime spent elsewhere, resulting in a rather wide range of "ns per cost" values.
  * This PR takes a middle ground, counting many function calls / branches / loops, with weights that were determined through benchmarking on an average on a number of systems.

  Specifically, the cost model was obtained by:
  * For a variety of machines:
    * Running a fixed collection of ~385000 clusters found through random generation and fuzzing, optimizing for difficulty of linearization.
      * Linearize each 1000-5000 times, with different random seeds. Sometimes without input linearization, sometimes with a bad one.
        * Gather cycle counts for each of the operations included in this cost model, broken down by their parameters.
    * Correct the data by subtracting the runtime of obtaining the cycle count.
    * Drop the 5% top and bottom samples from each cycle count dataset, and compute the average of the remaining samples.
    * For each operation, fit a least-squares linear function approximation through the samples.
  * Rescale all machine expressions to make their total time match, as we only care about relative cost of each operation.
  * Take the per-operation average of operation expressions across all machines, to construct expressions for an average machine.
  * Approximate the result with integer coefficients.

  The benchmarks were performed by `l0rinc <pap.lorinc@gmail.com>` and myself, on AMD Ryzen 5950X, AMD Ryzen 7995WX, AMD Ryzen 9980X, Apple M4 Max, Intel Core i5-12500H, Intel Core Ultra 7 155H, Intel N150 (Umbrel), Intel Core i7-7700, Intel Core i9-9900K, Intel Haswell (VPS, virtualized), Intel Xeon E5-2637, ARM Cortex-A76 (Raspberry Pi 5), ARM Cortex-A72 (Raspberry Pi 4).

  Based on final benchmarking, the "acceptable" iteration count (which is the minimum spent on every cluster) is to 75000 units, which corresponds to roughly 50 μs on Ryzen 5950X and similar modern desktop hardware.

ACKs for top commit:
  instagibbs:
    ACK 744d47fcee
  murchandamus:
    reACK 744d47fcee

Tree-SHA512: 5cb37a6bdd930389937c435f910410c3581e53ce609b9b594a8dc89601e6fca6e6e26216e961acfe9540581f889c14bf289b6a08438a2d7adafd696fc81ff517
2026-02-25 12:11:13 +00:00
2026-02-06 13:40:59 +00:00
2026-02-24 11:21:40 +00:00
2025-12-29 17:50:43 +00:00
2025-06-19 11:22:14 +01:00

Bitcoin Core integration/staging tree

https://bitcoincore.org

For an immediately usable, binary version of the Bitcoin Core software, see https://bitcoincore.org/en/download/.

What is Bitcoin Core?

Bitcoin Core connects to the Bitcoin peer-to-peer network to download and fully validate blocks and transactions. It also includes a wallet and graphical user interface, which can be optionally built.

Further information about Bitcoin Core is available in the doc folder.

License

Bitcoin Core is released under the terms of the MIT license. See COPYING for more information or see https://opensource.org/license/MIT.

Development Process

The master branch is regularly built (see doc/build-*.md for instructions) and tested, but it is not guaranteed to be completely stable. Tags are created regularly from release branches to indicate new official, stable release versions of Bitcoin Core.

The https://github.com/bitcoin-core/gui repository is used exclusively for the development of the GUI. Its master branch is identical in all monotree repositories. Release branches and tags do not exist, so please do not fork that repository unless it is for development reasons.

The contribution workflow is described in CONTRIBUTING.md and useful hints for developers can be found in doc/developer-notes.md.

Testing

Testing and code review is the bottleneck for development; we get more pull requests than we can review and test on short notice. Please be patient and help out by testing other people's pull requests, and remember this is a security-critical project where any mistake might cost people lots of money.

Automated Testing

Developers are strongly encouraged to write unit tests for new code, and to submit new unit tests for old code. Unit tests can be compiled and run (assuming they weren't disabled during the generation of the build system) with: ctest. Further details on running and extending unit tests can be found in /src/test/README.md.

There are also regression and integration tests, written in Python. These tests can be run (if the test dependencies are installed) with: build/test/functional/test_runner.py (assuming build is your build directory).

The CI (Continuous Integration) systems make sure that every pull request is tested on Windows, Linux, and macOS. The CI must pass on all commits before merge to avoid unrelated CI failures on new pull requests.

Manual Quality Assurance (QA) Testing

Changes should be tested by somebody other than the developer who wrote the code. This is especially important for large or high-risk changes. It is useful to add a test plan to the pull request description if testing the changes is not straightforward.

Translations

Changes to translations as well as new translations can be submitted to Bitcoin Core's Transifex page.

Translations are periodically pulled from Transifex and merged into the git repository. See the translation process for details on how this works.

Important: We do not accept translation changes as GitHub pull requests because the next pull from Transifex would automatically overwrite them again.

Description
Languages
C++ 65.1%
Python 19%
C 12.1%
CMake 1.3%
Shell 0.8%
Other 1.6%