Merge bitcoin/bitcoin#33591: Cluster mempool followups

b8d279a81c doc: add comment to explain correctness of GatherClusters() (Suhas Daftuar)
aba7500a30 Fix parameter name in getmempoolcluster rpc (Suhas Daftuar)
6c1325a091 Rename weight -> clusterweight in RPC output, and add doc explaining mempool terminology (Suhas Daftuar)
bc2eb931da Require mempool lock to be held when invoking TRUC checks (Suhas Daftuar)
957ae23241 Improve comments for getTransactionAncestry to reference cluster counts instead of descendants (Suhas Daftuar)
d97d6199ce Fix comment to reference cluster limits, not chain limits (Suhas Daftuar)
a1b341ef98 Sanity check feerate diagram in CTxMemPool::check() (Suhas Daftuar)
23d6f457c4 rpc: improve getmempoolcluster output (Suhas Daftuar)
d2dcd37aac Avoid using mapTx.modify() to update modified fees (Suhas Daftuar)
d84ffc24d2 doc: add release notes snippet for cluster mempool (Suhas Daftuar)
b0417ba944 doc: Add design notes for cluster mempool and explain new mempool limits (Suhas Daftuar)
2d88966e43 miner: replace "package" with "chunk" (Suhas Daftuar)
6f3e8eb300 Add a GetFeePerVSize() accessor to CFeeRate, and use it in the BlockAssembler (Suhas Daftuar)
b5f245f6f2 Remove unused DEFAULT_ANCESTOR_SIZE_LIMIT_KVB and DEFAULT_DESCENDANT_SIZE_LIMIT_KVB (Suhas Daftuar)
1dac54d506 Use cluster size limit instead of ancestor size limit in txpackage unit test (Suhas Daftuar)
04f65488ca Use cluster size limit instead of ancestor/descendant size limits when sanity checking TRUC policy limits (Suhas Daftuar)
634291a7dc Use cluster limits instead of ancestor/descendant limits when sanity checking package policy limits (Suhas Daftuar)
fc18ef1f3f Remove ancestor and descendant vsize limits from MemPoolLimits (Suhas Daftuar)
ed8e819121 Warn user if using -limitancestorsize/-limitdescendantsize that the options have no effect (Suhas Daftuar)
80d8df2d47 Invoke removeUnchecked() directly in removeForBlock() (Suhas Daftuar)
9292570f4c Rewrite GetChildren without sets (Suhas Daftuar)
3e39ea8c30 Rewrite removeForReorg to avoid using sets (Suhas Daftuar)
a3c31dfd71 scripted-diff: rename AddToMempool -> TryAddToMempool (Suhas Daftuar)
a5a7905d83 Simplify removeRecursive (Suhas Daftuar)
01d8520038 Remove unused argument to RemoveStaged (Suhas Daftuar)
bc64013e6f Remove unused variable (cacheMap) in mempool (Suhas Daftuar)

Pull request description:

  As suggested in the main cluster mempool PR (https://github.com/bitcoin/bitcoin/pull/28676#pullrequestreview-3177119367), I've pulled out some of the non-essential optimizations and cleanups into this separate PR.

  Will continue to add more commits here to address non-blocking suggestions/improvements as they come up.

ACKs for top commit:
  instagibbs:
    ACK b8d279a81c
  sipa:
    ACK b8d279a81c

Tree-SHA512: 1a05e99eaf8db2e274a1801307fed5d82f8f917e75ccb9ab0e1b0eb2f9672b13c79d691d78ea7cd96900d0e7d5031a3dd582ebcccc9b1d66eb7455b1d3642235
This commit is contained in:
merge-script
2025-12-02 09:46:00 +00:00
41 changed files with 587 additions and 334 deletions

View File

@@ -9,7 +9,7 @@ contents. Policy is *not* applied to transactions in blocks.
This documentation is not an exhaustive list of all policy rules.
- [Mempool Limits](mempool-limits.md)
- [Mempool Design and Limits](mempool-design.md)
- [Mempool Replacements](mempool-replacements.md)
- [Packages](packages.md)

View File

@@ -0,0 +1,104 @@
# Mempool design and limits
## Definitions
We view the unconfirmed transactions in the mempool as a directed graph,
with an edge from transaction B to transaction A if B spends an output created
by A (i.e., B is a **child** of A, and A is a **parent** of B).
A transaction's **ancestors** include, recursively, its parents, the parents of
its parents, etc. A transaction's **descendants** include, recursively, its
children, the children of its children, etc.
A **cluster** is a connected component of the graph, i.e., a set of
transactions where each transaction is reachable from any other transaction in
the set by following edges in either direction. The cluster corresponding to a
given transaction consists of that transaction, its ancestors and descendants,
and the ancestors and descendants of those transactions, and so on.
Each cluster is **linearized**, or sorted, in a topologically valid order (i.e.,
no transaction appears before any of its ancestors). Our goal is to construct a
linearization where the highest feerate subset of a cluster appears first,
followed by the next highest feerate subset of the remaining transactions, and
so on[1]. We call these subsets **chunks**, and the chunks of a linearization
have the property that they are always in monotonically decreasing feerate
order.
Given two or more linearized clusters, we can construct a linearization of the
union by simply merge sorting the chunks of each cluster by feerate.
For any set of linearized clusters, then, we can define the **feerate diagram**
of the set by plotting the cumulative fee (y-axis) against the cumulative size
(x-axis) as we progress from chunk to chunk. Given two linearizations for the
same set of transactions, we can compare their feerate diagrams by
comparing their cumulative fees at each size value. Two diagrams may be
**incomparable** if neither contains the other (i.e., there exist size values at
which each one has a greater cumulative fee than the other). Or, they may be
**equivalent** if they have identical cumulative fees at every size value; or
one may be **strictly better** than the other if they are comparable and there
exists at least one size value for which the cumulative fee is strictly higher
in one of them.
For more background and rationale, see [2] and [3] below.
## Mining/eviction
As described above, the linearization of each cluster gives us a linearization
of the entire mempool. We use this ordering for both block building and
eviction, by selecting chunks at the front of the linearization when
constructing a block template, and by evicting chunks from the back of the
linearization when we need to free up space in the mempool.
## Replace-by-fee
Prior to the cluster mempool implementation, it was possible for replacements
to be prevented even if they would make the mempool more profitable for miners,
and it was possible for replacements to be permitted even if the newly accepted
transaction was less desirable to miners than the transactions it was
replacing. With the ability to construct linearizations of the mempool, we're
now able to compare the feerate diagram of the mempool before and after a
proposed replacement, and only accept the replacement if it makes the feerate
diagram strictly better.
In simple cases, the intuition is that a replacement should have a higher
feerate and fee than the transaction(s) it replaces. But for more complex cases
(where some transactions may have unconfirmed parents), there may not be a
simple way to describe the fee that is needed to successfully replace a set of
transactions, other than to say that the overall feerate diagram of the
resulting mempool must improve somewhere and not be worse anywhere.
## Mempool limits
### Motivation
Selecting chunks in decreasing feerate order when building a block template
will be close to optimal when the maximum size of any chunk is small compared
to the block size. And for mempool eviction, we don't wish to evict too much of
the mempool at once when a single (potentially small) transaction arrives that
takes us over our mempool size limit. For both of these reasons, it's desirable
to limit the maximum size of a cluster and thereby limit the maximum size of
any chunk (as a cluster may consist entirely of one chunk).
The computation required to linearize a transaction grows (in polynomial time)
with the number of transactions in a cluster, so limiting the number of
transactions in a cluster is necessary to ensure that we're able to find good
(ideally, optimal) linearizations in a reasonable amount of time.
### Limits
Transactions submitted to the mempool must not result in clusters that would
exceed the cluster limits (64 transactions and 101 kvB total per cluster).
## References/Notes
[1] This is an instance of the maximal-ratio closure problem, which is closely
related to the maximal-weight closure problem, as found in the field of mineral
extraction for open pit mining.
[2] See
https://delvingbitcoin.org/t/an-overview-of-the-cluster-mempool-proposal/393
for a high level overview of the cluster mempool implementation (PR#33629,
since v31.0) and its design rationale.
[3] See https://delvingbitcoin.org/t/mempool-incentive-compatibility/553 for an
explanation of why and how we use feerate diagrams for mining, eviction, and
evaluating transaction replacements.

View File

@@ -1,65 +0,0 @@
# Mempool Limits
## Definitions
Given any two transactions Tx0 and Tx1 where Tx1 spends an output of Tx0,
Tx0 is a *parent* of Tx1 and Tx1 is a *child* of Tx0.
A transaction's *ancestors* include, recursively, its parents, the parents of its parents, etc.
A transaction's *descendants* include, recursively, its children, the children of its children, etc.
A mempool entry's *ancestor count* is the total number of in-mempool (unconfirmed) transactions in
its ancestor set, including itself.
A mempool entry's *descendant count* is the total number of in-mempool (unconfirmed) transactions in
its descendant set, including itself.
A mempool entry's *ancestor size* is the aggregated virtual size of in-mempool (unconfirmed)
transactions in its ancestor set, including itself.
A mempool entry's *descendant size* is the aggregated virtual size of in-mempool (unconfirmed)
transactions in its descendant set, including itself.
Transactions submitted to the mempool must not exceed the ancestor and descendant limits (aka
mempool *package limits*) set by the node (see `-limitancestorcount`, `-limitancestorsize`,
`-limitdescendantcount`, `-limitdescendantsize`).
## Exemptions
### CPFP Carve Out
**CPFP Carve Out** if a transaction candidate for submission to the
mempool would cause some mempool entry to exceed its descendant limits, an exemption is made if all
of the following conditions are met:
1. The candidate transaction is no more than 10,000 virtual bytes.
2. The candidate transaction has an ancestor count of 2 (itself and exactly 1 ancestor).
3. The in-mempool transaction's descendant count, including the candidate transaction, would only
exceed the limit by 1.
*Rationale*: this rule was introduced to prevent pinning by domination of a transaction's descendant
limits in two-party contract protocols such as LN. Also see the [mailing list
post](https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2018-November/016518.html).
This rule was introduced in [PR #15681](https://github.com/bitcoin/bitcoin/pull/15681).
### Single-Conflict RBF Carve Out
When a candidate transaction for submission to the mempool would replace mempool entries, it may
also decrease the descendant count of other mempool entries. Since ancestor/descendant limits are
calculated prior to removing the would-be-replaced transactions, they may be overestimated.
An exemption is given for a candidate transaction that would replace mempool transactions and meets
all of the following conditions:
1. The candidate transaction has exactly 1 directly conflicting transaction.
2. The candidate transaction does not spend any unconfirmed inputs that are not also spent by the
directly conflicting transaction.
The following discounts are given to account for the would-be-replaced transaction(s):
1. The descendant count limit is temporarily increased by 1.
2. The descendant size limit temporarily is increased by the virtual size of the to-be-replaced
directly conflicting transaction.

View File

@@ -0,0 +1,19 @@
## Fee and Size Terminology in Mempool Policy
* Each transaction has a **weight** and virtual size as defined in BIP 141 (different from serialized size for witness transactions, as witness data is discounted and the value is rounded up to the nearest integer).
* In the RPCs, "weight", refers to the weight as defined in BIP 141.
* A transaction has a **sigops size**, defined as its sigop cost multiplied by the node's `-bytespersigop`, an adjustable policy.
* A transaction's **virtual size (vsize)** refers to its **sigops-adjusted virtual size**: the maximum of its BIP 141 size and sigop size. This virtual size is used to simplify the process of building blocks that satisfy both the maximum weight limit and sigop limit.
* In the RPCs, "vsize" refers to this sigops-adjusted virtual size.
* Mempool entry data with the suffix "-size" (eg "ancestorsize") refer to the cumulative sigops-adjusted virtual size of the transactions in the associated set.
* A transaction can also have a **sigops-adjusted weight**, defined similarly as the maximum of its BIP 141 weight and 4 times the sigops size. This value is used internally by the mempool to avoid losing precision, and mempool entry data with the suffix "-weight" (eg "chunkweight", "clusterweight") refer to this sigops-adjusted weight.
* A transaction's **base fee** is the difference between its input and output values.
* A transaction's **modified fee** is its base fee added to any **fee delta** introduced by using the `prioritisetransaction` RPC. Modified fee is used internally for all fee-related mempool policies and block building.

View File

@@ -0,0 +1,43 @@
Mempool
=======
The mempool has been reimplemented with a new design ("cluster mempool"), to
facilitate better decision-making when constructing block templates, evicting
transactions, relaying transactions, and validating replacement transactions
(RBF). Most changes should be transparent to users, but some behavior changes
are noted:
- The mempool no longer enforces ancestor or descendant size/count limits.
Instead, two new default policy limits are introduced governing connected
components, or clusters, in the mempool, limiting clusters to 64 transactions
and up to 101 kB in virtual size. Transactions are considered to be in the
same cluster if they are connected to each other via any combination of
parent/child relationships in the mempool. These limits can be overridden
using command line arguments; see the extended help (`-help-debug`)
for more information.
- Within the mempool, transactions are ordered based on the feerate at which
they are expected to be mined, which takes into account the full set, or
"chunk", of transactions that would be included together (e.g., a parent and
its child, or more complicated subsets of transactions). This ordering is
utilized by the algorithms that implement transaction selection for
constructing block templates; eviction from the mempool when it is full; and
transaction relay announcements to peers.
- The replace-by-fee validation logic has been updated so that transaction
replacements are only accepted if the resulting mempool's feerate diagram is
strictly better than before the replacement. This eliminates all known cases
of replacements occurring that make the mempool worse off, which was possible
under previous RBF rules. For singleton transactions (that are in clusters by
themselves) it's sufficient for a replacement to have a higher fee and
feerate than the original. See
[delvingbitcoin.org post](https://delvingbitcoin.org/t/an-overview-of-the-cluster-mempool-proposal/393#rbf-can-now-be-made-incentive-compatible-for-miners-11)
for more information.
- Two new RPCs have been added: `getmempoolcluster` will provide the set of
transactions in the same cluster as the given transaction, along with the
ordering of those transactions and grouping into chunks; and
`getmempoolfeeratediagram` will return the feerate diagram of the entire
mempool.
- Chunk size and chunk fees are now also included in the output of `getmempoolentry`.