ea17a9423f [doc] release note for relaxing requirement of all unconfirmed parents present (glozow)
12f48d5ed3 test: add chained 1p1c propagation test (Greg Sanders)
525be56741 [unit test] package submission 2p1c with 1 parent missing (glozow)
f24771af05 relax child-with-unconfirmed-parents rule (glozow)
Pull request description:
Broadens the package validation interface, see #27463 for wider context.
On master, package rules include that (1) the package topology must be child-wth-parents (2) all of the child's unconfirmed parents must be present. This PR relaxes the second rule, leaving the first rule untouched (there are plans to change that as well, but not here).
Original motivation for this rule was based on the idea that we would have a child-with-unconfirmed-parents package relay protocol, and this would verify that the peer provided the "correct" package. For various reasons, we're not planning on doing this. We could potentially do this for ancestor packages (with a similar definition that all UTXOs to make the tx valid are available in this package), but it's also questionable whether it's useful to enforce this.
This rule gets in the way of certain usage of 1p1c package relay currently. If a transaction has multiple parents, of which only 1 requires a package CPFP, this rule blocks the package from relaying. Even if all the non-low-feerate parents are already in mempool, when the p2p logic submits the 1p1c package, it gets rejected for not meeting this rule.
ACKs for top commit:
ishaanam:
re-utACK ea17a9423f
instagibbs:
ACK ea17a9423f
Tree-SHA512: c2231761ae7b2acea10a96735e7a36c646f517964d0acb59bacbae1c5a1950e0223458b84c6d5ce008f0c1d53c1605df0fb3cd0064ee535ead006eb7c0fa625b
62ed1f92ef txgraph: check that DoWork finds optimal if given high budget (tests) (Pieter Wuille)
f3c2fc867f txgraph: add work limit to DoWork(), try optimal (feature) (Pieter Wuille)
e96b00d99e txgraph: make number of acceptable iterations configurable (feature) (Pieter Wuille)
cfe9958852 txgraph: track amount of work done in linearization (preparation) (Pieter Wuille)
6ba316eaa0 txgraph: 1-or-2-tx split-off clusters are optimal (optimization) (Pieter Wuille)
fad0eb091e txgraph: reset quality when merging clusters (bugfix) (Pieter Wuille)
Pull request description:
Part of #30289. Builds on top of #31553.
So far, the `TxGraph::DoWork()` function took no parameters, and just made all clusters reach the "acceptable" internal quality level by performing a minimum number of improvement iterations on it, but:
* Did not attempt to go beyond that.
* Was broken, as the QualityLevel of optimal clusters that merge together was not being reset.
Fix this by adding an argument to `DoWork()` to control how much work it is allowed to do right now, which will first be used to get all clusters to the acceptable level, and if more budget remains, use it to try to get some or all clusters optimal. The function will now return `true` if all clusters are known to be optimal (and thus no further work remains). This is verified in the tests, by remembering whether the graph is optimal, and if it is at the end of the simulation run, verify that the overall linearization cannot be improved further.
ACKs for top commit:
instagibbs:
ACK 62ed1f92ef
ismaelsadeeq:
Code review ACK 62ed1f92ef
glozow:
ACK 62ed1f92ef
Tree-SHA512: 5f57d4052e369f3444e72e724f04c02004e0f66e365faa59c9f145323e606508380fc97bb038b68783a62ae9c10757f1b628b3b00b2ce9a46161fca2d4336d73
The current `prevector` size of 28 bytes (chosen to fill the `sizeof(CScript)` aligned size) was introduced in 2015 (https://github.com/bitcoin/bitcoin/pull/6914) before SegWit and TapRoot.
However, the increasingly common `P2WSH` and `P2TR` scripts are both 34 bytes, and are forced to use heap (re)allocation rather than efficient inline storage.
The core trade-off of this change is to eliminate heap allocations for common 34-36 byte scripts at the cost of increasing the base memory footprint of all `CScript` objects by 8 bytes (while still respecting peak memory usage defined by `-dbcache`).
Increasing the `prevector` size allows these scripts to be stored inline, avoiding extra heap allocations, reducing potential memory fragmentation, and improving performance during cache flushes. Massif analysis confirms a lower stable memory usage after flushing, suggesting the elimination of heap allocations outweighs the larger base size for common workloads.
Due to memory alignment, increasing the `prevector` size to 36 bytes doesn't change the overall `sizeof(CScript)` compared to an increase to 34 bytes, allowing us to include `P2PK` scripts as well at no additional memory cost.
Performance benchmarks for AssumeUTXO load and flush show:
* Small dbcache (450MB): ~1-3% performance improvement (despite more frequent flushes)
* Large dbcache (4500MB): ~6-8% performance improvement due to fewer heap allocations (and basically the number of flushes)
* Very large dbcache (4500MB): ~5-6% performance improvement due to fewer heap allocations (and memory limit not being reached, so there's no memory penalty)
Full IBD and reindex-chainstate with larger `dbcache` values also show an overall ~3-4% speedup.
Co-authored-by: Ava Chow <github@achow101.com>
Co-authored-by: Andrew Toth <andrewstoth@gmail.com>
Co-authored-by: maflcko <6399679+maflcko@users.noreply.github.com>
Verifies that script types are correctly allocated using prevector's direct or indirect storage based on their size:
Direct allocated script types (size ≤ 28 bytes):
* OP_RETURN (small)
* P2WPKH
* P2SH
* P2PKH
Indirect allocated script types (size > 28 bytes):
* P2WSH
* P2TR
* P2PK
* MULTISIG (small)
This test provides a baseline for verifying changes to prevector's inline capacity.
The `CHECK_SCRIPT_STATIC_SIZE` and `CHECK_SCRIPT_DYNAMIC_SIZE` macros were added to differentiate the two cases - while preserving the correct source code line in case of failure.
1cb2399703 doc: clarify the GetAddresses/GetAddressesUnsafe documentation (Daniela Brozzoni)
e5a7dfd79f p2p: rename GetAddresses -> GetAddressesUnsafe (Daniela Brozzoni)
Pull request description:
Rename GetAddresses to GetAddressesUnsafe to make it clearer that this function should only be used in trusted contexts. This helps avoid accidental privacy leaks by preventing the uncached version from being used in non-trusted scenarios, like P2P.
Additionally, better reflect in the documentation that the two methods should be used in different contexts.
Also update the outdated "call the function without a parameter" phrasing in the cached version. This wording was accurate when the cache was introduced in #18991, but became outdated after later commits (f26502e9fc, 81b00f8780) added parameters to each
function, and the previous commit changed the function naming completely.
ACKs for top commit:
stickies-v:
re-ACK 1cb2399703
l0rinc:
ACK 1cb2399703
luisschwab:
ACK 1cb2399703
brunoerg:
ACK 1cb2399703
theStack:
Code-review ACK 1cb2399703
mzumsande:
Code review ACK 1cb2399703
Tree-SHA512: 02c05d88436abcdfabad994f47ec5144e9ba47668667a2c1818f57bf8710727505faf8426fd0672c63de14fcf20b96f17cea2acc39fe3c1f56abbc2b1a9e9c23
face8123fd log: [refactor] Use info level for init logs (MarcoFalke)
fa183761cb log: Remove function name from init logs (MarcoFalke)
Pull request description:
Many of the normal, and expected init logs, which are run once after startup use the deprecated alias of `LogInfo`.
Fix that by using `LogInfo` directly, which is a refactor, except for a few log lines that also have `__func__` removed.
(Also remove the unused trailing `\n` char while touching those logs)
ACKs for top commit:
stickies-v:
re-ACK face8123fd
fanquake:
ACK face8123fd
Tree-SHA512: 28c296129c9a31dff04f529c53db75057eae8a73fc7419e2f3068963dbb7b7fb9a457b2653f9120361fdb06ac97d1ee2be815c09ac659780dff01d7cd29f8480
fa1a14a13a fuzz: Reset chainman state in process_message(s) targets (MarcoFalke)
fa9a3de09b fuzz: DisableNextWrite (MarcoFalke)
aeeeeec9f7 fuzz: Reset dirty connman state in process_message(s) targets (MarcoFalke)
fa11eea405 fuzz: Avoid non-determinism in process_message(s) target (PeerMan) (MarcoFalke)
Pull request description:
`process_message(s)` are the least stable fuzz targets, according to OSS-Fuzz.
Tracking issue: https://github.com/bitcoin/bitcoin/issues/29018.
### Testing
Needs coverage compilation, as explained in `./contrib/devtools/README.md`. And then, using 32 threads:
```
cargo run --manifest-path ./contrib/devtools/deterministic-fuzz-coverage/Cargo.toml -- $PWD/bld-cmake/ $PWD/../b-c-qa-assets/fuzz_corpora/ process_messages 32
```
Each commit can be reverted to see more non-determinism re-appear.
ACKs for top commit:
marcofleon:
ReACK fa1a14a13a
dergoegge:
reACK fa1a14a13a
Tree-SHA512: 37b5b6dbdde6a39b4f83dc31e92cffb4a62a4b8f5befbf17029d943d0b2fd506f4a0833570dcdbf79a90b42af9caca44e98e838b03213d6bc1c3ecb70a6bb135
This rule was originally introduced along with a very early proposal for
package relay as a way to verify that the "correct"
child-with-unconfirmed-parents package was provided for a transaction,
where correctness was defined as all of the transactions unconfirmed
parents. However, we are not planning to introduce a protocol where
peers would be asked to send these packages.
This rule has downsides: if a transaction has multiple parents but only
1 that requires package CPFP to be accepted, the current rule prevents
us from accepting that package. Even if the other parents are already in
mempool, the p2p logic will only submit the 1p1c package, which fails
this check. See the test in p2p_1p1c_network.py
06ab3a394a tests: speed up coins_tests by parallelizing (Anthony Towns)
Pull request description:
Updates the cmake logic to generate a separate test for each BOOST_FIXTURE_TEST_SUITE declaration in a file, and splits coins_tests.cpp into three separate suites so that they can be run in parallel. Also updates the convention enforced by test/lint/lint-tests.py.
ACKs for top commit:
l0rinc:
reACK 06ab3a394a
maflcko:
lgtm ACK 06ab3a394a
achow101:
ACK 06ab3a394a
Tree-SHA512: 940d9aa31dab60d1000b5f57d8dc4b2c5b4045c7e5c979ac407aba39f2285d53bc00c5e4d7bf2247551fd7e1c8681144e11fc8c005a874282c4c59bd362fb467
Rename GetAddresses to GetAddressesUnsafe to make it clearer that this
function should only be used in trusted contexts. This helps avoid
accidental privacy leaks by preventing the uncached version from being
used in non-trusted scenarios, like P2P.
Updates the cmake logic to generate a separate test for each
BOOST_FIXTURE_TEST_SUITE declaration in a file, and splits coins_tests.cpp
into three separate suites so that they can be run in parallel. Also
updates the convention enforced by test/lint/lint-tests.py.
31c4e77a25 test: fix ReadTopologicalSet unsigned integer overflow (ismaelsadeeq)
Pull request description:
This PR is a simple fix for a potential unsigned integer overflow in ReadTopologicalSet.
We obtain the value of `mask` from fuzz input, which can be the maximum representable value.
Adding 1 to it would then cause an overflow.
The fix skips the addition when the read value is already the maximum.
See https://github.com/bitcoin/bitcoin/pull/30605#discussion_r2215338569 for more context
ACKs for top commit:
maflcko:
lgtm ACK 31c4e77a25
Tree-SHA512: f58d7907f66a0de0ed8d4b1cad6a4971f65925a99f3c030537c21c4d84126b643257c65865242caf7d445b9cbb7a71a1816a9f870ab7520625c4c16cd41979cb
96da68a38f qa: functional test a transaction running into the legacy sigop limit (Antoine Poinsot)
367147954d qa: unit test standardness of inputs packed with legacy sigops (Antoine Poinsot)
5863315e33 policy: make pathological transactions packed with legacy sigops non-standard. (Antoine Poinsot)
Pull request description:
The Consensus Cleanup soft fork proposal includes a limit on the number of legacy signature
operations potentially executed when validating a transaction. If this change is to be implemented
here and activated by Bitcoin users in the future, we should make transactions that are not valid
according to the new rules non-standard first because it would otherwise be a trivial DoS to
potentially unupgraded miners after the soft fork activates.
ML post: https://gnusha.org/pi/bitcoindev/49dyqqkf5NqGlGdinp6SELIoxzE_ONh3UIj6-EB8S804Id5yROq-b1uGK8DUru66eIlWuhb5R3nhRRutwuYjemiuOOBS2FQ4KWDnEh0wLuA=@protonmail.com/T/#u
ACKs for top commit:
instagibbs:
reACK 96da68a38f
maflcko:
review ACK 96da68a38f🚋
achow101:
ACK 96da68a38f
glozow:
light code review ACK 96da68a38f, looks correct to me
Tree-SHA512: 106ffe62e48952affa31c5894a404a17a3b4ea8971815828166fba89069f757366129f7807205e8c6558beb75c6f67d8f9a41000be2f8cf95be3b1a02d87bfe9
This is required in the process_message(s) fuzz targets to avoid leaking
the next write time from one run to the next. Also, disable it
completely because it is not needed and due to leveldb-internal
non-determinism.
The PeerManager has several members, such as the FastRandomContext,
which need to be reset before every run to avoid leaking state from one
run into the next.
Also, style fixups in p2p_handshake.cpp, where this code is copied from.
This is meant to focus the usages to narrow the scope of the obfuscation optimization.
`Obfuscation::Xor` is mostly a move.
Co-authored-by: maflcko <6399679+maflcko@users.noreply.github.com>
Mechanical refactor of the low-level "xor" wording to signal the intent instead of the implementation used.
The renames are ordered by heaviest-hitting substitutions first, and were constructed such that after each replacement the code is still compilable.
-BEGIN VERIFY SCRIPT-
sed -i \
-e 's/\bGetObfuscateKey\b/GetObfuscation/g' \
-e 's/\bxor_key\b/obfuscation/g' \
-e 's/\bxor_pat\b/obfuscation/g' \
-e 's/\bm_xor_key\b/m_obfuscation/g' \
-e 's/\bm_xor\b/m_obfuscation/g' \
-e 's/\bobfuscate_key\b/m_obfuscation/g' \
-e 's/\bOBFUSCATE_KEY_KEY\b/OBFUSCATION_KEY_KEY/g' \
-e 's/\bSetXor(/SetObfuscation(/g' \
-e 's/\bdata_xor\b/obfuscation/g' \
-e 's/\bCreateObfuscateKey\b/CreateObfuscation/g' \
-e 's/\bobfuscate key\b/obfuscation key/g' \
$(git ls-files '*.cpp' '*.h')
-END VERIFY SCRIPT-
The two tests are doing different things - `xor_roundtrip_random_chunks` does black-box style property-based testing to validate that certain invariants hold - that deobfuscating an obfuscation results in the original message (higher level, it doesn't have to know about the implementation details).
The `xor_bytes_reference` test makes sure the optimized xor implementation behaves in every imaginable scenario exactly as the simplest possible obfuscation - with random chunks, random alignment, random data, random key.
Since we're touching the file, other related small refactors were also applied:
* `nullpt` typo fixed;
* manual byte-by-byte xor key creations were replaced with `_hex` factories;
* since we're only using 64 bit keys in production, smaller keys were changed to reflect real-world usage;
Co-authored-by: Hodlinator <172445034+hodlinator@users.noreply.github.com>
Since 31 byte xor-keys are not used in the codebase, using the common size (8 bytes) makes the benchmarks more realistic.
Co-authored-by: maflcko <6399679+maflcko@users.noreply.github.com>
This adds a large simulation fuzz test for all TxOrphanage public interface
functions, using a mix of comparison with expected behavior (in case it is
fully specified), and testing of properties exhibited otherwise.
This is preparation for the simulation fuzz test added in a later commit. Since
AddChildrenToWorkSet consumes randomness, there is no way for the simulator to
exactly predict its behavior. By returning the set of made-reconsiderable announcements
instead, the simulator can instead test that it is *a* valid choice, and then
apply it to its own data structures.
For the default number of peers (125), allows each to relay a default
descendant package (up to 25-1=24 can be missing inputs) of small (9
inputs or fewer) transactions out of order.
This limit also gives acceptable bounds for worst case LimitOrphans iterations.
Functional tests aren't changed to check for larger cap because it would
make the runtime too long.
Also deletes the now-unused DEFAULT_MAX_ORPHAN_TRANSACTIONS.
This is largely a reimplementation using boost::multi_index_container.
All the same public methods are available. It has an index by outpoint,
per-peer tracking, peer worksets, etc.
A few differences:
- Limits have changed: instead of a global limit of 100 unique orphans,
we have a maximum number of announcements (which can include duplicate
orphans) and a global memory limit which scales with the number of
peers.
- The maximum announcements limit is 100 to match the original limit,
but this is actually a stricter limit because the announcement count
is not de-duplicated.
- Eviction strategy: when global limits are reached, a per-peer limit
comes into play. While limits are exceeded, we choose the peer whose
“DoS score” (max usage / limit ratio for announcements and memory
limits) is highest and evict announcements by entry time, sorting
non-reconsiderable ones before reconsiderable ones. Since announcements
are unique by (wtxid, peer), as long as 1 announcement remains for a
transaction, it remains in the orphanage.
- This eviction strategy means no peer can influence the eviction of
another peer’s orphans.
- Also, since global limits are a multiple of per-peer limits, as long
as a peer does not exceed its limits, its orphans are protected from
eviction.
- Orphans no longer expire, since older announcements are generally
removed before newer ones.
- GetChildrenFromSamePeer returns the transactions from newest to
oldest.
Co-authored-by: Pieter Wuille <pieter@wuille.net>
Move towards a model where TxOrphanage is initialized with limits that
it remembers throughout its lifetime.
Remove the param. Limiting by number of unique orphans will be removed
in a later commit.
Now that -maxorphantx is gone, this does not change the node behavior.
The parameter is only used in tests.
This adds an `iters` parameter to DoWork(), which controls how much work it is
allowed to do right now.
Additionally, DoWork() won't stop at just getting everything ACCEPTABLE, but if
there is work budget left, will also attempt to get every cluster linearized
optimally.