c0642e558a [fuzz] fix latency score check in txorphan_protected (glozow)
3d4d4f0d92 scripted-diff: rename "ann" variables to "latency_score" (monlovesmango)
3b92448923 [doc] comment fixups for orphanage changes (glozow)
1384dbaf6d [config] emit warning for -maxorphantx, but allow it to be set (glozow)
b10c55b298 fix up TxOrphanage lower_bound sanity checks (glozow)
cfd71c6704 scripted-diff: rename TxOrphanage outpoints index (glozow)
edb97bb3f1 [logging] add logs for inner loop of LimitOrphans (glozow)
8a58d0e87d scripted-diff: rename OrphanTxBase to OrphanInfo (glozow)
cc50f2f0df [cleanup] replace TxOrphanage::Size() with CountUniqueOrphans (glozow)
ed24e01696 [optimization] Maintain at most 1 reconsiderable announcement per wtxid (Pieter Wuille)
af7402ccfa [refactor] make TxOrphanage keep itself trimmed (glozow)
d1fac25ff3 [doc] 31829 release note (glozow)
Pull request description:
Followup to #31829:
- Release notes
- Have the orphanage auto-trim itself whenever necessary (and test changes) https://github.com/bitcoin/bitcoin/pull/31829#discussion_r2169508690
- Reduce duplicate reconsiderations by keeping track of which txns are already reconsiderable so we only mark it for reconsideration for 1 peer at a time https://github.com/bitcoin/bitcoin/pull/31829#issuecomment-3001627814
- Rename `OrphanTxBase` to `OrphanInfo`
- Get rid of `Size()` method by replacing all calls with `CountUniqueOrphans`
- Rename outpoints index since they point to wtxids, not iterators https://github.com/bitcoin/bitcoin/pull/31829#discussion_r2205557613
- Add more logging in the `LimitOrphans` inner loop to make it easy to see which peers are being trimmed https://github.com/bitcoin/bitcoin/pull/31829#issuecomment-3074385460
ACKs for top commit:
sipa:
utACK c0642e558a
marcofleon:
Nice, ACK c0642e558a
Tree-SHA512: f298eae92cf906ed5e4f15a24eeffa7b9e620bcff457772cd77522dd9f0b3b183ffc976871b1b0e6fe93009e64877d518e53d4b9e186e0df58fc16d17f6de90a
-BEGIN VERIFY SCRIPT-
sed -i 's/max_global_ann/max_global_latency_score/g' src/node/txorphanage.cpp
sed -i 's/max_global_ann/max_global_latency_score/g' src/node/txorphanage.h
sed -i 's/max_global_ann/max_global_latency_score/g' src/test/orphanage_tests.cpp
sed -i 's/max_global_ann/max_global_latency_score/g' src/test/fuzz/txorphan.cpp
sed -i 's/max_global_ann/max_global_latency_score/g' src/bench/txorphanage.cpp
sed -i 's/max_ann/max_lat/g' src/node/txorphanage.cpp
-END VERIFY SCRIPT-
This introduces an invariant that TxOrphanageImpl never holds more than one
announcement with m_reconsider=true for a given wtxid. This avoids duplicate
work, both in the caller might otherwise reconsider the same transaction multiple
times before it is ready, and internally in AddChildrenToWorkSet, which might
otherwise iterate over all announcements multiple times.
It is redundant with -logsourcelocations and the log messages are
clearer without it.
Also, remove a double-space.
Also, add braces around `if` touched in the next commit.
This tiny behavior change requires a test fixup.
249889bee6 orphanage: avoid vtx iteration when no orphans (furszy)
41ad2be434 mempool: Avoid expensive loop in `removeForBlock` during IBD (Lőrinc)
Pull request description:
During Initial Block Download, the mempool is usually empty, but `CTxMemPool::removeForBlock` is still called for every connected block where we:
* iterate over every transaction in the block even though none will be found in the empty `mapTx`, always leaving `txs_removed_for_block` empty...
* which is pre-allocated regardless with `40 bytes * vtx.size()`, even though it will always remain empty.
Similarly to https://github.com/bitcoin/bitcoin/pull/32730#discussion_r2140691354, this change introduces a minor performance & memory optimization by only executing the loop if any of the affected mempool maps have any contents. The second commit is cherry-picked from there since it's related to this change as well.
ACKs for top commit:
optout21:
ACK 249889bee6
glozow:
ACK 249889bee6
ismaelsadeeq:
reACK 249889bee6
Tree-SHA512: 80d06ff1515164529cdc3ad21db3041bb5b2a1d4b72ba9e6884cdf40c5f1477fee7479944b8bca32a6f0bf27c4e5501fccd085f6041a2dbb101438629cfb9e4b
Mechanical refactor of the low-level "xor" wording to signal the intent instead of the implementation used.
The renames are ordered by heaviest-hitting substitutions first, and were constructed such that after each replacement the code is still compilable.
-BEGIN VERIFY SCRIPT-
sed -i \
-e 's/\bGetObfuscateKey\b/GetObfuscation/g' \
-e 's/\bxor_key\b/obfuscation/g' \
-e 's/\bxor_pat\b/obfuscation/g' \
-e 's/\bm_xor_key\b/m_obfuscation/g' \
-e 's/\bm_xor\b/m_obfuscation/g' \
-e 's/\bobfuscate_key\b/m_obfuscation/g' \
-e 's/\bOBFUSCATE_KEY_KEY\b/OBFUSCATION_KEY_KEY/g' \
-e 's/\bSetXor(/SetObfuscation(/g' \
-e 's/\bdata_xor\b/obfuscation/g' \
-e 's/\bCreateObfuscateKey\b/CreateObfuscation/g' \
-e 's/\bobfuscate key\b/obfuscation key/g' \
$(git ls-files '*.cpp' '*.h')
-END VERIFY SCRIPT-
Since 31 byte xor-keys are not used in the codebase, using the common size (8 bytes) makes the benchmarks more realistic.
Co-authored-by: maflcko <6399679+maflcko@users.noreply.github.com>
This is preparation for the simulation fuzz test added in a later commit. Since
AddChildrenToWorkSet consumes randomness, there is no way for the simulator to
exactly predict its behavior. By returning the set of made-reconsiderable announcements
instead, the simulator can instead test that it is *a* valid choice, and then
apply it to its own data structures.
For the default number of peers (125), allows each to relay a default
descendant package (up to 25-1=24 can be missing inputs) of small (9
inputs or fewer) transactions out of order.
This limit also gives acceptable bounds for worst case LimitOrphans iterations.
Functional tests aren't changed to check for larger cap because it would
make the runtime too long.
Also deletes the now-unused DEFAULT_MAX_ORPHAN_TRANSACTIONS.
This is largely a reimplementation using boost::multi_index_container.
All the same public methods are available. It has an index by outpoint,
per-peer tracking, peer worksets, etc.
A few differences:
- Limits have changed: instead of a global limit of 100 unique orphans,
we have a maximum number of announcements (which can include duplicate
orphans) and a global memory limit which scales with the number of
peers.
- The maximum announcements limit is 100 to match the original limit,
but this is actually a stricter limit because the announcement count
is not de-duplicated.
- Eviction strategy: when global limits are reached, a per-peer limit
comes into play. While limits are exceeded, we choose the peer whose
“DoS score” (max usage / limit ratio for announcements and memory
limits) is highest and evict announcements by entry time, sorting
non-reconsiderable ones before reconsiderable ones. Since announcements
are unique by (wtxid, peer), as long as 1 announcement remains for a
transaction, it remains in the orphanage.
- This eviction strategy means no peer can influence the eviction of
another peer’s orphans.
- Also, since global limits are a multiple of per-peer limits, as long
as a peer does not exceed its limits, its orphans are protected from
eviction.
- Orphans no longer expire, since older announcements are generally
removed before newer ones.
- GetChildrenFromSamePeer returns the transactions from newest to
oldest.
Co-authored-by: Pieter Wuille <pieter@wuille.net>
Move towards a model where TxOrphanage is initialized with limits that
it remembers throughout its lifetime.
Remove the param. Limiting by number of unique orphans will be removed
in a later commit.
Now that -maxorphantx is gone, this does not change the node behavior.
The parameter is only used in tests.
Expiry is going away in a later commit.
This is only an RPC change. Behavior of the orphanage does not change.
Note that getorphantxs is marked experimental.
a60f863d3e scripted-diff: Replace GenTxidVariant with GenTxid (marcofleon)
c8ba199598 Remove old GenTxid class (marcofleon)
072a198ea4 Convert remaining instances of GenTxid to GenTxidVariant (marcofleon)
1b528391c7 Convert `txrequest` to GenTxidVariant (marcofleon)
bde4579b07 Convert `txdownloadman_impl` to GenTxidVariant (marcofleon)
c876a892ec Replace GenTxid with Txid/Wtxid overloads in `txmempool` (marcofleon)
de858ce2be move-only: make GetInfo a private CTxMemPool member (stickies-v)
eee473d9f3 Convert `CompareInvMempoolOrder` to GenTxidVariant (marcofleon)
243553d590 refactor: replace get_iter_from_wtxid with GetIter(const Wtxid&) (stickies-v)
fcf92fd640 refactor: make CTxMemPool::GetIter strongly typed (marcofleon)
11d28f21bb Implement GenTxid as a variant (marcofleon)
Pull request description:
Part of the [type safety refactor](https://github.com/bitcoin/bitcoin/pull/32189).
This PR changes the GenTxid class to a variant, which holds both Txids and Wtxids. This provides compile-time type safety and eliminates the manual type check (bool m_is_wtxid). Variables that can be either a Txid or a Wtxid are now using the new GenTxid variant, instead of uint256.
ACKs for top commit:
w0xlt:
ACK a60f863d3e
dergoegge:
Code review ACK a60f863d3e
maflcko:
review ACK a60f863d3e🎽
theStack:
Code-review ACK a60f863d3e
Tree-SHA512: da9b73b7bdffee2eb9281a409205519ac330d3336094d17681896703fbca8099608782c9c85801e388e4d90af5af8abf1f34931f57bbbe6e9674d802d6066047
Now that we are storing `CTxMemPool::CompareDepthAndScore` parameters using
`std::variant` we have no portable zero-overhead way of accessing them,
so use `std::visit` and drop `bool wtxid` in-parameter.
Co-authored-by: stickies-v <stickies-v@protonmail.com>
fcfd3db563 remove RPCTimerInterface and RPCRunLater (Matthew Zipkin)
8a1765795f use WalletContext scheduler for walletpassphrase callback (Matthew Zipkin)
Pull request description:
This removes the dependency on libevent for events scheduled by RPC commands, like re-locking a wallet some time after decryption with walletpassphrase. Since walletpassphrase is currently the only RPC that does this, `RPCRunLater`, `RPCTimerInterface` and all related methods are left unused, and deleted in the second commit. Any future RPC that needs to execute a callback in the future can follow the pattern in this PR and just use a scheduler from node or wallet context.
This is an alternative approach to #32796, described in https://github.com/bitcoin/bitcoin/pull/32796#issuecomment-3014309449
ACKs for top commit:
fjahr:
Code Review ACK fcfd3db563
achow101:
ACK fcfd3db563
furszy:
ACK fcfd3db563
Tree-SHA512: 04f5e9c3f73f598c3d41d6e35bb59c64c7b93b03ad9fce3c40901733147ce7764f41f475fef1527d44af18f722759996a31ca83b48cb52153795d5022fecfd14
c10e382d2a flatfile: check whether the file has been closed successfully (Vasil Dimov)
4bb5dd78ea util: check that a file has been closed before ~AutoFile() is called (Vasil Dimov)
8bb34f07df Explicitly close all AutoFiles that have been written (Vasil Dimov)
a69c4098b2 rpc: take ownership of the file by WriteUTXOSnapshot() (Hodlinator)
Pull request description:
`fclose(3)` may fail to flush the previously written data to disk, thus a failing `fclose(3)` is as serious as a failing `fwrite(3)`.
Previously the code ignored `fclose(3)` failures. This PR improves that by changing all users of `AutoFile` that use it to write data to explicitly close the file and handle a possible error.
---
Other alternatives are:
1. `fflush(3)` after each write to the file (and throw if it fails from the `AutoFile::write()` method) and hope that `fclose(3)` will then always succeed. Assert that it succeeds from the destructor 🙄. Will hurt performance.
2. Throw nevertheless from the destructor. Exception within the exception in C++ I think results in terminating the program without a useful message.
3. (this is implemented in the latest incarnation of this PR) Redesign `AutoFile` so that its destructor cannot fail. Adjust _all_ its users 😭. For example, if the file has been written to, then require the callers to explicitly call the `AutoFile::fclose()` method before the object goes out of scope. In the destructor, as a sanity check, assume/assert that this is indeed the case. Defeats the purpose of a RAII wrapper for `FILE*` which automatically closes the file when it goes out of scope and there are a lot of users of `AutoFile`.
4. Pass a new callback function to the `AutoFile` constructor which will be called from the destructor to handle `fclose()` errors, as described in https://github.com/bitcoin/bitcoin/pull/29307#issuecomment-2243842400. My thinking is that if that callback is going to only log a message, then we can log the message directly from the destructor without needing a callback. If the callback is going to do more complicated error handling then it is easier to do that at the call site by directly calling `AutoFile::fclose()` instead of getting the `AutoFile` object out of scope (so that its destructor is called) and inspecting for side effects done by the callback (e.g. set a variable to indicate a failed `fclose()`).
ACKs for top commit:
l0rinc:
ACK c10e382d2a
achow101:
ACK c10e382d2a
hodlinator:
re-ACK c10e382d2a
Tree-SHA512: 3994ca57e5b2b649fc84f24dad144173b7500fc0e914e06291d5c32fbbf8d2b1f8eae0040abd7a5f16095ddf4e11fe1636c6092f49058cda34f3eb2ee536d7ba
9341b5333a blockstorage: make block read hash checks explicit (Lőrinc)
2371b9f4ee test/bench: verify hash in `ComputeFilter` reads (Lőrinc)
5d235d50d6 net: assert block hash in `ProcessGetBlockData` and `ProcessMessage` (Lőrinc)
Pull request description:
A follow-up to https://github.com/bitcoin/bitcoin/pull/32487#discussion_r2094072165, after which validating the hash of a read block from disk doesn't incur the cost of calculating its hash anymore.
### Summary
This PR adds explicit checks that the read block header's hash matches the one we were expecting.
### Context
After the previous PR, validating a block's hash during read operations became essentially free. This PR leverages that by requiring callers to provide a block's expected hash (or `std::nullopt`), preventing silent failures caused by corrupted or mismatched data. Most `ReadBlock` usages were updated with expected hashes and now fail on mismatch.
### Changes
* added hash assertions in `ProcessGetBlockData` and `ProcessMessage` to validate that the block read from disk matches the expected hash;
* updated tests and benchmark to pass the correct block hash to `ReadBlock()`, ensuring the hash validation is tested - or none if we already expect PoW failure;
* removed the default value for `expected_hash`, requiring an explicit hash for all block reads.
### Why is the hash still optional (but no longer has a default value)
* for header-error tests, where the goal is to trigger failures early in the parsing process;
* for out-of-order orphan blocks, where the child hash isn't available before the initial disk read.
ACKs for top commit:
maflcko:
review ACK 9341b5333a🕙
achow101:
ACK 9341b5333a
hodlinator:
ACK 9341b5333a
janb84:
re ACK 9341b5333a
Tree-SHA512: cf1d4fff4c15e3f8898ec284929cb83d7e747125d4ee759e77d369f1716728e843ef98030be32c8d608956a96ae2fbefa0e801200c333b9eefd6c086ec032e1f
9f8e7b0b3b node: cap -dbcache to 1GiB on 32-bit architectures (Antoine Poinsot)
2c43b6adeb init: cap -maxmempool to 500 MB on 32-bit systems (Antoine Poinsot)
Pull request description:
32-bit architecture is limited to 4GiB of RAM, so it doesn't make sense to set a too high value. A too high value could cause an OOM unbeknownst to the user a while after startup as mempool / dbcache fills.
ACKs for top commit:
achow101:
ACK 9f8e7b0b3b
instagibbs:
utACK 9f8e7b0b3b
dergoegge:
Code review ACK 9f8e7b0b3b
glozow:
utACK 9f8e7b0b3b
Tree-SHA512: cc7541b2c0040fc21a43916caec464dfb443af808f4e85deffa1187448ffff6edb0d69f9ebdb43915d145b8b4694d8465afe548f88da53ccebc9ce4b7c34b735
a18e572328 test: more template verification tests (Sjors Provoost)
10c908808f test: move gbt proposal mode tests to new file (Sjors Provoost)
94959b8dee Add checkBlock to Mining interface (Sjors Provoost)
6077157531 ipc: drop BlockValidationState special handling (Sjors Provoost)
74690f4ed8 validation: refactor TestBlockValidity (Sjors Provoost)
Pull request description:
This PR adds the IPC equivalent of the `getblocktemplate` RPC in `proposal` mode.
In order to do so it has `TestBlockValidity` return error reasons as a string instead of `BlockValidationState`. This avoids complexity in IPC code for handling the latter struct.
The new Mining interface method is used in `miner_tests`.
It's not used by the `getblocktemplate` and `generateblock` RPC calls, see https://github.com/bitcoin/bitcoin/pull/31981#discussion_r2096473337
The `inconclusive-not-best-prevblk` check is moved from RPC
code to `TestBlockValidity`.
Test coverage is increased by `mining_template_verification.py`.
Superseedes #31564
## Background
### Verifying block templates (no PoW)
Stratum v2 allows miners to generate their own block template. Pools may wish (or need) to verify these templates. This typically involves comparing mempools, asking miners to providing missing transactions and then reconstructing the proposed block.[^0] This is not sufficient to ensure a proposed block is actually valid. In some schemes miners could take advantage of incomplete validation[^1].
The Stratum Reference Implementation (SRI), currently the only Stratum v2 implementation, collects all missing mempool transactions, but does not yet fully verify the block.[^2]. It could use the `getblocktemplate` RPC in `proposal` mode, but using IPC is more performant, as it avoids serialising up to 4 MB of transaction data as JSON.
(although SRI could use this PR, the Template Provider role doesn't need it, so this is _not_ part of #31098)
[^0]: https://github.com/stratum-mining/sv2-spec/blob/main/06-Job-Declaration-Protocol.md
[^1]: https://delvingbitcoin.org/t/pplns-with-job-declaration/1099/45?u=sjors
[^2]: https://github.com/stratum-mining/stratum/blob/v1.1.0/roles/jd-server/src/lib/job_declarator/message_handler.rs#L196
ACKs for top commit:
davidgumberg:
reACK a18e572328
achow101:
ACK a18e572328
TheCharlatan:
ACK a18e572328
ryanofsky:
Code review ACK a18e572328 just adding another NONFATAL_UNREACHABLE since last review
Tree-SHA512: 1a6c29f45a1666114f10f55aed155980b90104db27761c78aada4727ce3129e6ae7a522d90a56314bd767bd7944dfa46e85fb9f714370fc83e6a585be7b044f1
a201a99f8c thread-safety: fix annotations with REVERSE_LOCK (Cory Fields)
aeea5f0ec1 thread-safety: add missing lock annotation (Cory Fields)
832c57a534 thread-safety: modernize thread safety macros (Cory Fields)
Pull request description:
This is one of several PRs to cleanup/modernize our threading primitives.
While replacing the old critical section locks in the mining code with a `REVERSE_LOCK`, I noticed that our thread-safety annotations weren't hooked up to it. This PR gets `REVERSE_LOCK` working properly.
Firstly it modernizes the attributes as-recommended by the [clang docs](https://clang.llvm.org/docs/ThreadSafetyAnalysis.html) (ctrl+f for `USE_LOCK_STYLE_THREAD_SAFETY_ATTRIBUTES`). There's a subtle difference between the old `unlock_function` and new `release_capability`, where our `reverse_lock` only works with the latter. I believe this is an upstream bug. I've [reported and attempted a fix here](https://github.com/llvm/llvm-project/pull/139343), but either way it makes sense to me to modernize.
The second adds a missing annotation pointed out by a fixed `REVERSE_LOCK`. Because clang's thread-safety annotations aren't passed through a reference to `UniqueLock` as one may assume (see [here](https://clang.llvm.org/docs/ThreadSafetyAnalysis.html#no-alias-analysis) for more details), `cs_main` has to be listed explicitly as a requirement.
The last commit actually fixes the `reverse_lock` by making it a `SCOPED_LOCK` and using the pattern [found in a clang test](https://github.com/llvm/llvm-project/blob/main/clang/test/SemaCXX/warn-thread-safety-analysis.cpp#L3126). Though the docs don't describe how to accomplish it, the functionality was added [in this commit](6a68efc959). Due to aliasing issues (see link above), in order to work correctly, the original mutex has to be passed along with the lock, so all existing `REVERSE_LOCK`s have been updated. To ensure that the mutexes actually match, a runtime assertion is added.
ACKs for top commit:
fjahr:
re-ACK a201a99f8c
davidgumberg:
reACK a201a99f8c
theuni:
Ok, done. Those last pushes can be ignored. ACKs on a201a99 are still fresh.
ryanofsky:
Code review ACK a201a99f8c. Just dropping 0065b9673db5da2994b0b07c1d50ebfb19af39d0 and fixing incorrect `reverse_lock::lockname` initialization since last review.
TheCharlatan:
Re-ACK a201a99f8c
Tree-SHA512: 2755fae0c41021976a1a633014a86d927f104ccbc8014c01c06dae89af363f92e5bc5d4276ad6d759302ac4679fe02a543758124d48318074db1c370989af7a7
Without proper annotations, clang thinks that mutexes are still held for the
duration of a reverse_lock. This could lead to subtle bugs as
EXCLUSIVE_LOCKS_REQUIRED(foo) passes when it shouldn't.
As mentioned in the docs [0], clang's thread-safety analyzer is unable to deal
with aliases of mutexes, so it is not possible to use the lock's copy of the
mutex for that purpose. Instead, the original mutex needs to be passed back to
the reverse_lock for the sake of thread-safety analysis, but it is not actually
used otherwise.
[0]: https://clang.llvm.org/docs/ThreadSafetyAnalysis.html
There is no way to report a close error from `AutoFile` destructor.
Such an error could be serious if the file has been written to because
it may mean the file is now corrupted (same as if write fails).
So, change all users of `AutoFile` that use it to write data to
explicitly close the file and handle a possible error.
Comments are expanded.
Return BlockValidationState instead of passing a reference.
Lock Chainman mutex instead of cs_main.
Remove redundant chainparams and pindexPrev arguments.
Drop defaults for checking proof-of-work and merkle root.
The ContextualCheckBlockHeader check is moved to after CheckBlock,
which is more similar to normal validation where context-free checks
are done first.
Validation failure reasons are no longer printed through LogError(),
since it depends on the caller whether this implies an actual bug
in the node, or an externally sourced block that happens to be invalid.
When called from getblocktemplate, via BlockAssembler::CreateNewBlock(),
this method already throws an std::runtime_error if validation fails.
Additionally it moves the inconclusive-not-best-prevblk check from RPC
code to TestBlockValidity.
There is no behavior change when callling getblocktemplate with proposal.
Previously this would return a BIP22ValidationResult which can throw for
state.IsError(). But CheckBlock() and the functions it calls only use
state.IsValid().
The final assert is changed into Assume, with a LogError.
Co-authored-by: <Ryan Ofsky <ryan@ofsky.org>
Dropped the default expected_hash parameter from `ReadBlock()`.
In `blockmanager_flush_block_file` tests, we pass {} since the tests would already fail at PoW validation for corrupted blocks.
In `ChainstateManager::LoadExternalBlockFile`, we pass {} when processing child blocks because their hashes aren't known beforehand.