[p2p] overhaul TxOrphanage with smarter limits

This is largely a reimplementation using boost::multi_index_container.
All the same public methods are available. It has an index by outpoint,
per-peer tracking, peer worksets, etc.

A few differences:
- Limits have changed: instead of a global limit of 100 unique orphans,
  we have a maximum number of announcements (which can include duplicate
orphans) and a global memory limit which scales with the number of
peers.
- The maximum announcements limit is 100 to match the original limit,
  but this is actually a stricter limit because the announcement count
is not de-duplicated.
- Eviction strategy: when global limits are reached, a per-peer limit
  comes into play. While limits are exceeded, we choose the peer whose
“DoS score” (max usage / limit ratio for announcements and memory
limits) is highest and evict announcements by entry time, sorting
non-reconsiderable ones before reconsiderable ones. Since announcements
are unique by (wtxid, peer), as long as 1 announcement remains for a
transaction, it remains in the orphanage.
- This eviction strategy means no peer can influence the eviction of
  another peer’s orphans.
- Also, since global limits are a multiple of per-peer limits, as long
  as a peer does not exceed its limits, its orphans are protected from
eviction.
- Orphans no longer expire, since older announcements are generally
  removed before newer ones.
- GetChildrenFromSamePeer returns the transactions from newest to
  oldest.

Co-authored-by: Pieter Wuille <pieter@wuille.net>
This commit is contained in:
glozow
2025-04-02 17:29:38 -04:00
parent 1a41e7962d
commit 067365d2a8
4 changed files with 614 additions and 338 deletions

View File

@@ -188,6 +188,7 @@ bool TxDownloadManagerImpl::AddTxAnnouncement(NodeId peer, const GenTxid& gtxid,
if (MaybeAddOrphanResolutionCandidate(unique_parents, *wtxid, peer, now)) {
m_orphanage->AddAnnouncer(orphan_tx->GetWitnessHash(), peer);
m_orphanage->LimitOrphans(m_opts.m_rng);
}
// Return even if the peer isn't an orphan resolution candidate. This would be caught by AlreadyHaveTx.
@@ -420,8 +421,6 @@ node::RejectedTxTodo TxDownloadManagerImpl::MempoolRejectedTx(const CTransaction
m_txrequest.ForgetTxHash(tx.GetWitnessHash());
// DoS prevention: do not allow m_orphanage to grow unbounded (see CVE-2012-3789)
// Note that, if the orphanage reaches capacity, it's possible that we immediately evict
// the transaction we just added.
m_orphanage->LimitOrphans(m_opts.m_rng);
} else {
unique_parents.clear();

File diff suppressed because it is too large Load Diff

View File

@@ -16,17 +16,25 @@
#include <set>
namespace node {
/** Expiration time for orphan transactions */
static constexpr auto ORPHAN_TX_EXPIRE_TIME{20min};
/** Minimum time between orphan transactions expire time checks */
static constexpr auto ORPHAN_TX_EXPIRE_INTERVAL{5min};
/** Default value for TxOrphanage::m_reserved_usage_per_peer. Helps limit the total amount of memory used by the orphanage. */
static constexpr int64_t DEFAULT_RESERVED_ORPHAN_WEIGHT_PER_PEER{404'000};
/** Default value for TxOrphanage::m_max_global_latency_score. Helps limit the maximum latency for operations like
* EraseForBlock and LimitOrphans. */
static constexpr unsigned int DEFAULT_MAX_ORPHANAGE_LATENCY_SCORE{100};
/** Default maximum number of orphan transactions kept in memory */
static const uint32_t DEFAULT_MAX_ORPHAN_TRANSACTIONS{100};
/** A class to track orphan transactions (failed on TX_MISSING_INPUTS)
* Since we cannot distinguish orphans from bad transactions with
* non-existent inputs, we heavily limit the number of orphans
* we keep and the duration we keep them for.
* Since we cannot distinguish orphans from bad transactions with non-existent inputs, we heavily limit the amount of
* announcements (unique (NodeId, wtxid) pairs), the number of inputs, and size of the orphans stored (both individual
* and summed). We also try to prevent adversaries from churning this data structure: once global limits are reached, we
* continuously evict the oldest announcement (sorting non-reconsiderable orphans before reconsiderable ones) from the
* most resource-intensive peer until we are back within limits.
* - Peers can exceed their individual limits (e.g. because they are very useful transaction relay peers) as long as the
* global limits are not exceeded.
* - As long as the orphan has 1 announcer, it remains in the orphanage.
* - No peer can trigger the eviction of another peer's orphans.
* - Peers' orphans are effectively protected from eviction as long as they don't exceed their limits.
* Not thread-safe. Requires external synchronization.
*/
class TxOrphanage {
@@ -40,10 +48,11 @@ public:
/** Peers added with AddTx or AddAnnouncer. */
std::set<NodeId> announcers;
/** Get the weight of this transaction, an approximation of its memory usage. */
TxOrphanage::Usage GetUsage() const {
return GetTransactionWeight(*tx);
}
// Constructor with moved announcers
OrphanTxBase(CTransactionRef tx, std::set<NodeId>&& announcers) :
tx(std::move(tx)),
announcers(std::move(announcers))
{}
};
virtual ~TxOrphanage() = default;
@@ -63,7 +72,7 @@ public:
/** Check if a {tx, peer} exists in the orphanage.*/
virtual bool HaveTxFromPeer(const Wtxid& wtxid, NodeId peer) const = 0;
/** Extract a transaction from a peer's work set
/** Extract a transaction from a peer's work set, and flip it back to non-reconsiderable.
* Returns nullptr if there are no transactions to work on.
* Otherwise returns the transaction reference, and removes
* it from the work set.
@@ -81,7 +90,7 @@ public:
/** Erase all orphans included in or invalidated by a new block */
virtual void EraseForBlock(const CBlock& block) = 0;
/** Limit the orphanage to DEFAULT_MAX_ORPHAN_TRANSACTIONS. */
/** Limit the orphanage to MaxGlobalLatencyScore and MaxGlobalUsage. */
virtual void LimitOrphans(FastRandomContext& rng) = 0;
/** Add any orphans that list a particular tx as a parent into the from peer's work set */
@@ -106,16 +115,45 @@ public:
/** Total usage (weight) of orphans for which this peer is an announcer. If an orphan has multiple
* announcers, its weight will be accounted for in each PeerOrphanInfo, so the total of all
* peers' UsageByPeer() may be larger than TotalOrphanUsage(). */
* peers' UsageByPeer() may be larger than TotalOrphanUsage(). Similarly, UsageByPeer() may be far higher than
* ReservedPeerUsage(), particularly if many peers have provided the same orphans. */
virtual Usage UsageByPeer(NodeId peer) const = 0;
/** Check consistency between PeerOrphanInfo and m_orphans. Recalculate counters and ensure they
* match what is cached. */
virtual void SanityCheck() const = 0;
/** Number of announcements, i.e. total size of m_orphans. Ones for the same wtxid are not de-duplicated.
* Not the same as TotalLatencyScore(). */
virtual Count CountAnnouncements() const = 0;
/** Number of unique orphans (by wtxid). */
virtual Count CountUniqueOrphans() const = 0;
/** Number of orphans stored from this peer. */
virtual Count AnnouncementsFromPeer(NodeId peer) const = 0;
/** Latency score of transactions announced by this peer. */
virtual Count LatencyScoreFromPeer(NodeId peer) const = 0;
/** Get the maximum global latency score allowed */
virtual Count MaxGlobalLatencyScore() const = 0;
/** Get the total latency score of all orphans */
virtual Count TotalLatencyScore() const = 0;
/** Get the reserved usage per peer */
virtual Usage ReservedPeerUsage() const = 0;
/** Get the maximum latency score allowed per peer */
virtual Count MaxPeerLatencyScore() const = 0;
/** Get the maximum global usage allowed */
virtual Usage MaxGlobalUsage() const = 0;
};
/** Create a new TxOrphanage instance */
std::unique_ptr<TxOrphanage> MakeTxOrphanage() noexcept;
std::unique_ptr<TxOrphanage> MakeTxOrphanage(TxOrphanage::Count max_global_ann, TxOrphanage::Usage reserved_peer_usage) noexcept;
} // namespace node
#endif // BITCOIN_NODE_TXORPHANAGE_H

View File

@@ -170,27 +170,6 @@ BOOST_AUTO_TEST_CASE(DoS_mapOrphans)
expected_num_orphans -= 2;
BOOST_CHECK(orphanage->Size() == expected_num_orphans);
}
// Test LimitOrphanTxSize() function, nothing should timeout:
FastRandomContext rng{/*fDeterministic=*/true};
orphanage->LimitOrphans(rng);
BOOST_CHECK_EQUAL(orphanage->Size(), expected_num_orphans);
// Add one more orphan, check timeout logic
auto timeout_tx = MakeTransactionSpending(/*outpoints=*/{}, rng);
orphanage->AddTx(timeout_tx, 0);
expected_num_orphans += 1;
BOOST_CHECK_EQUAL(orphanage->Size(), expected_num_orphans);
// One second shy of expiration
SetMockTime(now + node::ORPHAN_TX_EXPIRE_TIME - 1s);
orphanage->LimitOrphans(rng);
BOOST_CHECK_EQUAL(orphanage->Size(), expected_num_orphans);
// Jump one more second, orphan should be timed out on limiting
SetMockTime(now + node::ORPHAN_TX_EXPIRE_TIME);
orphanage->LimitOrphans(rng);
BOOST_CHECK_EQUAL(orphanage->Size(), 0);
}
BOOST_AUTO_TEST_CASE(same_txid_diff_witness)