clusterlin: include topological pot subsets automatically (optimization)

Automatically add topologically-valid subsets of the potential set pot
to inc. It can be proven that these must be part of the best reachable
topologically-valid set from that work item.

This is a crucial optimization that (apparently) reduces the maximum
number of iterations from ~2^(N-1) to ~sqrt(2^N).

Co-Authored-By: Suhas Daftuar <sdaftuar@gmail.com>
This commit is contained in:
Pieter Wuille
2024-05-09 13:53:27 -04:00
parent e20fda77a2
commit 71f2629398
3 changed files with 57 additions and 6 deletions

View File

@@ -647,7 +647,7 @@ public:
* be <= max_iterations. If strictly < max_iterations, the
* returned subset is optimal.
*
* Complexity: O(N * min(max_iterations, 2^N)) where N=depgraph.TxCount().
* Complexity: possibly O(N * min(max_iterations, sqrt(2^N))) where N=depgraph.TxCount().
*/
std::pair<SetInfo<SetType>, uint64_t> FindCandidateSet(uint64_t max_iterations, SetInfo<SetType> best) noexcept
{
@@ -723,7 +723,8 @@ public:
}
/** Internal function to add an item to the queue of elements to explore if there are any
* transactions left to split on, and to update best/imp.
* transactions left to split on, possibly improving it before doing so, and to update
* best/imp.
*
* - inc: the "inc" value for the new work item (must be topological).
* - und: the "und" value for the new work item ((inc | und) must be topological).
@@ -746,6 +747,28 @@ public:
pot.Set(m_sorted_depgraph, pos);
}
// The "jump ahead" optimization: whenever pot has a topologically-valid subset,
// that subset can be added to inc. Any subset of (pot - inc) has the property that
// its feerate exceeds that of any set compatible with this work item (superset of
// inc, subset of (inc | und)). Thus, if T is a topological subset of pot, and B is
// the best topologically-valid set compatible with this work item, and (T - B) is
// non-empty, then (T | B) is better than B and also topological. This is in
// contradiction with the assumption that B is best. Thus, (T - B) must be empty,
// or T must be a subset of B.
//
// See https://delvingbitcoin.org/t/how-to-linearize-your-cluster/303 section 2.4.
const auto init_inc = inc.transactions;
for (auto pos : pot.transactions - inc.transactions) {
// If the transaction's ancestors are a subset of pot, we can add it together
// with its ancestors to inc. Just update the transactions here; the feerate
// update happens below.
auto anc_todo = m_sorted_depgraph.Ancestors(pos) & m_todo;
if (anc_todo.IsSubsetOf(pot.transactions)) inc.transactions |= anc_todo;
}
// Finally update und and inc's feerate to account for the added transactions.
und -= inc.transactions;
inc.feerate += m_sorted_depgraph.FeeRate(inc.transactions - init_inc);
// If inc's feerate is better than best's, remember it as our new best.
if (inc.feerate > best.feerate) {
best = inc;
@@ -892,7 +915,7 @@ public:
* - A boolean indicating whether the result is guaranteed to be
* optimal.
*
* Complexity: O(N * min(max_iterations + N, 2^N)) where N=depgraph.TxCount().
* Complexity: possibly O(N * min(max_iterations + N, sqrt(2^N))) where N=depgraph.TxCount().
*/
template<typename SetType>
std::pair<std::vector<ClusterIndex>, bool> Linearize(const DepGraph<SetType>& depgraph, uint64_t max_iterations, uint64_t rng_seed, Span<const ClusterIndex> old_linearization = {}) noexcept