Merge bitcoin/bitcoin#27877: wallet: Add CoinGrinder coin selection algorithm

13161ecf03 opt: Skip over barren combinations of tiny UTXOs (Murch)
b7672c7cdd opt: Skip checking max_weight separately (Murch)
1edd2baa37 opt: Cut if last addition was minimal weight (Murch)
5248e2a60d opt: Skip heavier UTXOs with same effective value (Murch)
9124c73742 opt: Tiebreak UTXOs by weight for CoinGrinder (Murch)
451be19dc1 opt: Skip evaluation of equivalent input sets (Murch)
407b1e3432 opt: Track remaining effective_value in lookahead (Murch)
5f84f3cc04 opt: Skip branches with worse weight (Murch)
d68bc74fb2 fuzz: Test optimality of CoinGrinder (Murch)
67df6c629a fuzz: Add CoinGrinder fuzz target (Murch)
1502231229 coinselection: Track whether CG completed (Murch)
7488acc646 test: Add coin_grinder_tests (Murch)
6cc9a46cd0 coinselection: Add CoinGrinder algorithm (Murch)
89d0956643 opt: Tie-break UTXO sort by waste for BnB (Murch)
aaee65823c doc: Document max_weight on BnB (Murch)

Pull request description:

  ***Please refer to the [topic on Delving Bitcoin](https://delvingbitcoin.org/t/gutterguard-and-coingrinder-simulation-results/279) discussing Gutter Guard/Coingrinder simulation results.***

  Adds a coin selection algorithm that minimizes the weight of the input set while creating change.

  Motivations
  ---

  - At high feerates, using unnecessary inputs can significantly increase the fees
  - Users are upset when fees are relatively large compared to the amount sent
  - Some users struggle to maintain a sufficient count of UTXOs in their wallet

  Approach
  ---

  So far, Bitcoin Core has used a balanced approach to coin selection, where it will generate multiple input set candidates using various coin selection algorithms and pick the least wasteful among their results, but not explicitly minimize the input set weight. Under some circumstances, we _do_ want to minimize the weight of the input set. Sometimes changeless solutions require many or heavy inputs, and there is not always a changeless solution for Branch and Bound to find in the first place. This can cause expensive transactions unnecessarily. Given a wallet with sufficient funds, `CoinGrinder` will pick the minimal-waste input set for a transaction with a change output. The current implementation only runs `CoinGrinder` at feerates over 3×long-term-feerate-estimate (by default 30 ṩ/vB), which may be a decent compromise between our goal to reduce costs for the users, but still permit transactions at lower feerates to naturally reduce the wallet’s UTXO pool to curb bloat.

  Trade-offs
  ---

  Simulations for my thesis on coin selection ([see Section 6.3.2.1 [PDF]](https://murch.one/erhardt2016coinselection.pdf)) suggest that minimizing the input set for all transactions tends to grind a wallet’s UTXO pool to dust (pun intended): an approach selecting inputs per coin-age-priority (in effect similar to “largest first selection”) on average produced a UTXO pool with 15× the UTXO count as Bitcoin Core’s Knapsack-based Coin Selection then (in 2016). Therefore, I do not recommend running `CoinGrinder` under all circumstances, but only at extreme feerates or when we have another good reason to minimize the input set for other reasons. In the long-term, we should introduce additional metrics to score different input set candidates, e.g. on basis of their privacy and wallet health impact, to pick from all our coin selection results, but until then, we may want to limit use of `CoinGrinder` in other ways.

ACKs for top commit:
  achow101:
    ACK 13161ecf03
  sr-gi:
    ACK [13161ec](13161ecf03)
  sipa:
    ACK 13161ecf03

Tree-SHA512: 895b08b2ebfd0b71127949b7dba27146a6d10700bf8590402b14f261e7b937f4e2e1b24ca46de440c35f19349043ed2eba4159dc2aa3edae57721384186dae40
This commit is contained in:
Ava Chow
2024-02-09 16:30:36 -05:00
5 changed files with 745 additions and 4 deletions

View File

@@ -11,6 +11,7 @@
#include <test/util/setup_common.h>
#include <wallet/coinselection.h>
#include <numeric>
#include <vector>
namespace wallet {
@@ -77,6 +78,144 @@ static SelectionResult ManualSelection(std::vector<COutput>& utxos, const CAmoun
// Returns true if the result contains an error and the message is not empty
static bool HasErrorMsg(const util::Result<SelectionResult>& res) { return !util::ErrorString(res).empty(); }
FUZZ_TARGET(coin_grinder)
{
FuzzedDataProvider fuzzed_data_provider{buffer.data(), buffer.size()};
std::vector<COutput> utxo_pool;
const CAmount target{fuzzed_data_provider.ConsumeIntegralInRange<CAmount>(1, MAX_MONEY)};
FastRandomContext fast_random_context{ConsumeUInt256(fuzzed_data_provider)};
CoinSelectionParams coin_params{fast_random_context};
coin_params.m_subtract_fee_outputs = fuzzed_data_provider.ConsumeBool();
coin_params.m_long_term_feerate = CFeeRate{ConsumeMoney(fuzzed_data_provider, /*max=*/COIN)};
coin_params.m_effective_feerate = CFeeRate{ConsumeMoney(fuzzed_data_provider, /*max=*/COIN)};
coin_params.change_output_size = fuzzed_data_provider.ConsumeIntegralInRange<int>(10, 1000);
coin_params.change_spend_size = fuzzed_data_provider.ConsumeIntegralInRange<int>(10, 1000);
coin_params.m_cost_of_change= coin_params.m_effective_feerate.GetFee(coin_params.change_output_size) + coin_params.m_long_term_feerate.GetFee(coin_params.change_spend_size);
coin_params.m_change_fee = coin_params.m_effective_feerate.GetFee(coin_params.change_output_size);
// For other results to be comparable to SRD, we must align the change_target with SRDs hardcoded behavior
coin_params.m_min_change_target = CHANGE_LOWER + coin_params.m_change_fee;
// Create some coins
CAmount total_balance{0};
CAmount max_spendable{0};
int next_locktime{0};
LIMITED_WHILE(fuzzed_data_provider.ConsumeBool(), 10000)
{
const int n_input{fuzzed_data_provider.ConsumeIntegralInRange<int>(0, 10)};
const int n_input_bytes{fuzzed_data_provider.ConsumeIntegralInRange<int>(41, 10000)};
const CAmount amount{fuzzed_data_provider.ConsumeIntegralInRange<CAmount>(1, MAX_MONEY)};
if (total_balance + amount >= MAX_MONEY) {
break;
}
AddCoin(amount, n_input, n_input_bytes, ++next_locktime, utxo_pool, coin_params.m_effective_feerate);
total_balance += amount;
CAmount eff_value = amount - coin_params.m_effective_feerate.GetFee(n_input_bytes);
max_spendable += eff_value;
}
std::vector<OutputGroup> group_pos;
GroupCoins(fuzzed_data_provider, utxo_pool, coin_params, /*positive_only=*/true, group_pos);
// Run coinselection algorithms
auto result_cg = CoinGrinder(group_pos, target, coin_params.m_min_change_target, MAX_STANDARD_TX_WEIGHT);
if (target + coin_params.m_min_change_target > max_spendable || HasErrorMsg(result_cg)) return; // We only need to compare algorithms if CoinGrinder has a solution
assert(result_cg);
if (!result_cg->GetAlgoCompleted()) return; // Bail out if CoinGrinder solution is not optimal
auto result_srd = SelectCoinsSRD(group_pos, target, coin_params.m_change_fee, fast_random_context, MAX_STANDARD_TX_WEIGHT);
if (result_srd && result_srd->GetChange(CHANGE_LOWER, coin_params.m_change_fee) > 0) { // exclude any srd solutions that dont have change, err on excluding
assert(result_srd->GetWeight() >= result_cg->GetWeight());
}
auto result_knapsack = KnapsackSolver(group_pos, target, coin_params.m_min_change_target, fast_random_context, MAX_STANDARD_TX_WEIGHT);
if (result_knapsack && result_knapsack->GetChange(CHANGE_LOWER, coin_params.m_change_fee) > 0) { // exclude any knapsack solutions that dont have change, err on excluding
assert(result_knapsack->GetWeight() >= result_cg->GetWeight());
}
}
FUZZ_TARGET(coin_grinder_is_optimal)
{
FuzzedDataProvider fuzzed_data_provider{buffer.data(), buffer.size()};
FastRandomContext fast_random_context{ConsumeUInt256(fuzzed_data_provider)};
CoinSelectionParams coin_params{fast_random_context};
coin_params.m_subtract_fee_outputs = false;
// Set effective feerate up to MAX_MONEY sats per 1'000'000 vB (2'100'000'000 sat/vB = 21'000 BTC/kvB).
coin_params.m_effective_feerate = CFeeRate{ConsumeMoney(fuzzed_data_provider, MAX_MONEY), 1'000'000};
coin_params.m_min_change_target = ConsumeMoney(fuzzed_data_provider);
// Create some coins
CAmount max_spendable{0};
int next_locktime{0};
static constexpr unsigned max_output_groups{16};
std::vector<OutputGroup> group_pos;
LIMITED_WHILE(fuzzed_data_provider.ConsumeBool(), max_output_groups)
{
// With maximum m_effective_feerate and n_input_bytes = 1'000'000, input_fee <= MAX_MONEY.
const int n_input_bytes{fuzzed_data_provider.ConsumeIntegralInRange<int>(1, 1'000'000)};
// Only make UTXOs with positive effective value
const CAmount input_fee = coin_params.m_effective_feerate.GetFee(n_input_bytes);
// Ensure that each UTXO has at least an effective value of 1 sat
const CAmount eff_value{fuzzed_data_provider.ConsumeIntegralInRange<CAmount>(1, MAX_MONEY - max_spendable - max_output_groups + group_pos.size())};
const CAmount amount{eff_value + input_fee};
std::vector<COutput> temp_utxo_pool;
AddCoin(amount, /*n_input=*/0, n_input_bytes, ++next_locktime, temp_utxo_pool, coin_params.m_effective_feerate);
max_spendable += eff_value;
auto output_group = OutputGroup(coin_params);
output_group.Insert(std::make_shared<COutput>(temp_utxo_pool.at(0)), /*ancestors=*/0, /*descendants=*/0);
group_pos.push_back(output_group);
}
size_t num_groups = group_pos.size();
assert(num_groups <= max_output_groups);
// Only choose targets below max_spendable
const CAmount target{fuzzed_data_provider.ConsumeIntegralInRange<CAmount>(1, std::max(CAmount{1}, max_spendable - coin_params.m_min_change_target))};
// Brute force optimal solution
CAmount best_amount{MAX_MONEY};
int best_weight{std::numeric_limits<int>::max()};
for (uint32_t pattern = 1; (pattern >> num_groups) == 0; ++pattern) {
CAmount subset_amount{0};
int subset_weight{0};
for (unsigned i = 0; i < num_groups; ++i) {
if ((pattern >> i) & 1) {
subset_amount += group_pos[i].GetSelectionAmount();
subset_weight += group_pos[i].m_weight;
}
}
if ((subset_amount >= target + coin_params.m_min_change_target) && (subset_weight < best_weight || (subset_weight == best_weight && subset_amount < best_amount))) {
best_weight = subset_weight;
best_amount = subset_amount;
}
}
if (best_weight < std::numeric_limits<int>::max()) {
// Sufficient funds and acceptable weight: CoinGrinder should find at least one solution
int high_max_weight = fuzzed_data_provider.ConsumeIntegralInRange<int>(best_weight, std::numeric_limits<int>::max());
auto result_cg = CoinGrinder(group_pos, target, coin_params.m_min_change_target, high_max_weight);
assert(result_cg);
assert(result_cg->GetWeight() <= high_max_weight);
assert(result_cg->GetSelectedEffectiveValue() >= target + coin_params.m_min_change_target);
assert(best_weight < result_cg->GetWeight() || (best_weight == result_cg->GetWeight() && best_amount <= result_cg->GetSelectedEffectiveValue()));
if (result_cg->GetAlgoCompleted()) {
// If CoinGrinder exhausted the search space, it must return the optimal solution
assert(best_weight == result_cg->GetWeight());
assert(best_amount == result_cg->GetSelectedEffectiveValue());
}
}
// CoinGrinder cannot ever find a better solution than the brute-forced best, or there is none in the first place
int low_max_weight = fuzzed_data_provider.ConsumeIntegralInRange<int>(0, best_weight - 1);
auto result_cg = CoinGrinder(group_pos, target, coin_params.m_min_change_target, low_max_weight);
// Max_weight should have been exceeded, or there were insufficient funds
assert(!result_cg);
}
FUZZ_TARGET(coinselection)
{
FuzzedDataProvider fuzzed_data_provider{buffer.data(), buffer.size()};