Merge bitcoin/bitcoin#24410: [kernel 2a/n] Split hashing/index GetUTXOStats codepaths, decouple from coinstatsindex

664a14ba7c coinstats: Move GetUTXOStats to rpc/blockchain (Carl Dong)
f100687566 kernel: Use ComputeUTXOStats in validation (Carl Dong)
faa52387e8 style-only: Rearrange using decls after scripted-diff (Carl Dong)
f329a9298c scripted-diff: Move src/kernel/coinstats to kernel:: (Carl Dong)
0e54456f04 Use only kernel/coinstats.h in index/coinstatsindex.h (Carl Dong)
80970985c9 coinstats: Split node/coinstats.h to kernel/coinstats.h (Carl Dong)
35f73ce4b2 coinstats: Move hasher codepath to kernel/coinstats (Carl Dong)
b7634fe02b Move logic from LookupUTXOStatsWithIndex to CoinStatsIndex::LookUpStats (Carl Dong)
1352e410a5 coinstats: Separate hasher/index lookup codepaths (Carl Dong)
524463daf6 coinstats: Return purely out-param CCoinsStats (Carl Dong)
46eb9fc56a coinstats: Extract index_requested in-member to in-param (Carl Dong)
a789f3f2b8 coinstats: Extract hash_type in-member to in-param (Carl Dong)
102294898d includes: Remove rpc/util.h -> node/coinstats.h (Carl Dong)
0848db9c35 fuzz: Remove useless GetUTXOStats fuzz case (Carl Dong)
52b1939993 kernel: Remove unnecessary blockfilter{index,}.cpp (Carl Dong)

Pull request description:

  Part of: #24303
  Depends on: #24322

  The `GetUTXOStats` function has 2 codepaths:
    - One which queries the `CoinStatsIndex` for the UTXO hash
    - One which actually performs the hashing

  For `libbitcoinkernel`, the only place where we call `GetUTXOStats` is in `PopulateAndValidateSnapshots`, which uses the `SHA256D` hash, and is therefore unable to use the `CoinStatsIndex` since that only provides `MuHash` hashes. Not that I think indices necessarily belong in `libbitcoinkernel` anyway.

  This PR separates these 2 aforementioned codepaths of `GetUTXOStats`, uses the hashing codepath in `PopulateAndValidateSnapshots`, and removes the need to link in `index/coinstatsindex.cpp` and `node/coinstats.cpp`.

  -----

  Logistically, this PR:
  - Extracts out the `index_requested` and `hash_type` members of `CoinStats`, which served as "in-params" to `GetUTXOStats` embedded within the `CoinStats` struct. This allows `CoinStats` to only consist of "out-param" members, and be returned by `GetUTXOStats` without needing to be an "in-out" param
  - Introduce the purely virtual `UTXOHashers` class, with 3 implementations: `SHA256DHasher`, `MuHashHasher`, and `NullHasher`. These replace the existing template-based polymorphism.
  - Split `GetUTXOStats` into:
      - `CalculateUTXOStatsWithHasher(UTXOHasher&, ...)`, and
      - `LookupUTXOStatsWithIndex(CoinStatsIndex&, ...)`
  - Use `CalculateUTXOStatsWithHasher` directly where appropriate (`src/validation.cpp` and `src/fuzz`)
  - Move `GetUTXOStats` to `rpc/blockchain`, which is the only place that depends on `GetUTXOStats`'s weird fallback behaviour
  - Move `LookupUTXOStatsWithIndex` to `index/coinstatsindex`

  Code organization:
  - `src/`
    - `kernel/` → only contains the hashing codepath
      - `coinstats.cpp` → hashing codepath implementations
      - `coinstats.h` → header for `kernel/coinstats.cpp`
    - `index/` → only contains the index codepath
      - `coinstatsindex.cpp` → index codepath implementations
      - `coinstatsindex.h`
    - `validation.cpp` → only uses the hashing codepath
    - `rpc/blockchain.cpp` → uses both the hashing and index codepath, old `GetUTXOStats` fallback logic moved here as static
    - `test/fuzz/coins_view.cpp` → only uses the hashing codepath

  TODOs:
  - [x] Commit messages could be fleshed out more

  Would love any feedback!

ACKs for top commit:
  laanwj:
    Code review ACK 664a14ba7c

Tree-SHA512: 18722c7bd279174d2d1881fec33ea04a9b261aae1c12e998cf434ef297d8ded47de69c526c8033a2ba7abc93ba3d2ff5faf4ce05e8888c725c31cf885ce3ef73
This commit is contained in:
laanwj
2022-05-24 13:55:57 +02:00
11 changed files with 145 additions and 131 deletions

View File

@@ -1,185 +0,0 @@
// Copyright (c) 2010 Satoshi Nakamoto
// Copyright (c) 2009-2021 The Bitcoin Core developers
// Distributed under the MIT software license, see the accompanying
// file COPYING or http://www.opensource.org/licenses/mit-license.php.
#include <node/coinstats.h>
#include <coins.h>
#include <crypto/muhash.h>
#include <hash.h>
#include <index/coinstatsindex.h>
#include <serialize.h>
#include <uint256.h>
#include <util/overflow.h>
#include <util/system.h>
#include <validation.h>
#include <map>
namespace node {
// Database-independent metric indicating the UTXO set size
uint64_t GetBogoSize(const CScript& script_pub_key)
{
return 32 /* txid */ +
4 /* vout index */ +
4 /* height + coinbase */ +
8 /* amount */ +
2 /* scriptPubKey len */ +
script_pub_key.size() /* scriptPubKey */;
}
CDataStream TxOutSer(const COutPoint& outpoint, const Coin& coin) {
CDataStream ss(SER_DISK, PROTOCOL_VERSION);
ss << outpoint;
ss << static_cast<uint32_t>(coin.nHeight * 2 + coin.fCoinBase);
ss << coin.out;
return ss;
}
//! Warning: be very careful when changing this! assumeutxo and UTXO snapshot
//! validation commitments are reliant on the hash constructed by this
//! function.
//!
//! If the construction of this hash is changed, it will invalidate
//! existing UTXO snapshots. This will not result in any kind of consensus
//! failure, but it will force clients that were expecting to make use of
//! assumeutxo to do traditional IBD instead.
//!
//! It is also possible, though very unlikely, that a change in this
//! construction could cause a previously invalid (and potentially malicious)
//! UTXO snapshot to be considered valid.
static void ApplyHash(CHashWriter& ss, const uint256& hash, const std::map<uint32_t, Coin>& outputs)
{
for (auto it = outputs.begin(); it != outputs.end(); ++it) {
if (it == outputs.begin()) {
ss << hash;
ss << VARINT(it->second.nHeight * 2 + it->second.fCoinBase ? 1u : 0u);
}
ss << VARINT(it->first + 1);
ss << it->second.out.scriptPubKey;
ss << VARINT_MODE(it->second.out.nValue, VarIntMode::NONNEGATIVE_SIGNED);
if (it == std::prev(outputs.end())) {
ss << VARINT(0u);
}
}
}
static void ApplyHash(std::nullptr_t, const uint256& hash, const std::map<uint32_t, Coin>& outputs) {}
static void ApplyHash(MuHash3072& muhash, const uint256& hash, const std::map<uint32_t, Coin>& outputs)
{
for (auto it = outputs.begin(); it != outputs.end(); ++it) {
COutPoint outpoint = COutPoint(hash, it->first);
Coin coin = it->second;
muhash.Insert(MakeUCharSpan(TxOutSer(outpoint, coin)));
}
}
static void ApplyStats(CCoinsStats& stats, const uint256& hash, const std::map<uint32_t, Coin>& outputs)
{
assert(!outputs.empty());
stats.nTransactions++;
for (auto it = outputs.begin(); it != outputs.end(); ++it) {
stats.nTransactionOutputs++;
if (stats.total_amount.has_value()) {
stats.total_amount = CheckedAdd(*stats.total_amount, it->second.out.nValue);
}
stats.nBogoSize += GetBogoSize(it->second.out.scriptPubKey);
}
}
//! Calculate statistics about the unspent transaction output set
template <typename T>
static bool GetUTXOStats(CCoinsView* view, BlockManager& blockman, CCoinsStats& stats, T hash_obj, const std::function<void()>& interruption_point, const CBlockIndex* pindex)
{
std::unique_ptr<CCoinsViewCursor> pcursor(view->Cursor());
assert(pcursor);
if (!pindex) {
LOCK(cs_main);
pindex = blockman.LookupBlockIndex(view->GetBestBlock());
}
stats.nHeight = Assert(pindex)->nHeight;
stats.hashBlock = pindex->GetBlockHash();
// Use CoinStatsIndex if it is requested and available and a hash_type of Muhash or None was requested
if ((stats.m_hash_type == CoinStatsHashType::MUHASH || stats.m_hash_type == CoinStatsHashType::NONE) && g_coin_stats_index && stats.index_requested) {
stats.index_used = true;
return g_coin_stats_index->LookUpStats(pindex, stats);
}
PrepareHash(hash_obj, stats);
uint256 prevkey;
std::map<uint32_t, Coin> outputs;
while (pcursor->Valid()) {
interruption_point();
COutPoint key;
Coin coin;
if (pcursor->GetKey(key) && pcursor->GetValue(coin)) {
if (!outputs.empty() && key.hash != prevkey) {
ApplyStats(stats, prevkey, outputs);
ApplyHash(hash_obj, prevkey, outputs);
outputs.clear();
}
prevkey = key.hash;
outputs[key.n] = std::move(coin);
stats.coins_count++;
} else {
return error("%s: unable to read value", __func__);
}
pcursor->Next();
}
if (!outputs.empty()) {
ApplyStats(stats, prevkey, outputs);
ApplyHash(hash_obj, prevkey, outputs);
}
FinalizeHash(hash_obj, stats);
stats.nDiskSize = view->EstimateSize();
return true;
}
bool GetUTXOStats(CCoinsView* view, BlockManager& blockman, CCoinsStats& stats, const std::function<void()>& interruption_point, const CBlockIndex* pindex)
{
switch (stats.m_hash_type) {
case(CoinStatsHashType::HASH_SERIALIZED): {
CHashWriter ss(SER_GETHASH, PROTOCOL_VERSION);
return GetUTXOStats(view, blockman, stats, ss, interruption_point, pindex);
}
case(CoinStatsHashType::MUHASH): {
MuHash3072 muhash;
return GetUTXOStats(view, blockman, stats, muhash, interruption_point, pindex);
}
case(CoinStatsHashType::NONE): {
return GetUTXOStats(view, blockman, stats, nullptr, interruption_point, pindex);
}
} // no default case, so the compiler can warn about missing cases
assert(false);
}
// The legacy hash serializes the hashBlock
static void PrepareHash(CHashWriter& ss, const CCoinsStats& stats)
{
ss << stats.hashBlock;
}
// MuHash does not need the prepare step
static void PrepareHash(MuHash3072& muhash, CCoinsStats& stats) {}
static void PrepareHash(std::nullptr_t, CCoinsStats& stats) {}
static void FinalizeHash(CHashWriter& ss, CCoinsStats& stats)
{
stats.hashSerialized = ss.GetHash();
}
static void FinalizeHash(MuHash3072& muhash, CCoinsStats& stats)
{
uint256 out;
muhash.Finalize(out);
stats.hashSerialized = out;
}
static void FinalizeHash(std::nullptr_t, CCoinsStats& stats) {}
} // namespace node

View File

@@ -1,84 +0,0 @@
// Copyright (c) 2010 Satoshi Nakamoto
// Copyright (c) 2009-2021 The Bitcoin Core developers
// Distributed under the MIT software license, see the accompanying
// file COPYING or http://www.opensource.org/licenses/mit-license.php.
#ifndef BITCOIN_NODE_COINSTATS_H
#define BITCOIN_NODE_COINSTATS_H
#include <chain.h>
#include <coins.h>
#include <consensus/amount.h>
#include <streams.h>
#include <uint256.h>
#include <cstdint>
#include <functional>
class CCoinsView;
namespace node {
class BlockManager;
} // namespace node
namespace node {
enum class CoinStatsHashType {
HASH_SERIALIZED,
MUHASH,
NONE,
};
struct CCoinsStats {
//! Which hash type to use
const CoinStatsHashType m_hash_type;
int nHeight{0};
uint256 hashBlock{};
uint64_t nTransactions{0};
uint64_t nTransactionOutputs{0};
uint64_t nBogoSize{0};
uint256 hashSerialized{};
uint64_t nDiskSize{0};
//! The total amount, or nullopt if an overflow occurred calculating it
std::optional<CAmount> total_amount{0};
//! The number of coins contained.
uint64_t coins_count{0};
//! Signals if the coinstatsindex should be used (when available).
bool index_requested{true};
//! Signals if the coinstatsindex was used to retrieve the statistics.
bool index_used{false};
// Following values are only available from coinstats index
//! Total cumulative amount of block subsidies up to and including this block
CAmount total_subsidy{0};
//! Total cumulative amount of unspendable coins up to and including this block
CAmount total_unspendable_amount{0};
//! Total cumulative amount of prevouts spent up to and including this block
CAmount total_prevout_spent_amount{0};
//! Total cumulative amount of outputs created up to and including this block
CAmount total_new_outputs_ex_coinbase_amount{0};
//! Total cumulative amount of coinbase outputs up to and including this block
CAmount total_coinbase_amount{0};
//! The unspendable coinbase amount from the genesis block
CAmount total_unspendables_genesis_block{0};
//! The two unspendable coinbase outputs total amount caused by BIP30
CAmount total_unspendables_bip30{0};
//! Total cumulative amount of outputs sent to unspendable scripts (OP_RETURN for example) up to and including this block
CAmount total_unspendables_scripts{0};
//! Total cumulative amount of coins lost due to unclaimed miner rewards up to and including this block
CAmount total_unspendables_unclaimed_rewards{0};
CCoinsStats(CoinStatsHashType hash_type) : m_hash_type(hash_type) {}
};
//! Calculate statistics about the unspent transaction output set
bool GetUTXOStats(CCoinsView* view, node::BlockManager& blockman, CCoinsStats& stats, const std::function<void()>& interruption_point = {}, const CBlockIndex* pindex = nullptr);
uint64_t GetBogoSize(const CScript& script_pub_key);
CDataStream TxOutSer(const COutPoint& outpoint, const Coin& coin);
} // namespace node
#endif // BITCOIN_NODE_COINSTATS_H