Merge bitcoin/bitcoin#35128: dbwrapper: avoid copying CDBIterator keys in GetKey()

5de2f97a05 dbwrapper: use `SpanReader` for iterator keys (Lőrinc)
f0e498af5c test: cover failed `CDBIterator::GetKey()` deserialization (Lőrinc)

Pull request description:

  ### Problem
  `CDBIterator::GetKey()` only deserializes the current LevelDB key once and `GetKeyImpl()` already exposes that key as a contiguous borrowed byte span, and `GetKey()` creates a fresh local reader and only performs immediate forward reads before returning.

  The copied `DataStream` currently insulates the iterator entry from a failed decode, so switching to a borrowed reader is only safe if a deserialization failure still returns false and leaves the same key/value readable afterward.

  > [!NOTE]
  > The same simplification does not apply to `GetValue()`, because that path deobfuscates the value bytes in place first and still needs an owning mutable buffer.

  ### Fix
  Add a preparatory test with an invalid reads and checks that the failed decode [does not consume](eb85cacd29/src/leveldb/include/leveldb/iterator.h (L60-L62)) the current iterator entry.
  Then switch `GetKey()` to `SpanReader` so the key bytes are read in place instead of being copied into a temporary `DataStream`.

  This keeps the same exception swallowing and `bool` return semantics while avoiding the extra allocation and copy.

  ### Context
  Related to https://github.com/bitcoin/bitcoin/pull/34483 and https://github.com/bitcoin/bitcoin/pull/35025

  ### Reproducer
  `gettxoutsetinfo` is ~10-12% faster for up-to-date blocks (run on SSD), see:

  <details><summary>2026-04-20 | gettxoutsetinfo | rpi5-8 | aarch64 | Cortex-A76 | 4 cores | 7.7Gi RAM | ext4 | SSD</summary>

  ```
  COMMITS="64a88c8c1edc7ee5cef623d9aa8179a239e27ce9 57dc0202ddb7b4cbdd521fb237a25fc4d7f28ddf"; \
  BASE_DIR="/mnt/my_storage"; DATA_DIR="$BASE_DIR/BitcoinData"; LOG_DIR="$BASE_DIR/logs"; \
  mkdir -p "$LOG_DIR" && \
  (echo ""; for c in $COMMITS; do git cat-file -e "$c^{commit}" 2>/dev/null || git fetch -q origin "$c" || exit 1; git log -1 --pretty='%h %s' "$c" || exit 1; done) && \
  (echo "" && echo "$(date -I) | gettxoutsetinfo | $(hostname) | $(uname -m) | $(lscpu | grep 'Model name' | head -1 | cut -d: -f2 | xargs) | $(nproc) cores | $(free -h | awk '/^Mem:/{print $2}') RAM | $(df -T $BASE_DIR | awk 'NR==2{print $2}') | $(lsblk -no ROTA $(df --output=source $BASE_DIR | tail -1) | grep -q 1 && echo HDD || echo SSD)"; echo "") && \
  hyperfine \
    --sort command \
    --runs 10 \
    --export-json "$BASE_DIR/gettxoutsetinfo-$(sed -E 's/([a-f0-9]{8})[a-f0-9]* ?/\1-/g;s/-$//'<<<"$COMMITS")-$(date +%s).json" \
    --parameter-list COMMIT ${COMMITS// /,} \
    --prepare "killall -9 bitcoind 2>/dev/null || true; rm -f $DATA_DIR/debug.log; git clean -fxd && git reset --hard {COMMIT} && \
      cmake -B build -G Ninja -DCMAKE_BUILD_TYPE=Release && ninja -C build bitcoind bitcoin-cli -j$(nproc) && \
      ./build/bin/bitcoind -datadir=$DATA_DIR -connect=0 -listen=0 -dnsseed=0 -coinstatsindex=0 -txindex=0 -blockfilterindex=0 -daemon -printtoconsole=0; \
      ./build/bin/bitcoin-cli -datadir=$DATA_DIR -rpcwait getblockcount >/dev/null" \
    --conclude "./build/bin/bitcoin-cli -datadir=$DATA_DIR stop 2>/dev/null || true; killall bitcoind 2>/dev/null || true; sleep 10; \
      grep -q 'Done loading' $DATA_DIR/debug.log && grep 'Bitcoin Core version' $DATA_DIR/debug.log | grep -q \"\$(git rev-parse --short=12 {COMMIT})\"; \
      cp $DATA_DIR/debug.log $LOG_DIR/gettxoutsetinfo-{COMMIT}-$(date +%s).log" \
    "./build/bin/bitcoin-cli -datadir=$DATA_DIR -rpcclienttimeout=0 -named gettxoutsetinfo hash_type='none' use_index='false' >/dev/null"

  64a88c8c1e Merge bitcoin/bitcoin#35096: kernel: align height parameters to int32_t in btck API
  57dc0202dd dbwrapper: use SpanReader for iterator keys

  Benchmark 1: ./build/bin/bitcoin-cli -datadir=/mnt/my_storage/BitcoinData -rpcclienttimeout=0 -named gettxoutsetinfo hash_type='none' use_index='false' >/dev/null (COMMIT = 64a88c8c1e)
    Time (mean ± σ):     109.002 s ±  3.091 s    [User: 0.003 s, System: 0.004 s]
    Range (min … max):   106.191 s … 113.608 s    10 runs

  Benchmark 2: ./build/bin/bitcoin-cli -datadir=/mnt/my_storage/BitcoinData -rpcclienttimeout=0 -named gettxoutsetinfo hash_type='none' use_index='false' >/dev/null (COMMIT = 57dc0202ddb7b4cbdd521fb237a25fc4d7f28ddf)
    Time (mean ± σ):     97.711 s ±  1.172 s    [User: 0.003 s, System: 0.004 s]
    Range (min … max):   96.651 s … 100.104 s    10 runs

  Relative speed comparison
          1.12 ±  0.03  ./build/bin/bitcoin-cli -datadir=/mnt/my_storage/BitcoinData -rpcclienttimeout=0 -named gettxoutsetinfo hash_type='none' use_index='false' >/dev/null (COMMIT = 64a88c8c1e)
          1.00          ./build/bin/bitcoin-cli -datadir=/mnt/my_storage/BitcoinData -rpcclienttimeout=0 -named gettxoutsetinfo hash_type='none' use_index='false' >/dev/null (COMMIT = 57dc0202ddb7b4cbdd521fb237a25fc4d7f28ddf)
  ```

  </details>

ACKs for top commit:
  achow101:
    ACK 5de2f97a05
  sedited:
    ACK 5de2f97a05
  andrewtoth:
    ACK 5de2f97a05
  optout21:
    ACK 5de2f97a05
  theStack:
    ACK 5de2f97a05

Tree-SHA512: 33b62149625b3ce2a378be9b4dffa361f11e324a2768e460c549b9b704efa78bf96ef5e24487d0cec82c18dafff6ba4571c06ad545684cf8738f38b9d21e9b0c
This commit is contained in:
Ava Chow
2026-04-23 11:50:44 -07:00
3 changed files with 8 additions and 2 deletions

View File

@@ -371,6 +371,8 @@ void CDBIterator::SeekImpl(std::span<const std::byte> key)
std::span<const std::byte> CDBIterator::GetKeyImpl() const
{
// The returned span borrows from the current iterator entry and is only
// valid until the iterator is advanced.
return MakeByteSpan(m_impl_iter->iter->key());
}

View File

@@ -154,7 +154,7 @@ public:
template<typename K> bool GetKey(K& key) {
try {
DataStream ssKey{GetKeyImpl()};
SpanReader ssKey{GetKeyImpl()};
ssKey >> key;
} catch (const std::exception&) {
return false;

View File

@@ -202,11 +202,15 @@ BOOST_AUTO_TEST_CASE(dbwrapper_iterator)
uint256 in2 = m_rng.rand256();
dbw.Write(key2, in2);
std::unique_ptr<CDBIterator> it(const_cast<CDBWrapper&>(dbw).NewIterator());
std::unique_ptr<CDBIterator> it(dbw.NewIterator());
// Be sure to seek past the obfuscation key (if it exists)
it->Seek(key);
// A failed key decode must not consume the current iterator entry.
uint16_t key_too_large{0};
BOOST_CHECK(!it->GetKey(key_too_large));
uint8_t key_res;
uint256 val_res;