Merge bitcoin/bitcoin#31551: [IBD] batch block reads/writes during AutoFile serialization

8d801e3efb optimization: bulk serialization writes in `WriteBlockUndo` and `WriteBlock` (Lőrinc)
520965e293 optimization: bulk serialization reads in `UndoRead`, `ReadBlock` (Lőrinc)
056cb3c0d2 refactor: clear up blockstorage/streams in preparation for optimization (Lőrinc)
67fcc64802 log: unify error messages for (read/write)[undo]block (Lőrinc)
a4de160492 scripted-diff: shorten BLOCK_SERIALIZATION_HEADER_SIZE constant (Lőrinc)
6640dd52c9 Narrow scope of undofile write to avoid possible resource management issue (Lőrinc)
3197155f91 refactor: collect block read operations into try block (Lőrinc)
c77e3107b8 refactor: rename leftover WriteBlockBench (Lőrinc)

Pull request description:

  This change is part of [[IBD] - Tracking PR for speeding up Initial Block Download](https://github.com/bitcoin/bitcoin/pull/32043)

  ### Summary
  We can serialize the blocks and undos to any `Stream` which implements the appropriate read/write methods.
  `AutoFile` is one of these, writing the results "directly" to disk (through the OS file cache). Batching these in memory first and reading/writing these to disk is measurably faster (likely because of fewer native fread calls or less locking, as [observed](https://github.com/bitcoin/bitcoin/pull/28226#issuecomment-1666842501) by Martinus in a similar change).

  ### Unlocking new optimization opportunities

  Buffered writes will also enable batched obfuscation calculations (implemented in https://github.com/bitcoin/bitcoin/pull/31144) - especially since currently we need to copy the write input's std::span to do the obfuscation on it, and batching enables doing the operations on the internal buffer directly.

  ### Measurements (micro benchmarks, full IBDs and reindexes)

  Microbenchmarks for `[Read|Write]BlockBench` show a ~**30%**/**168%** speedup with `macOS/Clang`, and ~**19%**/**24%** with `Linux/GCC` (the follow-up XOR batching improves these further):

  <details>
  <summary>macOS Sequoia - Clang 19.1.7</summary>

  > Before:

  |               ns/op |                op/s |    err% |     total | benchmark
  |--------------------:|--------------------:|--------:|----------:|:----------
  |        2,271,441.67 |              440.25 |    0.1% |     11.00 | `ReadBlockBench`
  |        5,149,564.31 |              194.19 |    0.8% |     10.95 | `WriteBlockBench`

  > After:

  |               ns/op |                op/s |    err% |     total | benchmark
  |--------------------:|--------------------:|--------:|----------:|:----------
  |        1,738,683.04 |              575.15 |    0.2% |     11.04 | `ReadBlockBench`
  |        3,052,658.88 |              327.58 |    1.0% |     10.91 | `WriteBlockBench`

  </details>

  <details>
  <summary>Ubuntu 24 - GNU 13.3.0</summary>

  > Before:

  |               ns/op |                op/s |    err% |          ins/op |          cyc/op |    IPC |         bra/op |   miss% |     total | benchmark
  |--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
  |        6,895,987.11 |              145.01 |    0.0% |   71,055,269.86 |   23,977,374.37 |  2.963 |   5,074,828.78 |    0.4% |     22.00 | `ReadBlockBench`
  |        5,152,973.58 |              194.06 |    2.2% |   19,350,886.41 |    8,784,539.75 |  2.203 |   3,079,335.21 |    0.4% |     23.18 | `WriteBlockBench`

  > After:

  |               ns/op |                op/s |    err% |          ins/op |          cyc/op |    IPC |         bra/op |   miss% |     total | benchmark
  |--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
  |        5,771,882.71 |              173.25 |    0.0% |   65,741,889.82 |   20,453,232.33 |  3.214 |   3,971,321.75 |    0.3% |     22.01 | `ReadBlockBench`
  |        4,145,681.13 |              241.21 |    4.0% |   15,337,596.85 |    5,732,186.47 |  2.676 |   2,239,662.64 |    0.1% |     23.94 | `WriteBlockBench`

  </details>

  2 full IBD runs against master (compiled with GCC where the gains seem more modest) for **888888** blocks (seeded from real nodes) indicates a ~**7%** total speedup.

  <details>
  <summary>Details</summary>

  ```bash
  COMMITS="d2b72b13699cf460ffbcb1028bcf5f3b07d3b73a 652b4e3de5c5e09fb812abe265f4a8946fa96b54"; \
  STOP_HEIGHT=888888; DBCACHE=1000; \
  C_COMPILER=gcc; CXX_COMPILER=g++; \
  BASE_DIR="/mnt/my_storage"; DATA_DIR="$BASE_DIR/BitcoinData"; LOG_DIR="$BASE_DIR/logs"; \
  (for c in $COMMITS; do git fetch origin $c -q && git log -1 --pretty=format:'%h %s' $c || exit 1; done) && \
  hyperfine \
    --sort 'command' \
    --runs 2 \
    --export-json "$BASE_DIR/ibd-${COMMITS// /-}-$STOP_HEIGHT-$DBCACHE-$C_COMPILER.json" \
    --parameter-list COMMIT ${COMMITS// /,} \
    --prepare "killall bitcoind; rm -rf $DATA_DIR/*; git checkout {COMMIT}; git clean -fxd; git reset --hard; \
      cmake -B build -DCMAKE_BUILD_TYPE=Release -DENABLE_WALLET=OFF -DCMAKE_C_COMPILER=$C_COMPILER -DCMAKE_CXX_COMPILER=$CXX_COMPILER && \
      cmake --build build -j$(nproc) --target bitcoind && \
      ./build/bin/bitcoind -datadir=$DATA_DIR -stopatheight=1 -printtoconsole=0; sleep 100" \
    --cleanup "cp $DATA_DIR/debug.log $LOG_DIR/debug-{COMMIT}-$(date +%s).log" \
    "COMPILER=$C_COMPILER COMMIT=${COMMIT:0:10} ./build/bin/bitcoind -datadir=$DATA_DIR -stopatheight=$STOP_HEIGHT -dbcache=$DBCACHE -blocksonly -printtoconsole=0"
  d2b72b1369 refactor: rename leftover WriteBlockBench
  652b4e3de5 optimization: Bulk serialization writes in `WriteBlockUndo` and `WriteBlock`
  Benchmark 1: COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=888888 -dbcache=1000 -blocksonly -printtoconsole=0 (COMMIT = d2b72b13699cf460ffbcb1028bcf5f3b07d3b73a)
    Time (mean ± σ):     41528.104 s ± 354.003 s    [User: 44324.407 s, System: 3074.829 s]
    Range (min … max):   41277.786 s … 41778.421 s    2 runs

  Benchmark 2: COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=888888 -dbcache=1000 -blocksonly -printtoconsole=0 (COMMIT = 652b4e3de5c5e09fb812abe265f4a8946fa96b54)
    Time (mean ± σ):     38771.457 s ± 441.941 s    [User: 41930.651 s, System: 3222.664 s]
    Range (min … max):   38458.957 s … 39083.957 s    2 runs

  Relative speed comparison
          1.07 ±  0.02  COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=888888 -dbcache=1000 -blocksonly -printtoconsole=0 (COMMIT = d2b72b13699cf460ffbcb1028bcf5f3b07d3b73a)
          1.00          COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=888888 -dbcache=1000 -blocksonly -printtoconsole=0 (COMMIT = 652b4e3de5c5e09fb812abe265f4a8946fa96b54)
  ```

  </details>

ACKs for top commit:
  maflcko:
    re-ACK 8d801e3efb 🐦
  achow101:
    ACK 8d801e3efb
  ryanofsky:
    Code review ACK 8d801e3efb. Most notable change is switching from BufferedReader to ReadRawBlock for block reads, which makes sense, and there are also various cleanups in blockstorage and test code.
  hodlinator:
    re-ACK 8d801e3efb

Tree-SHA512: 24e1dee653b927b760c0ba3c69d1aba15fa5d9c4536ad11cfc2d70196ae16b9228ecc3056eef70923364257d72dc929882e73e69c6c426e28139d31299d08adc
This commit is contained in:
Ava Chow
2025-04-16 15:16:22 -07:00
7 changed files with 332 additions and 88 deletions

View File

@@ -529,7 +529,7 @@ bool BlockManager::LoadBlockIndexDB(const std::optional<uint256>& snapshot_block
}
for (std::set<int>::iterator it = setBlkDataFiles.begin(); it != setBlkDataFiles.end(); it++) {
FlatFilePos pos(*it, 0);
if (OpenBlockFile(pos, true).IsNull()) {
if (OpenBlockFile(pos, /*fReadOnly=*/true).IsNull()) {
return false;
}
}
@@ -660,27 +660,30 @@ bool BlockManager::ReadBlockUndo(CBlockUndo& blockundo, const CBlockIndex& index
const FlatFilePos pos{WITH_LOCK(::cs_main, return index.GetUndoPos())};
// Open history file to read
AutoFile filein{OpenUndoFile(pos, true)};
if (filein.IsNull()) {
LogError("OpenUndoFile failed for %s", pos.ToString());
AutoFile file{OpenUndoFile(pos, true)};
if (file.IsNull()) {
LogError("OpenUndoFile failed for %s while reading block undo", pos.ToString());
return false;
}
BufferedReader filein{std::move(file)};
// Read block
uint256 hashChecksum;
HashVerifier verifier{filein}; // Use HashVerifier as reserializing may lose data, c.f. commit d342424301013ec47dc146a4beb49d5c9319d80a
try {
// Read block
HashVerifier verifier{filein}; // Use HashVerifier, as reserializing may lose data, c.f. commit d3424243
verifier << index.pprev->GetBlockHash();
verifier >> blockundo;
filein >> hashChecksum;
} catch (const std::exception& e) {
LogError("%s: Deserialize or I/O error - %s at %s\n", __func__, e.what(), pos.ToString());
return false;
}
// Verify checksum
if (hashChecksum != verifier.GetHash()) {
LogError("%s: Checksum mismatch at %s\n", __func__, pos.ToString());
uint256 hashChecksum;
filein >> hashChecksum;
// Verify checksum
if (hashChecksum != verifier.GetHash()) {
LogError("Checksum mismatch at %s while reading block undo", pos.ToString());
return false;
}
} catch (const std::exception& e) {
LogError("Deserialize or I/O error - %s at %s while reading block undo", e.what(), pos.ToString());
return false;
}
@@ -931,30 +934,35 @@ bool BlockManager::WriteBlockUndo(const CBlockUndo& blockundo, BlockValidationSt
// Write undo information to disk
if (block.GetUndoPos().IsNull()) {
FlatFilePos pos;
const unsigned int blockundo_size{static_cast<unsigned int>(GetSerializeSize(blockundo))};
const auto blockundo_size{static_cast<uint32_t>(GetSerializeSize(blockundo))};
if (!FindUndoPos(state, block.nFile, pos, blockundo_size + UNDO_DATA_DISK_OVERHEAD)) {
LogError("FindUndoPos failed");
LogError("FindUndoPos failed for %s while writing block undo", pos.ToString());
return false;
}
// Open history file to append
AutoFile fileout{OpenUndoFile(pos)};
if (fileout.IsNull()) {
LogError("OpenUndoFile failed");
return FatalError(m_opts.notifications, state, _("Failed to write undo data."));
{
// Open history file to append
AutoFile file{OpenUndoFile(pos)};
if (file.IsNull()) {
LogError("OpenUndoFile failed for %s while writing block undo", pos.ToString());
return FatalError(m_opts.notifications, state, _("Failed to write undo data."));
}
BufferedWriter fileout{file};
// Write index header
fileout << GetParams().MessageStart() << blockundo_size;
pos.nPos += STORAGE_HEADER_BYTES;
{
// Calculate checksum
HashWriter hasher{};
hasher << block.pprev->GetBlockHash() << blockundo;
// Write undo data & checksum
fileout << blockundo << hasher.GetHash();
}
fileout.flush(); // Make sure `AutoFile`/`BufferedWriter` go out of scope before we call `FlushUndoFile`
}
// Write index header
fileout << GetParams().MessageStart() << blockundo_size;
// Write undo data
pos.nPos += BLOCK_SERIALIZATION_HEADER_SIZE;
fileout << blockundo;
// Calculate & write checksum
HashWriter hasher{};
hasher << block.pprev->GetBlockHash();
hasher << blockundo;
fileout << hasher.GetHash();
// rev files are written in block height order, whereas blk files are written as blocks come in (often out of order)
// we want to flush the rev (undo) file once we've written the last block, which is indicated by the last height
// in the block file info as below; note that this does not catch the case where the undo writes are keeping up
@@ -986,29 +994,28 @@ bool BlockManager::ReadBlock(CBlock& block, const FlatFilePos& pos) const
block.SetNull();
// Open history file to read
AutoFile filein{OpenBlockFile(pos, true)};
if (filein.IsNull()) {
LogError("%s: OpenBlockFile failed for %s\n", __func__, pos.ToString());
std::vector<uint8_t> block_data;
if (!ReadRawBlock(block_data, pos)) {
return false;
}
// Read block
try {
filein >> TX_WITH_WITNESS(block);
// Read block
SpanReader{block_data} >> TX_WITH_WITNESS(block);
} catch (const std::exception& e) {
LogError("%s: Deserialize or I/O error - %s at %s\n", __func__, e.what(), pos.ToString());
LogError("Deserialize or I/O error - %s at %s while reading block", e.what(), pos.ToString());
return false;
}
// Check the header
if (!CheckProofOfWork(block.GetHash(), block.nBits, GetConsensus())) {
LogError("%s: Errors in block header at %s\n", __func__, pos.ToString());
LogError("Errors in block header at %s while reading block", pos.ToString());
return false;
}
// Signet only: check block solution
if (GetConsensus().signet_blocks && !CheckSignetBlockSolution(block, GetConsensus())) {
LogError("%s: Errors in block solution at %s\n", __func__, pos.ToString());
LogError("Errors in block solution at %s while reading block", pos.ToString());
return false;
}
@@ -1023,7 +1030,7 @@ bool BlockManager::ReadBlock(CBlock& block, const CBlockIndex& index) const
return false;
}
if (block.GetHash() != index.GetBlockHash()) {
LogError("%s: GetHash() doesn't match index for %s at %s\n", __func__, index.ToString(), block_pos.ToString());
LogError("GetHash() doesn't match index for %s at %s while reading block", index.ToString(), block_pos.ToString());
return false;
}
return true;
@@ -1031,17 +1038,16 @@ bool BlockManager::ReadBlock(CBlock& block, const CBlockIndex& index) const
bool BlockManager::ReadRawBlock(std::vector<uint8_t>& block, const FlatFilePos& pos) const
{
FlatFilePos hpos = pos;
// If nPos is less than 8 the pos is null and we don't have the block data
// Return early to prevent undefined behavior of unsigned int underflow
if (hpos.nPos < 8) {
LogError("%s: OpenBlockFile failed for %s\n", __func__, pos.ToString());
if (pos.nPos < STORAGE_HEADER_BYTES) {
// If nPos is less than STORAGE_HEADER_BYTES, we can't read the header that precedes the block data
// This would cause an unsigned integer underflow when trying to position the file cursor
// This can happen after pruning or default constructed positions
LogError("Failed for %s while reading raw block storage header", pos.ToString());
return false;
}
hpos.nPos -= 8; // Seek back 8 bytes for meta header
AutoFile filein{OpenBlockFile(hpos, true)};
AutoFile filein{OpenBlockFile({pos.nFile, pos.nPos - STORAGE_HEADER_BYTES}, /*fReadOnly=*/true)};
if (filein.IsNull()) {
LogError("%s: OpenBlockFile failed for %s\n", __func__, pos.ToString());
LogError("OpenBlockFile failed for %s while reading raw block", pos.ToString());
return false;
}
@@ -1052,22 +1058,21 @@ bool BlockManager::ReadRawBlock(std::vector<uint8_t>& block, const FlatFilePos&
filein >> blk_start >> blk_size;
if (blk_start != GetParams().MessageStart()) {
LogError("%s: Block magic mismatch for %s: %s versus expected %s\n", __func__, pos.ToString(),
HexStr(blk_start),
HexStr(GetParams().MessageStart()));
LogError("Block magic mismatch for %s: %s versus expected %s while reading raw block",
pos.ToString(), HexStr(blk_start), HexStr(GetParams().MessageStart()));
return false;
}
if (blk_size > MAX_SIZE) {
LogError("%s: Block data is larger than maximum deserialization size for %s: %s versus %s\n", __func__, pos.ToString(),
blk_size, MAX_SIZE);
LogError("Block data is larger than maximum deserialization size for %s: %s versus %s while reading raw block",
pos.ToString(), blk_size, MAX_SIZE);
return false;
}
block.resize(blk_size); // Zeroing of memory is intentional here
filein.read(MakeWritableByteSpan(block));
} catch (const std::exception& e) {
LogError("%s: Read from block file failed: %s for %s\n", __func__, e.what(), pos.ToString());
LogError("Read from block file failed: %s for %s while reading raw block", e.what(), pos.ToString());
return false;
}
@@ -1077,22 +1082,23 @@ bool BlockManager::ReadRawBlock(std::vector<uint8_t>& block, const FlatFilePos&
FlatFilePos BlockManager::WriteBlock(const CBlock& block, int nHeight)
{
const unsigned int block_size{static_cast<unsigned int>(GetSerializeSize(TX_WITH_WITNESS(block)))};
FlatFilePos pos{FindNextBlockPos(block_size + BLOCK_SERIALIZATION_HEADER_SIZE, nHeight, block.GetBlockTime())};
FlatFilePos pos{FindNextBlockPos(block_size + STORAGE_HEADER_BYTES, nHeight, block.GetBlockTime())};
if (pos.IsNull()) {
LogError("FindNextBlockPos failed");
LogError("FindNextBlockPos failed for %s while writing block", pos.ToString());
return FlatFilePos();
}
AutoFile fileout{OpenBlockFile(pos)};
if (fileout.IsNull()) {
LogError("OpenBlockFile failed");
AutoFile file{OpenBlockFile(pos, /*fReadOnly=*/false)};
if (file.IsNull()) {
LogError("OpenBlockFile failed for %s while writing block", pos.ToString());
m_opts.notifications.fatalError(_("Failed to write block."));
return FlatFilePos();
}
BufferedWriter fileout{file};
// Write index header
fileout << GetParams().MessageStart() << block_size;
pos.nPos += STORAGE_HEADER_BYTES;
// Write block
pos.nPos += BLOCK_SERIALIZATION_HEADER_SIZE;
fileout << TX_WITH_WITNESS(block);
return pos;
}
@@ -1201,7 +1207,7 @@ void ImportBlocks(ChainstateManager& chainman, std::span<const fs::path> import_
if (!fs::exists(chainman.m_blockman.GetBlockPosFilename(pos))) {
break; // No block files left to reindex
}
AutoFile file{chainman.m_blockman.OpenBlockFile(pos, true)};
AutoFile file{chainman.m_blockman.OpenBlockFile(pos, /*fReadOnly=*/true)};
if (file.IsNull()) {
break; // This error is logged in OpenBlockFile
}

View File

@@ -75,10 +75,10 @@ static const unsigned int UNDOFILE_CHUNK_SIZE = 0x100000; // 1 MiB
static const unsigned int MAX_BLOCKFILE_SIZE = 0x8000000; // 128 MiB
/** Size of header written by WriteBlock before a serialized CBlock (8 bytes) */
static constexpr size_t BLOCK_SERIALIZATION_HEADER_SIZE{std::tuple_size_v<MessageStartChars> + sizeof(unsigned int)};
static constexpr uint32_t STORAGE_HEADER_BYTES{std::tuple_size_v<MessageStartChars> + sizeof(unsigned int)};
/** Total overhead when writing undo data: header (8 bytes) plus checksum (32 bytes) */
static constexpr size_t UNDO_DATA_DISK_OVERHEAD{BLOCK_SERIALIZATION_HEADER_SIZE + uint256::size()};
static constexpr uint32_t UNDO_DATA_DISK_OVERHEAD{STORAGE_HEADER_BYTES + uint256::size()};
// Because validation code takes pointers to the map's CBlockIndex objects, if
// we ever switch to another associative container, we need to either use a
@@ -164,7 +164,7 @@ private:
* blockfile info, and checks if there is enough disk space to save the block.
*
* The nAddSize argument passed to this function should include not just the size of the serialized CBlock, but also the size of
* separator fields (BLOCK_SERIALIZATION_HEADER_SIZE).
* separator fields (STORAGE_HEADER_BYTES).
*/
[[nodiscard]] FlatFilePos FindNextBlockPos(unsigned int nAddSize, unsigned int nHeight, uint64_t nTime);
[[nodiscard]] bool FlushChainstateBlockFile(int tip_height);
@@ -400,7 +400,7 @@ public:
void UpdatePruneLock(const std::string& name, const PruneLockInfo& lock_info) EXCLUSIVE_LOCKS_REQUIRED(::cs_main);
/** Open a block file (blk?????.dat) */
AutoFile OpenBlockFile(const FlatFilePos& pos, bool fReadOnly = false) const;
AutoFile OpenBlockFile(const FlatFilePos& pos, bool fReadOnly) const;
/** Translation to a filesystem path */
fs::path GetBlockPosFilename(const FlatFilePos& pos) const;