Files
bitcoin/src/node
Ava Chow 33df4aebae Merge bitcoin/bitcoin#31551: [IBD] batch block reads/writes during AutoFile serialization
8d801e3efb optimization: bulk serialization writes in `WriteBlockUndo` and `WriteBlock` (Lőrinc)
520965e293 optimization: bulk serialization reads in `UndoRead`, `ReadBlock` (Lőrinc)
056cb3c0d2 refactor: clear up blockstorage/streams in preparation for optimization (Lőrinc)
67fcc64802 log: unify error messages for (read/write)[undo]block (Lőrinc)
a4de160492 scripted-diff: shorten BLOCK_SERIALIZATION_HEADER_SIZE constant (Lőrinc)
6640dd52c9 Narrow scope of undofile write to avoid possible resource management issue (Lőrinc)
3197155f91 refactor: collect block read operations into try block (Lőrinc)
c77e3107b8 refactor: rename leftover WriteBlockBench (Lőrinc)

Pull request description:

  This change is part of [[IBD] - Tracking PR for speeding up Initial Block Download](https://github.com/bitcoin/bitcoin/pull/32043)

  ### Summary
  We can serialize the blocks and undos to any `Stream` which implements the appropriate read/write methods.
  `AutoFile` is one of these, writing the results "directly" to disk (through the OS file cache). Batching these in memory first and reading/writing these to disk is measurably faster (likely because of fewer native fread calls or less locking, as [observed](https://github.com/bitcoin/bitcoin/pull/28226#issuecomment-1666842501) by Martinus in a similar change).

  ### Unlocking new optimization opportunities

  Buffered writes will also enable batched obfuscation calculations (implemented in https://github.com/bitcoin/bitcoin/pull/31144) - especially since currently we need to copy the write input's std::span to do the obfuscation on it, and batching enables doing the operations on the internal buffer directly.

  ### Measurements (micro benchmarks, full IBDs and reindexes)

  Microbenchmarks for `[Read|Write]BlockBench` show a ~**30%**/**168%** speedup with `macOS/Clang`, and ~**19%**/**24%** with `Linux/GCC` (the follow-up XOR batching improves these further):

  <details>
  <summary>macOS Sequoia - Clang 19.1.7</summary>

  > Before:

  |               ns/op |                op/s |    err% |     total | benchmark
  |--------------------:|--------------------:|--------:|----------:|:----------
  |        2,271,441.67 |              440.25 |    0.1% |     11.00 | `ReadBlockBench`
  |        5,149,564.31 |              194.19 |    0.8% |     10.95 | `WriteBlockBench`

  > After:

  |               ns/op |                op/s |    err% |     total | benchmark
  |--------------------:|--------------------:|--------:|----------:|:----------
  |        1,738,683.04 |              575.15 |    0.2% |     11.04 | `ReadBlockBench`
  |        3,052,658.88 |              327.58 |    1.0% |     10.91 | `WriteBlockBench`

  </details>

  <details>
  <summary>Ubuntu 24 - GNU 13.3.0</summary>

  > Before:

  |               ns/op |                op/s |    err% |          ins/op |          cyc/op |    IPC |         bra/op |   miss% |     total | benchmark
  |--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
  |        6,895,987.11 |              145.01 |    0.0% |   71,055,269.86 |   23,977,374.37 |  2.963 |   5,074,828.78 |    0.4% |     22.00 | `ReadBlockBench`
  |        5,152,973.58 |              194.06 |    2.2% |   19,350,886.41 |    8,784,539.75 |  2.203 |   3,079,335.21 |    0.4% |     23.18 | `WriteBlockBench`

  > After:

  |               ns/op |                op/s |    err% |          ins/op |          cyc/op |    IPC |         bra/op |   miss% |     total | benchmark
  |--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
  |        5,771,882.71 |              173.25 |    0.0% |   65,741,889.82 |   20,453,232.33 |  3.214 |   3,971,321.75 |    0.3% |     22.01 | `ReadBlockBench`
  |        4,145,681.13 |              241.21 |    4.0% |   15,337,596.85 |    5,732,186.47 |  2.676 |   2,239,662.64 |    0.1% |     23.94 | `WriteBlockBench`

  </details>

  2 full IBD runs against master (compiled with GCC where the gains seem more modest) for **888888** blocks (seeded from real nodes) indicates a ~**7%** total speedup.

  <details>
  <summary>Details</summary>

  ```bash
  COMMITS="d2b72b13699cf460ffbcb1028bcf5f3b07d3b73a 652b4e3de5c5e09fb812abe265f4a8946fa96b54"; \
  STOP_HEIGHT=888888; DBCACHE=1000; \
  C_COMPILER=gcc; CXX_COMPILER=g++; \
  BASE_DIR="/mnt/my_storage"; DATA_DIR="$BASE_DIR/BitcoinData"; LOG_DIR="$BASE_DIR/logs"; \
  (for c in $COMMITS; do git fetch origin $c -q && git log -1 --pretty=format:'%h %s' $c || exit 1; done) && \
  hyperfine \
    --sort 'command' \
    --runs 2 \
    --export-json "$BASE_DIR/ibd-${COMMITS// /-}-$STOP_HEIGHT-$DBCACHE-$C_COMPILER.json" \
    --parameter-list COMMIT ${COMMITS// /,} \
    --prepare "killall bitcoind; rm -rf $DATA_DIR/*; git checkout {COMMIT}; git clean -fxd; git reset --hard; \
      cmake -B build -DCMAKE_BUILD_TYPE=Release -DENABLE_WALLET=OFF -DCMAKE_C_COMPILER=$C_COMPILER -DCMAKE_CXX_COMPILER=$CXX_COMPILER && \
      cmake --build build -j$(nproc) --target bitcoind && \
      ./build/bin/bitcoind -datadir=$DATA_DIR -stopatheight=1 -printtoconsole=0; sleep 100" \
    --cleanup "cp $DATA_DIR/debug.log $LOG_DIR/debug-{COMMIT}-$(date +%s).log" \
    "COMPILER=$C_COMPILER COMMIT=${COMMIT:0:10} ./build/bin/bitcoind -datadir=$DATA_DIR -stopatheight=$STOP_HEIGHT -dbcache=$DBCACHE -blocksonly -printtoconsole=0"
  d2b72b1369 refactor: rename leftover WriteBlockBench
  652b4e3de5 optimization: Bulk serialization writes in `WriteBlockUndo` and `WriteBlock`
  Benchmark 1: COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=888888 -dbcache=1000 -blocksonly -printtoconsole=0 (COMMIT = d2b72b13699cf460ffbcb1028bcf5f3b07d3b73a)
    Time (mean ± σ):     41528.104 s ± 354.003 s    [User: 44324.407 s, System: 3074.829 s]
    Range (min … max):   41277.786 s … 41778.421 s    2 runs

  Benchmark 2: COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=888888 -dbcache=1000 -blocksonly -printtoconsole=0 (COMMIT = 652b4e3de5c5e09fb812abe265f4a8946fa96b54)
    Time (mean ± σ):     38771.457 s ± 441.941 s    [User: 41930.651 s, System: 3222.664 s]
    Range (min … max):   38458.957 s … 39083.957 s    2 runs

  Relative speed comparison
          1.07 ±  0.02  COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=888888 -dbcache=1000 -blocksonly -printtoconsole=0 (COMMIT = d2b72b13699cf460ffbcb1028bcf5f3b07d3b73a)
          1.00          COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=888888 -dbcache=1000 -blocksonly -printtoconsole=0 (COMMIT = 652b4e3de5c5e09fb812abe265f4a8946fa96b54)
  ```

  </details>

ACKs for top commit:
  maflcko:
    re-ACK 8d801e3efb 🐦
  achow101:
    ACK 8d801e3efb
  ryanofsky:
    Code review ACK 8d801e3efb. Most notable change is switching from BufferedReader to ReadRawBlock for block reads, which makes sense, and there are also various cleanups in blockstorage and test code.
  hodlinator:
    re-ACK 8d801e3efb

Tree-SHA512: 24e1dee653b927b760c0ba3c69d1aba15fa5d9c4536ad11cfc2d70196ae16b9228ecc3056eef70923364257d72dc929882e73e69c6c426e28139d31299d08adc
2025-04-16 15:16:22 -07:00
..
2025-03-13 11:13:13 +00:00
2024-06-18 18:47:51 +02:00
2024-09-19 07:33:02 -04:00
2024-06-13 11:20:49 +01:00

src/node/

The src/node/ directory contains code that needs to access node state (state in CChain, CBlockIndex, CCoinsView, CTxMemPool, and similar classes).

Code in src/node/ is meant to be segregated from code in src/wallet/ and src/qt/, to ensure wallet and GUI code changes don't interfere with node operation, to allow wallet and GUI code to run in separate processes, and to perhaps eventually allow wallet and GUI code to be maintained in separate source repositories.

As a rule of thumb, code in one of the src/node/, src/wallet/, or src/qt/ directories should avoid calling code in the other directories directly, and only invoke it indirectly through the more limited src/interfaces/ classes.

This directory is at the moment sparsely populated. Eventually more substantial files like src/validation.cpp and src/txmempool.cpp might be moved there.