Merge bitcoin/bitcoin#29975: blockstorage: Separate reindexing from saving new blocks

e41667b720 blockstorage: Don't move cursor backwards in UpdateBlockInfo (Ryan Ofsky)
17103637c6 blockstorage: Rename FindBlockPos and have it return a FlatFilePos (Martin Zumsande)
d9e477c4dc validation, blockstorage: Separate code paths for reindex and saving new blocks (Martin Zumsande)
064859bbad blockstorage: split up FindBlockPos function (Martin Zumsande)
fdae638e83 doc: Improve doc for functions involved in saving blocks to disk (Martin Zumsande)
0d114e3cb2 blockstorage: Add Assume for fKnown / snapshot chainstate (Martin Zumsande)

Pull request description:

  `SaveBlockToDisk` / `FindBlockPos` are used for two purposes, depending on whether they are called during reindexing (`dbp` set,  `fKnown = true`) or in the "normal" case when adding new blocks (`dbp == nullptr`,  `fKnown = false`).
  The actual tasks are quite different
  - In normal mode, preparations for saving a new block are made, which is then saved: find the correct position on disk (maybe skipping to a new blk file), check for available disk space, update the blockfile info db, save the block.
  - during reindex, most of this is not necessary (the block is already on disk after all), only the blockfile info needs to rebuilt because reindex wiped the leveldb it's saved in.

  Using one function with many conditional statements for this leads to code that is hard to read / understand and bug-prone:
  - many code paths in `FindBlockPos` are conditional on `fKnown` or `!fKnown`
  - It's not really clear what actually needs to be done during reindex (we don't need to "save a block to disk" or "find a block pos" as the function names suggest)
  - logic that should be applied to only one of the two modes is sometimes applied to both (see first commit, or #27039)

  #24858 and #27039 were recent bugs directly related to the differences between reindexing and normal mode, and in both cases the simple fix took a long time to be reviewed and merged.

  This PR proposes to clean this code up by splitting out the reindex logic into a separate function (`UpdateBlockInfo`) which will be called directly from validation. As a result, `SaveBlockToDisk` and `FindBlockPos` only need to cover the non-reindex logic.

ACKs for top commit:
  paplorinc:
    ACK e41667b720
  TheCharlatan:
    Re-ACK e41667b720
  ryanofsky:
    Code review ACK e41667b720. Just improvements to comments since last review.

Tree-SHA512: a14ff9a0facf6b1e3c1cd724a2d19a79a25d4b48de64398fdd172671532a472bc10a20cbb64ac3a3e55814dcc877d0597a3e1699cabc4f9d9a86b439b6eaba20
This commit is contained in:
Ryan Ofsky
2024-05-16 11:15:56 -04:00
5 changed files with 144 additions and 106 deletions

View File

@@ -35,20 +35,20 @@ BOOST_AUTO_TEST_CASE(blockmanager_find_block_pos)
};
BlockManager blockman{*Assert(m_node.shutdown), blockman_opts};
// simulate adding a genesis block normally
BOOST_CHECK_EQUAL(blockman.SaveBlockToDisk(params->GenesisBlock(), 0, nullptr).nPos, BLOCK_SERIALIZATION_HEADER_SIZE);
BOOST_CHECK_EQUAL(blockman.SaveBlockToDisk(params->GenesisBlock(), 0).nPos, BLOCK_SERIALIZATION_HEADER_SIZE);
// simulate what happens during reindex
// simulate a well-formed genesis block being found at offset 8 in the blk00000.dat file
// the block is found at offset 8 because there is an 8 byte serialization header
// consisting of 4 magic bytes + 4 length bytes before each block in a well-formed blk file.
FlatFilePos pos{0, BLOCK_SERIALIZATION_HEADER_SIZE};
BOOST_CHECK_EQUAL(blockman.SaveBlockToDisk(params->GenesisBlock(), 0, &pos).nPos, BLOCK_SERIALIZATION_HEADER_SIZE);
const FlatFilePos pos{0, BLOCK_SERIALIZATION_HEADER_SIZE};
blockman.UpdateBlockInfo(params->GenesisBlock(), 0, pos);
// now simulate what happens after reindex for the first new block processed
// the actual block contents don't matter, just that it's a block.
// verify that the write position is at offset 0x12d.
// this is a check to make sure that https://github.com/bitcoin/bitcoin/issues/21379 does not recur
// 8 bytes (for serialization header) + 285 (for serialized genesis block) = 293
// add another 8 bytes for the second block's serialization header and we get 293 + 8 = 301
FlatFilePos actual{blockman.SaveBlockToDisk(params->GenesisBlock(), 1, nullptr)};
FlatFilePos actual{blockman.SaveBlockToDisk(params->GenesisBlock(), 1)};
BOOST_CHECK_EQUAL(actual.nPos, BLOCK_SERIALIZATION_HEADER_SIZE + ::GetSerializeSize(TX_WITH_WITNESS(params->GenesisBlock())) + BLOCK_SERIALIZATION_HEADER_SIZE);
}
@@ -156,12 +156,11 @@ BOOST_AUTO_TEST_CASE(blockmanager_flush_block_file)
// Blockstore is empty
BOOST_CHECK_EQUAL(blockman.CalculateCurrentUsage(), 0);
// Write the first block; dbp=nullptr means this block doesn't already have a disk
// location, so allocate a free location and write it there.
FlatFilePos pos1{blockman.SaveBlockToDisk(block1, /*nHeight=*/1, /*dbp=*/nullptr)};
// Write the first block to a new location.
FlatFilePos pos1{blockman.SaveBlockToDisk(block1, /*nHeight=*/1)};
// Write second block
FlatFilePos pos2{blockman.SaveBlockToDisk(block2, /*nHeight=*/2, /*dbp=*/nullptr)};
FlatFilePos pos2{blockman.SaveBlockToDisk(block2, /*nHeight=*/2)};
// Two blocks in the file
BOOST_CHECK_EQUAL(blockman.CalculateCurrentUsage(), (TEST_BLOCK_SIZE + BLOCK_SERIALIZATION_HEADER_SIZE) * 2);
@@ -181,22 +180,19 @@ BOOST_AUTO_TEST_CASE(blockmanager_flush_block_file)
BOOST_CHECK_EQUAL(read_block.nVersion, 2);
}
// When FlatFilePos* dbp is given, SaveBlockToDisk() will not write or
// overwrite anything to the flat file block storage. It will, however,
// update the blockfile metadata. This is to facilitate reindexing
// when the user has the blocks on disk but the metadata is being rebuilt.
// During reindex, the flat file block storage will not be written to.
// UpdateBlockInfo will, however, update the blockfile metadata.
// Verify this behavior by attempting (and failing) to write block 3 data
// to block 2 location.
CBlockFileInfo* block_data = blockman.GetBlockFileInfo(0);
BOOST_CHECK_EQUAL(block_data->nBlocks, 2);
BOOST_CHECK(blockman.SaveBlockToDisk(block3, /*nHeight=*/3, /*dbp=*/&pos2) == pos2);
blockman.UpdateBlockInfo(block3, /*nHeight=*/3, /*pos=*/pos2);
// Metadata is updated...
BOOST_CHECK_EQUAL(block_data->nBlocks, 3);
// ...but there are still only two blocks in the file
BOOST_CHECK_EQUAL(blockman.CalculateCurrentUsage(), (TEST_BLOCK_SIZE + BLOCK_SERIALIZATION_HEADER_SIZE) * 2);
// Block 2 was not overwritten:
// SaveBlockToDisk() did not call WriteBlockToDisk() because `FlatFilePos* dbp` was non-null
blockman.ReadBlockFromDisk(read_block, pos2);
BOOST_CHECK_EQUAL(read_block.nVersion, 2);
}