bcb0cacac2reindex, log, test: fixes #21379 (mruddy) Pull request description: Fixes #21379. The blocks/blk?????.dat files are mutated and become increasingly malformed, or corrupt, as a result of running the re-indexing process. The mutations occur after the re-indexing process has finished, as new blocks are appended, but are a result of a re-indexing process miscalculation that lingers in the block manager's `m_blockfile_info` `nSize` data until node restart. These additions to the blk files are non-fatal, but also not desirable. That is, this is a form of data corruption that the reading code is lenient enough to process (it skips the extra bytes), but it adds some scary looking log messages as it encounters them. The summary of the problem is that the re-index process double counts the size of the serialization header (magic message start bytes [4 bytes] + length [4 bytes] = 8 bytes) while calculating the blk data file size (both values already account for the serialization header's size, hence why it is over accounted). This bug manifests itself in a few different ways, after re-indexing, when a new block from a peer is processed: 1. If the new block will not fit into the last blk file processed while re-indexing, while remaining under the 128MiB limit, then the blk file is flushed to disk and truncated to a size that is 8 greater than it should be. The truncation adds zero bytes (see `FlatFileSeq::Flush` and `TruncateFile`). 1. If the last blk file processed while re-indexing has logical space for the new block under the 128 MiB limit: 1. If the blk file was not already large enough to hold the new block, then the zeros are, in effect, added by `fseek` when the file is opened for writing. Eight zero bytes are added to the end of the last blk file just before the new block is written. This happens because the write offset is 8 too great due to the miscalculation. The result is 8 zero bytes between the end of the last block and the beginning of the next block's magic + length + block. 1. If the blk file was already large enough to hold the new block, then the current existing file contents remain in the 8 byte gap between the end of the last block and the beginning of the next block's magic + length + block. Commonly, when this occcurs, it is due to the blk file containing blocks that are not connected to the block tree during reindex and are thus left behind by the reindex process and later overwritten when new blocks are added. The orphaned blocks can be valid blocks, but due to the nature of concurrent block download, the parent may not have been retrieved and written by the time the node was previously shutdown. ACKs for top commit: LarryRuane: tested code-review ACKbcb0cacac2ryanofsky: Code review ACKbcb0cacac2. This is a disturbing bug with an easy fix which seems well-worth merging. mzumsande: ACKbcb0cacac2(reviewed code and did some testing, I agree that it fixes the bug). w0xlt: tACKbcb0cacac2Tree-SHA512: acc97927ea712916506772550451136b0f1e5404e92df24cc05e405bb09eb6fe7c3011af3dd34a7723c3db17fda657ae85fa314387e43833791e9169c0febe51
Bitcoin Core integration/staging tree
For an immediately usable, binary version of the Bitcoin Core software, see https://bitcoincore.org/en/download/.
What is Bitcoin Core?
Bitcoin Core connects to the Bitcoin peer-to-peer network to download and fully validate blocks and transactions. It also includes a wallet and graphical user interface, which can be optionally built.
Further information about Bitcoin Core is available in the doc folder.
License
Bitcoin Core is released under the terms of the MIT license. See COPYING for more information or see https://opensource.org/licenses/MIT.
Development Process
The master branch is regularly built (see doc/build-*.md for instructions) and tested, but it is not guaranteed to be
completely stable. Tags are created
regularly from release branches to indicate new official, stable release versions of Bitcoin Core.
The https://github.com/bitcoin-core/gui repository is used exclusively for the development of the GUI. Its master branch is identical in all monotree repositories. Release branches and tags do not exist, so please do not fork that repository unless it is for development reasons.
The contribution workflow is described in CONTRIBUTING.md and useful hints for developers can be found in doc/developer-notes.md.
Testing
Testing and code review is the bottleneck for development; we get more pull requests than we can review and test on short notice. Please be patient and help out by testing other people's pull requests, and remember this is a security-critical project where any mistake might cost people lots of money.
Automated Testing
Developers are strongly encouraged to write unit tests for new code, and to
submit new unit tests for old code. Unit tests can be compiled and run
(assuming they weren't disabled in configure) with: make check. Further details on running
and extending unit tests can be found in /src/test/README.md.
There are also regression and integration tests, written
in Python.
These tests can be run (if the test dependencies are installed) with: test/functional/test_runner.py
The CI (Continuous Integration) systems make sure that every pull request is built for Windows, Linux, and macOS, and that unit/sanity tests are run automatically.
Manual Quality Assurance (QA) Testing
Changes should be tested by somebody other than the developer who wrote the code. This is especially important for large or high-risk changes. It is useful to add a test plan to the pull request description if testing the changes is not straightforward.
Translations
Changes to translations as well as new translations can be submitted to Bitcoin Core's Transifex page.
Translations are periodically pulled from Transifex and merged into the git repository. See the translation process for details on how this works.
Important: We do not accept translation changes as GitHub pull requests because the next pull from Transifex would automatically overwrite them again.