Andrew Chow 5602cc7ccf
Merge bitcoin/bitcoin#16981: Improve runtime performance of --reindex
db929893ef0bc86ea2708cdbcf41152240cd7c73 Faster -reindex by initially deserializing only headers (Larry Ruane)
c72de9990ae8f1744006d9c852023b882d5ed80c util: add CBufferedFile::SkipTo() to move ahead in the stream (Larry Ruane)
48a68908ba3d5e077cda7bd1e908b923fbead824 Add LoadExternalBlockFile() benchmark (Larry Ruane)

Pull request description:

  ### Background
  During the first part of reindexing, `LoadExternalBlockFile()` sequentially reads raw blocks from the `blocks/blk00nnn.dat` files (rather than receiving them from peers, as with initial block download) and eventually adds all of them to the block index. When an individual block is initially read, it can't be immediately added unless all its ancestors have been added, which is rare (only about 8% of the time), because the blocks are not sorted by height. When the block can't be immediately added to the block index, its disk location is saved in a map so it can be added later. When its parent is later added to the block index, `LoadExternalBlockFile()` reads and deserializes the block from disk a second time and adds it to the block index. Most blocks (92%) get deserialized twice.

  ### This PR
  During the initial read, it's rarely useful to deserialize the entire block; only the header is needed to determine if the block can be added to the block index immediately. This change to `LoadExternalBlockFile()` initially deserializes only a block's header, then deserializes the entire block only if it can be added immediately. This reduces reindex time on mainnet by 7 hours on a Raspberry Pi, which translates to around a 25% reduction in the first part of reindexing (adding blocks to the index), and about a 6% reduction in overall reindex time.

  Summary: The performance gain is the result of deserializing each block only once, except its header which is deserialized twice, but the header is only 80 bytes.

ACKs for top commit:
  andrewtoth:
    ACK db929893ef0bc86ea2708cdbcf41152240cd7c73
  achow101:
    ACK db929893ef0bc86ea2708cdbcf41152240cd7c73
  aureleoules:
    ACK db929893ef0bc86ea2708cdbcf41152240cd7c73 - minor changes and new benchmark since last review
  theStack:
    re-ACK db929893ef0bc86ea2708cdbcf41152240cd7c73
  stickies-v:
    re-ACK db929893e

Tree-SHA512: 5a5377192c11edb5b662e18f511c9beb8f250bc88aeadf2f404c92c3232a7617bade50477ebf16c0602b9bd3b68306d3ee7615de58acfd8cae664d28bb7b0136
2022-11-15 19:23:39 -05:00
2022-10-15 18:43:07 +02:00
2022-11-01 20:50:51 +01:00
2021-09-07 06:12:53 +03:00
2022-01-03 04:48:41 +08:00
2021-09-09 19:53:12 +05:30
2022-07-30 09:05:07 +01:00
2022-08-23 16:57:46 -04:00

Bitcoin Core integration/staging tree

https://bitcoincore.org

For an immediately usable, binary version of the Bitcoin Core software, see https://bitcoincore.org/en/download/.

What is Bitcoin Core?

Bitcoin Core connects to the Bitcoin peer-to-peer network to download and fully validate blocks and transactions. It also includes a wallet and graphical user interface, which can be optionally built.

Further information about Bitcoin Core is available in the doc folder.

License

Bitcoin Core is released under the terms of the MIT license. See COPYING for more information or see https://opensource.org/licenses/MIT.

Development Process

The master branch is regularly built (see doc/build-*.md for instructions) and tested, but it is not guaranteed to be completely stable. Tags are created regularly from release branches to indicate new official, stable release versions of Bitcoin Core.

The https://github.com/bitcoin-core/gui repository is used exclusively for the development of the GUI. Its master branch is identical in all monotree repositories. Release branches and tags do not exist, so please do not fork that repository unless it is for development reasons.

The contribution workflow is described in CONTRIBUTING.md and useful hints for developers can be found in doc/developer-notes.md.

Testing

Testing and code review is the bottleneck for development; we get more pull requests than we can review and test on short notice. Please be patient and help out by testing other people's pull requests, and remember this is a security-critical project where any mistake might cost people lots of money.

Automated Testing

Developers are strongly encouraged to write unit tests for new code, and to submit new unit tests for old code. Unit tests can be compiled and run (assuming they weren't disabled in configure) with: make check. Further details on running and extending unit tests can be found in /src/test/README.md.

There are also regression and integration tests, written in Python. These tests can be run (if the test dependencies are installed) with: test/functional/test_runner.py

The CI (Continuous Integration) systems make sure that every pull request is built for Windows, Linux, and macOS, and that unit/sanity tests are run automatically.

Manual Quality Assurance (QA) Testing

Changes should be tested by somebody other than the developer who wrote the code. This is especially important for large or high-risk changes. It is useful to add a test plan to the pull request description if testing the changes is not straightforward.

Translations

Changes to translations as well as new translations can be submitted to Bitcoin Core's Transifex page.

Translations are periodically pulled from Transifex and merged into the git repository. See the translation process for details on how this works.

Important: We do not accept translation changes as GitHub pull requests because the next pull from Transifex would automatically overwrite them again.

Description
Bitcoin Core integration/staging tree
Readme 2.3 GiB
Languages
C++ 64.3%
Python 19.7%
C 12.1%
CMake 1.3%
Shell 0.9%
Other 1.6%