bitcoin

mirror of https://github.com/bitcoin/bitcoin.git synced 2026-01-20 07:09:15 +01:00

Author	SHA1	Message	Date
merge-script	a005fdff6c	Merge bitcoin/bitcoin#34074 : A few followups after introducing `/rest/blockpart/` endpoint `59b93f11e8` rest: print also HTTP response reason in case of an error (Roman Zeyde) `7fe94a0493` rest: add a test for unsuported `/blockpart/` request type (Roman Zeyde) `55d0d19b5c` rest: deduplicate `interface_rest.py` negative tests (Roman Zeyde) `89eb531024` rest: update release notes for `/blockpart/` endpoint (Roman Zeyde) `41118e17f8` blockstorage: simplify partial block read validation (Roman Zeyde) `599effdeab` rest: reformat `uri_prefixes` initializer list (Roman Zeyde) Pull request description: The commits below should resolve a few leftovers from #33657. ACKs for top commit: l0rinc: ACK `59b93f11e8` hodlinator: re-ACK `59b93f11e8` Tree-SHA512: ae45e08edd315018e11283b354fb32f9658f5829c956554dc662a81c2e16397def7c3700e6354e0a91ff03c850def35638a69ec2668b7c015d25d6fed42b92bb	2025-12-17 15:09:15 +00:00
merge-script	4f11ef058b	Merge bitcoin/bitcoin#30214 : refactor: Improve assumeutxo state representation `82be652e40` doc: Improve ChainstateManager documentation, use consistent terms (Ryan Ofsky) `af455dcb39` refactor: Simplify pruning functions (TheCharlatan) `ae85c495f1` refactor: Delete ChainstateManager::GetAll() method (Ryan Ofsky) `6a572dbda9` refactor: Add ChainstateManager::ActivateBestChains() method (Ryan Ofsky) `491d827d52` refactor: Add ChainstateManager::m_chainstates member (Ryan Ofsky) `e514fe6116` refactor: Delete ChainstateManager::SnapshotBlockhash() method (Ryan Ofsky) `ee35250683` refactor: Delete ChainstateManager::IsSnapshotValidated() method (Ryan Ofsky) `d9e82299fc` refactor: Delete ChainstateManager::IsSnapshotActive() method (Ryan Ofsky) `4dfe383912` refactor: Convert ChainstateRole enum to struct (Ryan Ofsky) `352ad27fc1` refactor: Add ChainstateManager::ValidatedChainstate() method (Ryan Ofsky) `a229cb9477` refactor: Add ChainstateManager::CurrentChainstate() method (Ryan Ofsky) `a9b7f5614c` refactor: Add Chainstate::StoragePath() method (Ryan Ofsky) `840bd2ef23` refactor: Pass chainstate parameters to MaybeCompleteSnapshotValidation (Ryan Ofsky) `1598a15aed` refactor: Deduplicate Chainstate activation code (Ryan Ofsky) `9fe927b6d6` refactor: Add Chainstate m_assumeutxo and m_target_utxohash members (Ryan Ofsky) `6082c84713` refactor: Add Chainstate::m_target_blockhash member (Ryan Ofsky) `de00e87548` test: Fix broken chainstatemanager_snapshot_init check (Ryan Ofsky) Pull request description: This PR contains the first part of #28608, which tries to make assumeutxo code more maintainable, and improve it by not locking `cs_main` for a long time when the snapshot block is connected, and by deleting the snapshot validation chainstate when it is no longer used, instead of waiting until the next restart. The changes in this PR are just refactoring. They make `Chainstate` objects self-contained, so for example, it is possible to determine what blocks to connect to a chainstate without querying `ChainstateManager`, and to determine whether a Chainstate is validated without basing it on inferences like `&cs != &ActiveChainstate()` or `GetAll().size() == 1`. The PR also tries to make assumeutxo terminology less confusing, using "current chainstate" to refer to the chainstate targeting the current network tip, and "historical chainstate" to refer to the chainstate downloading old blocks and validating the assumeutxo snapshot. It removes uses of the terms "active chainstate," "usable chainstate," "disabled chainstate," "ibd chainstate," and "snapshot chainstate" which are confusing for various reasons. ACKs for top commit: maflcko: re-review ACK `82be652e40` 🕍 fjahr: re-ACK `82be652e40` sedited: Re-ACK `82be652e40` Tree-SHA512: 81c67abba9fc5bb170e32b7bf8a1e4f7b5592315b4ef720be916d5f1f5a7088c0c59cfb697744dd385552f58aa31ee36176bae6a6e465723e65861089a1252e5	2025-12-16 14:03:34 +00:00
Roman Zeyde	41118e17f8	blockstorage: simplify partial block read validation Use `SaturatingAdd` following https://github.com/bitcoin/bitcoin/pull/33657#discussion_r2610832092.	2025-12-14 10:44:12 +01:00
merge-script	938d7aacab	Merge bitcoin/bitcoin#33657 : rest: allow reading partial block data from storage `07135290c1` rest: allow reading partial block data from storage (Roman Zeyde) `4e2af1c065` blockstorage: allow reading partial block data from storage (Roman Zeyde) `f2fd1aa21c` blockstorage: return an error code from `ReadRawBlock()` (Roman Zeyde) Pull request description: It allows fetching specific transactions using an external index, following https://github.com/bitcoin/bitcoin/pull/32541#issuecomment-3267485313. Currently, electrs and other indexers map between an address/scripthash to the list of the relevant transactions. However, in order to fetch those transactions from bitcoind, electrs relies on reading the whole block and post-filtering for a specific transaction[^1]. Other indexers use a `txindex` to fetch a transaction using its txid [^2][^3][^4]. The above approach has significant storage and CPU overhead, since the `txid` is a pseudo-random 32-byte value. Also, mainnet `txindex` takes ~60GB today. This PR is adding support for using the transaction's position within its block to be able to fetch it directly using [REST API](https://github.com/bitcoin/bitcoin/blob/master/doc/REST-interface.md), using the following HTTP request: ``` GET /rest/blockpart/BLOCKHASH.bin?offset=OFFSET&size=SIZE ``` - The offsets' index can be encoded much more efficiently ([~1.3GB today](https://github.com/romanz/bindex-rs/pull/66#issuecomment-3508476436)). - Address history query performance can be tested on mainnet using [1BitcoinEaterAddressDontSendf59kuE](https://mempool.space/address/1BitcoinEaterAddressDontSendf59kuE) - assuming warm OS block cache, [it takes <1s to fetch 5200 txs, i.e. <0.2ms per tx](https://github.com/romanz/bindex-rs/pull/66#issuecomment-3508476436) with [bindex](https://github.com/romanz/bindex-rs). - Only binary and hex response formats are supported. [^1]: https://github.com/romanz/electrs/blob/master/doc/schema.md [^2]: https://github.com/Blockstream/electrs/blob/new-index/doc/schema.md#txstore [^3]: https://github.com/spesmilo/electrumx/blob/master/docs/HOWTO.rst#prerequisites [^4]: https://github.com/cculianu/Fulcrum/blob/master/README.md#requirements ACKs for top commit: maflcko: review ACK `07135290c1` 🏪 l0rinc: ACK `07135290c1` hodlinator: re-ACK `07135290c1` Tree-SHA512: bcce7bf4b9a3e5e920ab5a83e656f50d5d7840cdde6b7147d329cf578f8a2db555fc1aa5334e8ee64d5630d25839ece77a2cf421c6c3ac1fa379bb453163bd4f	2025-12-12 13:22:00 +00:00
TheCharlatan	af455dcb39	refactor: Simplify pruning functions Move GetPruneRange from ChainstateManager to Chainstate.	2025-12-12 11:49:59 +01:00
Ryan Ofsky	ae85c495f1	refactor: Delete ChainstateManager::GetAll() method Just use m_chainstates array instead.	2025-12-12 06:49:59 -04:00
Ryan Ofsky	6a572dbda9	refactor: Add ChainstateManager::ActivateBestChains() method Deduplicate code looping over chainstate objects and calling ActivateBestChain() and avoid need for code outside ChainstateManager to use the GetAll() method.	2025-12-12 06:49:59 -04:00
Ryan Ofsky	4dfe383912	refactor: Convert ChainstateRole enum to struct Change ChainstateRole parameter passed to wallets and indexes. Wallets and indexes need to know whether chainstate is historical and whether it is fully validated. They should not be aware of the assumeutxo snapshot validation process.	2025-12-12 06:49:59 -04:00
Roman Zeyde	4e2af1c065	blockstorage: allow reading partial block data from storage It will allow fetching specific transactions using an external index, following https://github.com/bitcoin/bitcoin/pull/32541#issuecomment-3267485313. No logging takes place in case of an invalid offset/size (to avoid spamming the log), by using a new `ReadRawError::BadPartRange` error variant. Co-authored-by: Hodlinator <172445034+hodlinator@users.noreply.github.com> Co-authored-by: Lőrinc <pap.lorinc@gmail.com>	2025-12-11 18:54:55 +01:00
Roman Zeyde	f2fd1aa21c	blockstorage: return an error code from `ReadRawBlock()` It will enable different error handling flows for different error types. Also, `ReadRawBlockBench` performance has decreased due to no longer reusing a vector with an unchanging capacity - mirroring our production code behavior. Co-authored-by: Hodlinator <172445034+hodlinator@users.noreply.github.com> Co-authored-by: Lőrinc <pap.lorinc@gmail.com>	2025-12-11 18:54:55 +01:00
MarcoFalke	fa89f60e31	scripted-diff: LogPrintLevel(,BCLog::Level::,) -> LogError()/LogWarning() This is a minimal behavior change and changes log output from: [net:error] Something bad happened [net:warning] Something problematic happened to either [error] Something bad happened [warning] Something problematic happened or, when -loglevelalways=1 is enabled: [all:error] Something bad happened [all:warning] Something problematic happened Such a behavior change is desired, because all warning and error logs are written in the same style in the source code and they are logged in the same format for log consumers. -BEGIN VERIFY SCRIPT- sed --regexp-extended --in-place \ 's/LogPrintLevel\((BCLog::[^,]), BCLog::Level::(Error\|Warning), */Log\2(/g' \ $( git grep -l LogPrintLevel ':(exclude)src/test/logging_tests.cpp' ) -END VERIFY SCRIPT-	2025-12-09 10:44:33 +01:00
MarcoFalke	fa45a1503e	log: Use LogWarning for non-critical logs As per doc/developer-notes#logging, LogWarning should be used for severe problems that do not warrant shutting down the node	2025-11-27 14:33:59 +01:00
Andrew Toth	99d012ec80	refactor: return reference instead of pointer The return value of BlockManager::GetFirstBlock must always be non-null. This can be inferred by the implementation, which has an assertion that the return value is not null. A raw pointer should only be returned if the result may be null. In this case a reference is more appropriate.	2025-11-13 09:57:42 -05:00
merge-script	3789215f73	Merge bitcoin/bitcoin#33724 : refactor: Return uint64_t from GetSerializeSize `fa6c0bedd3` refactor: Return uint64_t from GetSerializeSize (MarcoFalke) `fad0c8680e` refactor: Use uint64_t over size_t for serialized-size values (MarcoFalke) `fa4f388fc9` refactor: Use fixed size ints over (un)signed ints for serialized values (MarcoFalke) `fa01f38e53` move-only: Move CBlockFileInfo to kernel namespace (MarcoFalke) `fa2bbc9e4c` refactor: [rpc] Remove cast when reporting serialized size (MarcoFalke) `fa364af89b` test: Remove outdated comment (MarcoFalke) Pull request description: Consensus code should arrive at the same conclusion, regardless of the architecture it runs on. Using architecture-specific types such as `size_t` can lead to issues, such as the low-severity [CVE-2025-46597](https://bitcoincore.org/en/2025/10/24/disclose-cve-2025-46597/). The CVE was already worked around, but it may be good to still fix the underlying issue. Fixes https://github.com/bitcoin/bitcoin/issues/33709 with a few refactors to use explicit fixed-sized integer types in serialization-size related code and concluding with a refactor to return `uint64_t` from `GetSerializeSize`. The refactors should not change any behavior, because the CVE was already worked around. ACKs for top commit: Crypt-iQ: crACK `fa6c0bedd3` l0rinc: ACK `fa6c0bedd3` laanwj: Code review ACK `fa6c0bedd3` Tree-SHA512: f45057bd86fb46011e4cb3edf0dc607057d72ed869fd6ad636562111ae80fea233b2fc45c34b02256331028359a9c3f4fa73e9b882b225bdc089d00becd0195e	2025-11-12 09:48:10 -05:00
Ava Chow	a4e96cae7d	Merge bitcoin/bitcoin#33042 : refactor: inline constant return values from `dbwrapper` write methods `743abbcbde` refactor: inline constant return value of `BlockTreeDB::WriteBatchSync` and `BlockManager::WriteBlockIndexDB` and `BlockTreeDB::WriteFlag` (Lőrinc) `e030240e90` refactor: inline constant return value of `CDBWrapper::Erase` and `BlockTreeDB::WriteReindexing` (Lőrinc) `cdab9480e9` refactor: inline constant return value of `CDBWrapper::Write` (Lőrinc) `d1847cf5b5` refactor: inline constant return value of `TxIndex::DB::WriteTxs` (Lőrinc) `50b63a5698` refactor: inline constant return value of `CDBWrapper::WriteBatch` (Lőrinc) Pull request description: Related to https://github.com/bitcoin/bitcoin/pull/31144#discussion_r2223587480 ### Summary `WriteBatch` always returns `true` - the errors are handled by throwing `dbwrapper_error` instead. ### Context This boolean return value of the `Write` methods is confusing because it's inconsistent with `CDBWrapper::Read`, which catches exceptions and returns a boolean to indicate success/failure. It's bad that `Read` returns and `Write` throws - but it's a lot worse that `Write` advertises a return value when it actually communicates errors through exceptions. ### Solution This PR removes the constant return values from write methods and inlines `true` at their call sites. Many upstream methods had boolean return values only because they were propagating these constants - those have been cleaned up as well. Methods that returned a constant `true` value that now return `void`: - `CDBWrapper::WriteBatch`, `CDBWrapper::Write`, `CDBWrapper::Erase` - `TxIndex::DB::WriteTxs` - `BlockTreeDB::WriteReindexing`, `BlockTreeDB::WriteBatchSync`, `BlockTreeDB::WriteFlag` - `BlockManager::WriteBlockIndexDB` ### Note `CCoinsView::BatchWrite` (and transitively `CCoinsViewCache::Flush` & `CCoinsViewCache::Sync`) were intentionally not changed here. While all implementations return `true`, the base `CCoinsView::BatchWrite` returns `false`. Changing this would cause `coins_view` tests to fail with: > terminating due to uncaught exception of type std::logic_error: Not all unspent flagged entries were cleared We can fix that in a follow-up PR. ACKs for top commit: achow101: ACK `743abbcbde` janb84: ACK `743abbcbde` TheCharlatan: ACK `743abbcbde` sipa: ACK `743abbcbde` Tree-SHA512: b2a550bff066216f1958d2dd9a7ef6a9949de518cc636f8ab9c670e0b7a330c1eb8c838e458a8629acb8ac980cea6616955cd84436a7b8ab9096f6d648073b1e	2025-11-10 09:15:24 -08:00
MarcoFalke	fa01f38e53	move-only: Move CBlockFileInfo to kernel namespace Also, move it to the blockstorage module, because it is only used inside that module. Can be reviewed with the git option --color-moved=dimmed-zebra	2025-10-28 16:08:44 +01:00
Lőrinc	743abbcbde	refactor: inline constant return value of `BlockTreeDB::WriteBatchSync` and `BlockManager::WriteBlockIndexDB` and `BlockTreeDB::WriteFlag`	2025-08-13 15:47:48 -07:00
Lőrinc	e030240e90	refactor: inline constant return value of `CDBWrapper::Erase` and `BlockTreeDB::WriteReindexing` Did both in this commit, since the return value of `WriteReindexing` was ignored anyway - which existed only because of the constant `Erase` being called	2025-08-13 15:47:48 -07:00
Lőrinc	cdab9480e9	refactor: inline constant return value of `CDBWrapper::Write`	2025-08-13 15:47:48 -07:00
Lőrinc	50b63a5698	refactor: inline constant return value of `CDBWrapper::WriteBatch` `WriteBatch` can only ever return `true` - its errors are handled by throwing a `throw dbwrapper_error` instead. The boolean return value is quite confusing, especially since it's symmetric with `CDBWrapper::Read`, which catches the exceptions and returns a boolean instead. We're removing the constant return value and inlining `true` for its usages.	2025-08-13 15:47:39 -07:00
Sergi Delgado Segura	18524b072e	Make nSequenceId init value constants Make it easier to follow what the values come without having to go over the comments, plus easier to maintain	2025-07-28 10:15:17 -04:00
Sergi Delgado Segura	8b91883a23	Set the same best tip on restart if two candidates have the same work Before this, if we had two (or more) same work tip candidates and restarted our node, it could be the case that the block set as tip after bootstrap didn't match the one before stopping. That's because the work and `nSequenceId` of both block will be the same (the latter is only kept in memory), so the active chain after restart would have depended on what tip candidate was loaded first. This makes sure that we are consistent over reboots.	2025-07-28 10:15:14 -04:00
Sergi Delgado Segura	ab145cb3b4	Updates CBlockIndexWorkComparator outdated comment	2025-07-28 10:11:34 -04:00
MarcoFalke	face8123fd	log: [refactor] Use info level for init logs This refactor does not change behavior.	2025-07-25 09:50:50 +02:00
MarcoFalke	fa183761cb	log: Remove function name from init logs It is redundant with -logsourcelocations and the log messages are clearer without it. Also, remove a double-space. Also, add braces around `if` touched in the next commit. This tiny behavior change requires a test fixup.	2025-07-25 09:50:24 +02:00
Lőrinc	478d40afc6	refactor: encapsulate `vector`/`array` keys into `Obfuscation`	2025-07-16 14:33:07 -07:00
Lőrinc	0b8bec8aa6	scripted-diff: unify xor-vs-obfuscation nomenclature Mechanical refactor of the low-level "xor" wording to signal the intent instead of the implementation used. The renames are ordered by heaviest-hitting substitutions first, and were constructed such that after each replacement the code is still compilable. -BEGIN VERIFY SCRIPT- sed -i \ -e 's/\bGetObfuscateKey\b/GetObfuscation/g' \ -e 's/\bxor_key\b/obfuscation/g' \ -e 's/\bxor_pat\b/obfuscation/g' \ -e 's/\bm_xor_key\b/m_obfuscation/g' \ -e 's/\bm_xor\b/m_obfuscation/g' \ -e 's/\bobfuscate_key\b/m_obfuscation/g' \ -e 's/\bOBFUSCATE_KEY_KEY\b/OBFUSCATION_KEY_KEY/g' \ -e 's/\bSetXor(/SetObfuscation(/g' \ -e 's/\bdata_xor\b/obfuscation/g' \ -e 's/\bCreateObfuscateKey\b/CreateObfuscation/g' \ -e 's/\bobfuscate key\b/obfuscation key/g' \ $(git ls-files '.cpp' '.h') -END VERIFY SCRIPT-	2025-07-16 14:32:01 -07:00
Lőrinc	54ab0bd64c	refactor: commit to 8 byte obfuscation keys Since 31 byte xor-keys are not used in the codebase, using the common size (8 bytes) makes the benchmarks more realistic. Co-authored-by: maflcko <6399679+maflcko@users.noreply.github.com>	2025-07-16 13:19:18 -07:00
Ava Chow	ea4285775e	Merge bitcoin/bitcoin#29307 : util: explicitly close all AutoFiles that have been written `c10e382d2a` flatfile: check whether the file has been closed successfully (Vasil Dimov) `4bb5dd78ea` util: check that a file has been closed before ~AutoFile() is called (Vasil Dimov) `8bb34f07df` Explicitly close all AutoFiles that have been written (Vasil Dimov) `a69c4098b2` rpc: take ownership of the file by WriteUTXOSnapshot() (Hodlinator) Pull request description: `fclose(3)` may fail to flush the previously written data to disk, thus a failing `fclose(3)` is as serious as a failing `fwrite(3)`. Previously the code ignored `fclose(3)` failures. This PR improves that by changing all users of `AutoFile` that use it to write data to explicitly close the file and handle a possible error. --- Other alternatives are: 1. `fflush(3)` after each write to the file (and throw if it fails from the `AutoFile::write()` method) and hope that `fclose(3)` will then always succeed. Assert that it succeeds from the destructor 🙄. Will hurt performance. 2. Throw nevertheless from the destructor. Exception within the exception in C++ I think results in terminating the program without a useful message. 3. (this is implemented in the latest incarnation of this PR) Redesign `AutoFile` so that its destructor cannot fail. Adjust _all_ its users 😭. For example, if the file has been written to, then require the callers to explicitly call the `AutoFile::fclose()` method before the object goes out of scope. In the destructor, as a sanity check, assume/assert that this is indeed the case. Defeats the purpose of a RAII wrapper for `FILE*` which automatically closes the file when it goes out of scope and there are a lot of users of `AutoFile`. 4. Pass a new callback function to the `AutoFile` constructor which will be called from the destructor to handle `fclose()` errors, as described in https://github.com/bitcoin/bitcoin/pull/29307#issuecomment-2243842400. My thinking is that if that callback is going to only log a message, then we can log the message directly from the destructor without needing a callback. If the callback is going to do more complicated error handling then it is easier to do that at the call site by directly calling `AutoFile::fclose()` instead of getting the `AutoFile` object out of scope (so that its destructor is called) and inspecting for side effects done by the callback (e.g. set a variable to indicate a failed `fclose()`). ACKs for top commit: l0rinc: ACK `c10e382d2a` achow101: ACK `c10e382d2a` hodlinator: re-ACK `c10e382d2a` Tree-SHA512: 3994ca57e5b2b649fc84f24dad144173b7500fc0e914e06291d5c32fbbf8d2b1f8eae0040abd7a5f16095ddf4e11fe1636c6092f49058cda34f3eb2ee536d7ba	2025-07-03 15:37:44 -07:00
Vasil Dimov	8bb34f07df	Explicitly close all AutoFiles that have been written There is no way to report a close error from `AutoFile` destructor. Such an error could be serious if the file has been written to because it may mean the file is now corrupted (same as if write fails). So, change all users of `AutoFile` that use it to write data to explicitly close the file and handle a possible error.	2025-06-16 15:33:15 +02:00
Roman Zeyde	6ecb9fc65f	chore: use `std::vector<std::byte>` for `BlockManager::ReadRawBlock()`	2025-06-13 19:19:44 +03:00
Lőrinc	09ee8b7f27	node: avoid recomputing block hash in `ReadBlock` Eliminate one SHA‑256 double‑hash computation of the header per block read by reusing the hash for: * proof‑of‑work verification; * (optional) integrity check against the supplied hash.	2025-05-26 23:23:44 +02:00
fanquake	2b85d31bcc	refactor: starts/ends_with changes for clang-tidy 20	2025-04-22 13:16:54 +01:00
Lőrinc	8d801e3efb	optimization: bulk serialization writes in `WriteBlockUndo` and `WriteBlock` Similarly to the serialization reads optimization, buffered writes will enable batched XOR calculations. This is especially beneficial since the current implementation requires copying the write input's `std::span` to perform obfuscation. Batching allows us to apply XOR operations on the internal buffer instead, reducing unnecessary data copying and improving performance. ------ > macOS Sequoia 15.3.1 > C++ compiler .......................... Clang 19.1.7 > cmake -B build -DBUILD_BENCH=ON -DCMAKE_BUILD_TYPE=Release -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ && cmake --build build -j$(nproc) && build/bin/bench_bitcoin -filter='WriteBlockBench' -min-time=10000 Before: \| ns/op \| op/s \| err% \| total \| benchmark \|--------------------:\|--------------------:\|--------:\|----------:\|:---------- \| 5,149,564.31 \| 194.19 \| 0.8% \| 10.95 \| `WriteBlockBench` After: \| ns/op \| op/s \| err% \| total \| benchmark \|--------------------:\|--------------------:\|--------:\|----------:\|:---------- \| 2,990,564.63 \| 334.39 \| 1.5% \| 11.27 \| `WriteBlockBench` ------ > Ubuntu 24.04.2 LTS > C++ compiler .......................... GNU 13.3.0 > cmake -B build -DBUILD_BENCH=ON -DCMAKE_BUILD_TYPE=Release -DCMAKE_C_COMPILER=gcc -DCMAKE_CXX_COMPILER=g++ && cmake --build build -j$(nproc) && build/bin/bench_bitcoin -filter='WriteBlockBench' -min-time=20000 Before: \| ns/op \| op/s \| err% \| ins/op \| cyc/op \| IPC \| bra/op \| miss% \| total \| benchmark \|--------------------:\|--------------------:\|--------:\|----------------:\|----------------:\|-------:\|---------------:\|--------:\|----------:\|:---------- \| 5,152,973.58 \| 194.06 \| 2.2% \| 19,350,886.41 \| 8,784,539.75 \| 2.203 \| 3,079,335.21 \| 0.4% \| 23.18 \| `WriteBlockBench` After: \| ns/op \| op/s \| err% \| ins/op \| cyc/op \| IPC \| bra/op \| miss% \| total \| benchmark \|--------------------:\|--------------------:\|--------:\|----------------:\|----------------:\|-------:\|---------------:\|--------:\|----------:\|:---------- \| 4,145,681.13 \| 241.21 \| 4.0% \| 15,337,596.85 \| 5,732,186.47 \| 2.676 \| 2,239,662.64 \| 0.1% \| 23.94 \| `WriteBlockBench` Co-authored-by: Ryan Ofsky <ryan@ofsky.org> Co-authored-by: Cory Fields <cory-nospam-@coryfields.com>	2025-04-14 12:04:06 +02:00
Lőrinc	520965e293	optimization: bulk serialization reads in `UndoRead`, `ReadBlock` The obfuscation (XOR) operations are currently done byte-by-byte during serialization. Buffering the reads will enable batching the obfuscation operations later. Different operating systems handle file caching differently, so reading larger batches (and processing them from memory) is measurably faster, likely because of fewer native fread calls and reduced lock contention. Note that `ReadRawBlock` doesn't need buffering since it already reads the whole block directly. Unlike `ReadBlockUndo`, the new `ReadBlock` implementation delegates to `ReadRawBlock`, which uses more memory than a buffered alternative but results in slightly simpler code and a small performance increase (~0.4%). This approach also clearly documents that `ReadRawBlock` is a logical subset of `ReadBlock` functionality. The current implementation, which iterates over a fixed-size buffer, provides a more general alternative to Cory Fields' solution of reading the entire block size in advance. Buffer sizes were selected based on benchmarking to ensure the buffered reader produces performance similar to reading the whole block into memory. Smaller buffers were slower, while larger ones showed diminishing returns. ------ > macOS Sequoia 15.3.1 > C++ compiler .......................... Clang 19.1.7 > cmake -B build -DBUILD_BENCH=ON -DCMAKE_BUILD_TYPE=Release -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ && cmake --build build -j$(nproc) && build/bin/bench_bitcoin -filter='ReadBlockBench' -min-time=10000 Before: \| ns/op \| op/s \| err% \| total \| benchmark \|--------------------:\|--------------------:\|--------:\|----------:\|:---------- \| 2,271,441.67 \| 440.25 \| 0.1% \| 11.00 \| `ReadBlockBench` After: \| ns/op \| op/s \| err% \| total \| benchmark \|--------------------:\|--------------------:\|--------:\|----------:\|:---------- \| 1,738,971.29 \| 575.05 \| 0.2% \| 10.97 \| `ReadBlockBench` ------ > Ubuntu 24.04.2 LTS > C++ compiler .......................... GNU 13.3.0 > cmake -B build -DBUILD_BENCH=ON -DCMAKE_BUILD_TYPE=Release -DCMAKE_C_COMPILER=gcc -DCMAKE_CXX_COMPILER=g++ && cmake --build build -j$(nproc) && build/bin/bench_bitcoin -filter='ReadBlockBench' -min-time=20000 Before: \| ns/op \| op/s \| err% \| ins/op \| cyc/op \| IPC \| bra/op \| miss% \| total \| benchmark \|--------------------:\|--------------------:\|--------:\|----------------:\|----------------:\|-------:\|---------------:\|--------:\|----------:\|:---------- \| 6,895,987.11 \| 145.01 \| 0.0% \| 71,055,269.86 \| 23,977,374.37 \| 2.963 \| 5,074,828.78 \| 0.4% \| 22.00 \| `ReadBlockBench` After: \| ns/op \| op/s \| err% \| ins/op \| cyc/op \| IPC \| bra/op \| miss% \| total \| benchmark \|--------------------:\|--------------------:\|--------:\|----------------:\|----------------:\|-------:\|---------------:\|--------:\|----------:\|:---------- \| 5,771,882.71 \| 173.25 \| 0.0% \| 65,741,889.82 \| 20,453,232.33 \| 3.214 \| 3,971,321.75 \| 0.3% \| 22.01 \| `ReadBlockBench` Co-authored-by: maflcko <6399679+maflcko@users.noreply.github.com> Co-authored-by: Ryan Ofsky <ryan@ofsky.org> Co-authored-by: Martin Leitner-Ankerl <martin.ankerl@gmail.com> Co-authored-by: Cory Fields <cory-nospam-@coryfields.com>	2025-04-14 12:04:06 +02:00
Lőrinc	056cb3c0d2	refactor: clear up blockstorage/streams in preparation for optimization Made every OpenBlockFile#fReadOnly value explicit. Replaced hard-coded values in ReadRawBlock with STORAGE_HEADER_BYTES. Changed `STORAGE_HEADER_BYTES` and `UNDO_DATA_DISK_OVERHEAD` to `uint32_t` to avoid casts. Also added `LIFETIMEBOUND` to the `AutoFile` parameter of `BufferedFile`, which stores a reference to the underlying `AutoFile`, allowing Clang to emit warnings if the referenced `AutoFile` might be destroyed while `BufferedFile` still exists. Without this attribute, code with lifetime violations wouldn't trigger compiler warnings. Co-authored-by: maflcko <6399679+maflcko@users.noreply.github.com>	2025-04-14 11:57:14 +02:00
Lőrinc	67fcc64802	log: unify error messages for (read/write)[undo]block Co-authored-by: maflcko <6399679+maflcko@users.noreply.github.com>	2025-04-13 23:44:46 +02:00
Lőrinc	a4de160492	scripted-diff: shorten BLOCK_SERIALIZATION_HEADER_SIZE constant Renames the constant to be less verbose and better reflect its purpose: it represents the size of the storage header that precedes serialized block data on disk, not to be confused with a block's own header. -BEGIN VERIFY SCRIPT- git grep -q "STORAGE_HEADER_BYTES" $(git ls-files) && echo "Error: Target name STORAGE_HEADER_BYTES already exists in the codebase" && exit 1 sed -i 's/BLOCK_SERIALIZATION_HEADER_SIZE/STORAGE_HEADER_BYTES/g' $(git grep -l 'BLOCK_SERIALIZATION_HEADER_SIZE') -END VERIFY SCRIPT-	2025-04-13 23:44:46 +02:00
Lőrinc	6640dd52c9	Narrow scope of undofile write to avoid possible resource management issue `AutoFile{OpenUndoFile(pos)}` was still in scope when `FlushUndoFile(pos.nFile)` was called, which could lead to file handle conflicts or other unexpected behavior. Co-authored-by: Hodlinator <172445034+hodlinator@users.noreply.github.com> Co-authored-by: maflcko <6399679+maflcko@users.noreply.github.com>	2025-04-13 23:44:46 +02:00
Lőrinc	3197155f91	refactor: collect block read operations into try block Reorganized error handling in block-related operations by grouping related operations together within the same scope. In `ReadBlockUndo()` and `ReadBlock()`, moved all deserialization operations, comments and checksum verification inside a single try/catch block for cleaner error handling. In `WriteBlockUndo()`, consolidated hash calculation and data writing operations within a common block to better express their logical relationship.	2025-04-13 23:44:44 +02:00
marcofleon	3c5d1a4681	Remove checkpoints The headers presync logic should be enough to prevent memory DoS using low-work headers. Therefore, we no longer have any use for checkpoints.	2025-03-13 11:13:13 +00:00
Ava Chow	601a6a6917	Merge bitcoin/bitcoin#30965 : kernel: Move block tree db open to block manager `0cdddeb224` kernel: Move block tree db open to BlockManager constructor (TheCharlatan) `7fbb1bc44b` kernel: Move block tree db open to block manager (TheCharlatan) `57ba59c0cd` refactor: Remove redundant reindex check (TheCharlatan) Pull request description: Before this change the block tree db was needlessly re-opened during startup when loading a completed snapshot. Improve this by letting the block manager open it on construction. This also simplifies the test code a bit. The change was initially motivated to make it easier for users of the kernel library to instantiate a BlockManager that may be used to read data from disk without loading the block index into a cache. ACKs for top commit: maflcko: re-ACK `0cdddeb224` 🏪 achow101: ACK `0cdddeb224` mzumsande: re-ACK `0cdddeb224` Tree-SHA512: fe3d557a725367e549e6a0659f64259cfef6aaa565ec867d9a177be0143ff18a2c4a20dd57e35e15f97cf870df476d88c05b03b6a7d9e8d51c568d9eda8947ef	2025-01-31 15:28:06 -05:00
Ava Chow	9ecc7af41f	Merge bitcoin/bitcoin#31674 : init: Lock blocksdir in addition to datadir `2656a5658c` tests: add a test for the new blocksdir lock (Cory Fields) `bdc0a68e67` init: lock blocksdir in addition to datadir (Cory Fields) `cabb2e5c24` refactor: introduce a more general LockDirectories for init (Cory Fields) `1db331ba76` init: allow a new xor key to be written if the blocksdir is newly created (Cory Fields) Pull request description: This probably should've been included in #12653 when `-blocksdir` was introduced. Credit TheCharlatan for noticing that it's missing. This guards against 2 processes running with separate datadirs but the same blocksdir. I didn't add `walletdir` as I assume sqlite has us covered there. It's not likely to happen currently, but may be more relevant in the future with applications using the kernel. Note that the kernel does not currently do any dir locking, but it should. ACKs for top commit: maflcko: review ACK `2656a5658c` 🏼 kevkevinpal: ACK [`2656a56`](`2656a5658c`) achow101: ACK `2656a5658c` tdb3: Code review and light test ACK `2656a5658c` Tree-SHA512: 3ba17dc670126adda104148e14d1322ea4f67d671c84aaa9c08c760ef778ca1936832c0dc843cd6367e09939f64c6f0a682b0fa23a5967e821b899dff1fff961	2025-01-24 18:15:00 -05:00
TheCharlatan	0cdddeb224	kernel: Move block tree db open to BlockManager constructor Make the block db open RAII style by calling it in the BlockManager constructor. Before this change the block tree db was needlessly re-opened during startup when loading a completed snapshot. Improve this by letting the block manager open it on construction. This also simplifies the test code a bit. The change was initially motivated to make it easier for users of the kernel library to instantiate a BlockManager that may be used to read data from disk without loading the block index into a cache.	2025-01-20 21:27:50 +01:00
Cory Fields	1db331ba76	init: allow a new xor key to be written if the blocksdir is newly created A subsequent commit will add a .lock file to this dir at startup, meaning that the blocksdir is never empty by the time the xor key is being read/written. Ignore all hidden files when determining if this is the first run.	2025-01-16 21:06:21 +00:00
Lőrinc	223081ece6	scripted-diff: rename block and undo functions for consistency Co-authored-by: Ryan Ofsky <ryan@ofsky.org> Co-authored-by: Hodlinator <172445034+hodlinator@users.noreply.github.com> -BEGIN VERIFY SCRIPT- grep -r -wE 'WriteBlock\|ReadRawBlock\|ReadBlock\|WriteBlockUndo\|ReadBlockUndo' $(git ls-files src/ ':!src/leveldb') && \ echo "Error: One or more target names already exist!" && exit 1 sed -i \ -e 's/\bSaveBlockToDisk/WriteBlock/g' \ -e 's/\bReadRawBlockFromDisk/ReadRawBlock/g' \ -e 's/\bReadBlockFromDisk/ReadBlock/g' \ -e 's/\bWriteUndoDataForBlock/WriteBlockUndo/g' \ -e 's/\bUndoReadFromDisk/ReadBlockUndo/g' \ $(git ls-files src/ ':!src/leveldb') -END VERIFY SCRIPT-	2025-01-09 15:17:02 +01:00
Lőrinc	baaa3b2846	refactor,blocks: remove costly asserts and modernize affected logs When the behavior was changes in a previous commit (caching `GetSerializeSize` and avoiding `AutoFile.tell`), (static)asserts were added to make sure the behavior was kept - to make sure reviewers and CI validates it. We can safely remove them now. Logs were also slightly modernized since they were trivial to do. Co-authored-by: Anthony Towns <aj@erisian.com.au> Co-authored-by: Hodlinator <172445034+hodlinator@users.noreply.github.com>	2025-01-09 15:16:49 +01:00
Lőrinc	fa39f27a0f	refactor,blocks: deduplicate block's serialized size calculations For consistency `UNDO_DATA_DISK_OVERHEAD` was also extracted to avoid the constant's ambiguity. Asserts were added to help with the review - they are removed in the next commit. Co-authored-by: Ryan Ofsky <ryan@ofsky.org>	2025-01-09 15:16:28 +01:00
Lőrinc	dfb2f9d004	refactor,blocks: inline `WriteBlockToDisk` Similarly, `WriteBlockToDisk` wasn't really extracting a meaningful subset of the `SaveBlockToDisk` functionality, it's tied closely to the only caller (needs the header size twice, recalculated block serializes size, returns multiple branches, mutates parameter). The inlined code should only differ in these parts (modernization will be done in other commits): * renamed `blockPos` to `pos` in `SaveBlockToDisk` to match the parameter name; * changed `return false` to `return FlatFilePos()`. Also removed remaining references to `SaveBlockToDisk`. Co-authored-by: Ryan Ofsky <ryan@ofsky.org>	2025-01-09 13:24:53 +01:00
Lőrinc	42bc491465	refactor,blocks: inline `UndoWriteToDisk` `UndoWriteToDisk` wasn't really extracting a meaningful subset of the `WriteUndoDataForBlock` functionality, it's tied closely to the only caller (needs the header size twice, recalculated undo serializes size, returns multiple branches, modifies parameter, needs documentation). The inlined code should only differ in these parts (modernization will be done in other commits): * renamed `_pos` to `pos` in `WriteUndoDataForBlock` to match the parameter name; * inlined `hashBlock` parameter usage into `hasher << block.pprev->GetBlockHash()`; * changed `return false` to `return FatalError`; * capitalize comment. Co-authored-by: Ryan Ofsky <ryan@ofsky.org>	2025-01-09 13:18:22 +01:00

1 2 3 4 5

230 Commits