bitcoin

mirror of https://github.com/bitcoin/bitcoin.git synced 2026-01-19 14:53:43 +01:00

Author	SHA1	Message	Date
MarcoFalke	face8123fd	log: [refactor] Use info level for init logs This refactor does not change behavior.	2025-07-25 09:50:50 +02:00
MarcoFalke	fa183761cb	log: Remove function name from init logs It is redundant with -logsourcelocations and the log messages are clearer without it. Also, remove a double-space. Also, add braces around `if` touched in the next commit. This tiny behavior change requires a test fixup.	2025-07-25 09:50:24 +02:00
Lőrinc	478d40afc6	refactor: encapsulate `vector`/`array` keys into `Obfuscation`	2025-07-16 14:33:07 -07:00
Lőrinc	0b8bec8aa6	scripted-diff: unify xor-vs-obfuscation nomenclature Mechanical refactor of the low-level "xor" wording to signal the intent instead of the implementation used. The renames are ordered by heaviest-hitting substitutions first, and were constructed such that after each replacement the code is still compilable. -BEGIN VERIFY SCRIPT- sed -i \ -e 's/\bGetObfuscateKey\b/GetObfuscation/g' \ -e 's/\bxor_key\b/obfuscation/g' \ -e 's/\bxor_pat\b/obfuscation/g' \ -e 's/\bm_xor_key\b/m_obfuscation/g' \ -e 's/\bm_xor\b/m_obfuscation/g' \ -e 's/\bobfuscate_key\b/m_obfuscation/g' \ -e 's/\bOBFUSCATE_KEY_KEY\b/OBFUSCATION_KEY_KEY/g' \ -e 's/\bSetXor(/SetObfuscation(/g' \ -e 's/\bdata_xor\b/obfuscation/g' \ -e 's/\bCreateObfuscateKey\b/CreateObfuscation/g' \ -e 's/\bobfuscate key\b/obfuscation key/g' \ $(git ls-files '.cpp' '.h') -END VERIFY SCRIPT-	2025-07-16 14:32:01 -07:00
Lőrinc	54ab0bd64c	refactor: commit to 8 byte obfuscation keys Since 31 byte xor-keys are not used in the codebase, using the common size (8 bytes) makes the benchmarks more realistic. Co-authored-by: maflcko <6399679+maflcko@users.noreply.github.com>	2025-07-16 13:19:18 -07:00
Ava Chow	ea4285775e	Merge bitcoin/bitcoin#29307 : util: explicitly close all AutoFiles that have been written `c10e382d2a` flatfile: check whether the file has been closed successfully (Vasil Dimov) `4bb5dd78ea` util: check that a file has been closed before ~AutoFile() is called (Vasil Dimov) `8bb34f07df` Explicitly close all AutoFiles that have been written (Vasil Dimov) `a69c4098b2` rpc: take ownership of the file by WriteUTXOSnapshot() (Hodlinator) Pull request description: `fclose(3)` may fail to flush the previously written data to disk, thus a failing `fclose(3)` is as serious as a failing `fwrite(3)`. Previously the code ignored `fclose(3)` failures. This PR improves that by changing all users of `AutoFile` that use it to write data to explicitly close the file and handle a possible error. --- Other alternatives are: 1. `fflush(3)` after each write to the file (and throw if it fails from the `AutoFile::write()` method) and hope that `fclose(3)` will then always succeed. Assert that it succeeds from the destructor 🙄. Will hurt performance. 2. Throw nevertheless from the destructor. Exception within the exception in C++ I think results in terminating the program without a useful message. 3. (this is implemented in the latest incarnation of this PR) Redesign `AutoFile` so that its destructor cannot fail. Adjust _all_ its users 😭. For example, if the file has been written to, then require the callers to explicitly call the `AutoFile::fclose()` method before the object goes out of scope. In the destructor, as a sanity check, assume/assert that this is indeed the case. Defeats the purpose of a RAII wrapper for `FILE*` which automatically closes the file when it goes out of scope and there are a lot of users of `AutoFile`. 4. Pass a new callback function to the `AutoFile` constructor which will be called from the destructor to handle `fclose()` errors, as described in https://github.com/bitcoin/bitcoin/pull/29307#issuecomment-2243842400. My thinking is that if that callback is going to only log a message, then we can log the message directly from the destructor without needing a callback. If the callback is going to do more complicated error handling then it is easier to do that at the call site by directly calling `AutoFile::fclose()` instead of getting the `AutoFile` object out of scope (so that its destructor is called) and inspecting for side effects done by the callback (e.g. set a variable to indicate a failed `fclose()`). ACKs for top commit: l0rinc: ACK `c10e382d2a` achow101: ACK `c10e382d2a` hodlinator: re-ACK `c10e382d2a` Tree-SHA512: 3994ca57e5b2b649fc84f24dad144173b7500fc0e914e06291d5c32fbbf8d2b1f8eae0040abd7a5f16095ddf4e11fe1636c6092f49058cda34f3eb2ee536d7ba	2025-07-03 15:37:44 -07:00
Vasil Dimov	8bb34f07df	Explicitly close all AutoFiles that have been written There is no way to report a close error from `AutoFile` destructor. Such an error could be serious if the file has been written to because it may mean the file is now corrupted (same as if write fails). So, change all users of `AutoFile` that use it to write data to explicitly close the file and handle a possible error.	2025-06-16 15:33:15 +02:00
Roman Zeyde	6ecb9fc65f	chore: use `std::vector<std::byte>` for `BlockManager::ReadRawBlock()`	2025-06-13 19:19:44 +03:00
Lőrinc	09ee8b7f27	node: avoid recomputing block hash in `ReadBlock` Eliminate one SHA‑256 double‑hash computation of the header per block read by reusing the hash for: * proof‑of‑work verification; * (optional) integrity check against the supplied hash.	2025-05-26 23:23:44 +02:00
fanquake	2b85d31bcc	refactor: starts/ends_with changes for clang-tidy 20	2025-04-22 13:16:54 +01:00
Lőrinc	8d801e3efb	optimization: bulk serialization writes in `WriteBlockUndo` and `WriteBlock` Similarly to the serialization reads optimization, buffered writes will enable batched XOR calculations. This is especially beneficial since the current implementation requires copying the write input's `std::span` to perform obfuscation. Batching allows us to apply XOR operations on the internal buffer instead, reducing unnecessary data copying and improving performance. ------ > macOS Sequoia 15.3.1 > C++ compiler .......................... Clang 19.1.7 > cmake -B build -DBUILD_BENCH=ON -DCMAKE_BUILD_TYPE=Release -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ && cmake --build build -j$(nproc) && build/bin/bench_bitcoin -filter='WriteBlockBench' -min-time=10000 Before: \| ns/op \| op/s \| err% \| total \| benchmark \|--------------------:\|--------------------:\|--------:\|----------:\|:---------- \| 5,149,564.31 \| 194.19 \| 0.8% \| 10.95 \| `WriteBlockBench` After: \| ns/op \| op/s \| err% \| total \| benchmark \|--------------------:\|--------------------:\|--------:\|----------:\|:---------- \| 2,990,564.63 \| 334.39 \| 1.5% \| 11.27 \| `WriteBlockBench` ------ > Ubuntu 24.04.2 LTS > C++ compiler .......................... GNU 13.3.0 > cmake -B build -DBUILD_BENCH=ON -DCMAKE_BUILD_TYPE=Release -DCMAKE_C_COMPILER=gcc -DCMAKE_CXX_COMPILER=g++ && cmake --build build -j$(nproc) && build/bin/bench_bitcoin -filter='WriteBlockBench' -min-time=20000 Before: \| ns/op \| op/s \| err% \| ins/op \| cyc/op \| IPC \| bra/op \| miss% \| total \| benchmark \|--------------------:\|--------------------:\|--------:\|----------------:\|----------------:\|-------:\|---------------:\|--------:\|----------:\|:---------- \| 5,152,973.58 \| 194.06 \| 2.2% \| 19,350,886.41 \| 8,784,539.75 \| 2.203 \| 3,079,335.21 \| 0.4% \| 23.18 \| `WriteBlockBench` After: \| ns/op \| op/s \| err% \| ins/op \| cyc/op \| IPC \| bra/op \| miss% \| total \| benchmark \|--------------------:\|--------------------:\|--------:\|----------------:\|----------------:\|-------:\|---------------:\|--------:\|----------:\|:---------- \| 4,145,681.13 \| 241.21 \| 4.0% \| 15,337,596.85 \| 5,732,186.47 \| 2.676 \| 2,239,662.64 \| 0.1% \| 23.94 \| `WriteBlockBench` Co-authored-by: Ryan Ofsky <ryan@ofsky.org> Co-authored-by: Cory Fields <cory-nospam-@coryfields.com>	2025-04-14 12:04:06 +02:00
Lőrinc	520965e293	optimization: bulk serialization reads in `UndoRead`, `ReadBlock` The obfuscation (XOR) operations are currently done byte-by-byte during serialization. Buffering the reads will enable batching the obfuscation operations later. Different operating systems handle file caching differently, so reading larger batches (and processing them from memory) is measurably faster, likely because of fewer native fread calls and reduced lock contention. Note that `ReadRawBlock` doesn't need buffering since it already reads the whole block directly. Unlike `ReadBlockUndo`, the new `ReadBlock` implementation delegates to `ReadRawBlock`, which uses more memory than a buffered alternative but results in slightly simpler code and a small performance increase (~0.4%). This approach also clearly documents that `ReadRawBlock` is a logical subset of `ReadBlock` functionality. The current implementation, which iterates over a fixed-size buffer, provides a more general alternative to Cory Fields' solution of reading the entire block size in advance. Buffer sizes were selected based on benchmarking to ensure the buffered reader produces performance similar to reading the whole block into memory. Smaller buffers were slower, while larger ones showed diminishing returns. ------ > macOS Sequoia 15.3.1 > C++ compiler .......................... Clang 19.1.7 > cmake -B build -DBUILD_BENCH=ON -DCMAKE_BUILD_TYPE=Release -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ && cmake --build build -j$(nproc) && build/bin/bench_bitcoin -filter='ReadBlockBench' -min-time=10000 Before: \| ns/op \| op/s \| err% \| total \| benchmark \|--------------------:\|--------------------:\|--------:\|----------:\|:---------- \| 2,271,441.67 \| 440.25 \| 0.1% \| 11.00 \| `ReadBlockBench` After: \| ns/op \| op/s \| err% \| total \| benchmark \|--------------------:\|--------------------:\|--------:\|----------:\|:---------- \| 1,738,971.29 \| 575.05 \| 0.2% \| 10.97 \| `ReadBlockBench` ------ > Ubuntu 24.04.2 LTS > C++ compiler .......................... GNU 13.3.0 > cmake -B build -DBUILD_BENCH=ON -DCMAKE_BUILD_TYPE=Release -DCMAKE_C_COMPILER=gcc -DCMAKE_CXX_COMPILER=g++ && cmake --build build -j$(nproc) && build/bin/bench_bitcoin -filter='ReadBlockBench' -min-time=20000 Before: \| ns/op \| op/s \| err% \| ins/op \| cyc/op \| IPC \| bra/op \| miss% \| total \| benchmark \|--------------------:\|--------------------:\|--------:\|----------------:\|----------------:\|-------:\|---------------:\|--------:\|----------:\|:---------- \| 6,895,987.11 \| 145.01 \| 0.0% \| 71,055,269.86 \| 23,977,374.37 \| 2.963 \| 5,074,828.78 \| 0.4% \| 22.00 \| `ReadBlockBench` After: \| ns/op \| op/s \| err% \| ins/op \| cyc/op \| IPC \| bra/op \| miss% \| total \| benchmark \|--------------------:\|--------------------:\|--------:\|----------------:\|----------------:\|-------:\|---------------:\|--------:\|----------:\|:---------- \| 5,771,882.71 \| 173.25 \| 0.0% \| 65,741,889.82 \| 20,453,232.33 \| 3.214 \| 3,971,321.75 \| 0.3% \| 22.01 \| `ReadBlockBench` Co-authored-by: maflcko <6399679+maflcko@users.noreply.github.com> Co-authored-by: Ryan Ofsky <ryan@ofsky.org> Co-authored-by: Martin Leitner-Ankerl <martin.ankerl@gmail.com> Co-authored-by: Cory Fields <cory-nospam-@coryfields.com>	2025-04-14 12:04:06 +02:00
Lőrinc	056cb3c0d2	refactor: clear up blockstorage/streams in preparation for optimization Made every OpenBlockFile#fReadOnly value explicit. Replaced hard-coded values in ReadRawBlock with STORAGE_HEADER_BYTES. Changed `STORAGE_HEADER_BYTES` and `UNDO_DATA_DISK_OVERHEAD` to `uint32_t` to avoid casts. Also added `LIFETIMEBOUND` to the `AutoFile` parameter of `BufferedFile`, which stores a reference to the underlying `AutoFile`, allowing Clang to emit warnings if the referenced `AutoFile` might be destroyed while `BufferedFile` still exists. Without this attribute, code with lifetime violations wouldn't trigger compiler warnings. Co-authored-by: maflcko <6399679+maflcko@users.noreply.github.com>	2025-04-14 11:57:14 +02:00
Lőrinc	67fcc64802	log: unify error messages for (read/write)[undo]block Co-authored-by: maflcko <6399679+maflcko@users.noreply.github.com>	2025-04-13 23:44:46 +02:00
Lőrinc	a4de160492	scripted-diff: shorten BLOCK_SERIALIZATION_HEADER_SIZE constant Renames the constant to be less verbose and better reflect its purpose: it represents the size of the storage header that precedes serialized block data on disk, not to be confused with a block's own header. -BEGIN VERIFY SCRIPT- git grep -q "STORAGE_HEADER_BYTES" $(git ls-files) && echo "Error: Target name STORAGE_HEADER_BYTES already exists in the codebase" && exit 1 sed -i 's/BLOCK_SERIALIZATION_HEADER_SIZE/STORAGE_HEADER_BYTES/g' $(git grep -l 'BLOCK_SERIALIZATION_HEADER_SIZE') -END VERIFY SCRIPT-	2025-04-13 23:44:46 +02:00
Lőrinc	6640dd52c9	Narrow scope of undofile write to avoid possible resource management issue `AutoFile{OpenUndoFile(pos)}` was still in scope when `FlushUndoFile(pos.nFile)` was called, which could lead to file handle conflicts or other unexpected behavior. Co-authored-by: Hodlinator <172445034+hodlinator@users.noreply.github.com> Co-authored-by: maflcko <6399679+maflcko@users.noreply.github.com>	2025-04-13 23:44:46 +02:00
Lőrinc	3197155f91	refactor: collect block read operations into try block Reorganized error handling in block-related operations by grouping related operations together within the same scope. In `ReadBlockUndo()` and `ReadBlock()`, moved all deserialization operations, comments and checksum verification inside a single try/catch block for cleaner error handling. In `WriteBlockUndo()`, consolidated hash calculation and data writing operations within a common block to better express their logical relationship.	2025-04-13 23:44:44 +02:00
marcofleon	3c5d1a4681	Remove checkpoints The headers presync logic should be enough to prevent memory DoS using low-work headers. Therefore, we no longer have any use for checkpoints.	2025-03-13 11:13:13 +00:00
Ava Chow	601a6a6917	Merge bitcoin/bitcoin#30965 : kernel: Move block tree db open to block manager `0cdddeb224` kernel: Move block tree db open to BlockManager constructor (TheCharlatan) `7fbb1bc44b` kernel: Move block tree db open to block manager (TheCharlatan) `57ba59c0cd` refactor: Remove redundant reindex check (TheCharlatan) Pull request description: Before this change the block tree db was needlessly re-opened during startup when loading a completed snapshot. Improve this by letting the block manager open it on construction. This also simplifies the test code a bit. The change was initially motivated to make it easier for users of the kernel library to instantiate a BlockManager that may be used to read data from disk without loading the block index into a cache. ACKs for top commit: maflcko: re-ACK `0cdddeb224` 🏪 achow101: ACK `0cdddeb224` mzumsande: re-ACK `0cdddeb224` Tree-SHA512: fe3d557a725367e549e6a0659f64259cfef6aaa565ec867d9a177be0143ff18a2c4a20dd57e35e15f97cf870df476d88c05b03b6a7d9e8d51c568d9eda8947ef	2025-01-31 15:28:06 -05:00
Ava Chow	9ecc7af41f	Merge bitcoin/bitcoin#31674 : init: Lock blocksdir in addition to datadir `2656a5658c` tests: add a test for the new blocksdir lock (Cory Fields) `bdc0a68e67` init: lock blocksdir in addition to datadir (Cory Fields) `cabb2e5c24` refactor: introduce a more general LockDirectories for init (Cory Fields) `1db331ba76` init: allow a new xor key to be written if the blocksdir is newly created (Cory Fields) Pull request description: This probably should've been included in #12653 when `-blocksdir` was introduced. Credit TheCharlatan for noticing that it's missing. This guards against 2 processes running with separate datadirs but the same blocksdir. I didn't add `walletdir` as I assume sqlite has us covered there. It's not likely to happen currently, but may be more relevant in the future with applications using the kernel. Note that the kernel does not currently do any dir locking, but it should. ACKs for top commit: maflcko: review ACK `2656a5658c` 🏼 kevkevinpal: ACK [`2656a56`](`2656a5658c`) achow101: ACK `2656a5658c` tdb3: Code review and light test ACK `2656a5658c` Tree-SHA512: 3ba17dc670126adda104148e14d1322ea4f67d671c84aaa9c08c760ef778ca1936832c0dc843cd6367e09939f64c6f0a682b0fa23a5967e821b899dff1fff961	2025-01-24 18:15:00 -05:00
TheCharlatan	0cdddeb224	kernel: Move block tree db open to BlockManager constructor Make the block db open RAII style by calling it in the BlockManager constructor. Before this change the block tree db was needlessly re-opened during startup when loading a completed snapshot. Improve this by letting the block manager open it on construction. This also simplifies the test code a bit. The change was initially motivated to make it easier for users of the kernel library to instantiate a BlockManager that may be used to read data from disk without loading the block index into a cache.	2025-01-20 21:27:50 +01:00
Cory Fields	1db331ba76	init: allow a new xor key to be written if the blocksdir is newly created A subsequent commit will add a .lock file to this dir at startup, meaning that the blocksdir is never empty by the time the xor key is being read/written. Ignore all hidden files when determining if this is the first run.	2025-01-16 21:06:21 +00:00
Lőrinc	223081ece6	scripted-diff: rename block and undo functions for consistency Co-authored-by: Ryan Ofsky <ryan@ofsky.org> Co-authored-by: Hodlinator <172445034+hodlinator@users.noreply.github.com> -BEGIN VERIFY SCRIPT- grep -r -wE 'WriteBlock\|ReadRawBlock\|ReadBlock\|WriteBlockUndo\|ReadBlockUndo' $(git ls-files src/ ':!src/leveldb') && \ echo "Error: One or more target names already exist!" && exit 1 sed -i \ -e 's/\bSaveBlockToDisk/WriteBlock/g' \ -e 's/\bReadRawBlockFromDisk/ReadRawBlock/g' \ -e 's/\bReadBlockFromDisk/ReadBlock/g' \ -e 's/\bWriteUndoDataForBlock/WriteBlockUndo/g' \ -e 's/\bUndoReadFromDisk/ReadBlockUndo/g' \ $(git ls-files src/ ':!src/leveldb') -END VERIFY SCRIPT-	2025-01-09 15:17:02 +01:00
Lőrinc	baaa3b2846	refactor,blocks: remove costly asserts and modernize affected logs When the behavior was changes in a previous commit (caching `GetSerializeSize` and avoiding `AutoFile.tell`), (static)asserts were added to make sure the behavior was kept - to make sure reviewers and CI validates it. We can safely remove them now. Logs were also slightly modernized since they were trivial to do. Co-authored-by: Anthony Towns <aj@erisian.com.au> Co-authored-by: Hodlinator <172445034+hodlinator@users.noreply.github.com>	2025-01-09 15:16:49 +01:00
Lőrinc	fa39f27a0f	refactor,blocks: deduplicate block's serialized size calculations For consistency `UNDO_DATA_DISK_OVERHEAD` was also extracted to avoid the constant's ambiguity. Asserts were added to help with the review - they are removed in the next commit. Co-authored-by: Ryan Ofsky <ryan@ofsky.org>	2025-01-09 15:16:28 +01:00
Lőrinc	dfb2f9d004	refactor,blocks: inline `WriteBlockToDisk` Similarly, `WriteBlockToDisk` wasn't really extracting a meaningful subset of the `SaveBlockToDisk` functionality, it's tied closely to the only caller (needs the header size twice, recalculated block serializes size, returns multiple branches, mutates parameter). The inlined code should only differ in these parts (modernization will be done in other commits): * renamed `blockPos` to `pos` in `SaveBlockToDisk` to match the parameter name; * changed `return false` to `return FlatFilePos()`. Also removed remaining references to `SaveBlockToDisk`. Co-authored-by: Ryan Ofsky <ryan@ofsky.org>	2025-01-09 13:24:53 +01:00
Lőrinc	42bc491465	refactor,blocks: inline `UndoWriteToDisk` `UndoWriteToDisk` wasn't really extracting a meaningful subset of the `WriteUndoDataForBlock` functionality, it's tied closely to the only caller (needs the header size twice, recalculated undo serializes size, returns multiple branches, modifies parameter, needs documentation). The inlined code should only differ in these parts (modernization will be done in other commits): * renamed `_pos` to `pos` in `WriteUndoDataForBlock` to match the parameter name; * inlined `hashBlock` parameter usage into `hasher << block.pprev->GetBlockHash()`; * changed `return false` to `return FatalError`; * capitalize comment. Co-authored-by: Ryan Ofsky <ryan@ofsky.org>	2025-01-09 13:18:22 +01:00
Pieter Wuille	67a3d59076	streams: remove unused code	2024-09-19 07:33:02 -04:00
Pieter Wuille	e624a9bef1	streams: cache file position within AutoFile	2024-09-13 07:35:41 -04:00
Ava Chow	d4b5553849	Merge bitcoin/bitcoin#30742 : kernel: Use spans instead of vectors for passing block headers to validation functions `a2955f0979` validation: Use span for ImportBlocks paths (TheCharlatan) `20515ea3f5` validation: Use span for CalculateClaimedHeadersWork (TheCharlatan) `52575e96e7` validation: Use span for ProcessNewBlockHeaders (TheCharlatan) Pull request description: Makes it friendlier for potential future users of the kernel library if they do not store the headers in a std::vector, but can guarantee contiguous memory. Take this opportunity to also change the argument of ImportBlocks previously taking a `std::vector` to a `std::span`. ACKs for top commit: stickies-v: re-ACK `a2955f0979` - no changes except further walking the ~file~ path of modernizing variable names. maflcko: ACK `a2955f0979` 🕑 achow101: ACK `a2955f0979` danielabrozzoni: ACK `a2955f0979` Tree-SHA512: 8b07f4ad26e270b65600d1968cd78847b85caca5bfbb83fd9860389f26656b1d9a40b85e0990339f50403d18cedcd2456990054f3b8b0bedce943e50222d2709	2024-09-03 15:40:40 -04:00
TheCharlatan	a2955f0979	validation: Use span for ImportBlocks paths Makes it friendlier for potential future users of the kernel library if they do not store the headers in a std::vector, but can guarantee contiguous memory.	2024-08-30 12:39:46 +02:00
MarcoFalke	3333415890	scripted-diff: LogPrint -> LogDebug -BEGIN VERIFY SCRIPT- sed -i 's/\<LogPrint\>/LogDebug/g' $( git grep -l '\<LogPrint\>' -- ./contrib/ ./src/ ./test/ ':(exclude)src/logging.h' ) -END VERIFY SCRIPT-	2024-08-29 13:49:57 +02:00
stickies-v	2925bd537c	refactor: use c++20 std::views::reverse instead of reverse_iterator.h Use std::ranges::views::reverse instead of the implementation in reverse_iterator.h, and remove it as it is no longer used.	2024-08-06 00:23:38 +01:00
Ava Chow	949b673472	Merge bitcoin/bitcoin#28052 : blockstorage: XOR blocksdir .dat files `fa895c7283` mingw: Document mode wbx workaround (MarcoFalke) `fa359255fe` Add -blocksxor boolean option (MarcoFalke) `fa7f7ac040` Return XOR AutoFile from BlockManager::OpenFile() (MarcoFalke) Pull request description: Currently the *.dat files in the blocksdir store the data received from remote peers as-is. This may be problematic when a program other than Bitcoin Core tries to interpret them by accident. For example, an anti-virus program or other program may scan them and move them into quarantine, or delete them, or corrupt them. This may cause Bitcoin Core to fail a reorg, or fail to reply to block requests (via P2P, RPC, REST, ...). Fix this, similar to https://github.com/bitcoin/bitcoin/pull/6650, by rolling a random XOR pattern over the dat files when writing or reading them. Obviously this can only protect against programs that accidentally and unintentionally are trying to mess with the dat files. Any program that intentionally wants to mess with the dat files can still trivially do so. The XOR pattern is only applied when the blocksdir is freshly created, and there is an option to disable it (on creation), so that people can disable it, if needed. ACKs for top commit: achow101: ACK `fa895c7283` TheCharlatan: Re-ACK `fa895c7283` hodlinator: ACK `fa895c7283` Tree-SHA512: c92a6a717da83bc33a9b8671a779eeefde2c63b192362ba1d71e6535ee31d08e2802b74acc908345197de9daac6930e4771595ee25b09acd5a67f7ea34854720	2024-08-05 17:52:42 -04:00
Fabian Jahr	bf0efb4fc7	scripted-diff: Modernize naming of nChainTx and nTxCount -BEGIN VERIFY SCRIPT- sed -i 's/nChainTx/m_chain_tx_count/g' $(git grep -l 'nChainTx' ./src) sed -i 's/nTxCount/tx_count/g' $(git grep -l 'nTxCount' ./src) -END VERIFY SCRIPT-	2024-08-04 14:24:43 +02:00
MarcoFalke	fa895c7283	mingw: Document mode wbx workaround	2024-07-26 17:31:15 +02:00
MarcoFalke	fa359255fe	Add -blocksxor boolean option	2024-07-26 17:30:53 +02:00
MarcoFalke	fa7f7ac040	Return XOR AutoFile from BlockManager::Open*File() This is a refactor, because the XOR key is empty.	2024-07-26 12:28:59 +02:00
TheCharlatan	7aa8994c6f	refactor: Add FlatFileSeq member variables in BlockManager Instead of constructing a new class every time a file operation is done, construct them once for each of the undo and block file when a new BlockManager is created. In future, this might make it easier to introduce an abstract block store.	2024-07-24 09:39:35 +02:00
Ryan Ofsky	8426e018bf	Merge bitcoin/bitcoin#30428 : log: LogError with FlatFilePos in UndoReadFromDisk `fa14e1d9d5` log: Fix __func__ in LogError in blockstorage module (MarcoFalke) `fad59a2f0f` log: LogError with FlatFilePos in UndoReadFromDisk (MarcoFalke) `aaaa3323f3` refactor: Mark IsBlockPruned const (MarcoFalke) Pull request description: These errors should never happen in normal operation. If they do, knowing the `FlatFilePos` may be useful to determine if data corruption happened. Also, handle the error `pos.IsNull()` as part of `OpenUndoFile`, because it may as well have happened due to data corruption. This mirrors the `LogError` behavior from `ReadBlockFromDisk`. Also, two other fixup commits in this module. ACKs for top commit: kevkevinpal: ACK [`fa14e1d`](`fa14e1d9d5`) tdb3: cr and light test ACK `fa14e1d9d5` ryanofsky: Code review ACK `fa14e1d9d5`. This should make logging clearer and more consistent Tree-SHA512: abb492a919b4796698d1de0a7874c8eae355422b992aa80dcd6b59c2de1ee0d2949f62b3cf649cd62892976fee640358f7522867ed9d48a595d6f8f4e619df50	2024-07-15 13:42:53 -04:00
MarcoFalke	fa14e1d9d5	log: Fix __func__ in LogError in blockstorage module These errors should never happen. However, when they do happen, it is useful to log the correct error location (function name). For example, this fixes an incorrect "ConnectBlock()" in "WriteUndoDataForBlock".	2024-07-11 16:34:43 +02:00
MarcoFalke	fad59a2f0f	log: LogError with FlatFilePos in UndoReadFromDisk These errors should never happen in normal operation. If they do, knowing the FlatFilePos may be useful to determine if data corruption happened. Also, handle the error pos.IsNull() as part of OpenUndoFile, because it may as well have happened due to data corruption. This mirrors the LogError behavior from ReadBlockFromDisk.	2024-07-11 16:22:31 +02:00
MarcoFalke	aaaa3323f3	refactor: Mark IsBlockPruned const Member fields are used read-only in this method.	2024-07-11 15:39:19 +02:00
Ava Chow	f4849f6922	Merge bitcoin/bitcoin#29668 : prune, rpc: Check undo data when finding pruneheight `8789dc8f31` doc: Add note to getblockfrompeer on missing undo data (Fabian Jahr) `4a1975008b` rpc: Make pruneheight also reflect undo data presence (Fabian Jahr) `96b4facc91` refactor, blockstorage: Generalize GetFirstStoredBlock (Fabian Jahr) Pull request description: The function `GetFirstStoredBlock()` helps us find the first block for which we have data. So far this function only looked for a block with `BLOCK_HAVE_DATA`. However, this doesn't mean that we also have the undo data of that block, and undo data might be required for what a user would like to do with those blocks. One example of how this might happen is if some blocks were fetched using the `getblockfrompeer` RPC. Blocks fetched from a peer will have data but no undo data. The first commit here allows `GetFirstStoredBlock()` to check for undo data as well by passing a parameter. This alone is useful for #29553 and I would use it there. In the second commit I am applying the undo check to the RPCs that report `pruneheight` to the user. I find this much more intuitive because I think the user expects to be able to do all operations on blocks up until the `pruneheight` but that is not the case if undo data is missing. I personally ran into this once before and now again when testing for assumeutxo when I had used `getblockfrompeer`. The following commit adds test coverage for this change of behavior. The last commit adds a note in the docs of `getblockfrompeer` that undo data will not be available. ACKs for top commit: achow101: ACK `8789dc8f31` furszy: Code review ACK `8789dc8f31`. stickies-v: ACK `8789dc8f31` Tree-SHA512: 90ae8bdd07a496ade579aa25240609c61c9ed173ad38d30533f6c631fe674e5a41727478ade69ca4b71a571ad94c9da4b33ebba6b5d8821109313c2de3bdfb3d	2024-07-10 15:27:05 -04:00
Fabian Jahr	96b4facc91	refactor, blockstorage: Generalize GetFirstStoredBlock GetFirstStoredBlock is generalized to check for any data status with a status mask that needs to be passed as a parameter. To reflect this the function is also renamed to GetFirstBlock. Co-authored-by: stickies-v <stickies-v@protonmail.com>	2024-06-21 15:00:16 +02:00
Ryan Ofsky	f68cba29b3	blockman: Replace m_reindexing with m_blockfiles_indexed This is a just a mechanical change, renaming and inverting the meaning of the indexing variable. "m_blockfiles_indexed" is a more straightforward name for this variable because this variable just indicates whether or not <datadir>/blocks/blk?????.dat files have been indexed in the <datadir>/blocks/index LevelDB database. The name "m_reindexing" was more confusing, it could be true even if -reindex was not specified, and false when it was specified. Also, the previous name unnecessarily required thinking about the whole reindexing process just to understand simple checks in validation code about whether blocks were indexed. The motivation for this change is to follow up on previous commits, moving away from having multiple variables called "reindex" internally, and instead naming variables individually after what they do and represent.	2024-06-07 19:18:46 +02:00
Ava Chow	058af75874	Merge bitcoin/bitcoin#29817 : kernel: De-globalize fReindex `b47bd95920` kernel: De-globalize fReindex (TheCharlatan) Pull request description: fReindex is one of the last remaining globals exposed by the kernel library, so move it into the blockstorage class to reduce the amount of global mutable state and make the kernel library a bit less awkward to use. --- This pull request is part of the [libbitcoinkernel project](https://github.com/bitcoin/bitcoin/issues/27587). ACKs for top commit: achow101: ACK `b47bd95920` ryanofsky: Code review ACK `b47bd95920`. I rereviewed the whole PR, but the only change since last review was reverting the bugfix https://github.com/bitcoin/bitcoin/pull/29817#discussion_r1578327024 and make the change a pure refactoring. mzumsande: Code Review ACK `b47bd95920` stickies-v: ACK `b47bd95920` Tree-SHA512: f7399d01f93bc0c0c7428fe95d19b9d29b4ed00a4f1deabca78fb0c4fecb434ec971e890feecb105938b5247c926850b1b7b4a4a9caa333a061e40777d0c8463	2024-05-17 15:50:56 -04:00
Ryan Ofsky	2f53f2273d	Merge bitcoin/bitcoin#29975 : blockstorage: Separate reindexing from saving new blocks `e41667b720` blockstorage: Don't move cursor backwards in UpdateBlockInfo (Ryan Ofsky) `17103637c6` blockstorage: Rename FindBlockPos and have it return a FlatFilePos (Martin Zumsande) `d9e477c4dc` validation, blockstorage: Separate code paths for reindex and saving new blocks (Martin Zumsande) `064859bbad` blockstorage: split up FindBlockPos function (Martin Zumsande) `fdae638e83` doc: Improve doc for functions involved in saving blocks to disk (Martin Zumsande) `0d114e3cb2` blockstorage: Add Assume for fKnown / snapshot chainstate (Martin Zumsande) Pull request description: `SaveBlockToDisk` / `FindBlockPos` are used for two purposes, depending on whether they are called during reindexing (`dbp` set, `fKnown = true`) or in the "normal" case when adding new blocks (`dbp == nullptr`, `fKnown = false`). The actual tasks are quite different - In normal mode, preparations for saving a new block are made, which is then saved: find the correct position on disk (maybe skipping to a new blk file), check for available disk space, update the blockfile info db, save the block. - during reindex, most of this is not necessary (the block is already on disk after all), only the blockfile info needs to rebuilt because reindex wiped the leveldb it's saved in. Using one function with many conditional statements for this leads to code that is hard to read / understand and bug-prone: - many code paths in `FindBlockPos` are conditional on `fKnown` or `!fKnown` - It's not really clear what actually needs to be done during reindex (we don't need to "save a block to disk" or "find a block pos" as the function names suggest) - logic that should be applied to only one of the two modes is sometimes applied to both (see first commit, or #27039) #24858 and #27039 were recent bugs directly related to the differences between reindexing and normal mode, and in both cases the simple fix took a long time to be reviewed and merged. This PR proposes to clean this code up by splitting out the reindex logic into a separate function (`UpdateBlockInfo`) which will be called directly from validation. As a result, `SaveBlockToDisk` and `FindBlockPos` only need to cover the non-reindex logic. ACKs for top commit: paplorinc: ACK `e41667b720` TheCharlatan: Re-ACK `e41667b720` ryanofsky: Code review ACK `e41667b720`. Just improvements to comments since last review. Tree-SHA512: a14ff9a0facf6b1e3c1cd724a2d19a79a25d4b48de64398fdd172671532a472bc10a20cbb64ac3a3e55814dcc877d0597a3e1699cabc4f9d9a86b439b6eaba20	2024-05-16 11:16:08 -04:00
TheCharlatan	b47bd95920	kernel: De-globalize fReindex fReindex is one of the last remaining globals exposed by the kernel library, so move it into the blockstorage class to reduce the amount of global mutable state and make the kernel library a bit less awkward to use.	2024-05-16 11:28:46 +02:00
Ryan Ofsky	e41667b720	blockstorage: Don't move cursor backwards in UpdateBlockInfo Previously, it was possible to move the cursor back to an older file during reindex if blocks are enocuntered out of order during reindex. This would mean that MaxBlockfileNum() would be incorrect, and a wrong DB_LAST_BLOCK could be written to disk. This improves the logic by only ever moving the cursor forward (if possible) but not backwards. Co-authored-by: Martin Zumsande <mzumsande@gmail.com>	2024-05-14 14:54:27 -04:00

1 2 3 4 5

207 Commits