Commit Graph

130 Commits

Author SHA1 Message Date
TheCharlatan
d69a582e72 kernel: Remove some unnecessary non-kernel includes
Specifically gets rid of batchpriority, chainparams, script/sign.h and
system includes.

Also take the opportunity of cleaning up the headers for the effected
files and adding them to the iwyu-enforced set.
2025-12-21 10:24:09 +01:00
merge-script
7f295e1d9b Merge bitcoin/bitcoin#34084: scripted-diff: [doc] Unify stale copyright headers
fa4cb13b52 test: [doc] Manually unify stale headers (MarcoFalke)
fa5f297748 scripted-diff: [doc] Unify stale copyright headers (MarcoFalke)

Pull request description:

  Historically, the upper year range in file headers was bumped manually
  or with a script.

  This has many issues:

  * The script is causing churn. See for example commit 306ccd4, or
    drive-by first-time contributions bumping them one-by-one. (A few from
    this year: https://github.com/bitcoin/bitcoin/pull/32008,
    https://github.com/bitcoin/bitcoin/pull/31642,
    https://github.com/bitcoin/bitcoin/pull/32963, ...)
  * Some, or likely most, upper year values were wrong. Reasons for
    incorrect dates could be code moves, cherry-picks, or simply bugs in
    the script.
  * The upper range is not needed for anything.
  * Anyone who wants to find the initial file creation date, or file
    history, can use `git log` or `git blame` to get more accurate
    results.
  * Many places are already using the `-present` suffix, with the meaning
    that the upper range is omitted.

  To fix all issues, this bumps the upper range of the copyright headers
  to `-present`.

  Further notes:

  * Obviously, the yearly 4-line bump commit for the build system (c.f.
    b537a2c02a) is fine and will remain.
  * For new code, the date range can be fully omitted, as it is done
    already by some developers. Obviously, developers are free to pick
    whatever style they want. One can list the commits for each style.
  * For example, to list all commits that use `-present`:
    `git log --format='%an (%ae) [%h: %s]' -S 'present The Bitcoin'`.
  * Alternatively, to list all commits that use no range at all:
    `git log --format='%an (%ae) [%h: %s]' -S '(c) The Bitcoin'`.

  <!--
  * The lower range can be wrong as well, so it could be omitted as well,
    but this is left for a follow-up. A previous attempt was in
    https://github.com/bitcoin/bitcoin/pull/26817.

ACKs for top commit:
  l0rinc:
    ACK fa4cb13b52
  rkrux:
    re-ACK fa4cb13b52
  janb84:
    ACK fa4cb13b52

Tree-SHA512: e5132781bdc4417d1e2922809b27ef4cf0abb37ffb68c65aab8a5391d3c917b61a18928ec2ec2c75ef5184cb79a5b8c8290d63e949220dbeab3bd2c0dfbdc4c5
2025-12-19 16:56:02 +00:00
MarcoFalke
fa5f297748 scripted-diff: [doc] Unify stale copyright headers
-BEGIN VERIFY SCRIPT-

 sed --in-place --regexp-extended \
   's;( 20[0-2][0-9])(-20[0-2][0-9])? The Bitcoin Core developers;\1-present The Bitcoin Core developers;g' \
   $( git grep -l 'The Bitcoin Core developers' -- ':(exclude)COPYING' ':(exclude)src/ipc/libmultiprocess' ':(exclude)src/minisketch' )

-END VERIFY SCRIPT-
2025-12-16 22:21:15 +01:00
Martin Zumsande
c011e3aa54 test: Wrap validation functions with TestChainstateManager
This allows to access them in the fuzz test in the next commit
without making them public.

Co-authored-by: TheCharlatan <seb.kung@gmail.com>
2025-12-16 11:25:46 -05:00
merge-script
4f11ef058b Merge bitcoin/bitcoin#30214: refactor: Improve assumeutxo state representation
82be652e40 doc: Improve ChainstateManager documentation, use consistent terms (Ryan Ofsky)
af455dcb39 refactor: Simplify pruning functions (TheCharlatan)
ae85c495f1 refactor: Delete ChainstateManager::GetAll() method (Ryan Ofsky)
6a572dbda9 refactor: Add ChainstateManager::ActivateBestChains() method (Ryan Ofsky)
491d827d52 refactor: Add ChainstateManager::m_chainstates member (Ryan Ofsky)
e514fe6116 refactor: Delete ChainstateManager::SnapshotBlockhash() method (Ryan Ofsky)
ee35250683 refactor: Delete ChainstateManager::IsSnapshotValidated() method (Ryan Ofsky)
d9e82299fc refactor: Delete ChainstateManager::IsSnapshotActive() method (Ryan Ofsky)
4dfe383912 refactor: Convert ChainstateRole enum to struct (Ryan Ofsky)
352ad27fc1 refactor: Add ChainstateManager::ValidatedChainstate() method (Ryan Ofsky)
a229cb9477 refactor: Add ChainstateManager::CurrentChainstate() method (Ryan Ofsky)
a9b7f5614c refactor: Add Chainstate::StoragePath() method (Ryan Ofsky)
840bd2ef23 refactor: Pass chainstate parameters to MaybeCompleteSnapshotValidation (Ryan Ofsky)
1598a15aed refactor: Deduplicate Chainstate activation code (Ryan Ofsky)
9fe927b6d6 refactor: Add Chainstate m_assumeutxo and m_target_utxohash members (Ryan Ofsky)
6082c84713 refactor: Add Chainstate::m_target_blockhash member (Ryan Ofsky)
de00e87548 test: Fix broken chainstatemanager_snapshot_init check (Ryan Ofsky)

Pull request description:

  This PR contains the first part of #28608, which tries to make assumeutxo code more maintainable, and improve it by not locking `cs_main` for a long time when the snapshot block is connected, and by deleting the snapshot validation chainstate when it is no longer used, instead of waiting until the next restart.

  The changes in this PR are just refactoring. They make `Chainstate` objects self-contained, so for example, it is possible to determine what blocks to connect to a chainstate without querying `ChainstateManager`, and to determine whether a Chainstate is validated without basing it on inferences like `&cs != &ActiveChainstate()` or `GetAll().size() == 1`.

  The PR also tries to make assumeutxo terminology less confusing, using "current chainstate" to refer to the chainstate targeting the current network tip, and "historical chainstate" to refer to the chainstate downloading old blocks and validating the assumeutxo snapshot. It removes uses of the terms "active chainstate," "usable chainstate," "disabled chainstate," "ibd chainstate," and "snapshot chainstate" which are confusing for various reasons.

ACKs for top commit:
  maflcko:
    re-review ACK 82be652e40 🕍
  fjahr:
    re-ACK 82be652e40
  sedited:
    Re-ACK 82be652e40

Tree-SHA512: 81c67abba9fc5bb170e32b7bf8a1e4f7b5592315b4ef720be916d5f1f5a7088c0c59cfb697744dd385552f58aa31ee36176bae6a6e465723e65861089a1252e5
2025-12-16 14:03:34 +00:00
TheCharlatan
af455dcb39 refactor: Simplify pruning functions
Move GetPruneRange from ChainstateManager to Chainstate.
2025-12-12 11:49:59 +01:00
Roman Zeyde
4e2af1c065 blockstorage: allow reading partial block data from storage
It will allow fetching specific transactions using an external index,
following https://github.com/bitcoin/bitcoin/pull/32541#issuecomment-3267485313.

No logging takes place in case of an invalid offset/size (to avoid spamming the log),
by using a new `ReadRawError::BadPartRange` error variant.

Co-authored-by: Hodlinator <172445034+hodlinator@users.noreply.github.com>
Co-authored-by: Lőrinc <pap.lorinc@gmail.com>
2025-12-11 18:54:55 +01:00
Roman Zeyde
f2fd1aa21c blockstorage: return an error code from ReadRawBlock()
It will enable different error handling flows for different error types.

Also, `ReadRawBlockBench` performance has decreased due to no longer reusing a vector
with an unchanging capacity - mirroring our production code behavior.

Co-authored-by: Hodlinator <172445034+hodlinator@users.noreply.github.com>
Co-authored-by: Lőrinc <pap.lorinc@gmail.com>
2025-12-11 18:54:55 +01:00
Andrew Toth
99d012ec80 refactor: return reference instead of pointer
The return value of BlockManager::GetFirstBlock must always be non-null. This
can be inferred by the implementation, which has an assertion that the return
value is not null. A raw pointer should only be returned if the result may be
null. In this case a reference is more appropriate.
2025-11-13 09:57:42 -05:00
Andrew Toth
f743e6c5dd refactor: add missing LIFETIMEBOUND annotation for parameter
The BlockManager::GetFirstBlock lower_block parameter can have its lifetime
extended by the return parameter. In the case where lower_block is returned,
its lifetime will be bound to the return value. A LIFETIMEBOUND annotation is
appropriate here.
2025-11-13 09:57:42 -05:00
Andrew Toth
141117f5e8 refactor: remove incorrect LIFETIMEBOUND annotations
The return value of CheckBlockDataAvailability does not extend the lifetime of
the input parameters, nor does BlockManager instance retain references to the
parameters. The LIFETIMEBOUND annotations are misleading here since the lifetime
of the parameters are not extended past the method call.
2025-11-13 09:37:55 -05:00
merge-script
3789215f73 Merge bitcoin/bitcoin#33724: refactor: Return uint64_t from GetSerializeSize
fa6c0bedd3 refactor: Return uint64_t from GetSerializeSize (MarcoFalke)
fad0c8680e refactor: Use uint64_t over size_t for serialized-size values (MarcoFalke)
fa4f388fc9 refactor: Use fixed size ints over (un)signed ints for serialized values (MarcoFalke)
fa01f38e53 move-only: Move CBlockFileInfo to kernel namespace (MarcoFalke)
fa2bbc9e4c refactor: [rpc] Remove cast when reporting serialized size (MarcoFalke)
fa364af89b test: Remove outdated comment (MarcoFalke)

Pull request description:

  Consensus code should arrive at the same conclusion, regardless of the architecture it runs on. Using architecture-specific types such as `size_t` can lead to issues, such as the low-severity [CVE-2025-46597](https://bitcoincore.org/en/2025/10/24/disclose-cve-2025-46597/).

  The CVE was already worked around, but it may be good to still fix the underlying issue.

  Fixes https://github.com/bitcoin/bitcoin/issues/33709 with a few refactors to use explicit fixed-sized integer types in serialization-size related code and concluding with a refactor to return `uint64_t` from `GetSerializeSize`. The refactors should not change any behavior, because the CVE was already worked around.

ACKs for top commit:
  Crypt-iQ:
    crACK fa6c0bedd3
  l0rinc:
    ACK fa6c0bedd3
  laanwj:
    Code review ACK fa6c0bedd3

Tree-SHA512: f45057bd86fb46011e4cb3edf0dc607057d72ed869fd6ad636562111ae80fea233b2fc45c34b02256331028359a9c3f4fa73e9b882b225bdc089d00becd0195e
2025-11-12 09:48:10 -05:00
MarcoFalke
fa4f388fc9 refactor: Use fixed size ints over (un)signed ints for serialized values
Bitcoin Core already assumes that 'unsigned int' means uint32_t and
'signed int' means int32_t. See src/compat/assumptions.h. Also, any
serialized integral value must be of a fixed size.

So make the fixed size explicit in this documenting refactor, which does
not change the behavior on any platform.
2025-10-30 17:51:38 +01:00
MarcoFalke
fa01f38e53 move-only: Move CBlockFileInfo to kernel namespace
Also, move it to the blockstorage module, because it is only used inside
that module.

Can be reviewed with the git option --color-moved=dimmed-zebra
2025-10-28 16:08:44 +01:00
Lőrinc
743abbcbde refactor: inline constant return value of BlockTreeDB::WriteBatchSync and BlockManager::WriteBlockIndexDB and BlockTreeDB::WriteFlag 2025-08-13 15:47:48 -07:00
Lőrinc
e030240e90 refactor: inline constant return value of CDBWrapper::Erase and BlockTreeDB::WriteReindexing
Did both in this commit, since the return value of `WriteReindexing` was ignored anyway - which existed only because of the constant `Erase` being called
2025-08-13 15:47:48 -07:00
Lőrinc
478d40afc6 refactor: encapsulate vector/array keys into Obfuscation 2025-07-16 14:33:07 -07:00
Lőrinc
0b8bec8aa6 scripted-diff: unify xor-vs-obfuscation nomenclature
Mechanical refactor of the low-level "xor" wording to signal the intent instead of the implementation used.
The renames are ordered by heaviest-hitting substitutions first, and were constructed such that after each replacement the code is still compilable.

-BEGIN VERIFY SCRIPT-
sed -i \
  -e 's/\bGetObfuscateKey\b/GetObfuscation/g' \
  -e 's/\bxor_key\b/obfuscation/g' \
  -e 's/\bxor_pat\b/obfuscation/g' \
  -e 's/\bm_xor_key\b/m_obfuscation/g' \
  -e 's/\bm_xor\b/m_obfuscation/g' \
  -e 's/\bobfuscate_key\b/m_obfuscation/g' \
  -e 's/\bOBFUSCATE_KEY_KEY\b/OBFUSCATION_KEY_KEY/g' \
  -e 's/\bSetXor(/SetObfuscation(/g' \
  -e 's/\bdata_xor\b/obfuscation/g' \
  -e 's/\bCreateObfuscateKey\b/CreateObfuscation/g' \
  -e 's/\bobfuscate key\b/obfuscation key/g' \
  $(git ls-files '*.cpp' '*.h')
-END VERIFY SCRIPT-
2025-07-16 14:32:01 -07:00
Ava Chow
319ff58bbd Merge bitcoin/bitcoin#32638: blocks: force hash validations on disk read
9341b5333a blockstorage: make block read hash checks explicit (Lőrinc)
2371b9f4ee test/bench: verify hash in `ComputeFilter` reads (Lőrinc)
5d235d50d6 net: assert block hash in `ProcessGetBlockData` and `ProcessMessage` (Lőrinc)

Pull request description:

  A follow-up to https://github.com/bitcoin/bitcoin/pull/32487#discussion_r2094072165, after which validating the hash of a read block from disk doesn't incur the cost of calculating its hash anymore.

  ### Summary

  This PR adds explicit checks that the read block header's hash matches the one we were expecting.

  ### Context

  After the previous PR, validating a block's hash during read operations became essentially free. This PR leverages that by requiring callers to provide a block's expected hash (or `std::nullopt`), preventing silent failures caused by corrupted or mismatched data. Most `ReadBlock` usages were updated with expected hashes and now fail on mismatch.

  ### Changes

  * added hash assertions in `ProcessGetBlockData` and `ProcessMessage` to validate that the block read from disk matches the expected hash;
  * updated tests and benchmark to pass the correct block hash to `ReadBlock()`, ensuring the hash validation is tested - or none if we already expect PoW failure;
  * removed the default value for `expected_hash`, requiring an explicit hash for all block reads.

  ### Why is the hash still optional (but no longer has a default value)

  * for header-error tests, where the goal is to trigger failures early in the parsing process;
  * for out-of-order orphan blocks, where the child hash isn't available before the initial disk read.

ACKs for top commit:
  maflcko:
    review ACK 9341b5333a 🕙
  achow101:
    ACK 9341b5333a
  hodlinator:
    ACK 9341b5333a
  janb84:
    re ACK 9341b5333a

Tree-SHA512: cf1d4fff4c15e3f8898ec284929cb83d7e747125d4ee759e77d369f1716728e843ef98030be32c8d608956a96ae2fbefa0e801200c333b9eefd6c086ec032e1f
2025-06-27 13:28:26 -07:00
Roman Zeyde
6ecb9fc65f chore: use std::vector<std::byte> for BlockManager::ReadRawBlock() 2025-06-13 19:19:44 +03:00
Lőrinc
9341b5333a blockstorage: make block read hash checks explicit
Dropped the default expected_hash parameter from `ReadBlock()`.

In `blockmanager_flush_block_file` tests, we pass {} since the tests would already fail at PoW validation for corrupted blocks.

In `ChainstateManager::LoadExternalBlockFile`, we pass {} when processing child blocks because their hashes aren't known beforehand.
2025-06-13 12:32:56 +02:00
Lőrinc
09ee8b7f27 node: avoid recomputing block hash in ReadBlock
Eliminate one SHA‑256 double‑hash computation of the header per block read by reusing the hash for:
* proof‑of‑work verification;
* (optional) integrity check against the supplied hash.
2025-05-26 23:23:44 +02:00
Lőrinc
056cb3c0d2 refactor: clear up blockstorage/streams in preparation for optimization
Made every OpenBlockFile#fReadOnly value explicit.

Replaced hard-coded values in ReadRawBlock with STORAGE_HEADER_BYTES.
Changed `STORAGE_HEADER_BYTES` and `UNDO_DATA_DISK_OVERHEAD` to `uint32_t` to avoid casts.

Also added `LIFETIMEBOUND` to the `AutoFile` parameter of `BufferedFile`, which stores a reference to the underlying `AutoFile`, allowing Clang to emit warnings if the referenced `AutoFile` might be destroyed while `BufferedFile` still exists.
Without this attribute, code with lifetime violations wouldn't trigger compiler warnings.

Co-authored-by: maflcko <6399679+maflcko@users.noreply.github.com>
2025-04-14 11:57:14 +02:00
Lőrinc
a4de160492 scripted-diff: shorten BLOCK_SERIALIZATION_HEADER_SIZE constant
Renames the constant to be less verbose and better reflect its purpose:
it represents the size of the storage header that precedes serialized block data on disk,
not to be confused with a block's own header.

-BEGIN VERIFY SCRIPT-
git grep -q "STORAGE_HEADER_BYTES" $(git ls-files) && echo "Error: Target name STORAGE_HEADER_BYTES already exists in the codebase" && exit 1
sed -i 's/BLOCK_SERIALIZATION_HEADER_SIZE/STORAGE_HEADER_BYTES/g' $(git grep -l 'BLOCK_SERIALIZATION_HEADER_SIZE')
-END VERIFY SCRIPT-
2025-04-13 23:44:46 +02:00
marcofleon
3c5d1a4681 Remove checkpoints
The headers presync logic should be enough to prevent memory DoS using
low-work headers. Therefore, we no longer have any use for checkpoints.
2025-03-13 11:13:13 +00:00
Lőrinc
223081ece6 scripted-diff: rename block and undo functions for consistency
Co-authored-by: Ryan Ofsky <ryan@ofsky.org>
Co-authored-by: Hodlinator <172445034+hodlinator@users.noreply.github.com>

-BEGIN VERIFY SCRIPT-
grep -r -wE 'WriteBlock|ReadRawBlock|ReadBlock|WriteBlockUndo|ReadBlockUndo' $(git ls-files src/ ':!src/leveldb') && \
    echo "Error: One or more target names already exist!" && exit 1
sed -i \
    -e 's/\bSaveBlockToDisk/WriteBlock/g' \
    -e 's/\bReadRawBlockFromDisk/ReadRawBlock/g' \
    -e 's/\bReadBlockFromDisk/ReadBlock/g' \
    -e 's/\bWriteUndoDataForBlock/WriteBlockUndo/g' \
    -e 's/\bUndoReadFromDisk/ReadBlockUndo/g' \
    $(git ls-files src/ ':!src/leveldb')
-END VERIFY SCRIPT-
2025-01-09 15:17:02 +01:00
Lőrinc
fa39f27a0f refactor,blocks: deduplicate block's serialized size calculations
For consistency `UNDO_DATA_DISK_OVERHEAD` was also extracted to avoid the constant's ambiguity.
Asserts were added to help with the review - they are removed in the next commit.

Co-authored-by: Ryan Ofsky <ryan@ofsky.org>
2025-01-09 15:16:28 +01:00
Lőrinc
dfb2f9d004 refactor,blocks: inline WriteBlockToDisk
Similarly, `WriteBlockToDisk` wasn't really extracting a meaningful subset of the `SaveBlockToDisk` functionality, it's tied closely to the only caller (needs the header size twice, recalculated block serializes size, returns multiple branches, mutates parameter).

The inlined code should only differ in these parts (modernization will be done in other commits):
* renamed `blockPos` to `pos` in `SaveBlockToDisk` to match the parameter name;
* changed `return false` to `return FlatFilePos()`.

Also removed remaining references to `SaveBlockToDisk`.

Co-authored-by: Ryan Ofsky <ryan@ofsky.org>
2025-01-09 13:24:53 +01:00
Lőrinc
42bc491465 refactor,blocks: inline UndoWriteToDisk
`UndoWriteToDisk` wasn't really extracting a meaningful subset of the `WriteUndoDataForBlock` functionality, it's tied closely to the only caller (needs the header size twice, recalculated undo serializes size, returns multiple branches, modifies parameter, needs documentation).

The inlined code should only differ in these parts (modernization will be done in other commits):
* renamed `_pos` to `pos` in `WriteUndoDataForBlock` to match the parameter name;
* inlined `hashBlock` parameter usage into `hasher << block.pprev->GetBlockHash()`;
* changed `return false` to `return FatalError`;
* capitalize comment.

Co-authored-by: Ryan Ofsky <ryan@ofsky.org>
2025-01-09 13:18:22 +01:00
Sjors Provoost
37946c0aaf Set notifications m_tip_block in LoadChainTip()
Ensure KernelNotifications m_tip_block is set even if no new block arrives.

Additionally, have node init always wait for this to happen.
2024-12-06 14:24:21 +07:00
TheCharlatan
a2955f0979 validation: Use span for ImportBlocks paths
Makes it friendlier for potential future users of the kernel library if
they do not store the headers in a std::vector, but can guarantee
contiguous memory.
2024-08-30 12:39:46 +02:00
MarcoFalke
fa7f7ac040 Return XOR AutoFile from BlockManager::Open*File()
This is a refactor, because the XOR key is empty.
2024-07-26 12:28:59 +02:00
TheCharlatan
7aa8994c6f refactor: Add FlatFileSeq member variables in BlockManager
Instead of constructing a new class every time a file operation is done,
construct them once for each of the undo and block file when a new
BlockManager is created.

In future, this might make it easier to introduce an abstract block
store.
2024-07-24 09:39:35 +02:00
Ryan Ofsky
8426e018bf Merge bitcoin/bitcoin#30428: log: LogError with FlatFilePos in UndoReadFromDisk
fa14e1d9d5 log: Fix __func__ in LogError in blockstorage module (MarcoFalke)
fad59a2f0f log: LogError with FlatFilePos in UndoReadFromDisk (MarcoFalke)
aaaa3323f3 refactor: Mark IsBlockPruned const (MarcoFalke)

Pull request description:

  These errors should never happen in normal operation. If they do,
  knowing the `FlatFilePos` may be useful to determine if data corruption
  happened. Also, handle the error `pos.IsNull()` as part of `OpenUndoFile`,
  because it may as well have happened due to data corruption.

  This mirrors the `LogError` behavior from `ReadBlockFromDisk`.

  Also, two other fixup commits in this module.

ACKs for top commit:
  kevkevinpal:
    ACK [fa14e1d](fa14e1d9d5)
  tdb3:
    cr and light test ACK fa14e1d9d5
  ryanofsky:
    Code review ACK fa14e1d9d5. This should make logging clearer and more consistent

Tree-SHA512: abb492a919b4796698d1de0a7874c8eae355422b992aa80dcd6b59c2de1ee0d2949f62b3cf649cd62892976fee640358f7522867ed9d48a595d6f8f4e619df50
2024-07-15 13:42:53 -04:00
MarcoFalke
aaaa3323f3 refactor: Mark IsBlockPruned const
Member fields are used read-only in this method.
2024-07-11 15:39:19 +02:00
Ava Chow
f4849f6922 Merge bitcoin/bitcoin#29668: prune, rpc: Check undo data when finding pruneheight
8789dc8f31 doc: Add note to getblockfrompeer on missing undo data (Fabian Jahr)
4a1975008b rpc: Make pruneheight also reflect undo data presence (Fabian Jahr)
96b4facc91 refactor, blockstorage: Generalize GetFirstStoredBlock (Fabian Jahr)

Pull request description:

  The function `GetFirstStoredBlock()` helps us find the first block for which we have data. So far this function only looked for a block with `BLOCK_HAVE_DATA`. However, this doesn't mean that we also have the undo data of that block, and undo data might be required for what a user would like to do with those blocks. One example of how this might happen is if some blocks were fetched using the `getblockfrompeer` RPC. Blocks fetched from a peer will have data but no undo data.

  The first commit here allows `GetFirstStoredBlock()` to check for undo data as well by passing a parameter. This alone is useful for #29553 and I would use it there.

  In the second commit I am applying the undo check to the RPCs that report `pruneheight` to the user. I find this much more intuitive because I think the user expects to be able to do all operations on blocks up until the `pruneheight` but that is not the case if undo data is missing. I personally ran into this once before and now again when testing for assumeutxo when I had used `getblockfrompeer`. The following commit adds test coverage for this change of behavior.

  The last commit adds a note in the docs of `getblockfrompeer` that undo data will not be available.

ACKs for top commit:
  achow101:
    ACK 8789dc8f31
  furszy:
    Code review ACK 8789dc8f31.
  stickies-v:
    ACK 8789dc8f31

Tree-SHA512: 90ae8bdd07a496ade579aa25240609c61c9ed173ad38d30533f6c631fe674e5a41727478ade69ca4b71a571ad94c9da4b33ebba6b5d8821109313c2de3bdfb3d
2024-07-10 15:27:05 -04:00
Fabian Jahr
96b4facc91 refactor, blockstorage: Generalize GetFirstStoredBlock
GetFirstStoredBlock is generalized to check for any data status with a
status mask that needs to be passed as a parameter. To reflect this the
function is also renamed to GetFirstBlock.

Co-authored-by: stickies-v <stickies-v@protonmail.com>
2024-06-21 15:00:16 +02:00
Ryan Ofsky
f68cba29b3 blockman: Replace m_reindexing with m_blockfiles_indexed
This is a just a mechanical change, renaming and inverting the meaning
of the indexing variable.

"m_blockfiles_indexed" is a more straightforward name for this variable
because this variable just indicates whether or not
<datadir>/blocks/blk?????.dat files have been indexed in the
<datadir>/blocks/index LevelDB database. The name "m_reindexing" was
more confusing, it could be true even if -reindex was not specified, and
false when it was specified. Also, the previous name unnecessarily
required thinking about the whole reindexing process just to understand
simple checks in validation code about whether blocks were indexed.

The motivation for this change is to follow up on previous commits,
moving away from having multiple variables called "reindex" internally,
and instead naming variables individually after what they do and
represent.
2024-06-07 19:18:46 +02:00
Ryan Ofsky
804f09dfa1 kernel: Add less confusing reindex options
Drop confusing kernel options:

  BlockManagerOpts::reindex
  ChainstateLoadOptions::reindex
  ChainstateLoadOptions::reindex_chainstate

Replacing them with more straightforward options:

  ChainstateLoadOptions::wipe_block_tree_db
  ChainstateLoadOptions::wipe_chainstate_db

Having two options called "reindex" which did slightly different things
was needlessly confusing (one option wiped the block tree database, and
the other caused block files to be rescanned). Also the previous set of
options did not allow rebuilding the block database without also
rebuilding the chainstate database, when it should be possible to do
those independently.
2024-06-07 19:17:11 +02:00
Ava Chow
058af75874 Merge bitcoin/bitcoin#29817: kernel: De-globalize fReindex
b47bd95920 kernel: De-globalize fReindex (TheCharlatan)

Pull request description:

  fReindex is one of the last remaining globals exposed by the kernel library, so move it into the blockstorage class to reduce the amount of global mutable state and make the kernel library a bit less awkward to use.

  ---

  This pull request is part of the [libbitcoinkernel project](https://github.com/bitcoin/bitcoin/issues/27587).

ACKs for top commit:
  achow101:
    ACK b47bd95920
  ryanofsky:
    Code review ACK b47bd95920. I rereviewed the whole PR, but the only change since last review was reverting the bugfix https://github.com/bitcoin/bitcoin/pull/29817#discussion_r1578327024 and make the change a pure refactoring.
  mzumsande:
    Code Review ACK b47bd95920
  stickies-v:
    ACK b47bd95920

Tree-SHA512: f7399d01f93bc0c0c7428fe95d19b9d29b4ed00a4f1deabca78fb0c4fecb434ec971e890feecb105938b5247c926850b1b7b4a4a9caa333a061e40777d0c8463
2024-05-17 15:50:56 -04:00
TheCharlatan
b47bd95920 kernel: De-globalize fReindex
fReindex is one of the last remaining globals exposed by the kernel
library, so move it into the blockstorage class to reduce the amount of
global mutable state and make the kernel library a bit less awkward to
use.
2024-05-16 11:28:46 +02:00
Martin Zumsande
17103637c6 blockstorage: Rename FindBlockPos and have it return a FlatFilePos
The new name reflects that it is no longer called with existing blocks
for which the position is already known.

Returning a FlatFilePos directly simplifies the interface.
2024-05-14 14:54:27 -04:00
Martin Zumsande
d9e477c4dc validation, blockstorage: Separate code paths for reindex and saving new blocks
By calling SaveBlockToDisk only when we actually want to save a new
block to disk. In the reindex case, we now call UpdateBlockInfo
directly from validation.

This commit doesn't change behavior.
2024-05-14 14:54:27 -04:00
Martin Zumsande
064859bbad blockstorage: split up FindBlockPos function
FindBlockPos does different things depending on whether the block is known
or not, as shown by the fact that much of the existing code is conditional on fKnown set or not.

If the block position is known (during reindex) the function only updates the block info
statistics. It doesn't actually find a block position in this case.

This commit removes fKnown and splits up these two code paths by introducing a separate function
for the reindex case when the block position is known.
It doesn't change behavior.
2024-05-14 14:54:26 -04:00
Martin Zumsande
fdae638e83 doc: Improve doc for functions involved in saving blocks to disk
In particular, document the flat file positions expected and
returned by functions better.

Co-authored-by: Ryan Ofsky <ryan@ofsky.org>
2024-05-14 13:49:34 -04:00
MarcoFalke
fa604eb6cf refactor: Use reference instead of pointer in IsBlockPruned
This makes it harder to pass nullptr and cause issues such as
dde7ac5c70
2023-12-07 12:02:54 +01:00
Anthony Towns
bbd4646a2e blockstorage: switch from CAutoFile to AutoFile
Also bump includes per suggestions from iwyu.
2023-11-18 03:01:03 +10:00
MarcoFalke
fac36b94ef refactor: Remove CBlockFileInfo::SetNull 2023-10-20 16:29:02 +02:00
James O'Beirne
7fcd21544a blockstorage: segment normal/assumedvalid blockfiles
When using an assumedvalid (snapshot) chainstate along with a background
chainstate, we are syncing two very different regions of the chain
simultaneously. If we use the same blockfile space for both of these
syncs, wildly different height blocks will be stored alongside one
another, making pruning ineffective.

This change implements a separate blockfile cursor for the assumedvalid
chainstate when one is in use.
2023-09-30 06:40:17 -04:00
James O'Beirne
4c3b8ca35c validation: populate nChainTx value for assumedvalid chainstates
Use the expected AssumeutxoData in order to bootstrap nChainTx values
for assumedvalid blockindex entries in the snapshot chainstate. This
is necessary because nChainTx is normally built up from nTx values,
which are populated using blockdata which the snapshot chainstate
does not yet have.
2023-09-30 06:40:17 -04:00