CCoinsViewCache::BatchWrite always returns true if called from a backed
cache, so just return void instead. Also return void from ::Sync and
::Flush.
This allows for dropping a FatalError condition and simplifying some
dead error handling code a bit.
Since we now no longer exercise the "error path" when returning from
`CCoinsView::BatchWrite`, make the method clear the cache instead. This
should only be exercised by tests and not change production behaviour.
This might slightly improve the coins_view fuzz test's ability to
generate better coverage.
Co-authored-by: l0rinc <pap.lorinc@gmail.com>
0ac969cddf validation: don't reallocate cache for short-lived CCoinsViewCache (Lőrinc)
c8f5e446dc coins: reduce lookups in dbcache layer propagation (Lőrinc)
Pull request description:
This change is part of [[IBD] - Tracking PR for speeding up Initial Block Download](https://github.com/bitcoin/bitcoin/pull/32043)
### Summary
Previously, when the parent coins cache had no entry and the child did, `BatchWrite` performed a find followed by `try_emplace`, which resulted in multiple `SipHash` computations and bucket traversals on the common insert path.
On a different path, these caches were recreated needlessly for every block connection.
### Fix for double fetch
This change uses a single leading `try_emplace` and branches on the returned `inserted` flag. In the `FRESH && SPENT` case (not used in production, only exercised by tests), we erase the just-inserted placeholder (which is constant time with no rehash anyway). Semantics are unchanged for all valid parent/child state combinations.
This change is a minimal version of [bitcoin/bitcoin@`723c49b` (#32128)](723c49b63b) and draws simplification ideas [bitcoin/bitcoin@`ae76ec7` (#30673)](ae76ec7bcf) and https://github.com/bitcoin/bitcoin/pull/30326.
### Fix for temporary cache recreation
Related to parent cache propagation, the second commit makes it possible to avoid destructuring-recreating-destructuring of these short-live parent caches created for each new block.
A few temporary `CCoinsViewCache`'s are destructed right after the `Flush()`, therefore it is not necessary to call `ReallocateCache` to recreate them right before they're killed anyway.
This change was based on a subset of https://github.com/bitcoin/bitcoin/pull/28945, the original authors and relevant commenters were added as coauthors to this version.
-----
Reindex-chainstate indicates ~1% speedup.
<details>
<summary>Details</summary>
```python
COMMITS="647cdb4f7e8041affed887e2325ee03a91078bb1 0b0c3293ffd75afb27dadc0b28426b40132a8c6b"; \
STOP=909090; DBCACHE=4500; \
CC=gcc; CXX=g++; \
BASE_DIR="/mnt/my_storage"; DATA_DIR="$BASE_DIR/BitcoinData"; LOG_DIR="$BASE_DIR/logs"; \
(echo ""; for c in $COMMITS; do git fetch -q origin $c && git log -1 --pretty='%h %s' $c || exit 1; done; echo "") && \
hyperfine \
--sort command \
--runs 2 \
--export-json "$BASE_DIR/rdx-$(sed -E 's/(\w{8})\w+ ?/\1-/g;s/-$//'<<<"$COMMITS")-$STOP-$DBCACHE-$CC.json" \
--parameter-list COMMIT ${COMMITS// /,} \
--prepare "killall bitcoind 2>/dev/null; rm -f $DATA_DIR/debug.log; git checkout {COMMIT}; git clean -fxd; git reset --hard && \
cmake -B build -G Ninja -DCMAKE_BUILD_TYPE=Release -DENABLE_IPC=OFF && ninja -C build bitcoind && \
./build/bin/bitcoind -datadir=$DATA_DIR -stopatheight=$STOP -dbcache=1000 -printtoconsole=0; sleep 20" \
--cleanup "cp $DATA_DIR/debug.log $LOG_DIR/debug-{COMMIT}-$(date +%s).log" \
"COMPILER=$CC ./build/bin/bitcoind -datadir=$DATA_DIR -stopatheight=$STOP -dbcache=$DBCACHE -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0"
647cdb4f7e Merge bitcoin/bitcoin#33311: net: Quiet down logging when router doesn't support natpmp/pcp
0b0c3293ff validation: don't reallocate cache for short-lived CCoinsViewCache
Benchmark 1: COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=909090 -dbcache=4500 -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 (COMMIT = 647cdb4f7e)
Time (mean ± σ): 16233.508 s ± 9.501 s [User: 19064.578 s, System: 951.672 s]
Range (min … max): 16226.790 s … 16240.226 s 2 runs
Benchmark 2: COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=909090 -dbcache=4500 -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 (COMMIT = 0b0c3293ffd75afb27dadc0b28426b40132a8c6b)
Time (mean ± σ): 16039.626 s ± 17.284 s [User: 18870.130 s, System: 950.722 s]
Range (min … max): 16027.405 s … 16051.848 s 2 runs
Relative speed comparison
1.01 ± 0.00 COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=909090 -dbcache=4500 -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 (COMMIT = 647cdb4f7e)
1.00 COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=909090 -dbcache=4500 -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 (COMMIT = 0b0c3293ffd75afb27dadc0b28426b40132a8c6b)
```
</details>
ACKs for top commit:
optout21:
utACK 0ac969cddf
achow101:
ACK 0ac969cddf
andrewtoth:
utACK 0ac969cddf
sedited:
ACK 0ac969cddf
Tree-SHA512: 9fcc3f1a8314368576a4fba96ca72665527eaa3a97964ab5b39491757f3527147d134f79a5c3456f76c1330c7ef862989d23f764236c5e2563be89a81c1cee47
A few temporary `CCoinsViewCache`'s are destructed right after the `Flush()`, therefore it is not necessary to call `ReallocateCache` to recreate them right before they're killed anyway.
* `Flush()` - retains existing functionality;
* `Flush(/*will_reuse_cache=*/false)` - skips destruction and reallocation of the parent cache since it will soon go out of scope anyway;
For the `will_reuse_cache` parameter we want to see exactly which ones will reallocate memory and which won't - since both can be valid usages.
This change was based on a subset of https://github.com/bitcoin/bitcoin/pull/28945.
Co-authored-by: Martin Ankerl <martin.ankerl@gmail.com>
`EmplaceCoinInternalDANGER()` incremented `cachedCoinsUsage` even when `try_emplace` did not insert (duplicate key), inflating the counter.
This is mostly reachable in tests today since `AssumeUTXO` does not overwrite.
Increment only on successful insert, and capture `coin.DynamicMemoryUsage()` before the move so accounting uses the correct value.
Fuzz: add an `EmplaceCoinInternalDANGER` path to exercise insert-only accounting.
Unit test: emplace two different coins at the same outpoint (with different `DynamicMemoryUsage()`), verify `SelfTest()` passes and `AccessCoin(outpoint)` returns the first coin.
Co-authored-by: Andrew Toth <andrewstoth@gmail.com>
Co-authored-by: w0xlt <woltx@protonmail.com>
Move the `cachedCoinsUsage` subtract in `AddCoin()` to after the `possible_overwrite` check.
Previously a throw before assignment decremented the counter without changing the entry, which corrupted accounting and later underflowed.
In `Flush()`, reset `cachedCoinsUsage` to `0` only when `BatchWrite()` succeeds and `cacheCoins` is actually cleared. In production `BatchWrite()` returns `true`, so this mostly affects tests. On failure, leave the counter unchanged to keep it in sync with the cache.
The existing `Flush()` workaround in fuzzing was also removed now that the source of the problem was fixed, so the fuzzer no longer needs `coins_view_cache.Flush()` to realign `cachedCoinsUsage` after an exception.
Replace the prior `expected_code_path` tracking with direct assertions. The role of the variable was to verify that code execution follows only expected paths, either successful addition, or if it's an exception, the message is verified and checked that overwrite was disallowed.
With these changes the counter stays consistent across success and exception paths, so we can finally remove the `UBSan` suppressions for `CCoinsViewCache` that were masking the issue.
Included a unit test as well, attempting to add a different coin to the same outpoint without allowing overwrites and make sure it throws.
We use `SelfTest()` to validates accounting, and check that the cache remains usable.
Co-authored-by: Ryan Ofsky <ryan@ofsky.org>
Co-authored-by: w0xlt <woltx@protonmail.com>
When a coin is spent via `SpendCoin()`, `cachedCoinsUsage` is already decremented and the coin's `scriptPubKey` is cleared, so `DynamicMemoryUsage()` is `0`.
`CoinsViewCacheCursor::NextAndMaybeErase()` was subtracting usage again when erasing spent entries.
Replace it with an assert that documents spent coins have zero dynamic memory usage by the time the cursor encounters them.
Remove the now-unnecessary `usage` reference from the cursor's constructor and member variables.
`using script_verify_flags = uint32_t` allows implicit conversion to
and from int, so replace it with a class to have the compiler ensure we
use the correct type. Provide from_int and as_int to allow for explicit
conversions when desired.
Introduces the type `script_verify_flag_name` for the individual flag
name enumeration.
It reuses the logic from the `coins_view` target, except it uses an
in-memory CCoinsViewDB as the backend.
Note `CCoinsViewDB` will assert the best block hash is never null, so we
slightly modify the coins_view fuzz logic to take care of this.
CCoinsCacheEntry provided general access to its internal flags state, even though in reality it could only be clean, fresh, dirty or fresh|dirty.
After it got dirtied we couldn't set the state back to clean by AddFlags(0) - tests were explicitly checking against that.
This commit cleans up the public interface to make this distinction cleaner and invalid behavior impossible instead of just checked by tests.
This includes the removal of redundant `inline` qualifiers (we're inside a struct).
Also renamed `self` to `pair` to simplify the upcoming commits.
Also modernized `EmplaceCoinInternalDANGER` since it was already modified.
Co-authored-by: Andrew Toth <andrewstoth@gmail.com>
It is incorrect to assert that `cache.HaveCoin()` will always be `true`
if `backend.HaveCoin()` is. The coin could well have been marked as
spent in the cache but not yet flushed, in which case `cache.HaveCoin()`
would return `false`.
Note this was never hit because `exists_using_have_coin_in_backend` is
currently never `true` (it's the default implementation of `CCoinsView`.
However this might change if we were to add a target where the backend
is a `CCoinsViewDB`.
This is a follow-up to previous commits moving the chain constants out
of chainparamsbase.
The script removes the chainparamsbase header in all files where it is
included, but not used. This is done by filtering against all defined
symbols of the header as well as its respective .cpp file.
The kernel chainparams now no longer relies on chainparamsbase.
-BEGIN VERIFY SCRIPT-
sed -i '/#include <chainparamsbase.h>/d' $( git grep -l 'chainparamsbase.h' | xargs grep -L 'CBaseChainParams\|CreateBaseChainParams\|SetupChainParamsBaseOptions\|BaseParams\|SelectBaseParams\|chainparamsbase.cpp' )
-END VERIFY SCRIPT-
In my benchmarks, using this pool allocator for CCoinsMap gives about
20% faster `-reindex-chainstate` with -dbcache=5000 with practically the
same memory usage. The change in max RSS changed was 0.3%.
The `validation_flush_tests` tests need to be updated because
memory allocation is now done in large pools instead of one node at a
time, so the limits need to be updated accordingly.
In the GetUTXOStats fuzz case, GetUTXOStats is always called with a
CCoinsViewCache. Which is guaranteed to throw a std::logic_error when
its ::Cursor() method is called on the first line of GetUTXOStats.
In the fuzz case, we basically catch this logic error and declare
victory if we caught it.
There is no point to fuzzing this deterministic logic.
Confirmed with IWYU that the node/coinstats.h #include is no longer
necessary.
BlockManager is a large data structure, and cs_main is not required to
take its address or access every part of it. Individual BlockManager
fields and methods which do require cs_main like m_block_index and
LookupBlockIndex are already annotated separately, and these other
annotations describe locking requirements more accurately and do a
better job enforcing thread safety.
Since cs_main is not needed to access the address of the m_block object,
this commit drops cs_main LOCK calls which were added pointlessly to
satisfy this annotation in the past.
Co-authored-by: Carl Dong <contact@carldong.me>
Blindly chose a cap of 10000 iterations for every loop, except for
the two in script_ops.cpp and scriptnum_ops.cpp which appeared to
(sometimes) be deserializing individual bytes; capped those to one
million to ensure that sometimes we try working with massive scripts.
There was also one fuzzer-controlled loop in timedata.cpp which was
already capped, so I left that alone.
git grep 'while (fuzz' should now run clean except for timedata.cpp
fa621ededd refactor: Pass script verify flags as uint32_t (MarcoFalke)
Pull request description:
The flags are cast to unsigned in the interpreter anyway, so avoid the confusion (and fuzz crashes) by just passing them as unsigned from the beginning.
Also, the flags are often inverted bit-wise with the `~` operator, which also works on signed integers, but might cause confusion as the sign bit is flipped.
Fixes#22233
ACKs for top commit:
theStack:
Concept and code review ACK fa621ededd
kristapsk:
ACK fa621ededd
jonatack:
ACK fa621ededd
Tree-SHA512: ea0720f32f823fa7f075309978672aa39773c6019d12b6c1c9d611fc1983a76115b7fe2a28d50814673bb6415c311ccc05b99d6e871575fb6900faf75ed17769
-BEGIN VERIFY SCRIPT-
# Rename
sed -i -e 's/MakeFuzzingContext/MakeNoLogFileContext/g' $(git grep -l MakeFuzzingContext)
# Bump the copyright of touched files in this scripted diff to avoid touching them again later
./contrib/devtools/copyright_header.py update ./src/test/fuzz/
-END VERIFY SCRIPT-