This should avoid having to include interfaces/chain.h from a kernel
module. interfaces/chain.h in turn includes a bunch of non-kernel
headers, that break the desired library topology and might introduce
entanglement regressions.
Specifically gets rid of batchpriority, chainparams, script/sign.h and
system includes.
Also take the opportunity of cleaning up the headers for the effected
files and adding them to the iwyu-enforced set.
fa336053aa Move ci_exec to the Python script (MarcoFalke)
fa83555d16 ci: Require rsync to pass (MarcoFalke)
eeee02ea53 ci: Untangle CI_EXEC bash function (MarcoFalke)
fa21fd1dc2 ci: Move macos snippet under DANGER_RUN_CI_ON_HOST (MarcoFalke)
fa37559ac5 ci: Document the retry script in PATH (MarcoFalke)
666675e95f ci: Move folder creation and docker kill to Python script (MarcoFalke)
Pull request description:
The remaining `ci/test/02_run_container.sh` is fine, but has a bunch of shellcheck SC2086 word splitting violations.
This is fine currently, because the only place that needed them had additional escaping, and all other commands happened to split fine on spaces.
However, this may change in the future. So fix it now, by rewriting it in Python, which is recommended in the dev notes.
ACKs for top commit:
frankomosh:
Code Review ACK [fa33605](fa336053aa)
m3dwards:
ACK fa336053aa
Tree-SHA512: 472decb13edca75566dffe49b9b3f554ab977fa60ec7902d5a060fe53381aee8606a10ff0c990a62ee2454dc6d9430cc064f58320b9043070b7bf08845413bf4
fad6118586 test: Fix "typo" in written invalid content (MarcoFalke)
fab085c15f contrib: Use text=True in subprocess over manual encoding handling (MarcoFalke)
fa71c15f86 scripted-diff: Bump copyright headers after encoding changes (MarcoFalke)
fae612424b contrib: Remove confusing and redundant encoding from IO (MarcoFalke)
fa7d72bd1b lint: Drop check to enforce encoding to be specified in Python scripts (MarcoFalke)
faf39d8539 test: Clarify that Python UTF-8 mode is the default today for most systems (MarcoFalke)
fa83e3a81d lint: Do not allow locale dependent shell scripts (MarcoFalke)
Pull request description:
Historically, there was an attempt via `test/lint/lint-python-utf8-encoding.py` to enforce explicit UTF8 in every Python IO statement (`open`, `subprocess`, ...). However, the lint check has many problems:
* The check is incomplete and many IO statements lack the explicit UTF8 specification.
* It was added at a time when some systems were not UTF8 by default.
* The check is brittle, as it depends on a fragile regex.
In theory, now that the minimum Python version is 3.10 (since commit 2123c94448), the check could be replaced by `PYTHONWARNDEFAULTENCODING=1` from https://docs.python.org/3/whatsnew/3.10.html#optional-encodingwarning-and-encoding-locale-option. However, this comes with many other problems:
* All our Python scripts already assume and require UTF8 to be set externally. On almost all modern systems, this is already the default. Some Windows versions do not have UTF8 by default and require `PYTHONUTF8=1` to be set for the tests to run already today (with or without the changes in this pull). Also, the CI and many other Bash scripts force UTF8 via `LC_ALL`. Finally, Python 3.15 will likely enable UTF8 on *all* systems by default, per https://peps.python.org/pep-0686/#abstract.
* So adding UTF8 to every single IO call is redundant, verbose, and confusing, given that it is the expected default.
So fix all issues, by:
* Removing the `test/lint/lint-python-utf8-encoding.py` check.
* Removing the encoding on the individual IO calls.
* Clarifying the existing docs around the existing UTF8 requirement and assumption.
Obviously, every IO call is still free to specify UTF8 or any other encoding explicitly, if there is a documented need for it in the future.
ACKs for top commit:
theStack:
re-ACK fad6118586
laanwj:
Re-ACK fad6118586
Tree-SHA512: 78025ea3508597d2299490347614f0ee3e4c66e3ba559ff50e498045a9c8bbd92f3a5ced18719d8fcebbd1e47bdbb56a0c85a5b73b425adb0ea4f02fe69c3149
2e27bd9c3a ci: Add Windows + UCRT jobs for cross-compiling and native testing (Hennadii Stepanov)
bd130db994 ci: Rename items specific to Windows + MSVCRT (Hennadii Stepanov)
Pull request description:
This PR is part of the ongoing effort to migrate to the modern UCRT runtime for cross-compiled Windows binaries, including release builds.
For more details about this migration, see:
- https://github.com/bitcoin/bitcoin/issues/30210
- https://github.com/bitcoin/bitcoin/pull/33593
MSVCRT-related CI jobs should be removed from the CI framework once the migration to UCRT is complete.
ACKs for top commit:
maflcko:
review ACK 2e27bd9c3a 🖊
fanquake:
ACK 2e27bd9c3a
Tree-SHA512: 222ca5e54646bcce9db6e20191d5891e988274e18b2f30085de6435a3b288a9d0fc414e8f76342e275ae58ee6603f751933d1faa8bdff446edf2695091f8ca4c
Starting with Python 3.11, Pythons gzip might delegate to zlib.
Depending on the OS, i.e Ubuntu vs Fedora, the underlying zlib
implementation might differ, resulting in different output.
For now, or until a better solution exists, disable compression. This
results in the SDK increasing in size to ~157mb. Which is not
unreasonable, to regain determinism (and would be significantly worse
without the previous commit).
See: https://docs.python.org/3/library/gzip.html#gzip.compress
Co-authored-by: stickies-v <stickies-v@protonmail.com>
The encoding arg is confusing, because it is not applied consistently
for all IO.
Also, it is useless, as the majority of files are ASCII encoded, which
are fine to encode and decode with any mode.
Moreover, UTF-8 is already required for most scripts to work properly,
so setting the encoding twice is redundant.
So remove the encoding from most IO. It would be fine to remove from all
IO, however I kept it for two files:
* contrib/asmap/asmap-tool.py: This specifically looks for utf-8
encoding errors, so it makes sense to sepecify the utf-8 encoding
explicitly.
* test/functional/test_framework/test_node.py: Reading the debug log in
text mode specifically counts the utf-8 characters (not bytes), so it
makes sense to specify the utf-8 encoding explicitly.
The Bash script was acceptable, but CI_EXEC_CMD_PREFIX was a single
string, relying on brittle word splitting that the shellcheck SC2086
would warn about.
So just fix that by moving everything to the Python script and deleting
the Bash script.
This also removes the need to export the CI_CONTAINER_ID env var.
In theory one could run the CI without the rsync package installed, and
with DANGER_RUN_CI_ON_HOST=1. However, this seems to be an edge case.
Simply requiring rsync to be installed is less code and avoids brittle
edge cases around rsync failures.
It contains a large `bash -c` string, which is hard to parse. So pull
out components:
* CI_EXEC is only called with absolute folders as args, so the `cd` is
not needed in CI_EXEC. It is only needed to specify the working dir of
running the tests in 03_test_script.sh, so move it there.
* The PATH modification is only needed after commit
4756114e50 to check that depends does
work properly, even when the PATH contains a space.
* This allows to also drop the `bash -c` and use the proper and safer
"$@" to forward args without the risk of word splitting.
This move-only refactor clarifies that macos assumes and requires
DANGER_RUN_CI_ON_HOST.
So move the snippet under the condition for self-documenting code.
Can be reviewed with the git options:
--color-moved=dimmed-zebra --color-moved-ws=ignore-all-space
The `retry` script is required for CI_RETRY_EXE and there are two ways
to put it into PATH:
* When running in a container engine, by copying it into /usr/bin
* When running without a container engine, by prepending its location to PATH
The option was fine, but now that there is a dedicated Alpine Linux
task, which uses BusyBox, it seems redundant.
(See: ci/test/00_setup_env_native_alpine_musl.sh)
So remove the USE_BUSY_BOX option, along with the BINS_SCRATCH_DIR env
var.
Also, enable pipefail in the ci/test/00_setup_env.sh script, while
touching it.
55555db055 doc: Add missing --platform=linux to docker build command (MarcoFalke)
fa0ce4c148 ci: Re-enable LINT_CI_SANITY_CHECK_COMMIT_SIG (MarcoFalke)
faa0973de2 ci: [refactor] Rename CIRRUS_PR env var to LINT_CI_IS_PR (MarcoFalke)
fa1dacaebe ci: Move lint exec snippet to stand-alone py file (MarcoFalke)
Pull request description:
The sanity check to check the last few merge commit signatures on the main branch was accidentally and silently disabled while moving from the `cirrus-ci.com` platform to the GHA platform.
So fix that by re-enabling it.
Also, contains a few other lint cleanup commits.
ACKs for top commit:
janb84:
re ACK 55555db055
willcl-ark:
ACK 55555db055
Tree-SHA512: e623dc88035ee4d1c6a8efa5fad33c35cface87f54e78c7ebfe5d468d28d8d8097150344d276f90f8ed52a89e61609ce95380476ea0151b50f73ad5919233933
552eb90071 doc: CI - Describe qemu-user-static usage (Hodlinator)
2afbbddee5 doc: CI - Clarify how important `env -i` is and why (Hodlinator)
Pull request description:
Should at least partially fix#31199
ACKs for top commit:
maflcko:
lgtm ACK 552eb90071
janb84:
ACK 552eb90071
Tree-SHA512: 45807a61d805646384c8162501f432537b7e655aa01434766ffb90ea47da9532387a76fcccac7fe208ad77f4ea5573f60b9be09e1235b9493eaa8795e1d7fbdd
fae83611b8 ci: [refactor] Use --preset=dev-mode in mac_native task (MarcoFalke)
fadb67b4b4 ci: [refactor] Base nowallet task on --preset=dev-mode (MarcoFalke)
6666980e86 ci: Enable bitcoin-chainstate and test_bitcoin-qt in win64 task (MarcoFalke)
faff7b2312 ci: Enable experimental kernel stuff in i686 task (MarcoFalke)
fa1632eecf ci: Enable experimental kernel stuff in mac-cross tasks (MarcoFalke)
fad10ff7c9 ci: Enable experimental kernel stuff in armhf task (MarcoFalke)
fa9d67c13d ci: Enable experimental kernel stuff in Alpine task (MarcoFalke)
fab3fb8302 ci: Enable experimental kernel stuff in s390x task (MarcoFalke)
fa7da8a646 ci: Enable experimental kernel stuff in valgrind task (MarcoFalke)
fa9c2973d6 ci: Enable experimental kernel stuff in TSan task (MarcoFalke)
fad30d4395 ci: Enable experimental kernel stuff in MSan task (MarcoFalke)
Pull request description:
Most of the CI tasks have a long list of stuff that they enable. This makes it hard to see what each CI task is actually running.
Also, most of the CI tasks should probably mimic the `dev-mode` CMake preset and run on as much stuff as possible. Usually, changing the `dev-mode` comes with changing those CI tasks as well in the same commit, which is verbose.
Fix both issues, by basing most CI tasks on the `dev-mode`. In the future, this makes it easier to change the `dev-mode` in a single place. If CI tasks explicitly disable something, it will be listed explicitly in them.
As a side-effect this will enable the kernel stuff for some CI task that did not have it enabled, which seems desirable.
ACKs for top commit:
TheCharlatan:
Nice, ACK fae83611b8
janb84:
ACK fae83611b8
hebasto:
ACK fae83611b8, I have reviewed the code and it looks OK.
Tree-SHA512: 58d9d553437b57362e9ec0766bd202482435f263d3f4c6ee7020c5e1e5ba69f8c064630423424f9d754254a66981e670b964a5aee58ef87f30b7d775642255be
With the move from cirrus-ci to GHA, the CIRRUS_REPO_FULL_NAME env var
was always unset, never triggering the sanity check.
Fix this by introducing a new vendor-agnostic env var and setting it
properly.
The CIRRUS_PR env var was cirrus-specific and using a provider-agnostic
name makes more sense.
Also, enable pipefail, while touching this file.
This refactor is needed for the next commit.
a3ac59a431 ci: Enable experimental kernel stuff in ASan task (MarcoFalke)
5b89956eeb kernel: Allow null arguments for serialized data (TheCharlatan)
Pull request description:
An empty span constructed from an empty vector may have a null data pointer depending on the implementation. Remove the BITCOINKERNEL_ARG_NONNULL requirement for these arguments and instead handle such null arguments in the implementation.
Also cherry-picked from #33845 to show that CI task passing now.
ACKs for top commit:
yuvicc:
Code review ACK a3ac59a431
maflcko:
review ACK a3ac59a431🥈
laanwj:
code review ACK a3ac59a431
Tree-SHA512: 629e463796f2f057df5be8e8981a45751c578ed0021be731c1d57fe849a539fe38b0a445914b0fc48f32f0408ad6d566984bd7f3a68797fcfdf1c6889e316a08
Base the task on --preset=dev-mode to ensure maximal coverage and add
the following:
bitcoin-chainstate (experimental) ... ON
test_bitcoin-qt ..................... ON
IPC and USDT remain explicitly disabled.
40dcbf580d build: add -Wtrailing-whitespace=any (fanquake)
d7659cd7e6 build: add -Wleading-whitespace=spaces (fanquake)
d86650220a cmake: Disable `-Wtrailing-whitespace` warnings for RCC-generated files (Hennadii Stepanov)
aabc5ca6ed cmake: Switch from AUTORCC to `qt6_add_resources` (Hennadii Stepanov)
25ae14c339 subprocess: replace tab with space (fanquake)
0c2b9dadd5 scripted-diff: remove whitespace in sha256_sse4.cpp (fanquake)
4da084fbc9 scripted-diff: change whitespace to spaces in univalue (fanquake)
e6caf150b3 ci: add moreutils to lint job (fanquake)
Pull request description:
GCC 15 now has options to turn leading & trailing whitespace into compile failures: https://gcc.gnu.org/gcc-15/changes.html#c-family. Fix the few cases of leading tabs, and trailing whitespace, and then enable `-Wleading-whitespace` and `-Wtrailing-whitespace`.
We currently get PRs that are opened with various whitespace, i.e #33822, so turning that into compile-time failure where possible, seems useful, to avoid a CI roundtrip.
ACKs for top commit:
ajtowns:
utACK 40dcbf580d
hebasto:
re-ACK 40dcbf580d.
Tree-SHA512: a128001ab2abb41cd6d249dcf46be4167ebd608d6b0f1452212a3ec9a383747bea623ab0382ec7bc0ac7a232a47cca5174e1cd73d4eda6751aa3cb2365ad2ede
fa9f29a4a7 doc: Recommend latest Debian stable or Ubuntu LTS (MarcoFalke)
fa1711ee0d doc: Add GCC-12 min release notes (MarcoFalke)
faa8be75c9 ci: Enable experimental kernel stuff in G++-12 task (previous releases) (MarcoFalke)
fabce97b30 test: Remove gccbug_90348 test case (MarcoFalke)
fa3854e432 test: Remove unused fs::create_directories test (MarcoFalke)
fa9dacdbde util: [refactor] Remove unused create_directories workaround (MarcoFalke)
fa807f78ae build: Bump g++ minimum supported version to 12 (MarcoFalke)
Pull request description:
All supported operating systems that previously came with at least g++-11, also come with at least g++-12, so bumping the minimum should be fine.
For reference:
* https://packages.ubuntu.com/jammy/g++-12
* https://packages.ubuntu.com/noble/g++ (g++-13)
* https://packages.debian.org/bookworm/g++ (g++-12)
* FreeBSD Ports ship a recent GCC
* RHEL-based 8, and 9 ship with g++-14 via appstream (`dnf install gcc-toolset-14` -> `/opt/rh/gcc-toolset-14/`)
* RHEL-based 10 ships with g++ (14 by default)
* OpenSuse Leap and Tumbleweed ship with g++ 15 https://software.opensuse.org/package/gcc15-c++
Obviously, downloading pre-compiled releases or compiling previous release branches is unaffected by this change.
ACKs for top commit:
janb84:
re-ACK fa9f29a4a7
TheCharlatan:
Re-ACK fa9f29a4a7
hebasto:
ACK fa9f29a4a7.
Tree-SHA512: ce14ecf78ccfe4f221dcbc9147dcfc00c0512b23a6fcda5ba71b62b4f5d39a5139f083d035113f189bfbd396d485e1ebc626a9a16b6fa0b74fd95aed2041c841
Base the task on --preset=dev-mode to ensure maximal coverage and add
the following:
bitcoin-chainstate (experimental) ... ON
libbitcoinkernel (experimental) ..... ON
kernel-test (experimental) .......... ON
IPC remains explicitly disabled.
Base the task on --preset=dev-mode to ensure maximal coverage and add
the following:
bitcoin-chainstate (experimental) ... ON
libbitcoinkernel (experimental) ..... ON
kernel-test (experimental) .......... ON
USDT remains explicitly disabled.
Base the task on --preset=dev-mode to ensure maximal coverage and add
the following:
bitcoin-chainstate (experimental) ... ON
libbitcoinkernel (experimental) ..... ON
kernel-test (experimental) .......... ON
Base the task on --preset=dev-mode to ensure maximal coverage and add
the following:
bitcoin-chainstate (experimental) ... ON
libbitcoinkernel (experimental) ..... ON
kernel-test (experimental) .......... ON
Base the task on --preset=dev-mode to ensure maximal coverage and add
the following:
bitcoin-chainstate (experimental) ... ON
libbitcoinkernel (experimental) ..... ON
kernel-test (experimental) .......... ON
Base the task on --preset=dev-mode to ensure maximal coverage and add
the following:
bitcoin-chainstate (experimental) ... ON
libbitcoinkernel (experimental) ..... ON
kernel-test (experimental) .......... ON
The GUI and USDT remain disabled explicitly.
Base the task on --preset=dev-mode to ensure maximal coverage and add
the following:
bitcoin-chainstate (experimental) ... ON
libbitcoinkernel (experimental) ..... ON
kernel-test (experimental) .......... ON
The GUI remains disabled explicitly.
Base the task on --preset=dev-mode to ensure maximal coverage and add
the following:
bitcoin-chainstate (experimental) ... ON
libbitcoinkernel (experimental) ..... ON
kernel-test (experimental) .......... ON
The GUI remains disabled explicitly.
Base the task on --preset=dev-mode to ensure maximal coverage and add
the following:
bitcoin-chainstate (experimental) ... ON
libbitcoinkernel (experimental) ..... ON
kernel-test (experimental) .......... ON
Also, shorten the name, for a less cluttered web view.
Base the task on --preset=dev-mode to ensure maximal coverage and add
the following:
bitcoin-chainstate (experimental) ... ON
libbitcoinkernel (experimental) ..... ON
kernel-test (experimental) .......... ON
It does not make sense to use a pointer, when a reference is more
appropriate, especially given that nullptr has been ruled out.
This is also allows to remove the CI workaround to avoid warnings:
```
C++ compiler .......................... GNU 13.0.0, /bin/x86_64-w64-mingw32-g++-posix
...
/ci_container_base/src/test/blockmanager_tests.cpp: In member function ‘void blockmanager_tests::blockmanager_scan_unlink_already_pruned_files::test_method()’:
/ci_container_base/src/test/blockmanager_tests.cpp:63:17: error: possibly dangling reference to a temporary [-Werror=dangling-reference]
63 | const auto& chainman = Assert(m_node.chainman);
| ^~~~~~~~
In file included from /ci_container_base/src/streams.h:13,
from /ci_container_base/src/dbwrapper.h:11,
from /ci_container_base/src/node/blockstorage.h:10,
from /ci_container_base/src/test/blockmanager_tests.cpp:8:
/ci_container_base/src/util/check.h:116:49: note: the temporary was destroyed at the end of the full expression ‘inline_assertion_check<true, std::unique_ptr<ChainstateManager>&>(((blockmanager_tests::blockmanager_scan_unlink_already_pruned_files*)this)->blockmanager_tests::blockmanager_scan_unlink_already_pruned_files::<anonymous>.TestChain100Setup::<anonymous>.TestingSetup::<anonymous>.ChainTestingSetup::<anonymous>.BasicTestingSetup::m_node.node::NodeContext::chainman, std::source_location{(& *.Lsrc_loc27)}, std::basic_string_view<char>(((const char*)"m_node.chainman")))’
116 | #define Assert(val) inline_assertion_check<true>(val, std::source_location::current(), #val)
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/ci_container_base/src/test/blockmanager_tests.cpp:63:28: note: in expansion of macro ‘Assert’
63 | const auto& chainman = Assert(m_node.chainman);
| ^~~~~~
cc1plus: all warnings being treated as errors
gmake[2]: Leaving directory '/ci_container_base/ci/scratch/build-x86_64-w64-mingw32'
gmake[2]: *** [src/test/CMakeFiles/test_bitcoin.dir/build.make:382: src/test/CMakeFiles/test_bitcoin.dir/blockmanager_tests.cpp.obj] Error 1
gmake[1]: *** [CMakeFiles/Makefile2:1810: src/test/CMakeFiles/test_bitcoin.dir/all] Error 2
gmake[1]: Leaving directory '/ci_container_base/ci/scratch/build-x86_64-w64-mingw32'
gmake: *** [Makefile:146: all] Error 2
```
This false-positive warning is also fixed in later GCC versions.
See also https://godbolt.org/z/fjc6be65M
fad6efd3be refactor: Use STR_INTERNAL_BUG macro where possible (MarcoFalke)
fada379589 doc: Remove unused bugprone-lambda-function-name suppression (MarcoFalke)
fae1d99651 refactor: Use const reference to std::source_location (MarcoFalke)
fa5fbcd615 util: Allow Assert() in contexts without __func__ (MarcoFalke)
Pull request description:
Without this, compile warnings could be hit about `__func__` being only valid inside functions.
```
warning: predefined identifier is only valid inside function [-Wpredefined-identifier-outside-function] note: expanded from macro Assert
115 | #define Assert(val) inline_assertion_check<true>(val, __FILE__, __LINE__, __func__, #val)
| ^
```
Ref https://github.com/bitcoin/bitcoin/pull/32740#discussion_r2486258473
This also introduces a slight behaviour change, because `std::source_location::function_name` usually includes the entire function signature instead of just the name.
ACKs for top commit:
l0rinc:
Code review ACK fad6efd3be
stickies-v:
ACK fad6efd3be
hodlinator:
re-ACK fad6efd3be
Tree-SHA512: e78a2d812d5ae22e45c93db1661dafbcd22ef209b3d8d8d5f2ac514e92fd19a17c3f0a5db2ef5e7748aa2083b10c0465326eb36812e6a80e238972facd2c7e98
Performance likely does not matter here, but from a perspective of
code-readablilty, a const reference should be preferred for read-only
access.
So use it here.
This requires to set -Wno-error=dangling-reference for GCC 13.1
compilations, but this false-positive is fixed in later GCC versions.
See also https://godbolt.org/z/fjc6be65M
6c7a34f3b0 kernel: Add Purpose section to header documentation (TheCharlatan)
7e9f00bcc1 kernel: Allowing reducing exports (TheCharlatan)
7990463b10 kernel: Add pure kernel bitcoin-chainstate (TheCharlatan)
36ec9a3ea2 Kernel: Add functions for working with outpoints (TheCharlatan)
5eec7fa96a kernel: Add block hash type and block tree utility functions to C header (TheCharlatan)
f5d5d1213c kernel: Add function to read block undo data from disk to C header (TheCharlatan)
09d0f62638 kernel: Add functions to read block from disk to C header (TheCharlatan)
a263a4caf2 kernel: Add function for copying block data to C header (TheCharlatan)
b30e15f432 kernel: Add functions for the block validation state to C header (TheCharlatan)
aa262da7bc kernel: Add validation interface to C header (TheCharlatan)
d27e27758d kernel: Add interrupt function to C header (TheCharlatan)
1976b13be9 kernel: Add import blocks function to C header (TheCharlatan)
a747ca1f51 kernel: Add chainstate load options for in-memory dbs in C header (TheCharlatan)
070e77732c kernel: Add options for reindexing in C header (TheCharlatan)
ad80abc73d kernel: Add block validation to C header (TheCharlatan)
cb1590b05e kernel: Add chainstate loading when instantiating a ChainstateManager (TheCharlatan)
e2c1bd3d71 kernel: Add chainstate manager option for setting worker threads (TheCharlatan)
65571c36a2 kernel: Add chainstate manager object to C header (TheCharlatan)
c62f657ba3 kernel: Add notifications context option to C header (TheCharlatan)
9e1bac4585 kernel: Add chain params context option to C header (TheCharlatan)
337ea860df kernel: Add kernel library context object (TheCharlatan)
28d679bad9 kernel: Add logging to kernel library C header (TheCharlatan)
2cf136dec4 kernel: Introduce initial kernel C header API (TheCharlatan)
Pull request description:
This is a first attempt at introducing a C header for the libbitcoinkernel library that may be used by external applications for interfacing with Bitcoin Core's validation logic. It currently is limited to operations on blocks. This is a conscious choice, since it already offers a lot of powerful functionality, but sits just on the cusp of still being reviewable scope-wise while giving some pointers on how the rest of the API could look like.
The current design was informed by the development of some tools using the C header:
* A re-implementation (part of this pull request) of [bitcoin-chainstate](https://github.com/bitcoin/bitcoin/blob/master/src/bitcoin-chainstate.cpp).
* A re-implementation of the python [block linearize](https://github.com/bitcoin/bitcoin/tree/master/contrib/linearize) scripts: https://github.com/TheCharlatan/bitcoin/tree/kernelLinearize
* A silent payment scanner: https://github.com/josibake/silent-payments-scanner
* An electrs index builder: https://github.com/josibake/electrs/commits/electrs-kernel-integration
* A rust bitcoin node: https://github.com/TheCharlatan/kernel-node
* A reindexer: https://github.com/TheCharlatan/bitcoin/tree/kernelApi_Reindexer
The library has also been used by other developers already:
* A historical block analysis tool: https://github.com/ismaelsadeeq/mining-analysis
* A swiftsync hints generator: https://github.com/theStack/swiftsync-hints-gen
* Fast script validation in floresta: https://github.com/vinteumorg/Floresta/pull/456
* A swiftsync node implementation: https://github.com/2140-dev/swiftsync/tree/master/node
Next to the C++ header also made available in this pull request, bindings for other languages are available here:
* Rust: https://github.com/TheCharlatan/rust-bitcoinkernel
* Python: https://github.com/stickies-v/py-bitcoinkernel
* Go: https://github.com/stringintech/go-bitcoinkernel
* Java: https://github.com/yuvicc/java-bitcoinkernel
The rust bindings include unit and fuzz tests for the API.
The header currently exposes logic for enabling the following functionality:
* Feature-parity with the now deprecated libbitcoin-consensus
* Optimized sha256 implementations that were not available to previous users of libbitcoin-consensus thanks to a static kernel context
* Full support for logging as well as control over categories and severity
* Feature parity with the existing experimental bitcoin-chainstate
* Traversing the block index as well as using block index entries for reading block and undo data.
* Running the chainstate in memory
* Reindexing (both full and chainstate-only)
* Interrupting long-running functions
The pull request introduces a new kernel-only test binary that purely relies on the kernel C header and the C++ standard library. This is intentionally done to show its capabilities without relying on other code inside the project. This may be relaxed to include some of the existing utilities, or even be merged into the existing test suite.
The complete docs for the API as well as some usage examples are hosted on [thecharlatan.ch/kernel-docs](https://thecharlatan.ch/kernel-docs/index.html). The docs are generated from the following repository (which also holds the examples): [github.com/TheCharlatan/kernel-docs](https://github.com/TheCharlatan/kernel-docs).
#### How can I review this PR?
Scrutinize the commit messages, run the tests, write your own little applications using the library, let your favorite code sanitizer loose on it, hook it up to your fuzzing infrastructure, profile the difference between the existing bitcoin-chainstate and the bitcoin-chainstate introduced here, be nitty on the documentation, police the C interface, opine on your own API design philosophy.
To get a feeling for the API, read through the tests, or one of the examples.
To configure this PR for making the shared library and the bitcoin-chainstate and test_kernel utilities available:
```
cmake -B build -DBUILD_KERNEL_LIB=ON -DBUILD_UTIL_CHAINSTATE=ON
```
Once compiled the library is part of the build artifacts that can be installed with:
```
cmake --install build
```
#### Why a C header (and not a C++ header)
* Shipping a shared library with a C++ header is hard, because of name mangling and an unstable ABI.
* Mature and well-supported tooling for integrating C exists for nearly every popular language.
* C offers a reasonably stable ABI
Also see https://github.com/bitcoin/bitcoin/pull/30595#issuecomment-2285719575.
#### What about versioning?
The header and library are still experimental and I would expect this to remain so for some time, so best not to worry about versioning yet.
#### Potential future additions
In future, the C header could be expanded to support (some of these have been roughly implemented):
* Handling transactions, block headers, coins cache, utxo set, meta data, and the mempool
* Adapters for an abstract coins store
* Adapters for an abstract block store
* Adapters for an abstract block tree store
* Allocators and buffers for more efficient memory usage
* An "[io-less](https://sans-io.readthedocs.io/how-to-sans-io.html)" interface
* Hooks for an external mempool, or external policy rules
#### Current drawbacks
* For external applications to read the block index of an existing Bitcoin Core node, Bitcoin Core needs to shut down first, since leveldb does not support reading across multiple processes. Other than migrating away from leveldb, there does not seem to be a solution for this problem. Such a migration is implemented in #32427.
* The fatal error handling through the notifications is awkward. This is partly improved through #29642.
* Handling shared pointers in the interfaces is unfortunate. They make ownership and freeing of the resources fuzzy and poison the interfaces with additional types and complexity. However, they seem to be an artifact of the current code that interfaces with the validation engine. The validation engine itself does not seem to make extensive use of these shared pointers.
* If multiple instances of the same type of objects are used, there is no mechanism for distinguishing the log messages produced by each of them. A potential solution is #30342.
* The background leveldb compaction thread may not finish in time leading to a non-clean exit. There seems to be nothing we can do about this, outside of patching leveldb.
ACKs for top commit:
alexanderwiederin:
re-ACK 6c7a34f3b0
stringintech:
re-ACK 6c7a34f
laanwj:
Code review ACK 6c7a34f3b0
ismaelsadeeq:
reACK 6c7a34f3b0👾
fanquake:
ACK 6c7a34f3b0 - soon we'll be running bitcoin (kernel)
Tree-SHA512: ffe7d4581facb7017d06da8b685b81f4b5e4840576e878bb6845595021730eab808d8f9780ed0eb0d2b57f2647c85dcb36b6325180caaac469eaf339f7258030