Install pycapnp on all (active) CI hosts which have IPC enabled and
run the functional tests.
Except for previous_releases, which uses an older version of pip
that doesn't support --break-system-packages.
With this change, tests can specify `self.extra_init = [{ipcbind: True}]` to
start a node listening on an IPC socket, instead of needing to choose which
node binary to invoke and what `self.extra_args=[["-ipcbind=..."]]` value to
pass to it.
The eliminates boilerplate code #30437 (interface_ipc_mining.py), #32297
(interface_ipc_cli.py), and #33201 (interface_ipc.py) previously needed in
their test setup.
Set new `BitcoinTestFramework.binary_paths.bitcoin_bin` property with path to
the `bitcoin` wrapper binary. This allows new tests for `bitcoin-mine` in
#30437 and `bitcoin-cli` in #32297 to find the `bitcoin` binary and call
`bitcoin -m` to start nodes with IPC support. This way the new tests can run
whenever the ENABLE_IPC build option is enabled, instead of only running when
the `BITCOIN_CMD` environment variable is set to `bitcoin -m`
61ec8866c6 [doc] archive v29.1 release notes (glozow)
Pull request description:
Copied from https://github.com/bitcoin/bitcoin/blob/v29.1/doc/release-notes.md
This is needed for announcement links and so we can see historical release notes in master.
ACKs for top commit:
l0rinc:
review ACK 61ec8866c6
Tree-SHA512: da9692c8cd8de54e848caab19da41975e8e75049b4fd3e1c6475ee86bf9947132597ceb4bf2e217710a73178b54c05b8f27668c67da202ba5fb1799b582fb15d
c9d5f211c1 depends: strip when installing qt (fanquake)
Pull request description:
Otherwise we end up with ~1.5GB of binaries (Linux) when `DEBUG=1`. This isn't great generally, but is worse in the CI, where disk may be limited (#33293).
ACKs for top commit:
TheCharlatan:
ACK c9d5f211c1
hebasto:
ACK c9d5f211c1.
Tree-SHA512: bf83e0d8c41c64aaa6d841e24c4f25bbe33034ae54a32f34ca14aca59eaa1a004809d48acf171414ed43b99f7d3d1f4b973aee0b272475bd7cc2ca708718b8da
4f1a4cbccd net: Quiet down logging when router doesn't support natpmp/pcp (laanwj)
Pull request description:
When the router doesn't support natpmp and PCP, one'd normally expect the UDP packet to be ignored, and hit a time out. This logs a message that is already in the debug category. However, there's also the case in which sending an UDP packet causes a ICMP response (type 3, code 3 "port unreachable"). This is returned to user space as "connection refused" (despite UDP having no concept of connections).
Move the warnings from `Send` and `Recv` to debug level too, to reduce log spam in that case.
Closes#33301.
ACKs for top commit:
willcl-ark:
utACK 4f1a4cbccd
sipa:
utACK 4f1a4cbccd
davidgumberg:
Tested ACK 4f1a4cbccd
achow101:
ACK 4f1a4cbccd
darosior:
utACK 4f1a4cbccd
mzumsande:
utACK 4f1a4cbccd
Tree-SHA512: 2c99a5679720482ece47af33616b6b207509fb58ba1962a1c2d30f8d0e68554f8f5ef25224313d93f4c5a1cc702183fcf8e6119abc411209c9884119ef680aad
The warnings are false positive and have been fixed upstream.
See: https://github.com/capnproto/capnproto/pull/2334.
This change disables the `UndefinedBinaryOperatorResult` clang-tidy
check for source files generated by the `mpgen` tool.
When the router doesn't support natpmp and PCP, one'd normally expect
the UDP packet to be ignored, and hit a time out. This logs a warning
that is already in the debug category. However, there's also the case in
which sending an UDP packet causes a ICMP response. This is returned to
user space as "connection refused" (despite UDP having no concept of
connections).
Move the warnings from `Send` and `Recv` to debug level too, to reduce
log spam in that case.
Closes#33301.
fae610d858 ci: Remove redundant RUN_UNIT_TESTS_SEQUENTIAL (MarcoFalke)
Pull request description:
`RUN_UNIT_TESTS_SEQUENTIAL` is useful to detect cases where global state is left dirty in the test process and leads to subsequent unit test cases failing. However, one CI task is sufficient to catch this.
As there already is one, add docs there and remove this env var (and extra logic).
ACKs for top commit:
fanquake:
ACK fae610d858
Tree-SHA512: b7ace1257d039f144cb0acb08d5d19d641028464517e6a2468e248ed79b2511512dc904867dacd66157b7483ec8041c95cce00f8ce3c89f3a2c3bb47939d7ff9
88db09bafe net: handle multi-part netlink responses (willcl-ark)
42e99ad773 net: skip non-route netlink responses (willcl-ark)
57ce645f05 net: filter for default routes in netlink responses (willcl-ark)
Pull request description:
...for default route in pcp pinholing.
Currently we only make a single recv call, which trucates results from large routing tables, or in the case the kernel may split the message into multiple responses (which may happen with `NLM_F_DUMP`).
We also do not filter on the default route. For IPv6, this led to selecting the first route with an `RTA_GATEWAY` attribute, often a non-default route instead of the actual default. This caused PCP port mapping failures because the wrong gateway was used.
Fix both issues by adding multi-part handling of responses and filter for the default route.
Limit responses to ~ 1MB to prevent any router-based DoS.
ACKs for top commit:
achow101:
ACK 88db09bafe
davidgumberg:
Code Review re-ACK 88db09b
Sjors:
re-utACK 88db09bafe
Tree-SHA512: ea5948edebfad5896a487a61737aa5af99f529fad3cf3da68dced456266948238a7143383847e79a7bb90134e023eb173c25116d8eb80ff57fa4c4a0377ca1ed
af4156ab75 build: set ENABLE_IPC to OFF when fuzzing (fanquake)
Pull request description:
A `BUILD_FOR_FUZZING` build will currently failure to configure, with missing `capnp`.
ACKs for top commit:
Crypt-iQ:
tACK af4156ab75
marcofleon:
ACK af4156ab75
dergoegge:
utACK af4156ab75
janb84:
ACK af4156ab75
Tree-SHA512: e3c5238cb5823116a958502eab84ee72a94cac0853fc3908ef97b6b6dc037db27806be0726f321d70ab706c37924dec526b46a3a46ea3f3f3684ce48da46a803
Handle multi-part netlink responses to prevent truncated results from
large routing tables.
Previously, we only made a single recv call, which led to incomplete
results when the kernel split the message into multiple responses (which
happens frequently with NLM_F_DUMP).
Also guard against a potential hanging issue where the code would
indefinitely wait for NLMSG_DONE for non-multi-part responses by
detecting the NLM_F_MULTI flag and only continue waiting when necessary.
7270839af4 doc: truc packages allow sub min feerate transactions (Pol Espinasa)
Pull request description:
Fixes https://github.com/bitcoin/bitcoin/issues/32067
Some policy documentation is outdated since TRUC. This PR aims to update the documentation to the actual policy state.
ACKs for top commit:
w0xlt:
reACK 7270839af4
glozow:
ACK 7270839af4
Tree-SHA512: 1272e7acc76c76d7e145cdd07827ece31253dba4b99b9a22fc986fcd538830e46392fda877736cb496f3e53a0abcb9d8403d439bb1da63b88da7f8b6f17b6c8b
3c5da69a23 ci: remove un-needed lint_run*.sh files (willcl-ark)
2aa288efdd ci: fix annoying docker warning (will)
dd1c5903e8 ci: add ccache hit-rate warning when < 75% (will)
f427284483 doc: Detail configuration of hosted CI runners (will)
3f339e99e0 ci: dynamically match makejobs with cores (will)
4393ffdd83 ci: remove .cirrus.yml (will)
bc41848d00 ci: port lint (will)
d290a8e6ea ci: port msan-depends (will)
9bbae61e3b ci: port tsan-depends (will)
bf7d536452 ci: port tidy (will)
549074bc64 ci: port centos-depends-gui (will)
58e38c3a04 ci: port previous-releases-depends-debug (will)
341196d75c ci: port fuzzer-address-undefined-integer-nodepends (will)
f2068f26c1 ci: port no-IPC-i686-DEBUG (will)
2a00b12d73 ci: port nowallet-libbitcoinkernel (will)
9c2514de53 ci: port mac-cross-gui-notests (will)
2c990d84a3 ci: force reinstall of kernel headers in asan (will)
884251441b ci: update asan-lsan-ubsan (will)
f253031cb8 ci: port arm 32-bit job (will)
04e7bfbceb ci: update windows-cross job (will)
cc1735d777 ci: add job to determine runner type (will)
020069e6b7 ci: add Cirrus cache host (will)
9c2b96e0d0 ci: have base install run in right dir (will)
18f6be09d0 ci: use docker build cache arg directly (will)
94a0932547 ci: use buildx in ci (will)
fdf64e5532 ci: add configure-docker action (will)
33ba073df7 ci: add REPO_USE_CIRRUS_RUNNERS (will)
b232b0fa5e ci: add caching actions (will)
b8fcc9fcbc ci: add configure environment action (will)
Pull request description:
This changeset migrates all current self-hosted CI jobs over to hosted [Cirrus Runners](https://cirrus-runners.app/).
These runners cost a flat rate of $150/month, and we qualify for an open source discount of 50%. Therefore they are $75/month/runner.
One "runner" should more accurately be thought of in terms of the number of vCPU you are purchasing: https://cirrus-runners.app/pricing/ or in terms of "concurrency", where 1 runners gets you 1.0 concurrency.
e.g. a Linux x86 Runner gets you 16 vCPU (1.0 concurrency) and 64GB RAM to be provisioned as you choose, amongst one or more jobs.
Cirrus Runners currently only support Linux (x86 and Arm64) and MacOS (Arm64).
This changeset does **not** move the existing Github Actions native MacOS runners away from being run on Github's infrastructure. This could be a follow up optimisation.
Runs from this changeset using Cirrus Runners can be found at: https://github.com/testing-cirrus-runners/bitcoin2/actions which shows an uncached run on master ([CI#1](https://github.com/testing-cirrus-runners/bitcoin2/actions/runs/16298637161)), an outside pull request ([CI#3](https://github.com/testing-cirrus-runners/bitcoin2/actions/runs/16303305483?pr=1)) and an updated push to master ([CI#4](https://github.com/testing-cirrus-runners/bitcoin2/actions/runs/16304182527)).
These workflows were run on 10 runners, and we would recommend purchasing a similar number for our CI in this repo to achieve the speed and concurrency we expect.
We include some optional performance commits, but these could be split out and made into followups or dropped entirely.
## Benefits
### Maintenance
As we are not self-hosting, nobody needs to maintain servers, disks etc.
### Bus factor
Currently we have a very small number of people with the know-how working on server setup and maintenance. This setup fixes that so that "anyone" familiar with GitHub-style CI systems can work on it.
### Scaling
These do _not_ "auto-scale"/have "unlimited concurrency" like some solutions, but if we want more workers/cpu to increase parallism or increase the runner size of certain jobs for a speed-up we can simply buy more concurrency using the web interface.
### Speed
Runtimes aproximate current runtimes pretty well, with some jobs being faster.
Caching improvements on pull request (re-runs) are left as future optimisations from the current changeset (see below).
### GitHub workflow syntax
With a migration to the more-commonly-used GitHub workflow syntax, migration to other providers in the future is often as simple as a one-line change (and installing a new GitHub app to the repo).
If we decide to self-host again, then we can also self-host GitHub runners (using https://github.com/actions/runner) and maintain new GH-style CI syntax.
### Reporting
GitHub workflows provide nicer built-in reporting directly on the "Checks" page of a pr. This includes more-detailed action reporting, and a host of pretty nice integrated features, such as [Workflow Commands](https://docs.github.com/en/actions/writing-workflows/choosing-what-your-workflow-does/workflow-commands-for-github-actions) for creating annotations that can print messages during runs. See for example at the bottom of this window where we report `ccache` hitrate, if it was below 90%: https://github.com/testing-cirrus-runners/bitcoin/actions/runs/16163449125?pr=1
These could be added conditionally into our CI scripts to report interesting or other information.
## Costs
### Financial
Relative to competitors Cirrus runners are cheap for the hosted CI-world. However these are likely more expensive than our current setup, or a well-configured (new) self-hosted setup.
If we started with 10 runners to be shared amongst all migrated jobs, this would total $750/mo = $9000/yr.
Note that we are not trying to comptete here on cost directly.
### Dependencies
We would be dependent on Cirrus infra.
## Forks
- Forks should be able to run CI without paid Cirrus runners. This behaviour is achieved through a rather verbose `runs-on:` directive.
- This directive hardcodes the main repo (unfortunately you cannot use the `env` github context in this field in particular, for some reason).
- This directive also allows for a fork to patch the `runs-on:` field in the ci.yml file if they want to use Cirrus Runners too.
- The workflow otherwise will fallback to the GitHub free runners on forks.
- This cirrus cache action transparently falls back to github actions cache when not running on cirrus, so forks will get some free github caching (10GB per repo).
All jobs work on forks, but will run (slowly) on GitHub native free hosted runners, instead of Cirrus runners. They will also suffer from poor cache hit-rates, but there's nothing that can be done about that, and the situtation is an improvement on today.
## Migration process
The main org should also, in addition to pulling code changes:
1. Permit the actions `docker/setup-buildx-action@v3` and `docker/login-action@v3` to be run in this repo.
## Caching
For the number of CI jobs we have, cache usage on GitHub would be an issue as GH only provides 10GB of cache space, **per repo**. However cirrus provides [10 GB per runner](https://cirrus-runners.app/setup/#speeding-up-the-cache), which scales better with the number of runners.
The `cirruslabs/action/[restore|save]` action we use here redirects this to Cirrus' own cache and is both faster and larger.
In the case that user is running CI on a fork, the cirrus cache falls back transparently to GitHub default cache without error.
### ccache, depends-sources, built-depends
- Cached as blobs via `cirruslabs/actions/cache` action.
- Current implementation:
- On `push`: restores and saves caches.
- On `pull_request`: restores but does **not** save caches.
This means a new pull request should hit a _pretty relevant_ cache.
Old pull requests **which are not being rebased on master** may suffer from lower cache hit-rate.
If we save caches on all pull request runs we run the risk of evicting recent (and more relevant) cache blobs.
It may be possible in a future optimisation to widen this to save on pull request runs too, but it will also depend on how many runners we provision and what cache churn rates are like in the main repo.
### Docker build layer caching
- Cached using the `gha` cache backend
- These cache blobs compete for space with `ccache`, `depends-sources` and `depends-built` caches
- `gha` cache allows `--cache-from` to be used from pull requests, which does not work using a registry cache type (technically we could use a public read-only token to get this working, but that feels wrong)
This backend does network i/o and so are marginally slower than our current disk i/o cache.
## But what about... `x`?
We have tested many other providers, including [Runs-on](https://runs-on.com/), [Buildjet](https://buildjet.com/), [WarpBuild](https://www.warpbuild.com/), and GitHub hosted runners (and investigated even more). But they all fall short in one-way or another.
- Runs-On and Buildjet (and others) require installing GH apps with much too-liberal permissions (e.g. `Administration: Read|Write`) for our use-case.
- GitHub hosted runners suffer from all of high costs, lower speed, small cache, and the requirement for a GitHub Teams subscription.
- WarpBuild seems to be simply too expensive.
## TODO:
To complete migration from self-hosted to hosted for this repo, the backport branches `27.x`, `28.x` and `29.x` would also need their CI ported, but these are left for followups to this change (and pending review/changes here first).
-----
Work and experimentation undertaken with m3dwards
ACKs for top commit:
maflcko:
re-ACK 3c5da69a23 🏗
m3dwards:
ACK 3c5da69a23
achow101:
ACK 3c5da69a23
janb84:
re ACK 3c5da69a23
Tree-SHA512: 9f7f2dddf1a5eebc56b4101663283d4219d189cda6054dba760f1288bed9e6ed3f2fa029a5caedc76c31b1271ea0a0cb0967a796086360d8f5be8277379b6397
2885bd0e1c doc: unify `datacarriersize` warning with release notes (Lőrinc)
Pull request description:
Follow-up to https://github.com/bitcoin/bitcoin/pull/32406
---
The [release notes](a189d63618/doc/release-notes-32406.md (L1)) claim
> [...] marked as deprecated and are expected to be removed in a future release
but the [warning itself](2885bd0e1c/src/init.cpp (L907)) claims
> [...] marked as deprecated. They **will** be removed in a future version.
To be less aggressive (since some have objected against this version online) - and to unify the deprecation warning with the release notes - I have changed the warning to communicate our expectation in a friendlier way.
ACKs for top commit:
cedwies:
ACK 2885bd0
ryanofsky:
Code review ACK 2885bd0e1c. I don't think it is good for the release notes and the runtime warning message to say two different things. I'd also be happy if release notes were updated to match the runtime warning, instead of vice versa. Whatever is more accurate is better.
ajtowns:
ACK 2885bd0e1c
kevkevinpal:
ACK [2885bd0](2885bd0e1c)
achow101:
ACK 2885bd0e1c
janb84:
ACK 2885bd0e1c
Zero-1729:
crACK 2885bd0e1c
jonatack:
ACK 2885bd0e1c
hodlinator:
ACK 2885bd0e1c
w0xlt:
ACK 2885bd0e1c
optout21:
ACK 2885bd0e1c
Tree-SHA512: a9d2a64ab96b3dd7f3a1a29622930054fd5c56e573bc96330f4ef3327dc024b21b3fbc8a698d17aea7c76f57f0c2ccd6403b2df344ae2f69c645ceb8b6fa54a5
ci/lint_run.sh: Only used in .cirrus.yml. Refer to test/lint/README.md on how to run locally.
ci/lint_run_all.sh: Only used in .cirrus.yml for stale re-runs of old pull request tasks.
Docker currently warns that we are missing a default value.
Set this to scratch which will error if an appropriate image tag is not
passed in to silence the warning.
Previously jobs were running on a large multi-core server where 10 jobs
as default made sense (or may even have been on the low side).
Using hosted runners with fixed (and lower) numbers of vCPUs we should
adapt compilation to match the number of cpus we have dynamically.
This is cross-platform compatible with macos and linux only.