9c5775c331 addrman: cap the `max_pct` to not exceed the maximum number of addresses (brunoerg)
Pull request description:
Fixes#31234
This PR fixes a bad alloc issue in `GetAddresses` by capping the value `max_pct`. In practice, values greater than 100 should be treated as 100 since it's the percentage of addresses to return. Also, it limites the value `max_pct` in connman target to exercise values between 0 and 100.
ACKs for top commit:
adamandrews1:
Code Review ACK 9c5775c331
marcofleon:
Tested ACK 9c5775c331. Reproduced the crash on master and checked that this fixed it. The checks added to `GetAddr_` look reasonable.
mzumsande:
Code Review ACK 9c5775c331
vasild:
ACK 9c5775c331
Tree-SHA512: 2957ae561ccc37df71f43c1863216d2e563522ea70b9a4baee6990e0b4a1ddadccabdcb9115c131a9a57480367b5ebdd03e0e3d4c8583792e2b7d1911a0a06d3
01960c53c7 fuzz: make FuzzedDataProvider usage deterministic (Martin Leitner-Ankerl)
Pull request description:
There exist many usages of `fuzzed_data_provider` where it is evaluated directly in the function call.
Unfortunately, [the order of evaluation of function arguments is unspecified](https://en.cppreference.com/w/cpp/language/eval_order), and a simple example shows that it can differ e.g. between clang++ and g++: https://godbolt.org/z/jooMezWWY
When the evaluation order is not consistent, the same fuzzing/random input will produce different output, which is bad for coverage/reproducibility. This PR fixes all these cases I have found where unspecified evaluation order could be a problem.
Finding these has been manual work; I grepped the sourcecode for these patterns, and looked at each usage individually. So there is a chance I missed some.
* `fuzzed_data_provider`
* `.Consume`
* `>Consume`
* `.rand`
I first discovered this in https://github.com/bitcoin/bitcoin/pull/29013#discussion_r1420236394. Note that there is a possibility that due to this fix the evaluation order is now different in many cases than when the fuzzing corpus has been created. If that is the case, the fuzzing corpus will have worse coverage than before.
Update: In list-initialization the order of evaluation is well defined, so e.g. usages in `initializer_list` or constructors that use `{...}` is ok.
ACKs for top commit:
achow101:
ACK 01960c53c7
vasild:
ACK 01960c53c7
ismaelsadeeq:
ACK 01960c53c7
Tree-SHA512: e56d087f6f4bf79c90b972a5f0c6908d1784b3cfbb8130b6b450d5ca7d116c5a791df506b869a23bce930b2a6977558e1fb5115bb4e061969cc40f568077a1ad
There exist many usages of `fuzzed_data_provider` where it is evaluated directly in the function call.
Unfortunately, the order of evaluation of function arguments is unspecified. This means it can differ
between compilers/version/optimization levels etc. But when the evaluation order changes, the same
fuzzing input will produce different output, which is bad for coverage/reproducibility.
This PR fixes all these cases where by moving multiple calls to `fuzzed_data_provider` out of the
function arguments.
3ea54e5db7 net: Add continuous ASMap health check logging (Fabian Jahr)
28d7e55dff test: Add tests for unfiltered GetAddr usage (Fabian Jahr)
b8843d37ae fuzz: Let fuzzers use filter options in GetAddr/GetAddresses (Fabian Jahr)
e16f420547 net: Optionally include terrible addresses in GetAddr results (Fabian Jahr)
Pull request description:
There are certain statistics we can collect by running all our known clearnet addresses against the ASMap file. This could show issues with a maliciously manipulated file or with an old file that has decayed with time.
This is just a proof of concept for now. My idea currently is to run the analysis once per day and print the results to logs if an ASMap file is used.
ACKs for top commit:
achow101:
ACK 3ea54e5db7
mzumsande:
ACK 3ea54e5db7
brunoerg:
crACK 3ea54e5db7
Tree-SHA512: 777acbfac43cc43ce4a0a3612434e4ddbc65f59ae8ffc9e24f21de09011bccb297f0599cbaa82bcf40ef68e5af582c4e98556379db7ceff7d9f97574a1cf8e09
`CConnman::GetAddedNodeInfo` is used both to get a list of addresses to manually connect to
in `CConnman::ThreadOpenAddedConnections`, and to report about manually added connections in
`getaddednodeinfo`. In both cases, all addresses added to `m_added_nodes` are returned, however
the nodes we are already connected to are only relevant to the latter, in the former they are
actively discarded.
Parametrizes `CConnman::GetAddedNodeInfo` so we can ask for only addresses we are not connected to,
to avoid passing useless information around.
This is a follow-up to previous commits moving the chain constants out
of chainparamsbase.
The script removes the chainparamsbase header in all files where it is
included, but not used. This is done by filtering against all defined
symbols of the header as well as its respective .cpp file.
The kernel chainparams now no longer relies on chainparamsbase.
-BEGIN VERIFY SCRIPT-
sed -i '/#include <chainparamsbase.h>/d' $( git grep -l 'chainparamsbase.h' | xargs grep -L 'CBaseChainParams\|CreateBaseChainParams\|SetupChainParamsBaseOptions\|BaseParams\|SelectBaseParams\|chainparamsbase.cpp' )
-END VERIFY SCRIPT-
This is an extraction of ArgsManager related functions from util/system
into their own common file.
Config file related functions are moved to common/config.cpp.
The background of this commit is an ongoing effort to decouple the
libbitcoinkernel library from the ArgsManager. The ArgsManager belongs
into the common library, since the kernel library should not depend on
it. See doc/design/libraries.md for more information on this rationale.
9b8dcb25b5 [net processing] Rename PoissonNextSendInbound to NextInvToInbounds (John Newbery)
ea99f5d01e [net processing] Move PoissonNextSendInbound to PeerManager (John Newbery)
bb060746df scripted-diff: replace PoissonNextSend with GetExponentialRand (John Newbery)
03cfa1b603 [refactor] Use uint64_t and std namespace in PoissonNextSend (John Newbery)
9e64d69bf7 [move] Move PoissonNextSend to src/random and update comment (John Newbery)
Pull request description:
`PoissonNextSend` and `PoissonNextSendInbound` are used in the p2p code to obfuscate various regularly occurring processes, in order to make it harder for others to get timing-based information deterministically.
The naming of these functions has been confusing to several people (including myself, see also #23347) because the resulting random timestamps don't follow a Poisson distribution but an exponential distribution (related to events in a Poisson process, hence the name). This PR
- moves `PoissonNextSend()` out of `net` to `random` and renames it to `GetExponentialRand()`
- moves `PoissonNextSendInbound()` out of `CConnman` to `PeerManager` and renames it to `NextInvToInbounds()`
- adds documentation for these functions
This is work by jnewbery - due to him being less active currently, I opened the PR and will address feedback.
ACKs for top commit:
jnewbery:
ACK 9b8dcb25b5
hebasto:
ACK 9b8dcb25b5, I have reviewed the code and it looks OK, I agree it can be merged.
theStack:
ACK 9b8dcb25b5📊
Tree-SHA512: 85c366c994e7147f9981fe863fb9838502643fa61ffd32d55a43feef96a38b79a5daa2c4d38ce01074897cc95fa40c76779816edad53f5265b81b05c3a1f4f50
Make it possible to override from the command line (without recompiling)
the addrman check ratio in non-addrman fuzz tests (connman and
deserialize) instead of hardcoding it to 0:
```
FUZZ=connman ./src/test/fuzz/fuzz --checkaddrman=5
```
fa551b3bdd Remove GetAdjustedTime from init.cpp (MarcoFalke)
fa815f8473 Replace addrman.h include with forward decl in net.h (MarcoFalke)
Pull request description:
It seems confusing to call `GetAdjustedTime` there, because no offset could have been retrieved from the network at this point. Even if connman was started, `timedata` needs at least 5 peer connections to calculate an offset.
Fix the confusion by replacing `GetAdjustedTime` with `GetTime`, which does not change behavior.
Also:
* Replace magic number with `MAX_FUTURE_BLOCK_TIME` to clarify the context
* Add test, which passes both on current master and this pull request
* An unrelated refactoring commit, happy to drop
ACKs for top commit:
dongcarl:
Code Review ACK fa551b3bdd, noticed the exact same thing here: e073634c37
mzumsande:
Code Review ACK fa551b3bdd
jnewbery:
Code review ACK fa551b3bdd
shaavan:
ACK fa551b3bdd
theStack:
Code-review ACK fa551b3bdd
Tree-SHA512: 15807a0e943e3e8d8c5250c8f6d7b56afb26002b1e290bf93636a2c747f27e78f01f1de04ce1a83d6339e27284c69c43e077a8467545c4078746f4c1ecb1164d
Blindly chose a cap of 10000 iterations for every loop, except for
the two in script_ops.cpp and scriptnum_ops.cpp which appeared to
(sometimes) be deserializing individual bytes; capped those to one
million to ensure that sometimes we try working with massive scripts.
There was also one fuzzer-controlled loop in timedata.cpp which was
already capped, so I left that alone.
git grep 'while (fuzz' should now run clean except for timedata.cpp
Currently addrman consistency checks are a compile time option, and are not
enabled in our CI. It's unlikely anyone is running these consistency checks.
Make them a runtime option instead, where users can enable addrman
consistency checks every n operations (similar to mempool tests). Update
the addrman unit tests to do internal consistency checks every 100
operations (checking on every operations causes the test runtime to
increase by several seconds).
Also assert on a failed addrman consistency check to terminate program
execution.
0829516d1f [refactor] Remove unused ForEachNodeThen() template (John Newbery)
09cc66c00e scripted-diff: rename address relay fields (John Newbery)
76568a3351 [net processing] Move addr relay data and logic into net processing (John Newbery)
caba7ae8a5 [net processing] Make RelayAddress() a member function of PeerManagerImpl (John Newbery)
86acc96469 [net processing] Take NodeId instead of CNode* as originator for RelayAddress() (John Newbery)
Pull request description:
This continues the work of moving application layer data into net_processing, by moving all addr data into the new Peer object added in #19607.
For motivation, see #19398.
ACKs for top commit:
laanwj:
Code review ACK 0829516d1f
mzumsande:
ACK 0829516d1f, reviewed the code and ran tests.
sipa:
utACK 0829516d1f
hebasto:
re-ACK 0829516d1f
Tree-SHA512: efe0410fac288637f203eb37d1999910791e345872d37e1bd5cde50e25bb3cb1c369ab86b3a166ffd5e06ee72e4508aa2c46d658be6a54e20b4f220d2f57d0a6
-BEGIN VERIFY SCRIPT-
# Rename
sed -i -e 's/MakeFuzzingContext/MakeNoLogFileContext/g' $(git grep -l MakeFuzzingContext)
# Bump the copyright of touched files in this scripted diff to avoid touching them again later
./contrib/devtools/copyright_header.py update ./src/test/fuzz/
-END VERIFY SCRIPT-
Previously, the {Basic,}TestingSetup for fuzzers were set up in many ways:
1. Calling InitializeFuzzingContext, which implicitly constructs a static
const BasicTestingSetup
2. Directly constructing a static const BasicTestingSetup in the initialize_*
function
3. Directly constructing a static TestingSetup and reproducing the
initialization arguments (I'm assuming because
InitializeFuzzingContext only initializes a BasicTestingSetup)
The new, relatively-simple MakeFuzzingContext function allows us to
consolidate these methods of initialization by being flexible enough to
be used in all situations. It:
1. Is templated so that we can choose to initialize any of
the *TestingSetup classes
2. Has sane defaults which are often used in fuzzers but are also
easily overridable
3. Returns a unique_ptr, explicitly transferring ownership to the caller
to deal with according to its situation