4fec726c4d refactor: Simplify Interpret asmap function (Fabian Jahr)
79e97d45c1 doc: Add more extensive docs to asmap implementation (Fabian Jahr)
cf4943fdcd refactor: Use span instead of vector for data in util/asmap (Fabian Jahr)
385c34a052 refactor: Unify asmap version calculation and naming (Fabian Jahr)
fa41fc6a1a refactor: Operate on bytes instead of bits in Asmap code (Fabian Jahr)
Pull request description:
This is a second slice carved out of #28792. It contains the following changes that are crucial for the embedding of asmap data which is added the following PR in the series (probably this will remain in #28792).
The changes are:
- Modernizes and simplifies the asmap code by operating on `std::byte` instead of bits
- Unifies asmap version calculation and naming (previously it was called version and checksum interchangeably)
- Operate on a `span` rather than a vector in the asmap internal to prevent holding the asmap data in memory twice
- Add more extensive documentation to the asmap implementation
- Unify asmap casing in implemetation function names
The first three commits were already part of #28792, the others are new.
The documentation commit came out of feedback gathered at the latest CoreDev. The primary input for the documentation was the documentation that already existed in the Python implementation (`contrib/asmap/asmap.py`) but there are several other comments as well. Please note: I have also asked several LLMs to provide suggestions on how to explain pieces of the implementation and better demonstrate how the parts work together. I have copied bits and pieces that I liked but everything has been edited further by me and obviously all mistakes here are my own.
ACKs for top commit:
hodlinator:
re-ACK 4fec726c4d
sipa:
ACK 4fec726c4d
sedited:
Re-ACK 4fec726c4d
Tree-SHA512: 950a591c3fcc9ddb28fcfdc3164ad3fbd325fa5004533c4a8b670fbf8b956060a0daeedd1fc2fced1f761ac49cd992b79cabe12ef46bc60b2559a7a613d0e166
Verify that addresses announcing themselves (addr == source) are exempt
from time penalties, while addresses announced by others receive the
expected penalty.
This prevents holding the asmap data in memory twice.
The version hash changes due to spans being serialized without their size-prefix (unlike vectors).
This change allows to drop brittle sizeof calls in favor of the
std::span::size method.
Other improvements include:
* Use of a namespace to mark test and bench data
* Use of the modern std::byte
* Drop of a no longer used std::vector copy and the bench/data module
There are no changes to behavior. Changes in this commit are all additions, and
are easiest to review using "git diff -U0 --word-diff-regex=." options.
Motivation for this change is to keep util functions with really generic names
like "Split" and "Join" out of the global namespace so it is easier to see
where these functions are defined, and so they don't interfere with function
overloading, especially since the util library is a dependency of the kernel
library and intended to be used with external code.
3ea54e5db7 net: Add continuous ASMap health check logging (Fabian Jahr)
28d7e55dff test: Add tests for unfiltered GetAddr usage (Fabian Jahr)
b8843d37ae fuzz: Let fuzzers use filter options in GetAddr/GetAddresses (Fabian Jahr)
e16f420547 net: Optionally include terrible addresses in GetAddr results (Fabian Jahr)
Pull request description:
There are certain statistics we can collect by running all our known clearnet addresses against the ASMap file. This could show issues with a maliciously manipulated file or with an old file that has decayed with time.
This is just a proof of concept for now. My idea currently is to run the analysis once per day and print the results to logs if an ASMap file is used.
ACKs for top commit:
achow101:
ACK 3ea54e5db7
mzumsande:
ACK 3ea54e5db7
brunoerg:
crACK 3ea54e5db7
Tree-SHA512: 777acbfac43cc43ce4a0a3612434e4ddbc65f59ae8ffc9e24f21de09011bccb297f0599cbaa82bcf40ef68e5af582c4e98556379db7ceff7d9f97574a1cf8e09
This also cleans up the addrman (de)serialization code paths to only
allow `Disk` serialization. Some unit tests previously forced a
`Network` serialization, which does not make sense, because Bitcoin Core
in production will always `Disk` serialize.
This cleanup idea was suggested by Pieter Wuille and implemented by Anthony
Towns.
Co-authored-by: Pieter Wuille <pieter@wuille.net>
Co-authored-by: Anthony Towns <aj@erisian.com.au>
if an addr matching the network requirements is only on the new table and
select is invoked with new_only = false, ensure that the code selects the new
table.
in order to test this case, we use a non deterministic addrman. this means we
cannot have more than one address in any addrman table, or risk sporadic
failures when the second address happens to conflict.
if the code chose a table at random, the test would fail 50% of the time
Co-authored-by: Martin Zumsande <mzumsande@gmail.com>
this adds coverage for the 7 different cases of which table should be selected
when the network is specified. the different cases are the result of new_only
being true or false and whether there are network addresses on both, neither,
or one of new vs tried tables. the only case not covered is when new_only is
false and the only network addresses are on the new table.
Co-authored-by: Martin Zumsande <mzumsande@gmail.com>
Co-authored-by: Vasil Dimov <vd@FreeBSD.org>
c9d548c91f net: remove CService::ToStringPort() (Vasil Dimov)
fd4f0f41e9 gui: simplify OptionsDialog::updateDefaultProxyNets() (Vasil Dimov)
96c791dd20 net: remove CService::ToString() use ToStringAddrPort() instead (Vasil Dimov)
944a9de08a net: remove CNetAddr::ToString() and use ToStringAddr() instead (Vasil Dimov)
043b9de59a scripted-diff: rename ToStringIP[Port]() to ToStringAddr[Port]() (Vasil Dimov)
Pull request description:
Before this PR we had the somewhat confusing combination of methods:
`CNetAddr::ToStringIP()`
`CNetAddr::ToString()` (duplicate of the above)
`CService::ToStringIPPort()`
`CService::ToString()` (duplicate of the above, overrides a non-virtual method from `CNetAddr`)
`CService::ToStringPort()`
Avoid [overriding non-virtual methods](https://github.com/bitcoin/bitcoin/pull/25349/#issuecomment-1185226396).
"IP" stands for "Internet Protocol" and while sometimes "IP addresses" are called just "IPs", it is incorrect to call Tor or I2P addresses "IPs". Thus use "Addr" instead of "IP".
Change the above to:
`CNetAddr::ToStringAddr()`
`CService::ToStringAddrPort()`
The changes touch a lot of files, but are mostly mechanical.
ACKs for top commit:
sipa:
utACK c9d548c91f
achow101:
ACK c9d548c91f
jonatack:
re-ACK c9d548c91f only change since my previous reviews is rebase, but as a sanity check rebased to current master and at each commit quickly re-reviewed and re-verified clean build and green unit tests
LarryRuane:
ACK c9d548c91f
Tree-SHA512: 633fb044bdecf9f551b5e3314c385bf10e2b78e8027dc51ec324b66b018da35e5b01f3fbe6295bbc455ea1bcd1a3629de1918d28de510693afaf6a52693f2157
Now that Size() performs internal consistency checks,
it will rightfully fail (and assert) when dealing with
a corrupted AddrMan. Therefore remove this check.
The functionality of the old size() is covered by the new Size()
when no arguments are specified, so this does not change behavior.
Co-authored-by: Martin Zumsande <mzumsande@gmail.com>
For now, the new functionality will be used in the context of
querying fixed seeds. Other possible applications for
future changes is the use in the context of making automatic
connections to specific networks, or making more detailed info
about addrman accessible via rpc.
Both methods do the same thing, so simplify to having just one.
`ToString()` is too generic in this case and it is unclear what it does,
given that there are similar methods:
`ToStringAddr()` (inherited from `CNetAddr`),
`ToStringPort()` and
`ToStringAddrPort()`.
In addrman unit tests, make it possible to override the check ratio from
the command line, without recompiling:
```
test_bitcoin --run_test="addrman_tests/*" -- -checkaddrman=1
```
Also, make the arguments of the constructor of `AddrManTest` the
same as the arguments of `AddrMan`.
ea4c9fd4ab test: Cover eviction by timeout in addrman_evictionworks (Martin Zumsande)
4f1bb467b5 test: Add test for multiplicity in addrman new tables (Martin Zumsande)
e880bb7836 test: Add test for updating addrman entries (Martin Zumsande)
f02eee8c87 test: introduce utility function to retrieve an addrman (Martin Zumsande)
f0e5efb824 test: Remove unused AddrManTest class (Martin Zumsande)
b696d7870b test: Remove tests for internal helper functions (Martin Zumsande)
0538520091 test: use AddrMan instead of AddrManTest where possible (Martin Zumsande)
1c65d427bb test: Inline SimConnFail function (Martin Zumsande)
5b7aac34f2 test: delete unused GetBucketAndEntry function (Amiti Uttarwar)
2ba1e74e59 test: Update addrman_serialization unit test to use AddrMan's interface (Amiti Uttarwar)
dad5f76021 addrman: Introduce a test-only function to lookup addresses (Amiti Uttarwar)
Pull request description:
This PR (joint work with Amiti Uttarwar) changes the addrman unit tests such that they only use the public `AddrMan` interface:
This has the advantage that the tests are less implementation-dependent, i.e. it would be possible to rewrite the internal addrman implementation (as drafted [here](https://github.com/sipa/bitcoin/tree/202106_multiindex_addrman) for using a multiindex) without having to adjust the tests.
This includes the following steps:
* Adding a test-only function `FindAddressEntry()` to the public addrman interface which returns info about an address in addrman (e.g. bucket, position, whethe the address is in new or tried). Obviously we want to do this sparingly, but I think a single test-only function is ok (which could also be useful elsewhere, e.g. in fuzz tests).
* Removal of the `AddrManTest` subclass which would reach into AddrMan's internals, using `AddrMan` instead
* Removal of tests for internal helper functions that are not publicly exposed (these are still tested indirectly via the public functions calling them).
* Additional tests for previously untested features such as multiplicity in the new tables, that can be tested with the help of `FindAddressEntry()`.
All in all, this PR increases the unit test coverage of AddrMan by a bit.
ACKs for top commit:
jnewbery:
ACK ea4c9fd4ab
josibake:
reACK ea4c9fd4ab
Tree-SHA512: c2d4ec8bdc62ffd6055ddcd37dea85ec08c76889e9e417e8d7c62a96cf68a8bcbe8c67bec3344d91fa7d3c499f6d9f810962da1dddd38e70966186b10b8ab447
The logic of these functions is already covered by existing unit tests
using publicly exposed functions of the interface.
Therefore, removing them does not decrease test coverage.
Switches to AddrMan for tests that use no features of AddrManTest.
Also removes unusued AddrManTest variables
Co-Authored-By: Amiti Uttarwar <amiti@uttarwar.org>