fa0677d131 refactor: Use SpanReader over DataStream (MarcoFalke)
fad3eb3956 refactor: Use SpanReader over DataStream (MarcoFalke)
fa06e26764 refactor: [qt] Use SpanReader to avoid two vector copies (MarcoFalke)
fabd4d2e2e refactor: Avoid UB in SpanReader::ignore (MarcoFalke)
fa20bc2ec2 refactor: Use empty() over eof() in the streams interface (MarcoFalke)
fa879db735 test: Read debug log for self-checking comment (MarcoFalke)
Pull request description:
This changes all places, where possible, to use SpanReader over DataStream. This makes the code easier to read and reason about, because `SpanReader` can never write data. Also, the code should be minimally faster, because it avoids a full redundant copy of the whole vector of bytes.
ACKs for top commit:
stickies-v:
re-ACK fa0677d131
achow101:
ACK fa0677d131
janb84:
re ACK fa0677d131
sipa:
crACK fa0677d131
Tree-SHA512: 1d9f43fc6e71d481cf7b8f8457f479745ee331734649e9e2c2ab00ce5d317112796c77afc328612ed004e65ac5c16fc92279d760cfb012cfddce9098c4af810f
This refactor does not change behavior. However, it avoids a vector
copy, which can lead to a minimal speed-up of 1%-5%, depending on the
call-site. This is mostly relevant for the fuzz tests and utils that
read large blobs of data (like a full block).
4fec726c4d refactor: Simplify Interpret asmap function (Fabian Jahr)
79e97d45c1 doc: Add more extensive docs to asmap implementation (Fabian Jahr)
cf4943fdcd refactor: Use span instead of vector for data in util/asmap (Fabian Jahr)
385c34a052 refactor: Unify asmap version calculation and naming (Fabian Jahr)
fa41fc6a1a refactor: Operate on bytes instead of bits in Asmap code (Fabian Jahr)
Pull request description:
This is a second slice carved out of #28792. It contains the following changes that are crucial for the embedding of asmap data which is added the following PR in the series (probably this will remain in #28792).
The changes are:
- Modernizes and simplifies the asmap code by operating on `std::byte` instead of bits
- Unifies asmap version calculation and naming (previously it was called version and checksum interchangeably)
- Operate on a `span` rather than a vector in the asmap internal to prevent holding the asmap data in memory twice
- Add more extensive documentation to the asmap implementation
- Unify asmap casing in implemetation function names
The first three commits were already part of #28792, the others are new.
The documentation commit came out of feedback gathered at the latest CoreDev. The primary input for the documentation was the documentation that already existed in the Python implementation (`contrib/asmap/asmap.py`) but there are several other comments as well. Please note: I have also asked several LLMs to provide suggestions on how to explain pieces of the implementation and better demonstrate how the parts work together. I have copied bits and pieces that I liked but everything has been edited further by me and obviously all mistakes here are my own.
ACKs for top commit:
hodlinator:
re-ACK 4fec726c4d
sipa:
ACK 4fec726c4d
sedited:
Re-ACK 4fec726c4d
Tree-SHA512: 950a591c3fcc9ddb28fcfdc3164ad3fbd325fa5004533c4a8b670fbf8b956060a0daeedd1fc2fced1f761ac49cd992b79cabe12ef46bc60b2559a7a613d0e166
This prevents holding the asmap data in memory twice.
The version hash changes due to spans being serialized without their size-prefix (unlike vectors).
"tor" as a network specification was deprecated in 60dc8e4208 in favor
of "onion" and this commit removes it and updates the relevant test.
Co-authored-by: Mara van der Laan <126646+laanwj@users.noreply.github.com>
6967e8e8ab add more bad p2p ports (Jameson Lopp)
Pull request description:
Add a few more ports used by extremely well adopted services that require authentication and really ought not be used by bitcoin nodes for p2p traffic.
ACKs for top commit:
Sjors:
utACK 6967e8e8ab
l0rinc:
ACK 6967e8e8ab
glozow:
ACK 6967e8e8ab
Tree-SHA512: bbe86aef2be9727338712ded8f90227f5d12f633ab5d324c8907c01173945d1c4d9899e05565f78688842bbf5ebb010d22173969e4168ea08d4e33f01fe9569d
It does not make sense and it is rejected by other parsers as well:
>>> ipaddress.ip_network("1.2.3.0/+24")
ValueError: '1.2.3.0/+24' does not appear to be an IPv4 or IPv6 network
fb3e812277 p2p: return `CSubNet` in `LookupSubNet` (brunoerg)
Pull request description:
Analyzing the usage of `LookupSubNet`, noticed that most cases uses check if the subnet is valid by calling `subnet.IsValid()`, and the boolean returned by `LookupSubNet` hasn't been used so much, see:
29d540b7ad/src/httpserver.cpp (L172-L174)29d540b7ad/src/net_permissions.cpp (L114-L116)
It makes sense to return `CSubNet` instead of `bool`.
ACKs for top commit:
achow101:
ACK fb3e812277
vasild:
ACK fb3e812277
theStack:
Code-review ACK fb3e812277
stickies-v:
Concept ACK, but Approach ~0 (for now). Reviewed the code (fb3e812277) and it all looks good to me.
Tree-SHA512: ba50d6bd5d58dfdbe1ce1faebd80dd8cf8c92ac53ef33519860b83399afffab482d5658cb6921b849d7a3df6d5cea911412850e08f3f4e27f7af510fbde4b254
This also cleans up the addrman (de)serialization code paths to only
allow `Disk` serialization. Some unit tests previously forced a
`Network` serialization, which does not make sense, because Bitcoin Core
in production will always `Disk` serialize.
This cleanup idea was suggested by Pieter Wuille and implemented by Anthony
Towns.
Co-authored-by: Pieter Wuille <pieter@wuille.net>
Co-authored-by: Anthony Towns <aj@erisian.com.au>
c9d548c91f net: remove CService::ToStringPort() (Vasil Dimov)
fd4f0f41e9 gui: simplify OptionsDialog::updateDefaultProxyNets() (Vasil Dimov)
96c791dd20 net: remove CService::ToString() use ToStringAddrPort() instead (Vasil Dimov)
944a9de08a net: remove CNetAddr::ToString() and use ToStringAddr() instead (Vasil Dimov)
043b9de59a scripted-diff: rename ToStringIP[Port]() to ToStringAddr[Port]() (Vasil Dimov)
Pull request description:
Before this PR we had the somewhat confusing combination of methods:
`CNetAddr::ToStringIP()`
`CNetAddr::ToString()` (duplicate of the above)
`CService::ToStringIPPort()`
`CService::ToString()` (duplicate of the above, overrides a non-virtual method from `CNetAddr`)
`CService::ToStringPort()`
Avoid [overriding non-virtual methods](https://github.com/bitcoin/bitcoin/pull/25349/#issuecomment-1185226396).
"IP" stands for "Internet Protocol" and while sometimes "IP addresses" are called just "IPs", it is incorrect to call Tor or I2P addresses "IPs". Thus use "Addr" instead of "IP".
Change the above to:
`CNetAddr::ToStringAddr()`
`CService::ToStringAddrPort()`
The changes touch a lot of files, but are mostly mechanical.
ACKs for top commit:
sipa:
utACK c9d548c91f
achow101:
ACK c9d548c91f
jonatack:
re-ACK c9d548c91f only change since my previous reviews is rebase, but as a sanity check rebased to current master and at each commit quickly re-reviewed and re-verified clean build and green unit tests
LarryRuane:
ACK c9d548c91f
Tree-SHA512: 633fb044bdecf9f551b5e3314c385bf10e2b78e8027dc51ec324b66b018da35e5b01f3fbe6295bbc455ea1bcd1a3629de1918d28de510693afaf6a52693f2157
Both methods do the same thing, so simplify to having just one.
`ToString()` is too generic in this case and it is unclear what it does,
given that there are similar methods:
`ToStringAddr()` (inherited from `CNetAddr`),
`ToStringPort()` and
`ToStringAddrPort()`.
Both methods do the same thing, so simplify to having just one.
Further, `CService` inherits `CNetAddr` and `CService::ToString()`
overrides `CNetAddr::ToString()` but the latter is not virtual which
may be confusing. Avoid such a confusion by not having non-virtual
methods with the same names in inheritance.
Forward the validation of the port from `ParseUInt16(...)`.
Consider port 0 as invalid.
Add suitable test for the `SplitHostPort` function.
Add doxygen description to the `SplitHostPort` function.
By default, for mainnet, the p2p listening port is 8333. Bitcoin Core
has a strong preference for only connecting to nodes that listen on that
port.
Remove that preference because connections over clearnet that involve
port 8333 make it easy to detect, analyze, block or divert Bitcoin p2p
traffic before the connection is even established (at TCP SYN time).
For further justification see the OP of:
https://github.com/bitcoin/bitcoin/pull/23306
- drop redundant PF_ permission flags prefixes
- drop ALL_CAPS naming per https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#Renum-caps
- rename IsImplicit to Implicit
-BEGIN VERIFY SCRIPT-
s() { git grep -l "$1" src | xargs sed -i "s/$1/$2/g"; }
s 'PF_NONE' 'None'
s 'PF_BLOOMFILTER' 'BloomFilter'
s 'PF_RELAY' 'Relay'
s 'PF_FORCERELAY' 'ForceRelay'
s 'PF_DOWNLOAD' 'Download'
s 'PF_NOBAN' 'NoBan'
s 'PF_MEMPOOL' 'Mempool'
s 'PF_ADDR' 'Addr'
s 'PF_ISIMPLICIT' 'Implicit'
s 'PF_ALL' 'All'
-END VERIFY SCRIPT-
-BEGIN VERIFY SCRIPT-
s() { git grep -l "$1" -- 'src' ':!src/net_permissions.h' | xargs sed -i -E "s/([^:])$1/\1NetPermissionFlags::$1/"; }
s 'PF_NONE'
s 'PF_BLOOMFILTER'
s 'PF_RELAY'
s 'PF_FORCERELAY'
s 'PF_DOWNLOAD'
s 'PF_NOBAN'
s 'PF_MEMPOOL'
s 'PF_ADDR'
s 'PF_ISIMPLICIT'
s 'PF_ALL'
-END VERIFY SCRIPT-
Co-authored-by: Hennadii Stepanov <32963518+hebasto@users.noreply.github.com>
to clarify/test the relationship and NetPermissions operations
involving the NetPermissionFlags PF_NOBAN and PF_DOWNLOAD.
Co-authored-by: Vasil Dimov <vd@FreeBSD.org>
Allow creation of valid `CSubNet` objects of non-IP networks and only
match the single address they were created from (like /32 for IPv4 or
/128 for IPv6).
This fixes a deficiency in `CConnman::DisconnectNode(const CNetAddr& addr)`
and in `BanMan` which assume that creating a subnet from any address
using the `CSubNet(CNetAddr)` constructor would later match that address
only. Before this change a non-IP subnet would be invalid and would not
match any address.
ecc6cf1a3b test: fix creation of std::string objects with \0s (Vasil Dimov)
Pull request description:
A string literal `"abc"` contains a terminating `\0`, so that is 4
bytes. There is no need to write `"abc\0"` unless two terminating
`\0`s are necessary.
`std::string` objects do not internally contain a terminating `\0`, so
`std::string("abc")` creates a string with size 3 and is the same as
`std::string("abc", 3)`.
In `"\01"` the `01` part is interpreted as one number (1) and that is
the same as `"\1"` which is a string like `{1, 0}` whereas `"\0z"` is a
string like `{0, 'z', 0}`. To create a string like `{0, '1', 0}` one
must use `"\0" "1"`.
Adjust the tests accordingly.
ACKs for top commit:
laanwj:
ACK ecc6cf1a3b
practicalswift:
ACK ecc6cf1a3b modulo happily green CI
Tree-SHA512: 5eb489e8533a4199a9324b92f7280041552379731ebf7dfee169f70d5458e20e29b36f8bfaee6f201f48ab2b9d1d0fc4bdf8d6e4c58d6102f399cfbea54a219e
A string literal `"abc"` contains a terminating `\0`, so that is 4
bytes. There is no need to write `"abc\0"` unless two terminating
`\0`s are necessary.
`std::string` objects do not internally contain a terminating `\0`, so
`std::string("abc")` creates a string with size 3 and is the same as
`std::string("abc", 3)`.
In `"\01"` the `01` part is interpreted as one number (1) and that is
the same as `"\1"` which is a string like `{1, 0}` whereas `"\0z"` is a
string like `{0, 'z', 0}`. To create a string like `{0, '1', 0}` one
must use `"\0" "1"`.
Adjust the tests accordingly.
Change the serialization of `CAddrMan` to serialize its addresses
in ADDRv2/BIP155 format by default. Introduce a new `CAddrMan` format
version (3).
Add support for ADDRv2 format in `CAddress` (un)serialization.
Co-authored-by: Carl Dong <contact@carldong.me>
Before this change `CNetAddr::ip` was a fixed-size array of 16 bytes,
not being able to store larger addresses (e.g. TORv3) and encoded
smaller ones as 16-byte IPv6 addresses.
Change its type to `prevector`, so that it can hold larger addresses and
do not disguise non-IPv6 addresses as IPv6. So the IPv4 address
`1.2.3.4` is now encoded as `01020304` instead of
`00000000000000000000FFFF01020304`.
Rename `CNetAddr::ip` to `CNetAddr::m_addr` because it is not an "IP" or
"IP address" (TOR addresses are not IP addresses).
In order to preserve backward compatibility with serialization (where
e.g. `1.2.3.4` is serialized as `00000000000000000000FFFF01020304`)
introduce `CNetAddr` dedicated legacy serialize/unserialize methods.
Adjust `CSubNet` accordingly. Still use `CSubNet::netmask[]` of fixed 16
bytes, but use the first 4 for IPv4 (not the last 4). Only allow
subnetting for IPv4 and IPv6.
Co-authored-by: Carl Dong <contact@carldong.me>
A netmask that contains 1-bits after 0-bits (the 1-bits are not
contiguous on the left side) is invalid [1] [2].
The code before this PR used to parse and accept such
non-left-contiguous netmasks. However, a coming change that will alter
`CNetAddr::ip` to have flexible size would make juggling with such
netmasks more difficult, thus drop support for those.
[1] https://en.wikipedia.org/wiki/Classless_Inter-Domain_Routing#Subnet_masks
[2] https://tools.ietf.org/html/rfc4632#section-5.1
3bd67ba5a4 Test addr response caching (Gleb Naumenko)
cf1569e074 Add addr permission flag enabling non-cached addr sharing (Gleb Naumenko)
acd6135b43 Cache responses to addr requests (Gleb Naumenko)
7cc0e8101f Remove useless 2500 limit on AddrMan queries (Gleb Naumenko)
ded742bc5b Move filtering banned addrs inside GetAddresses() (Gleb Naumenko)
Pull request description:
This is a very simple code change with a big p2p privacy benefit.
It’s currently trivial to scrape any reachable node’s AddrMan (a database of all nodes known to them along with the timestamps).
We do have a limit of one GETADDR per connection, but a spy can disconnect and reconnect even from the same IP, and send GETADDR again and again.
Since we respond with 1,000 random records at most, depending on the AddrMan size it takes probably up to 100 requests for an spy to make sure they scraped (almost) everything.
I even have a script for that. It is totally doable within couple minutes.
Then, with some extra protocol knowledge a spy can infer the direct peers of the victim, and other topological stuff.
I suggest to cache responses to GETADDR on a daily basis, so that an attacker gets at most 1,000 records per day, and can’t track the changes in real time. I will be following up with more improvements to addr relay privacy, but this one alone is a very effective. And simple!
I doubt any of the real software does *reconnect to get new addrs from a given peer*, so we shouldn’t be cutting anyone.
I also believe it doesn’t have any negative implications on the overall topology quality. And the records being “outdated” for at most a day doesn’t break any honest assumptions either.
ACKs for top commit:
jnewbery:
reACK 3bd67ba5a4
promag:
Code review ACK 3bd67ba5a4.
ariard:
Code Review ACK 3bd67ba
Tree-SHA512: dfa5d03205c2424e40a3f8a41af9306227e1ca18beead3b3dda44aa2a082175bb1c6d929dbc7ea8e48e01aed0d50f0d54491caa1147471a2b72a46c3ca06b66f
Before this change, we would analyze the contents of `CNetAddr::ip[16]`
in order to tell which type is an address. Change this by introducing a
new member `CNetAddr::m_net` that explicitly tells the type of the
address.
This is necessary because in BIP155 we will not be able to tell the
address type by just looking at its raw representation (e.g. both TORv3
and I2P are "seemingly random" 32 bytes).
As a side effect of this change we no longer need to store IPv4
addresses encoded as IPv6 addresses - we can store them in proper 4
bytes (will be done in a separate commit). Also the code gets
somewhat simplified - instead of
`memcmp(ip, pchIPv4, sizeof(pchIPv4)) == 0` we can use
`m_net == NET_IPV4`.
Co-authored-by: Carl Dong <contact@carldong.me>
- Move the decision whether to translate an error message to where it is
defined. This simplifies call sites: no more `InitError(Untranslated(...))`.
- Make all functions in `util/error.h` consistently return a
`bilingual_str`. We've decided to use this as error message type so
let's roll with it.
This has no functional changes: no messages are changed, no new
translation messages are defined.