b0c706795c Remove unreliable seed from chainparams.cpp, and the associated README (SatsAndSports)
Pull request description:
The DNS seed `dnsseed.bitcoin.dashjr-list-of-p2p-nodes.us.` is not returning a representative sample of bitcoin nodes. It currently returns nothing later than 28.1.0, breaching the policy.
This PR removes that seed from the list of DNS seeds
### Rationale
The [policy for seeds](https://github.com/bitcoin/bitcoin/blob/master/doc/dnsseed-policy.md) includes this:
> The DNS seed results must consist exclusively of fairly selected and functioning Bitcoin nodes from the public network
A number of comments below, in response to this PR, include apparent breaches of this policy: [1](https://github.com/bitcoin/bitcoin/pull/33723#issuecomment-3458071231) [2](https://github.com/bitcoin/bitcoin/pull/33723#issuecomment-3457655364), [3](https://github.com/bitcoin/bitcoin/pull/33723#issuecomment-3457712557), in particular the first linked comment ([1](https://github.com/bitcoin/bitcoin/pull/33723#issuecomment-3458071231)) comparing the distribution at this seed to other seeds. This seed is not including anything later than 28.2.0, breaching this policy.
To ensure the policy is followed, and the seeds include a representative sample of Bitcoin nodes, this PR removes this seed from the list
### Data
I ran this:
```
# Get some ip address from that seed:
# Repeated multiple times, to get many different IPs:
dig +short dnsseed.bitcoin.dashjr-list-of-p2p-nodes.us >> dnsseed.bitcoin.dashjr-list-of-p2p-nodes.us
# For each distinct ip gathered from the seed, get basic info about the node, including it's User Agent string:
cat dnsseed.bitcoin.dashjr-list-of-p2p-nodes.us | sort -u | while read ip; do echo ===; echo $ip; nmap -p 8333 --script bitcoin-info "$ip"; done > seed_versions.txt
```
and then summarized the agents with `egrep 'User Agent' seed_versions.txt | sort | uniq -c` and got:
```
1 User Agent: /Satoshi:22.0.0/
1 User Agent: /Satoshi:22.1.0/
5 User Agent: /Satoshi:24.0.1/
1 User Agent: /Satoshi:25.1.0/
30 User Agent: /Satoshi:27.0.0/
1 User Agent: /Satoshi:27.1.0/
1 User Agent: /Satoshi:27.1.0/Knots:20240801/
1 User Agent: /Satoshi:28.0.0/
7 User Agent: /Satoshi:28.1.0/
2 User Agent: /Satoshi:28.1.0/Knots:20250305/
```
ACKs for top commit:
l0rinc:
reACK b0c706795c
delta1:
reACK b0c706795c
Crypt-iQ:
crACK b0c706795c
laanwj:
ACK b0c706795c
murchandamus:
ACK b0c706795c
RandyMcMillan:
ACK b0c7067
wiz:
ACK b0c706795c
dergoegge:
ACK b0c706795c
stickies-v:
re-ACK b0c706795c
mzumsande:
ACK b0c706795c
instagibbs:
ACK b0c706795c
Tree-SHA512: 7230b8dd24560ce6f8247e2e82ae7846ded8b91e230c59cc3643da3f5b9c12b5f025c1bb14490c19ca55f3794e81ce08106b31b3bf883d5c2dced05017123ac4
Historically, the headers have been bumped some time after a file has
been touched. Do it now to avoid having to touch them again in the
future for that reason.
-BEGIN VERIFY SCRIPT-
sed -i --regexp-extended 's;( 20[0-2][0-9])(-20[0-2][0-9])? The Bitcoin Core developers;\1-present The Bitcoin Core developers;g' $( git show --pretty="" --name-only HEAD~0 )
-END VERIFY SCRIPT-
The encoding arg is confusing, because it is not applied consistently
for all IO.
Also, it is useless, as the majority of files are ASCII encoded, which
are fine to encode and decode with any mode.
Moreover, UTF-8 is already required for most scripts to work properly,
so setting the encoding twice is redundant.
So remove the encoding from most IO. It would be fine to remove from all
IO, however I kept it for two files:
* contrib/asmap/asmap-tool.py: This specifically looks for utf-8
encoding errors, so it makes sense to sepecify the utf-8 encoding
explicitly.
* test/functional/test_framework/test_node.py: Reading the debug log in
text mode specifically counts the utf-8 characters (not bytes), so it
makes sense to specify the utf-8 encoding explicitly.
We shouldn't have | at the end of the last clause, as this will make it match
the empty string too (so effectively everything starting with Satoshi: matches).
While doing this, put the | at the beginning of every line of regex rather than
the end, to make it easier to update in the future without accidentally running
into this problem again.
Regenerate mainnet seeds from new sources without the need for hardcoded
data. Result has 512 nodes from each network type except cjdns, for
which only eight nodes were found that match the seed node criteria.
Pull additional nodes from virtu's crawler. Data includes sufficient
Onion and I2P nodes to align the uptime requirements for these networks
to that of clearnet nodes (i.e., 50%). Data also includes more than
three times the number of CJDNS nodes currently hardcoded into
nodes_main_manual.txt, so hardcoded nodes becomes obsolete.
The seeders now produce onion and i2p seeds, so there is no need to keep these
in the manual list.
Although should also be produced, there are not enough
good ones detected by the seeder, so we keep the manual seeds for them.
The crawlers are not guaranteed to output nodes in a random order, so
shuffle the ips list after parsing to break any biasing that may be
caused by the output order.
Update the user agent regex to match all 3 digits of the version number,
not just the first 2 digits.
Also updates it to include 24.2, 25.2, 26.1, 27.0, 27.1, 27.99, 28.0 and
28.99.
fa52e13ee8 test: Remove struct.pack from almost all places (MarcoFalke)
fa826db477 scripted-diff: test: Use int.to_bytes over struct packing (MarcoFalke)
faf2a975ad test: Use int.to_bytes over struct packing (MarcoFalke)
faf3cd659a test: Normalize struct.pack format (MarcoFalke)
Pull request description:
`struct.pack` has many issues:
* The format string consists of characters that may be confusing and may need to be looked up in the documentation, as opposed to using easy to understand self-documenting code.
This lead to many test bugs, which weren't hit, which is fine, but still confusing. Ref: https://github.com/bitcoin/bitcoin/pull/29400, https://github.com/bitcoin/bitcoin/pull/29399, https://github.com/bitcoin/bitcoin/pull/29363, fa3886b7c6, ...
Fix all issues by using the built-in `int` helpers `to_bytes` via a scripted diff.
Review notes:
* For `struct.pack` and `int.to_bytes` the error behavior is the same, although the error messages are not identical.
* Two `struct.pack` remain. One for float serialization in a C++ code comment, and one for native serialization.
ACKs for top commit:
achow101:
ACK fa52e13ee8
rkrux:
tACK [fa52e13](fa52e13ee8)
theStack:
Code-review ACK fa52e13ee8
Tree-SHA512: ee80d935b68ae43d1654b047e84ceb80abbd20306df35cca848b3f7874634b518560ddcbc7e714e2e7a19241e153dee64556dc4701287ae811e26e4f8c57fe3e
6abe772a17 contrib: Add asmap-tool (Fabian Jahr)
Pull request description:
This adds `asmap.py` and `asmap-tool.py` from sipa's `nextgen` branch: https://github.com/sipa/asmap/tree/nextgen
The motivation is that we should maintain the tooling for de- and encoding asmap files within the bitcoin core repository because it is not possible to use an asmap file that is not encoded.
We already had an earlier version of `asmap.py` within the seeds contrib tools. The newer version only had a small amount of changes and is still compatible, so the old version is removed from contrib/seeds and the new version is made available to `makeseeds.py`.
ACKs for top commit:
virtu:
ACK [6abe772](6abe772a17)
0xB10C:
ACK 6abe772a17
achow101:
ACK 6abe772a17
brunoerg:
ACK 6abe772a17
Tree-SHA512: cc2a82ffa4eb46fa0ce4ca769dd82f8d0d2f37fc3652aa748eeb060e1142f9da4035008fe89433e2fd524a4dc153b7b9c085748944b49137b37009b0c0be8afb
Since Python 3.9, type hinting has become a little less awkward, as for
collection types one doesn't need to import the corresponding
capitalized types (`Dict`, `List`, `Set`, `Tuple`, ...) anymore, but can
use the built-in types directly. [1] [2]
This commit applies the replacement for all Python scripts (i.e. in the
contrib and test folders) for the basic types:
- typing.Dict -> dict
- typing.List -> list
- typing.Set -> set
- typing.Tuple -> tuple
[1] https://docs.python.org/3.9/whatsnew/3.9.html#type-hinting-generics-in-standard-collections
[2] https://peps.python.org/pep-0585/#implementation for a list of type
a2bef805c1 kernel: update m_assumed_* chain params for 25.x (fanquake)
4128e01dba kernel: update chainTxData for 25.x (fanquake)
00b2b114b4 kernel: update nMinimumChainWork & defaultAssumeValid for 25.x (fanquake)
07fcc0a82c doc: update references to kernel/chainparams.cpp (fanquake)
Pull request description:
Update chainparams pre `25.x` branch off.
Co-Author in the commits as a PR (#27223) had previously been opened too-early to do the same.
Note: Remember that some variance is expected in the `m_assumed_*` sizes.
ACKs for top commit:
achow101:
ACK a2bef805c1
josibake:
ACK a2bef805c1
gruve-p:
ACK a2bef805c1
dergoegge:
ACK a2bef805c1 on the new mainnet params
Tree-SHA512: 0b19c2ef15c6b15863d6a560a1053ee223057c7bfb617ffd3400b1734cee8f75bc6fd7f04d8f8e3f5af6220659a1987951a1b36945d6fe17d06972004fd62610
3cc989da5c Fix checking bad dns seeds without casting (Yusuf Sahin HAMZA)
Pull request description:
- Since seed lines comes with `str` type, comparing `good` column directly with **0** (`int` type) in the if statement was not working at all. This is fixed by casting `int` type to the values in the `good` column of seeds text file.
- Lines that starts with comment in the seeds text file are now ignored.
- If statement for checking bad seeds are moved to the top of the `parseline` function as if a seed is bad; there is no point of going forward from there.
Since this bug-fix eliminates bad seeds over **550k** in the first place, in my case; particular job for parsing all seeds speed is up by **600%** and whole script's speed is up by **%30**.
Note that **stats** in the terminal are not going to include bad seeds after this fix, which would be the same if this bug were never there before.
ACKs for top commit:
achow101:
ACK 3cc989da5c
jonatack:
ACK 3cc989da5c
Tree-SHA512: 13c82681de4d72de07293f0b7f09721ad8514a2ad99b0584d1c94fa5f2818821df2000944f9514d6a222a5dccc82856d16c8c05aa36d905cfa7d4610c629fd38