Commit Graph

66 Commits

Author SHA1 Message Date
TheCharlatan
23cc8ddff4 util: move HexStr and HexDigit from util to crypto
Move HexStr and HexDigit functions from util to crypto. The crypto library does
not actually use these functions, but the consensus library does. The consensus
and util libraries not allowed to depend on each other, but are allowed to
depend on the cryto library, so the crypto library is a reasonable put these.

The consensus library uses HexStr and HexDigit in script.cpp, transaction.cpp,
and uint256.cpp.

The util library does not use HexStr but does use HexDigit in strencodings.cpp
to parse integers.
2024-05-16 17:16:08 +02:00
Ava Chow
c38157b9b9 Merge bitcoin/bitcoin#29606: refactor: Reserve memory for ToLower/ToUpper conversions
6f2f4a4d09 Reserve memory for ToLower/ToUpper conversions (Lőrinc)

Pull request description:

  Similarly to https://github.com/bitcoin/bitcoin/pull/29458, we're preallocating the result string based on the input string's length.
  The methods were already [covered by tests](https://github.com/bitcoin/bitcoin/blob/master/src/test/util_tests.cpp#L1250-L1276).

ACKs for top commit:
  tdb3:
    ACK for 6f2f4a4d09
  maflcko:
    lgtm ACK 6f2f4a4d09
  achow101:
    ACK 6f2f4a4d09
  Empact:
    Code Review ACK 6f2f4a4d09
  stickies-v:
    ACK 6f2f4a4d09

Tree-SHA512: e3ba7af77decdc73272d804c94fef0b11028a85f3c0ea1ed6386672611b1c35fce151f02e64f5bb5acb5ba506aaa54577719b07925b9cc745143cf5c7e5eb262
2024-03-13 08:18:06 -04:00
Lőrinc
6f2f4a4d09 Reserve memory for ToLower/ToUpper conversions 2024-03-08 23:06:22 +01:00
Lőrinc
a19235c14b Preallocate result in TryParseHex to avoid resizing
Running `make && ./src/bench/bench_bitcoin -filter=HexParse` a few times results in:
```
|           ns/base16 |            base16/s |    err% |     total | benchmark
|--------------------:|--------------------:|--------:|----------:|:----------
|                0.68 |    1,465,555,976.27 |    0.8% |      0.01 | `HexParse`
|                0.68 |    1,472,962,920.18 |    0.3% |      0.01 | `HexParse`
|                0.68 |    1,476,159,423.00 |    0.3% |      0.01 | `HexParse`
```
2024-02-28 17:23:54 +00:00
MarcoFalke
fa8481b05f util: Work around ParseHex gcc cross compiler bug 2023-03-07 11:33:42 +01:00
MarcoFalke
faab273e06 util: Return empty vector on invalid hex encoding 2023-02-27 13:39:55 +01:00
Hennadii Stepanov
306ccd4927 scripted-diff: Bump copyright headers
-BEGIN VERIFY SCRIPT-
./contrib/devtools/copyright_header.py update ./
-END VERIFY SCRIPT-

Commits of previous years:
- 2021: f47dda2c58
- 2020: fa0074e2d8
- 2019: aaaaad6ac9
2022-12-24 23:49:50 +00:00
amadeuszpawlik
f8387c4234 Validate port value in SplitHostPort
Forward the validation of the port from `ParseUInt16(...)`.
Consider port 0 as invalid.
Add suitable test for the `SplitHostPort` function.
Add doxygen description to the `SplitHostPort` function.
2022-10-05 19:24:04 +02:00
MacroFake
fa875349e2 Fix iwyu 2022-08-20 09:33:01 +02:00
fanquake
07f2c25d04 refactor: add most of src/util to iwyu
These files change infrequently, and not much header shuffling is required.

We don't add everything in src/util/ yet, because IWYU makes some
dubious suggestions, which I'm going to follow up with upstream.
2022-07-08 11:06:01 +01:00
laanwj
0cd1a2eff9 Merge bitcoin/bitcoin#23595: util: Add ParseHex<std::byte>() helper
facd1fb911 refactor: Use Span of std::byte in CExtKey::SetSeed (MarcoFalke)
fae1006019 util: Add ParseHex<std::byte>() helper (MarcoFalke)
fabdf81983 test: Add test for embedded null in hex string (MarcoFalke)

Pull request description:

  This adds the hex->`std::byte` helper after the `std::byte`->hex helper was added in commit 9394964f6b

ACKs for top commit:
  pk-b2:
    ACK facd1fb911
  laanwj:
    Code review ACK facd1fb911

Tree-SHA512: e2329fbdea2e580bd1618caab31f5d0e59c245a028e1236662858e621929818870b76ab6834f7ac6a46d7874dfec63f498380ad99da6efe4218f720a60e859be
2022-05-20 10:47:30 +02:00
laanwj
fe6a299fc0 Merge bitcoin/bitcoin#24852: util: optimize HexStr
5e61532e72 util: optimizes HexStr (Martin Leitner-Ankerl)
4e2b99f72a bench: Adds a benchmark for HexStr (Martin Leitner-Ankerl)
67c8411c37 test: Adds a test for HexStr that checks all 256 bytes (Martin Leitner-Ankerl)

Pull request description:

  In my benchmark, this rewrite improves runtime 27% (g++) to 46% (clang++) for the benchmark `HexStrBench`:

  g++ 11.2.0
  |             ns/byte |              byte/s |    err% |        ins/byte |        cyc/byte |    IPC |       bra/byte |   miss% |     total | benchmark
  |--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
  |                0.94 |    1,061,381,310.36 |    0.7% |           12.00 |            3.01 |  3.990 |           1.00 |    0.0% |      0.01 | `HexStrBench` master
  |                0.68 |    1,465,366,544.25 |    1.7% |            6.00 |            2.16 |  2.778 |           1.00 |    0.0% |      0.01 | `HexStrBench` branch

  clang++ 13.0.1
  |             ns/byte |              byte/s |    err% |        ins/byte |        cyc/byte |    IPC |       bra/byte |   miss% |     total | benchmark
  |--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
  |                0.80 |    1,244,713,415.92 |    0.9% |           10.00 |            2.56 |  3.913 |           0.50 |    0.0% |      0.01 | `HexStrBench` master
  |                0.43 |    2,324,188,940.72 |    0.2% |            4.00 |            1.37 |  2.914 |           0.25 |    0.0% |      0.01 | `HexStrBench` branch

  Note that the idea for this change comes from denis2342 in #23364. This is a rewrite so no unaligned accesses occur.

  Also, the lookup table is now calculated at compile time, which hopefully makes the code a bit easier to review.

ACKs for top commit:
  laanwj:
    Code review ACK 5e61532e72
  aureleoules:
    tACK 5e61532e72.
  theStack:
    ACK 5e61532e72 🚤

Tree-SHA512: 40b53d5908332473ef24918d3a80ad1292b60566c02585fa548eb4c3189754971be5a70325f4968fce6d714df898b52d9357aba14d4753a8c70e6ffd273a2319
2022-05-04 20:36:09 +02:00
MarcoFalke
fae1006019 util: Add ParseHex<std::byte>() helper 2022-04-27 19:53:17 +02:00
MarcoFalke
fabdf81983 test: Add test for embedded null in hex string
Also, fix style in the corresponding function. The style change can be
reviewed with "--word-diff-regex=."
2022-04-27 19:18:20 +02:00
Pieter Wuille
e7d2fbda63 Use std::string_view throughout util strencodings/string 2022-04-27 14:13:39 +02:00
Pieter Wuille
8ffbd1412d Make DecodeBase{32,64} take string_view arguments 2022-04-27 14:12:55 +02:00
Pieter Wuille
78f3ac51b7 Make DecodeBase{32,64} return optional instead of taking bool* 2022-04-27 14:12:55 +02:00
Pieter Wuille
a65931e3ce Make DecodeBase{32,64} always return vector, not string
Base32/base64 are mechanisms for encoding binary data. That they'd
decode to a string is just bizarre. The fact that they'd do that
based on the type of input arguments even more so.
2022-04-27 14:12:55 +02:00
Pieter Wuille
a4377a0843 Reject incorrect base64 in HTTP auth
In addition, to make sure that no call site ignores the invalid
decoding status, make the pf_invalid argument mandatory.
2022-04-27 14:12:55 +02:00
Pieter Wuille
d648b5120b Make SanitizeString use string_view 2022-04-27 14:12:55 +02:00
Pieter Wuille
963bc9b576 Make IsHexNumber use string_view 2022-04-27 14:12:55 +02:00
Pieter Wuille
40062997f2 Make IsHex use string_view 2022-04-27 14:12:55 +02:00
Pieter Wuille
c1d165a8c2 Make ParseHex use string_view 2022-04-27 14:12:55 +02:00
Martin Leitner-Ankerl
5e61532e72 util: optimizes HexStr
In my benchmark, this rewrite improves runtime 27% (g++) to 46% (clang++) for the benchmark `HexStrBench`:

g++ 11.2.0
|             ns/byte |              byte/s |    err% |        ins/byte |        cyc/byte |    IPC |       bra/byte |   miss% |     total | benchmark
|--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
|                0.94 |    1,061,381,310.36 |    0.7% |           12.00 |            3.01 |  3.990 |           1.00 |    0.0% |      0.01 | `HexStrBench` master
|                0.68 |    1,465,366,544.25 |    1.7% |            6.00 |            2.16 |  2.778 |           1.00 |    0.0% |      0.01 | `HexStrBench` branch

clang++ 13.0.1
|             ns/byte |              byte/s |    err% |        ins/byte |        cyc/byte |    IPC |       bra/byte |   miss% |     total | benchmark
|--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
|                0.80 |    1,244,713,415.92 |    0.9% |           10.00 |            2.56 |  3.913 |           0.50 |    0.0% |      0.01 | `HexStrBench` master
|                0.43 |    2,324,188,940.72 |    0.2% |            4.00 |            1.37 |  2.914 |           0.25 |    0.0% |      0.01 | `HexStrBench` branch

Note that the idea for this change comes from denis2342 in PR 23364. This is a rewrite so no unaligned accesses occur.

Also, the lookup table is now calculated at compile time, which hopefully makes the code a bit easier to review.
2022-04-17 14:29:52 +02:00
fanquake
243a9c3925 Merge bitcoin/bitcoin#24297: Fix unintended unsigned integer overflow in strencodings
fac9fe5d05 Fix unintended unsigned integer overflow in strencodings (MarcoFalke)

Pull request description:

  This fixes two issues for strings that start with a colon and only have one colon:

  * `fMultiColon` is incorrectly set to `true`
  * There is an unsigned integer overflow `colon - 1` (`0 - 1`)

  Neither issue matters, as the result is discarded. Though, it makes sense to still fix the issue for clarity and to avoid sanitizer issues in the function.

ACKs for top commit:
  laanwj:
    Code review ACK fac9fe5d05
  shaavan:
    Code Review ACK fac9fe5d05

Tree-SHA512: e71c21a0b617abf241e561ce6b90b963e2d5e2f77bd9547ce47209a1a94b454384391f86ef5d35fedd4f4df19add3896bb3d61fed396ebba8e864e3eeb75ed59
2022-02-10 07:17:32 +00:00
MarcoFalke
fa2f7d0059 fuzz: Avoid unsigned integer overflow in FormatParagraph 2022-02-09 14:38:22 +01:00
MarcoFalke
fac9fe5d05 Fix unintended unsigned integer overflow in strencodings 2022-02-09 13:24:55 +01:00
Hennadii Stepanov
f47dda2c58 scripted-diff: Bump copyright headers
-BEGIN VERIFY SCRIPT-
./contrib/devtools/copyright_header.py update ./
-END VERIFY SCRIPT-

Commits of previous years:
* 2020: fa0074e2d8
* 2019: aaaaad6ac9
2021-12-30 19:36:57 +02:00
MarcoFalke
fa5865a9e3 Reduce size of strencodings decode tables 2021-12-13 09:58:20 +01:00
MarcoFalke
fad6761cf7 Fix implicit integer sign changes in strencodings 2021-12-13 09:57:33 +01:00
MarcoFalke
9394964f6b Merge bitcoin/bitcoin#23451: span: Add std::byte helpers
faa3ec2304 span: Add std::byte helpers (MarcoFalke)
fa18038f51 refactor: Use ignore helper when unserializing an invalid pubkey (MarcoFalke)
fabe18d0b3 Use value_type in CDataStream where possible (MarcoFalke)

Pull request description:

  This adds (currently unused) span std::byte helpers, so that they can be used in new code.

  The refactors are also required for https://github.com/bitcoin/bitcoin/pull/23438, but they are split up because the other pull doesn't compile with msvc right now.

  The third commit is not needed for the other pull, but still nice.

ACKs for top commit:
  klementtan:
    reACK  faa3ec2. Verified that all the new `std::byte` helper functions are tested.
  laanwj:
    Code review ACK faa3ec2304

Tree-SHA512: b1f6af39f03ea4dfebf20d4a8538fa993a6104e7fc92ddf0c4606a7efc3ca9a8c1a4741d98a1418569c11bb9ce9258bf0c0c06d93d85ed7e208902a2db04e407
2021-11-24 11:04:37 +01:00
Douglas Chimento
21b58f430f util: ParseByteUnits - Parse a string with suffix unit [k|K|m|M|g|G|t|T]
A convenience utility for human readable arguments/config e.g. -maxuploadtarget=500g
2021-11-17 12:47:30 +02:00
MarcoFalke
faa3ec2304 span: Add std::byte helpers
Also, add Span<std::byte> interface to strencondings.
2021-11-09 17:42:13 +01:00
MarcoFalke
42fedb4acd Merge bitcoin/bitcoin#23156: refactor: Remove unused ParsePrechecks and ParseDouble
fa9d72a794 Remove unused ParseDouble and ParsePrechecks (MarcoFalke)
fa3cd28535 refactor: Remove unused ParsePrechecks from ParseIntegral (MarcoFalke)

Pull request description:

  All of the `ParsePrechecks` are already done by `ToIntegral`, so remove them from `ParseIntegral`.

  Also:
  * Remove redundant `{}`. See https://github.com/bitcoin/bitcoin/pull/20457#discussion_r720116866
  * Add missing failing c-string test case
  * Add missing failing test cases for non-int32_t integral types

ACKs for top commit:
  laanwj:
    Code review ACK fa9d72a794, good find on ParseDouble not being used at all, and testing for behavior of embedded NULL characters is always a good thing.
  practicalswift:
    cr ACK fa9d72a794

Tree-SHA512: 3d654dcaebbf312dd57e54241f9aa6d35b1d1d213c37e4c6b8b9a69bcbe8267a397474a8b86b57740fbdd8e3d03b4cdb6a189a9eb8e05cd38035dab195410aa7
2021-10-04 15:06:37 +02:00
MarcoFalke
fa9d72a794 Remove unused ParseDouble and ParsePrechecks 2021-10-04 09:46:17 +02:00
MarcoFalke
fa3cd28535 refactor: Remove unused ParsePrechecks from ParseIntegral
Also:
* Remove redundant {} from return statement
* Add missing failing c-string test case and "-" and "+" strings
* Add missing failing test cases for non-int32_t integral types
2021-10-01 18:05:33 +02:00
practicalswift
4343f114cc Replace use of locale dependent atoi(…) with locale-independent std::from_chars(…) (C++17)
test: Add test cases for LocaleIndependentAtoi

fuzz: Assert legacy atoi(s) == LocaleIndependentAtoi<int>(s)

fuzz: Assert legacy atoi64(s) == LocaleIndependentAtoi<int64_t>(s)
2021-09-30 14:21:17 +00:00
practicalswift
4747db8761 util: Introduce ToIntegral<T>(const std::string&) for locale independent parsing using std::from_chars(…) (C++17)
util: Avoid locale dependent functions strtol/strtoll/strtoul/strtoull in ParseInt32/ParseInt64/ParseUInt32/ParseUInt64

fuzz: Assert equivalence between new and old Parse{Int,Uint}{8,32,64} functions

test: Add unit tests for ToIntegral<T>(const std::string&)
2021-09-18 04:31:24 +00:00
W. J. van der Laan
1ed859e90e Merge bitcoin/bitcoin#21173: util: faster HexStr => 13% faster blockToJSON
74bf850ac4 faster HexStr => 13% faster blockToJSON (Martin Ankerl)

Pull request description:

  `std::string`'s push_back is rather slow because it needs to check & update the string size. For
  `HexStr` the output string size is already easily know, so we can initially create the string with
  the correct size and then just assign the data.

  `HexStr` is heavily usd in `blockToJSON`, so this change is a noticeable benefit. Benchmark on an i7-8700 @3.2GHz:

  * 71,315,461.00 ns/op master
  * 62,842,490.00 ns/op this commit

  So this little change makes `blockToJSON` about ~13% faster.

ACKs for top commit:
  laanwj:
    Code review ACK 74bf850ac4
  theStack:
    re-ACK 74bf850ac4

Tree-SHA512: fc99105123edc11f4e40ed77aea80cf7f32e49c53369aa364b38395dcb48575e15040b0489ed30d0fe857c032a04e225c33e9d95cdfa109a3cb5a6ec9a972415
2021-05-19 10:07:53 +02:00
Jon Atack
6f09c0f6b5 util: add missing braces and apply clang format to SplitHostPort() 2021-03-16 19:52:35 +01:00
Jon Atack
2875a764f7 util: add ParseUInt16(), use it in SplitHostPort() 2021-03-16 19:52:33 +01:00
Jon Atack
6423c8175f p2p, refactor: pass and use uint16_t CService::port as uint16_t 2021-03-16 19:52:31 +01:00
Martin Ankerl
74bf850ac4 faster HexStr => 13% faster blockToJSON
`std::string`'s push_back is rather slow because it needs to check & update the string size. For
`HexStr` the output string size is already easily know, so we can initially create the string with
the correct size and then just assign the data.

`HexStr` is heavily usd in `blockToJSON`, so this change is a noticeable benefit. Benchmark on an i7-8700 @3.2GHz:

* 71,315,461.00 ns/op master
* 62,842,490.00 ns/op this commit

So this little change makes `blockToJSON` about ~13% faster.
2021-02-16 07:33:55 +01:00
practicalswift
4848e71107 scripted-diff: Use [[nodiscard]] (C++17) instead of NODISCARD
-BEGIN VERIFY SCRIPT-
sed -i "s/NODISCARD/[[nodiscard]]/g" $(git grep -l "NODISCARD" ":(exclude)src/bench/nanobench.h" ":(exclude)src/attributes.h")
-END VERIFY SCRIPT-
2020-11-26 09:05:59 +00:00
Vasil Dimov
7be6ff6187 net: recognize TORv3/I2P/CJDNS networks
Recognizing addresses from those networks allows us to accept and gossip
them, even though we don't know how to connect to them (yet).

Co-authored-by: eriknylund <erik@daychanged.com>
2020-09-21 10:13:34 +02:00
Sebastian Falbesoner
e2aa1a585a util: make EncodeBase64 consume Spans 2020-08-25 18:52:57 +02:00
Sebastian Falbesoner
2bc207190e util: make EncodeBase32 consume Spans 2020-08-25 18:52:51 +02:00
MarcoFalke
8d6224fefe Merge #19628: net: change CNetAddr::ip to have flexible size
102867c587 net: change CNetAddr::ip to have flexible size (Vasil Dimov)
1ea57ad674 net: don't accept non-left-contiguous netmasks (Vasil Dimov)

Pull request description:

  (chopped off from #19031 to ease review)

  Before this change `CNetAddr::ip` was a fixed-size array of 16 bytes,
  not being able to store larger addresses (e.g. TORv3) and encoded
  smaller ones as 16-byte IPv6 addresses.

  Change its type to `prevector`, so that it can hold larger addresses and
  do not disguise non-IPv6 addresses as IPv6. So the IPv4 address
  `1.2.3.4` is now encoded as `01020304` instead of
  `00000000000000000000FFFF01020304`.

  Rename `CNetAddr::ip` to `CNetAddr::m_addr` because it is not an "IP" or
  "IP address" (TOR addresses are not IP addresses).

  In order to preserve backward compatibility with serialization (where
  e.g. `1.2.3.4` is serialized as `00000000000000000000FFFF01020304`)
  introduce `CNetAddr` dedicated legacy serialize/unserialize methods.

  Adjust `CSubNet` accordingly. Still use `CSubNet::netmask[]` of fixed 16
  bytes, but use the first 4 for IPv4 (not the last 4). Do not accept
  invalid netmasks that have 0-bits followed by 1-bits and only allow
  subnetting for IPv4 and IPv6.

  Co-authored-by: Carl Dong <contact@carldong.me>

ACKs for top commit:
  sipa:
    utACK 102867c587
  MarcoFalke:
    Concept ACK 102867c587
  ryanofsky:
    Code review ACK 102867c587. Just many suggested updates since last review. Thanks for following up on everything!
  jonatack:
    re-ACK 102867c587 diff review, code review, build/tests/running bitcoind with ipv4/ipv6/onion peers
  kallewoof:
    ACK 102867c587

Tree-SHA512: d60bf716cecf8d3e8146d2f90f897ebe956befb16f711a24cfe680024c5afc758fb9e4a0a22066b42f7630d52cf916318bedbcbc069ae07092d5250a11e8f762
2020-08-25 18:10:25 +02:00
Vasil Dimov
102867c587 net: change CNetAddr::ip to have flexible size
Before this change `CNetAddr::ip` was a fixed-size array of 16 bytes,
not being able to store larger addresses (e.g. TORv3) and encoded
smaller ones as 16-byte IPv6 addresses.

Change its type to `prevector`, so that it can hold larger addresses and
do not disguise non-IPv6 addresses as IPv6. So the IPv4 address
`1.2.3.4` is now encoded as `01020304` instead of
`00000000000000000000FFFF01020304`.

Rename `CNetAddr::ip` to `CNetAddr::m_addr` because it is not an "IP" or
"IP address" (TOR addresses are not IP addresses).

In order to preserve backward compatibility with serialization (where
e.g. `1.2.3.4` is serialized as `00000000000000000000FFFF01020304`)
introduce `CNetAddr` dedicated legacy serialize/unserialize methods.

Adjust `CSubNet` accordingly. Still use `CSubNet::netmask[]` of fixed 16
bytes, but use the first 4 for IPv4 (not the last 4). Only allow
subnetting for IPv4 and IPv6.

Co-authored-by: Carl Dong <contact@carldong.me>
2020-08-24 21:50:59 +02:00
Sebastian Falbesoner
71e0f07e9c util: remove unused c-string variant of atoi64() 2020-08-17 17:56:59 +02:00