bitcoin/src/util at eb137184482cea8a2a7cdfcd3c73ec6ae5fdd0d6 - bitcoin - Gitea: Git with a cup of tea

highperfocused/bitcoin

mirror of https://github.com/bitcoin/bitcoin.git synced 2026-03-12 00:26:03 +01:00

Files

History

Lőrinc 248b6a27c3 optimization: peel align-head and unroll body to 64 bytes

Benchmarks indicated that obfuscating multiple bytes already gives an order of magnitude speed-up, but:
* GCC still emitted scalar code;
* Clang’s auto-vectorized loop ran on the slow unaligned-load path.

Fix contains:
* peeling the misaligned head enabled the hot loop starting at an 8-byte address;
* `std::assume_aligned<8>` tells the optimizer the promise holds - required to keep Apple Clang happy;
* manually unrolling the body to 64 bytes enabled GCC to auto-vectorize.

Note that `target.size() > KEY_SIZE` condition is just an optimization, the aligned and unaligned loops work without it as well - it's why the alignment calculation still contains `std::min`.

>  C++ compiler .......................... GNU 14.2.0

|             ns/byte |              byte/s |    err% |        ins/byte |        cyc/byte |    IPC |       bra/byte |   miss% |     total | benchmark
|--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
|                0.03 |   32,464,658,919.11 |    0.0% |            0.50 |            0.11 |  4.474 |           0.08 |    0.0% |      5.29 | `ObfuscationBench`

> C++ compiler .......................... Clang 20.1.7

|             ns/byte |              byte/s |    err% |        ins/byte |        cyc/byte |    IPC |       bra/byte |   miss% |     total | benchmark
|--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
|                0.02 |   41,231,547,045.17 |    0.0% |            0.30 |            0.09 |  3.463 |           0.02 |    0.0% |      5.47 | `ObfuscationBench`

Co-authored-by: Hodlinator <172445034+hodlinator@users.noreply.github.com>

2025-07-16 14:37:19 -07:00

..

any.h

…

asmap.cpp

…

asmap.h

…

batchpriority.cpp

…

batchpriority.h

…

bip32.cpp

refactor: Use ToIntegral in ParseHDKeypath

2025-05-15 19:33:58 +02:00

bip32.h

…

bitdeque.h

…

bitset.h

util: use explicit cast in MultiIntBitSet::Fill()

2024-12-05 16:55:36 +01:00

byte_units.h

util: fix compiler warning about deprecated space before _MiB

2025-01-20 14:32:20 +01:00

bytevectorhash.cpp

…

bytevectorhash.h

…

chaintype.cpp

…

chaintype.h

…

check.cpp

fuzz: enable running fuzz test cases in Debug mode

2025-04-22 17:11:24 +10:00

check.h

fuzz: enable running fuzz test cases in Debug mode

2025-04-22 17:11:24 +10:00

CMakeLists.txt

Merge bitcoin/bitcoin#31375 : multiprocess: Add bitcoin wrapper executable

2025-05-27 12:38:19 -07:00

epochguard.h

…

exception.cpp

…

exception.h

…

exec.cpp

util: Add cross-platform ExecVp and GetExePath functions

2025-05-12 13:49:17 -05:00

exec.h

util: Add cross-platform ExecVp and GetExePath functions

2025-05-12 13:49:17 -05:00

fastrange.h

…

feefrac.cpp

scripted-diff: Use std::span over Span

2025-03-12 19:45:37 +01:00

feefrac.h

refactor: Sort includes of touched source files

2025-06-03 19:56:55 +02:00

fs_helpers.cpp

fs: remove _POSIX_C_SOURCE defining

2025-05-21 15:58:11 +01:00

fs_helpers.h

…

fs.cpp

util: Remove fsbridge::get_filesystem_error_message()

2025-04-30 10:41:34 +01:00

fs.h

util: Remove fsbridge::get_filesystem_error_message()

2025-04-30 10:41:34 +01:00

golombrice.h

…

hash_type.h

…

hasher.cpp

scripted-diff: Bump copyright headers after std::span changes

2025-03-12 19:46:54 +01:00

hasher.h

scripted-diff: Bump copyright headers after std::span changes

2025-03-12 19:46:54 +01:00

insert.h

…

macros.h

…

moneystr.cpp

…

moneystr.h

…

obfuscation.h

optimization: peel align-head and unroll body to 64 bytes

2025-07-16 14:37:19 -07:00

overflow.h

scripted-diff: modernize outdated trait patterns - values

2025-02-21 10:43:01 +01:00

overloaded.h

…

rbf.cpp

…

rbf.h

…

readwritefile.cpp

…

readwritefile.h

…

result.h

…

serfloat.cpp

…

serfloat.h

…

signalinterrupt.cpp

…

signalinterrupt.h

…

sock.cpp

scripted-diff: Bump copyright headers after std::span changes

2025-03-12 19:46:54 +01:00

sock.h

scripted-diff: Bump copyright headers after std::span changes

2025-03-12 19:46:54 +01:00

strencodings.cpp

refactor: Remove unused Parse(U)Int*

2025-05-19 17:16:13 +02:00

strencodings.h

doc: Remove ParseInt mentions in documentation

2025-05-20 06:50:50 +02:00

string.cpp

…

string.h

refactor: starts/ends_with changes for clang-tidy 20

2025-04-22 13:16:54 +01:00

subprocess.h

subprocess: Don't add an extra whitespace at end of Windows command line

2025-05-20 12:10:10 +01:00

syserror.cpp

scripted-diff: drop config/ subdir for bitcoin-config.h, rename to bitcoin-build-config.h

2024-10-10 12:22:12 +02:00

syserror.h

…

task_runner.h

…

thread.cpp

…

thread.h

…

threadinterrupt.cpp

…

threadinterrupt.h

…

threadnames.cpp

build: replace header checks with __has_include

2025-05-02 16:41:04 +01:00

threadnames.h

…

time.cpp

Add SetMockTime for time_point types

2025-07-09 13:57:54 +02:00

time.h

Add SetMockTime for time_point types

2025-07-09 13:57:54 +02:00

tokenpipe.cpp

scripted-diff: Bump copyright headers after include changes

2025-06-03 15:13:57 +02:00

tokenpipe.h

…

trace.h

tracing: only prepare tracepoint args if attached

2024-10-28 14:27:47 +01:00

transaction_identifier.h

scripted-diff: Replace GenTxidVariant with GenTxid

2025-07-08 20:00:51 +01:00

translation.h

refactor: Introduce struct to hold a runtime format string

2025-01-15 12:16:08 +01:00

types.h

…

ui_change_type.h

…

vecdeque.h

…

vector.h

scripted-diff: modernize outdated trait patterns - types

2025-02-21 10:41:27 +01:00