Merge bitcoin/bitcoin#31144: [IBD] multi-byte block obfuscation

248b6a27c3 optimization: peel align-head and unroll body to 64 bytes (Lőrinc)
e7114fc6dc optimization: migrate fixed-size obfuscation from `std::vector<std::byte>` to `uint64_t` (Lőrinc)
478d40afc6 refactor: encapsulate `vector`/`array` keys into `Obfuscation` (Lőrinc)
377aab8e5a refactor: move `util::Xor` to `Obfuscation().Xor` (Lőrinc)
fa5d296e3b refactor: prepare mempool_persist for obfuscation key change (Lőrinc)
6bbf2d9311 refactor: prepare `DBWrapper` for obfuscation key change (Lőrinc)
0b8bec8aa6 scripted-diff: unify xor-vs-obfuscation nomenclature (Lőrinc)
972697976c bench: make ObfuscationBench more representative (Lőrinc)
618a30e326 test: compare util::Xor with randomized inputs against simple impl (Lőrinc)
a5141cd39e test: make sure dbwrapper obfuscation key is never obfuscated (Lőrinc)
54ab0bd64c refactor: commit to 8 byte obfuscation keys (Lőrinc)
7aa557a37b random: add fixed-size `std::array` generation (Lőrinc)

Pull request description:

  This change is part of [[IBD] - Tracking PR for speeding up Initial Block Download](https://github.com/bitcoin/bitcoin/pull/32043)

  ### Summary

  Current block obfuscations are done byte-by-byte, this PR batches them to 64 bit primitives to speed up obfuscating bigger memory batches.
  This is especially relevant now that https://github.com/bitcoin/bitcoin/pull/31551 was merged, having bigger obfuscatable chunks.

  Since this obfuscation is optional, the speedup measured here depends on whether it's a [random value](https://github.com/bitcoin/bitcoin/pull/31144#issuecomment-2523295114) or [completely turned off](https://github.com/bitcoin/bitcoin/pull/31144#issuecomment-2519764142) (i.e. XOR-ing with 0).

  ### Changes in testing, benchmarking and implementation

  * Added new tests comparing randomized inputs against a trivial implementation and performing roundtrip checks with random chunks.
  * Migrated `std::vector<std::byte>(8)` keys to plain `uint64_t`;
  * Process unaligned bytes separately and unroll body to 64 bytes.

  ### Assembly

  Memory alignment is enforced by a small peel-loop (`std::memcpy` is optimized out on tested platform), with an `std::assume_aligned<8>` check, see the Godbolt listing at https://godbolt.org/z/59EMv7h6Y for details

  <details>
  <summary>Details</summary>

  Target & Compiler | Stride (per hot-loop iter) | Main operation(s) in loop | Effective XORs / iter
  -- | -- | -- | --
  Clang x86-64 (trunk) | 64 bytes | 4 × movdqu → pxor → store | 8 × 64-bit
  GCC x86-64 (trunk) | 64 bytes | 4 × movdqu/pxor sequence, enabled by 8-way unroll | 8 × 64-bit
  GCC RV32 (trunk) | 8 bytes | copy 8 B to temp → 2 × 32-bit XOR → copy back | 1 × 64-bit (as 2 × 32-bit)
  GCC s390x (big-endian 14.2) | 64 bytes | 8 × XC (mem-mem 8-B XOR) with key cached on stack | 8 × 64-bit

  </details>

  ### Endianness

  The only endianness issue was with bit rotation, intended to realign the key if obfuscation halted before full key consumption.
  Elsewhere, memory is read, processed, and written back in the same endianness, preserving byte order.
  Since CI lacks a big-endian machine, testing was done locally via Docker.
  <details>
  <summary>Details</summary>

  ```bash
  brew install podman pigz
  softwareupdate --install-rosetta
  podman machine init
  podman machine start
  docker run --platform linux/s390x -it ubuntu:latest /bin/bash
    apt update && apt install -y git build-essential cmake ccache pkg-config libevent-dev libboost-dev libssl-dev libsqlite3-dev python3 && \
    cd /mnt && git clone --depth=1 https://github.com/bitcoin/bitcoin.git && cd bitcoin && git remote add l0rinc https://github.com/l0rinc/bitcoin.git && git fetch --all && git checkout l0rinc/optimize-xor && \
    cmake -B build && cmake --build build --target test_bitcoin -j$(nproc) && \
    ./build/bin/test_bitcoin --run_test=streams_tests
  ```

  </details>

  ### Measurements (micro benchmarks and full IBDs)

  > cmake -B build -DBUILD_BENCH=ON -DCMAKE_BUILD_TYPE=Release -DCMAKE_C_COMPILER=gcc/clang -DCMAKE_CXX_COMPILER=g++/clang++ && \
    cmake --build build -j$(nproc) && \
    build/bin/bench_bitcoin -filter='ObfuscationBench' -min-time=5000

  <details>
  <summary>GNU 14.2.0</summary>

  > Before:

  |             ns/byte |              byte/s |    err% |        ins/byte |        cyc/byte |    IPC |       bra/byte |   miss% |     total | benchmark
  |--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
  |                0.84 |    1,184,138,235.64 |    0.0% |            9.01 |            3.03 |  2.971 |           1.00 |    0.1% |      5.50 | `ObfuscationBench`

  > After (first optimizing commit):

  |             ns/byte |              byte/s |    err% |        ins/byte |        cyc/byte |    IPC |       bra/byte |   miss% |     total | benchmark
  |--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
  |                0.04 |   28,365,698,819.44 |    0.0% |            0.34 |            0.13 |  2.714 |           0.07 |    0.0% |      5.33 | `ObfuscationBench`

  > and (second optimizing commit):

  |             ns/byte |              byte/s |    err% |        ins/byte |        cyc/byte |    IPC |       bra/byte |   miss% |     total | benchmark
  |--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
  |                0.03 |   32,464,658,919.11 |    0.0% |            0.50 |            0.11 |  4.474 |           0.08 |    0.0% |      5.29 | `ObfuscationBench`

  </details>

  <details>
  <summary>Clang 20.1.7</summary>

  > Before:

  |             ns/byte |              byte/s |    err% |        ins/byte |        cyc/byte |    IPC |       bra/byte |   miss% |     total | benchmark
  |--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
  |                0.89 |    1,124,087,330.23 |    0.1% |            6.52 |            3.20 |  2.041 |           0.50 |    0.2% |      5.50 | `ObfuscationBench`

  > After (first optimizing commit):

  |             ns/byte |              byte/s |    err% |        ins/byte |        cyc/byte |    IPC |       bra/byte |   miss% |     total | benchmark
  |--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
  |                0.08 |   13,012,464,203.00 |    0.0% |            0.65 |            0.28 |  2.338 |           0.13 |    0.8% |      5.50 | `ObfuscationBench`

  > and (second optimizing commit):

  |             ns/byte |              byte/s |    err% |        ins/byte |        cyc/byte |    IPC |       bra/byte |   miss% |     total | benchmark
  |--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
  |                0.02 |   41,231,547,045.17 |    0.0% |            0.30 |            0.09 |  3.463 |           0.02 |    0.0% |      5.47 | `ObfuscationBench`

  </details>

  i.e. 27.4x faster obfuscation with GCC, 36.7x faster with Clang

  For other benchmark speedups see  https://corecheck.dev/bitcoin/bitcoin/pulls/31144

  ------

  Running an IBD until 888888 blocks reveals a 4% speedup.

  <details>
  <summary>Details</summary>

  SSD:

  ```bash
  COMMITS="8324a00bd4a6a5291c841f2d01162d8a014ddb02 5ddfd31b4158a89b0007cfb2be970c03d9278525"; \
  STOP_HEIGHT=888888; DBCACHE=1000; \
  CC=gcc; CXX=g++; \
  BASE_DIR="/mnt/my_storage"; DATA_DIR="$BASE_DIR/BitcoinData"; LOG_DIR="$BASE_DIR/logs"; \
  (for c in $COMMITS; do git fetch origin $c -q && git log -1 --pretty=format:'%h %s' $c || exit 1; done) && \
  hyperfine \
    --sort 'command' \
    --runs 1 \
    --export-json "$BASE_DIR/ibd-${COMMITS// /-}-$STOP_HEIGHT-$DBCACHE-$CC.json" \
    --parameter-list COMMIT ${COMMITS// /,} \
    --prepare "killall bitcoind; rm -rf $DATA_DIR/*; git checkout {COMMIT}; git clean -fxd; git reset --hard; \
      cmake -B build -DCMAKE_BUILD_TYPE=Release -DENABLE_WALLET=OFF && \
      cmake --build build -j$(nproc) --target bitcoind && \
      ./build/bin/bitcoind -datadir=$DATA_DIR -stopatheight=1 -printtoconsole=0; sleep 100" \
    --cleanup "cp $DATA_DIR/debug.log $LOG_DIR/debug-{COMMIT}-$(date +%s).log" \
    "COMPILER=$CC ./build/bin/bitcoind -datadir=$DATA_DIR -stopatheight=$STOP_HEIGHT -dbcache=$DBCACHE -blocksonly -printtoconsole=0"
  ```

  > 8324a00bd4 test: Compare util::Xor with randomized inputs against simple impl
  > 5ddfd31b41 optimization: Xor 64 bits together instead of byte-by-byte

  ```python
  Benchmark 1: COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=888888 -dbcache=1000 -blocksonly -printtoconsole=0 (COMMIT = 8324a00bd4a6a5291c841f2d01162d8a014ddb02)
    Time (abs ≡):        25033.413 s               [User: 33953.984 s, System: 2613.604 s]

  Benchmark 2: COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=888888 -dbcache=1000 -blocksonly -printtoconsole=0 (COMMIT = 5ddfd31b4158a89b0007cfb2be970c03d9278525)
    Time (abs ≡):        24110.710 s               [User: 33389.536 s, System: 2660.292 s]

  Relative speed comparison
          1.04          COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=888888 -dbcache=1000 -blocksonly -printtoconsole=0 (COMMIT = 8324a00bd4a6a5291c841f2d01162d8a014ddb02)
          1.00          COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=888888 -dbcache=1000 -blocksonly -printtoconsole=0 (COMMIT = 5ddfd31b4158a89b0007cfb2be970c03d9278525)
  ```

  > HDD:

  ```bash
  COMMITS="71eb6eaa740ad0b28737e90e59b89a8e951d90d9 46854038e7984b599d25640de26d4680e62caba7"; \
  STOP_HEIGHT=888888; DBCACHE=4500; \
  CC=gcc; CXX=g++; \
  BASE_DIR="/mnt/my_storage"; DATA_DIR="$BASE_DIR/BitcoinData"; LOG_DIR="$BASE_DIR/logs"; \
  (for c in $COMMITS; do git fetch origin $c -q && git log -1 --pretty=format:'%h %s' $c || exit 1; done) && \
  hyperfine \
    --sort 'command' \
    --runs 2 \
    --export-json "$BASE_DIR/ibd-${COMMITS// /-}-$STOP_HEIGHT-$DBCACHE-$CC.json" \
    --parameter-list COMMIT ${COMMITS// /,} \
    --prepare "killall bitcoind; rm -rf $DATA_DIR/*; git checkout {COMMIT}; git clean -fxd; git reset --hard; \
      cmake -B build -DCMAKE_BUILD_TYPE=Release -DENABLE_WALLET=OFF && cmake --build build -j$(nproc) --target bitcoind && \
      ./build/bin/bitcoind -datadir=$DATA_DIR -stopatheight=1 -printtoconsole=0; sleep 100" \
    --cleanup "cp $DATA_DIR/debug.log $LOG_DIR/debug-{COMMIT}-$(date +%s).log" \
    "COMPILER=$CC ./build/bin/bitcoind -datadir=$DATA_DIR -stopatheight=$STOP_HEIGHT -dbcache=$DBCACHE -blocksonly -printtoconsole=0"
  ```

  > 71eb6eaa74 test: compare util::Xor with randomized inputs against simple impl
  > 46854038e7 optimization: migrate fixed-size obfuscation from `std::vector<std::byte>` to `uint64_t`

  ```python
  Benchmark 1: COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=888888 -dbcache=4500 -blocksonly -printtoconsole=0 (COMMIT = 71eb6eaa740ad0b28737e90e59b89a8e951d90d9)
    Time (mean ± σ):     37676.293 s ± 83.100 s    [User: 36900.535 s, System: 2220.382 s]
    Range (min … max):   37617.533 s … 37735.053 s    2 runs

  Benchmark 2: COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=888888 -dbcache=4500 -blocksonly -printtoconsole=0 (COMMIT = 46854038e7984b599d25640de26d4680e62caba7)
    Time (mean ± σ):     36181.287 s ± 195.248 s    [User: 34962.822 s, System: 1988.614 s]
    Range (min … max):   36043.226 s … 36319.349 s    2 runs

  Relative speed comparison
          1.04 ±  0.01  COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=888888 -dbcache=4500 -blocksonly -printtoconsole=0 (COMMIT = 71eb6eaa740ad0b28737e90e59b89a8e951d90d9)
          1.00          COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=888888 -dbcache=4500 -blocksonly -printtoconsole=0 (COMMIT = 46854038e7984b599d25640de26d4680e62caba7)
  ```

  </details>

ACKs for top commit:
  achow101:
    ACK 248b6a27c3
  maflcko:
    review ACK 248b6a27c3 🎻
  ryanofsky:
    Code review ACK 248b6a27c3. Looks good! Thanks for adapting this and considering all the suggestions. I did leave more comments below but non are important and this looks good as-is

Tree-SHA512: ef541cd8a1f1dc504613c4eaa708202e32ae5ac86f9c875e03bcdd6357121f6af0860ef83d513c473efa5445b701e59439d416effae1085a559716b0fd45ecd6
This commit is contained in:
Ava Chow
2025-07-18 22:17:11 -07:00
16 changed files with 363 additions and 187 deletions

View File

@@ -9,39 +9,62 @@
#include <util/string.h>
#include <memory>
#include <ranges>
#include <boost/test/unit_test.hpp>
using util::ToString;
// Test if a string consists entirely of null characters
static bool is_null_key(const std::vector<unsigned char>& key) {
bool isnull = true;
for (unsigned int i = 0; i < key.size(); i++)
isnull &= (key[i] == '\x00');
return isnull;
}
BOOST_FIXTURE_TEST_SUITE(dbwrapper_tests, BasicTestingSetup)
BOOST_AUTO_TEST_CASE(dbwrapper)
{
// Perform tests both obfuscated and non-obfuscated.
for (const bool obfuscate : {false, true}) {
fs::path ph = m_args.GetDataDirBase() / (obfuscate ? "dbwrapper_obfuscate_true" : "dbwrapper_obfuscate_false");
CDBWrapper dbw({.path = ph, .cache_bytes = 1 << 20, .memory_only = true, .wipe_data = false, .obfuscate = obfuscate});
uint8_t key{'k'};
uint256 in = m_rng.rand256();
uint256 res;
constexpr size_t CACHE_SIZE{1_MiB};
const fs::path path{m_args.GetDataDirBase() / "dbwrapper"};
// Ensure that we're doing real obfuscation when obfuscate=true
BOOST_CHECK(obfuscate != is_null_key(dbwrapper_private::GetObfuscateKey(dbw)));
Obfuscation obfuscation;
std::vector<std::pair<uint8_t, uint256>> key_values{};
BOOST_CHECK(dbw.Write(key, in));
BOOST_CHECK(dbw.Read(key, res));
BOOST_CHECK_EQUAL(res.ToString(), in.ToString());
// Write values
{
CDBWrapper dbw{{.path = path, .cache_bytes = CACHE_SIZE, .wipe_data = true, .obfuscate = obfuscate}};
BOOST_CHECK_EQUAL(obfuscate, !dbw.IsEmpty());
// Ensure that we're doing real obfuscation when obfuscate=true
obfuscation = dbwrapper_private::GetObfuscation(dbw);
BOOST_CHECK_EQUAL(obfuscate, dbwrapper_private::GetObfuscation(dbw));
for (uint8_t k{0}; k < 10; ++k) {
uint8_t key{k};
uint256 value{m_rng.rand256()};
BOOST_CHECK(dbw.Write(key, value));
key_values.emplace_back(key, value);
}
}
// Verify that the obfuscation key is never obfuscated
{
CDBWrapper dbw{{.path = path, .cache_bytes = CACHE_SIZE, .obfuscate = false}};
BOOST_CHECK_EQUAL(obfuscation, dbwrapper_private::GetObfuscation(dbw));
}
// Read back the values
{
CDBWrapper dbw{{.path = path, .cache_bytes = CACHE_SIZE, .obfuscate = obfuscate}};
// Ensure obfuscation is read back correctly
BOOST_CHECK_EQUAL(obfuscation, dbwrapper_private::GetObfuscation(dbw));
BOOST_CHECK_EQUAL(obfuscate, dbwrapper_private::GetObfuscation(dbw));
// Verify all written values
for (const auto& [key, expected_value] : key_values) {
uint256 read_value{};
BOOST_CHECK(dbw.Read(key, read_value));
BOOST_CHECK_EQUAL(read_value, expected_value);
}
}
}
}
@@ -57,7 +80,7 @@ BOOST_AUTO_TEST_CASE(dbwrapper_basic_data)
bool res_bool;
// Ensure that we're doing real obfuscation when obfuscate=true
BOOST_CHECK(obfuscate != is_null_key(dbwrapper_private::GetObfuscateKey(dbw)));
BOOST_CHECK_EQUAL(obfuscate, dbwrapper_private::GetObfuscation(dbw));
//Simulate block raw data - "b + block hash"
std::string key_block = "b" + m_rng.rand256().ToString();
@@ -116,13 +139,13 @@ BOOST_AUTO_TEST_CASE(dbwrapper_basic_data)
std::string file_option_tag = "F";
uint8_t filename_length = m_rng.randbits(8);
std::string filename = "randomfilename";
std::string key_file_option = strprintf("%s%01x%s", file_option_tag,filename_length,filename);
std::string key_file_option = strprintf("%s%01x%s", file_option_tag, filename_length, filename);
bool in_file_bool = m_rng.randbool();
BOOST_CHECK(dbw.Write(key_file_option, in_file_bool));
BOOST_CHECK(dbw.Read(key_file_option, res_bool));
BOOST_CHECK_EQUAL(res_bool, in_file_bool);
}
}
}
// Test batch operations
@@ -231,8 +254,8 @@ BOOST_AUTO_TEST_CASE(existing_data_no_obfuscate)
BOOST_CHECK(odbw.Read(key, res2));
BOOST_CHECK_EQUAL(res2.ToString(), in.ToString());
BOOST_CHECK(!odbw.IsEmpty()); // There should be existing data
BOOST_CHECK(is_null_key(dbwrapper_private::GetObfuscateKey(odbw))); // The key should be an empty string
BOOST_CHECK(!odbw.IsEmpty());
BOOST_CHECK(!dbwrapper_private::GetObfuscation(odbw)); // The key should be an empty string
uint256 in2 = m_rng.rand256();
uint256 res3;
@@ -269,7 +292,7 @@ BOOST_AUTO_TEST_CASE(existing_data_reindex)
// Check that the key/val we wrote with unobfuscated wrapper doesn't exist
uint256 res2;
BOOST_CHECK(!odbw.Read(key, res2));
BOOST_CHECK(!is_null_key(dbwrapper_private::GetObfuscateKey(odbw)));
BOOST_CHECK(dbwrapper_private::GetObfuscation(odbw));
uint256 in2 = m_rng.rand256();
uint256 res3;

View File

@@ -4,9 +4,10 @@
#include <span.h>
#include <streams.h>
#include <test/fuzz/FuzzedDataProvider.h>
#include <test/fuzz/fuzz.h>
#include <test/fuzz/FuzzedDataProvider.h>
#include <test/fuzz/util.h>
#include <util/obfuscation.h>
#include <array>
#include <cstddef>
@@ -18,9 +19,10 @@ FUZZ_TARGET(autofile)
{
FuzzedDataProvider fuzzed_data_provider{buffer.data(), buffer.size()};
FuzzedFileProvider fuzzed_file_provider{fuzzed_data_provider};
const auto key_bytes{ConsumeFixedLengthByteVector<std::byte>(fuzzed_data_provider, Obfuscation::KEY_SIZE)};
AutoFile auto_file{
fuzzed_file_provider.open(),
ConsumeRandomLengthByteVector<std::byte>(fuzzed_data_provider),
Obfuscation{std::span{key_bytes}.first<Obfuscation::KEY_SIZE>()},
};
LIMITED_WHILE(fuzzed_data_provider.ConsumeBool(), 100)
{

View File

@@ -4,9 +4,10 @@
#include <span.h>
#include <streams.h>
#include <test/fuzz/FuzzedDataProvider.h>
#include <test/fuzz/fuzz.h>
#include <test/fuzz/FuzzedDataProvider.h>
#include <test/fuzz/util.h>
#include <util/obfuscation.h>
#include <array>
#include <cstddef>
@@ -20,9 +21,10 @@ FUZZ_TARGET(buffered_file)
FuzzedDataProvider fuzzed_data_provider{buffer.data(), buffer.size()};
FuzzedFileProvider fuzzed_file_provider{fuzzed_data_provider};
std::optional<BufferedFile> opt_buffered_file;
const auto key_bytes{ConsumeFixedLengthByteVector<std::byte>(fuzzed_data_provider, Obfuscation::KEY_SIZE)};
AutoFile fuzzed_file{
fuzzed_file_provider.open(),
ConsumeRandomLengthByteVector<std::byte>(fuzzed_data_provider),
Obfuscation{std::span{key_bytes}.first<Obfuscation::KEY_SIZE>()},
};
try {
auto n_buf_size = fuzzed_data_provider.ConsumeIntegralInRange<uint64_t>(0, 4096);

View File

@@ -58,7 +58,7 @@ BOOST_AUTO_TEST_CASE(fastrandom_tests_deterministic)
BOOST_CHECK_EQUAL(ctx1.rand32(), ctx2.rand32());
BOOST_CHECK_EQUAL(ctx1.rand64(), ctx2.rand64());
BOOST_CHECK_EQUAL(ctx1.randbits(3), ctx2.randbits(3));
BOOST_CHECK(ctx1.randbytes(17) == ctx2.randbytes(17));
BOOST_CHECK(std::ranges::equal(ctx1.randbytes<std::byte>(17), ctx2.randbytes<17>())); // check vector/array behavior symmetry
BOOST_CHECK(ctx1.rand256() == ctx2.rand256());
BOOST_CHECK_EQUAL(ctx1.randbits(7), ctx2.randbits(7));
BOOST_CHECK(ctx1.randbytes(128) == ctx2.randbytes(128));

View File

@@ -8,27 +8,120 @@
#include <test/util/random.h>
#include <test/util/setup_common.h>
#include <util/fs.h>
#include <util/obfuscation.h>
#include <util/strencodings.h>
#include <boost/test/unit_test.hpp>
using namespace std::string_literals;
using namespace util::hex_literals;
BOOST_FIXTURE_TEST_SUITE(streams_tests, BasicTestingSetup)
// Test that obfuscation can be properly reverted even with random chunk sizes.
BOOST_AUTO_TEST_CASE(xor_roundtrip_random_chunks)
{
auto apply_random_xor_chunks{[&](std::span<std::byte> target, const Obfuscation& obfuscation) {
for (size_t offset{0}; offset < target.size();) {
const size_t chunk_size{1 + m_rng.randrange(target.size() - offset)};
obfuscation(target.subspan(offset, chunk_size), offset);
offset += chunk_size;
}
}};
for (size_t test{0}; test < 100; ++test) {
const size_t write_size{1 + m_rng.randrange(100U)};
const std::vector original{m_rng.randbytes<std::byte>(write_size)};
std::vector roundtrip{original};
const auto key_bytes{m_rng.randbool() ? m_rng.randbytes<Obfuscation::KEY_SIZE>() : std::array<std::byte, Obfuscation::KEY_SIZE>{}};
const Obfuscation obfuscation{key_bytes};
apply_random_xor_chunks(roundtrip, obfuscation);
const bool key_all_zeros{std::ranges::all_of(
std::span{key_bytes}.first(std::min(write_size, Obfuscation::KEY_SIZE)), [](auto b) { return b == std::byte{0}; })};
BOOST_CHECK(key_all_zeros ? original == roundtrip : original != roundtrip);
apply_random_xor_chunks(roundtrip, obfuscation);
BOOST_CHECK(original == roundtrip);
}
}
// Compares optimized obfuscation against a trivial, byte-by-byte reference implementation
// with random offsets to ensure proper handling of key wrapping.
BOOST_AUTO_TEST_CASE(xor_bytes_reference)
{
auto expected_xor{[](std::span<std::byte> target, std::span<const std::byte, Obfuscation::KEY_SIZE> obfuscation, size_t key_offset) {
for (auto& b : target) {
b ^= obfuscation[key_offset++ % obfuscation.size()];
}
}};
for (size_t test{0}; test < 100; ++test) {
const size_t write_size{1 + m_rng.randrange(100U)};
const size_t key_offset{m_rng.randrange(3 * Obfuscation::KEY_SIZE)}; // Make sure the key can wrap around
const size_t write_offset{std::min(write_size, m_rng.randrange(Obfuscation::KEY_SIZE * 2))}; // Write unaligned data
const auto key_bytes{m_rng.randbool() ? m_rng.randbytes<Obfuscation::KEY_SIZE>() : std::array<std::byte, Obfuscation::KEY_SIZE>{}};
const Obfuscation obfuscation{key_bytes};
std::vector expected{m_rng.randbytes<std::byte>(write_size)};
std::vector actual{expected};
expected_xor(std::span{expected}.subspan(write_offset), key_bytes, key_offset);
obfuscation(std::span{actual}.subspan(write_offset), key_offset);
BOOST_CHECK_EQUAL_COLLECTIONS(expected.begin(), expected.end(), actual.begin(), actual.end());
}
}
BOOST_AUTO_TEST_CASE(obfuscation_hexkey)
{
const auto key_bytes{m_rng.randbytes<Obfuscation::KEY_SIZE>()};
const Obfuscation obfuscation{key_bytes};
BOOST_CHECK_EQUAL(obfuscation.HexKey(), HexStr(key_bytes));
}
BOOST_AUTO_TEST_CASE(obfuscation_serialize)
{
const Obfuscation original{m_rng.randbytes<Obfuscation::KEY_SIZE>()};
// Serialization
DataStream ds;
ds << original;
BOOST_CHECK_EQUAL(ds.size(), 1 + Obfuscation::KEY_SIZE); // serialized as a vector
// Deserialization
Obfuscation recovered{};
ds >> recovered;
BOOST_CHECK_EQUAL(recovered.HexKey(), original.HexKey());
}
BOOST_AUTO_TEST_CASE(obfuscation_empty)
{
const Obfuscation null_obf{};
BOOST_CHECK(!null_obf);
const Obfuscation non_null_obf{"ff00ff00ff00ff00"_hex};
BOOST_CHECK(non_null_obf);
}
BOOST_AUTO_TEST_CASE(xor_file)
{
fs::path xor_path{m_args.GetDataDirBase() / "test_xor.bin"};
auto raw_file{[&](const auto& mode) { return fsbridge::fopen(xor_path, mode); }};
const std::vector<uint8_t> test1{1, 2, 3};
const std::vector<uint8_t> test2{4, 5};
const std::vector<std::byte> xor_pat{std::byte{0xff}, std::byte{0x00}};
const Obfuscation obfuscation{"ff00ff00ff00ff00"_hex};
{
// Check errors for missing file
AutoFile xor_file{raw_file("rb"), xor_pat};
BOOST_CHECK_EXCEPTION(xor_file << std::byte{}, std::ios_base::failure, HasReason{"AutoFile::write: file handle is nullpt"});
BOOST_CHECK_EXCEPTION(xor_file >> std::byte{}, std::ios_base::failure, HasReason{"AutoFile::read: file handle is nullpt"});
BOOST_CHECK_EXCEPTION(xor_file.ignore(1), std::ios_base::failure, HasReason{"AutoFile::ignore: file handle is nullpt"});
AutoFile xor_file{raw_file("rb"), obfuscation};
BOOST_CHECK_EXCEPTION(xor_file << std::byte{}, std::ios_base::failure, HasReason{"AutoFile::write: file handle is nullptr"});
BOOST_CHECK_EXCEPTION(xor_file >> std::byte{}, std::ios_base::failure, HasReason{"AutoFile::read: file handle is nullptr"});
BOOST_CHECK_EXCEPTION(xor_file.ignore(1), std::ios_base::failure, HasReason{"AutoFile::ignore: file handle is nullptr"});
}
{
#ifdef __MINGW64__
@@ -37,7 +130,7 @@ BOOST_AUTO_TEST_CASE(xor_file)
#else
const char* mode = "wbx";
#endif
AutoFile xor_file{raw_file(mode), xor_pat};
AutoFile xor_file{raw_file(mode), obfuscation};
xor_file << test1 << test2;
BOOST_REQUIRE_EQUAL(xor_file.fclose(), 0);
}
@@ -51,7 +144,7 @@ BOOST_AUTO_TEST_CASE(xor_file)
BOOST_CHECK_EXCEPTION(non_xor_file.ignore(1), std::ios_base::failure, HasReason{"AutoFile::ignore: end of file"});
}
{
AutoFile xor_file{raw_file("rb"), xor_pat};
AutoFile xor_file{raw_file("rb"), obfuscation};
std::vector<std::byte> read1, read2;
xor_file >> read1 >> read2;
BOOST_CHECK_EQUAL(HexStr(read1), HexStr(test1));
@@ -60,7 +153,7 @@ BOOST_AUTO_TEST_CASE(xor_file)
BOOST_CHECK_EXCEPTION(xor_file >> std::byte{}, std::ios_base::failure, HasReason{"AutoFile::read: end of file"});
}
{
AutoFile xor_file{raw_file("rb"), xor_pat};
AutoFile xor_file{raw_file("rb"), obfuscation};
std::vector<std::byte> read2;
// Check that ignore works
xor_file.ignore(4);
@@ -76,7 +169,7 @@ BOOST_AUTO_TEST_CASE(streams_vector_writer)
{
unsigned char a(1);
unsigned char b(2);
unsigned char bytes[] = { 3, 4, 5, 6 };
unsigned char bytes[] = {3, 4, 5, 6};
std::vector<unsigned char> vch;
// Each test runs twice. Serializing a second time at the same starting
@@ -223,34 +316,26 @@ BOOST_AUTO_TEST_CASE(bitstream_reader_writer)
BOOST_AUTO_TEST_CASE(streams_serializedata_xor)
{
std::vector<std::byte> in;
// Degenerate case
{
DataStream ds{in};
ds.Xor({0x00, 0x00});
DataStream ds{};
Obfuscation{}(ds);
BOOST_CHECK_EQUAL(""s, ds.str());
}
in.push_back(std::byte{0x0f});
in.push_back(std::byte{0xf0});
// Single character key
{
DataStream ds{in};
ds.Xor({0xff});
const Obfuscation obfuscation{"ffffffffffffffff"_hex};
DataStream ds{"0ff0"_hex};
obfuscation(ds);
BOOST_CHECK_EQUAL("\xf0\x0f"s, ds.str());
}
// Multi character key
in.clear();
in.push_back(std::byte{0xf0});
in.push_back(std::byte{0x0f});
{
DataStream ds{in};
ds.Xor({0xff, 0x0f});
const Obfuscation obfuscation{"ff0fff0fff0fff0f"_hex};
DataStream ds{"f00f"_hex};
obfuscation(ds);
BOOST_CHECK_EQUAL("\x0f\x00"s, ds.str());
}
}
@@ -563,7 +648,7 @@ BOOST_AUTO_TEST_CASE(buffered_reader_matches_autofile_random_content)
const FlatFilePos pos{0, 0};
const FlatFileSeq test_file{m_args.GetDataDirBase(), "buffered_file_test_random", node::BLOCKFILE_CHUNK_SIZE};
const std::vector obfuscation{m_rng.randbytes<std::byte>(8)};
const Obfuscation obfuscation{m_rng.randbytes<Obfuscation::KEY_SIZE>()};
// Write out the file with random content
{
@@ -618,7 +703,7 @@ BOOST_AUTO_TEST_CASE(buffered_writer_matches_autofile_random_content)
const FlatFileSeq test_buffered{m_args.GetDataDirBase(), "buffered_write_test", node::BLOCKFILE_CHUNK_SIZE};
const FlatFileSeq test_direct{m_args.GetDataDirBase(), "direct_write_test", node::BLOCKFILE_CHUNK_SIZE};
const std::vector obfuscation{m_rng.randbytes<std::byte>(8)};
const Obfuscation obfuscation{m_rng.randbytes<Obfuscation::KEY_SIZE>()};
{
DataBuffer test_data{m_rng.randbytes<std::byte>(file_size)};