Merge bitcoin/bitcoin#25325: Add pool based memory resource

9f947fc3d4 Use PoolAllocator for CCoinsMap (Martin Leitner-Ankerl)
5e4ac5abf5 Call ReallocateCache() on each Flush() (Martin Leitner-Ankerl)
1afca6b663 Add PoolResource fuzzer (Martin Leitner-Ankerl)
e19943f049 Calculate memory usage correctly for unordered_maps that use PoolAllocator (Martin Leitner-Ankerl)
b8401c3281 Add pool based memory resource & allocator (Martin Leitner-Ankerl)

Pull request description:

  A memory resource similar to `std::pmr::unsynchronized_pool_resource`, but optimized for node-based containers. The goal is to be able to cache more coins with the same memory usage, and allocate/deallocate faster.

  This is a reimplementation of #22702. The goal was to implement it in a way that is simpler to review & test

  * There is now a generic `PoolResource` for allocating/deallocating memory. This has practically the same API as `std::pmr::memory_resource`. (Unfortunately I cannot use std::pmr because libc++ simply doesn't implement that API).
  * Thanks to sipa there is now a fuzzer for PoolResource! On a fast machine I ran it for ~770 million executions without finding any issue.

  * The estimation of the correct node size is now gone, PoolResource now has multiple pools and just needs to be created large enough to have space for the unordered_map nodes.

  I ran benchmarks with #22702, mergebase, and this PR. Frequency locked Intel i7-8700, clang++ 13.0.1 to reindex up to block 690000.

  ```sh
  bitcoind -dbcache=5000 -assumevalid=00000000000000000002a23d6df20eecec15b21d32c75833cce28f113de888b7 -reindex-chainstate -printtoconsole=0 -stopatheight=690000
  ```

  The performance is practically identical with #22702, just 0.4% slower. It's ~21% faster than master:

  ![Progress in Million Transactions over Time(2)](https://user-images.githubusercontent.com/14386/173288685-91952ade-f304-4825-8bfb-0725a71ca17b.png)

  ![Size of Cache in MiB over Time](https://user-images.githubusercontent.com/14386/173291421-e6b410be-ac77-479b-ad24-5fafcebf81eb.png)
  Note that on cache drops mergebase's memory doesnt go so far down because it does not free the `CCoinsMap` bucket array.

  ![Size of Cache in Million tx over Time(1)](https://user-images.githubusercontent.com/14386/173288703-a80c9c9e-93c8-4a16-9df8-610c89c61cc4.png)

ACKs for top commit:
  LarryRuane:
    ACK 9f947fc3d4
  achow101:
    re-ACK 9f947fc3d4
  john-moffett:
    ACK 9f947fc3d4
  jonatack:
    re-ACK 9f947fc3d4

Tree-SHA512: 48caf57d1775875a612b54388ef64c53952cd48741cacfe20d89049f2fb35301b5c28e69264b7d659a3ca33d4c714d47bafad6fd547c4075f08b45acc87c0f45
This commit is contained in:
Andrew Chow
2023-04-20 16:11:17 -04:00
16 changed files with 1004 additions and 24 deletions

50
src/bench/pool.cpp Normal file
View File

@@ -0,0 +1,50 @@
// Copyright (c) 2022 The Bitcoin Core developers
// Distributed under the MIT software license, see the accompanying
// file COPYING or http://www.opensource.org/licenses/mit-license.php.
#include <bench/bench.h>
#include <support/allocators/pool.h>
#include <unordered_map>
template <typename Map>
void BenchFillClearMap(benchmark::Bench& bench, Map& map)
{
size_t batch_size = 5000;
// make sure each iteration of the benchmark contains exactly 5000 inserts and one clear.
// do this at least 10 times so we get reasonable accurate results
bench.batch(batch_size).minEpochIterations(10).run([&] {
auto rng = ankerl::nanobench::Rng(1234);
for (size_t i = 0; i < batch_size; ++i) {
map[rng()];
}
map.clear();
});
}
static void PoolAllocator_StdUnorderedMap(benchmark::Bench& bench)
{
auto map = std::unordered_map<uint64_t, uint64_t>();
BenchFillClearMap(bench, map);
}
static void PoolAllocator_StdUnorderedMapWithPoolResource(benchmark::Bench& bench)
{
using Map = std::unordered_map<uint64_t,
uint64_t,
std::hash<uint64_t>,
std::equal_to<uint64_t>,
PoolAllocator<std::pair<const uint64_t, uint64_t>,
sizeof(std::pair<const uint64_t, uint64_t>) + 4 * sizeof(void*),
alignof(void*)>>;
// make sure the resource supports large enough pools to hold the node. We do this by adding the size of a few pointers to it.
auto pool_resource = Map::allocator_type::ResourceType();
auto map = Map{0, std::hash<uint64_t>{}, std::equal_to<uint64_t>{}, &pool_resource};
BenchFillClearMap(bench, map);
}
BENCHMARK(PoolAllocator_StdUnorderedMap, benchmark::PriorityLevel::HIGH);
BENCHMARK(PoolAllocator_StdUnorderedMapWithPoolResource, benchmark::PriorityLevel::HIGH);