b2ea365648txgraph: Add Get{Ancestors,Descendants}Union functions (feature) (Pieter Wuille)54bceddd3atxgraph: Multiple inputs to Get{Ancestors,Descendant}Refs (preparation) (Pieter Wuille)aded047019txgraph: Add CountDistinctClusters function (feature) (Pieter Wuille)b685d322c9txgraph: Add DoWork function (feature) (Pieter Wuille)295a1ca8bbtxgraph: Expose ability to compare transactions (feature) (Pieter Wuille)22c68cd153txgraph: Allow Refs to outlive the TxGraph (feature) (Pieter Wuille)82fa3573e1txgraph: Destroying Ref means removing transaction (feature) (Pieter Wuille)6b037ceddftxgraph: Cache oversizedness of graphs (optimization) (Pieter Wuille)8c70688965txgraph: Add staging support (feature) (Pieter Wuille)c99c7300b4txgraph: Abstract out ClearLocator (refactor) (Pieter Wuille)34aa3da5adtxgraph: Group per-graph data in ClusterSet (refactor) (Pieter Wuille)36dd5edca5txgraph: Special-case removal of tail of cluster (Optimization) (Pieter Wuille)5801e0fb2btxgraph: Delay chunking while sub-acceptable (optimization) (Pieter Wuille)57f5499882txgraph: Avoid looking up the same child cluster repeatedly (optimization) (Pieter Wuille)1171953ac6txgraph: Avoid representative lookup for each dependency (optimization) (Pieter Wuille)64f69ec8c3txgraph: Make max cluster count configurable and "oversize" state (feature) (Pieter Wuille)1d27b74c8etxgraph: Add GetChunkFeerate function (feature) (Pieter Wuille)c80aecc24dtxgraph: Avoid per-group vectors for clusters & dependencies (optimization) (Pieter Wuille)ee57e93099txgraph: Add internal sanity check function (tests) (Pieter Wuille)05abf336f9txgraph: Add simulation fuzz test (tests) (Pieter Wuille)8ad3ed2681txgraph: Add initial version (feature) (Pieter Wuille)6eab3b2d73feefrac: Introduce tagged wrappers to distinguish vsize/WU rates (Pieter Wuille)d449773899scripted-diff: (refactor) ClusterIndex -> DepGraphIndex (Pieter Wuille)bfeb69f6e0clusterlin: Make IsAcyclic() a DepGraph member function (Pieter Wuille)0aa874a357clusterlin: Add FixLinearization function + fuzz test (Pieter Wuille) Pull request description: Part of cluster mempool: #30289. ### 1. Overview This introduces the `TxGraph` class, which encapsulates knowledge about the (effective) fees, sizes, and dependencies between all mempool transactions, but nothing else. In particular, it lacks knowledge about `CTransaction`, inputs, outputs, txids, wtxids, prioritization, validatity, policy rules, and a lot more. Being restricted to just those aspects of the mempool makes the behavior very easy to fully specify (ignoring the actual linearizations produced), and write simulation-based tests for (which are included in this PR). ### 2. Interface The interface can be largely categorized into: * Mutation functions: * `AddTransaction` (add a new transaction with specified feerate, and get a `Ref` object back to identify it). * `RemoveTransaction` (given a `Ref` object, remove the transaction). * `AddDependency` (given two `Ref` objects, add a dependency between them). * `SetTransactionFee` (modify the fee associated with a Ref object). * Inspector functions: * `GetAncestors` (get the ancestor set in the form of `Ref*` pointers) * `GetAncestorsUnion` (like above, but for the union of ancestors of multiple `Ref*` pointers) * `GetDescendants` (get the descendant set in the form of `Ref*` pointers) * `GetDescendantsUnion` (like above, but for the union of ancestors of multiple `Ref*` pointers) * `GetCluster` (get the connected component set in the form of `Ref*` pointers, in the order they would be mined). * `GetIndividualFeerate` (get the feerate of a transaction) * `GetChunkFeerate` (get the mining score of a transaction) * `CountDistinctClusters` (count the number of distinct clusters a list of `Ref`s belong to) * Staging functions: * `StartStaging` (make all future mutations operate on a proposed transaction graph) * `CommitStaging` (apply all the changes that are staged) * `AbortStaging` (discard all the changes that are staged) * Miscellaneous functions: * `DoWork` (do queued-up computations now, so that future operations are fast) This `TxGraph::Ref` type used as a "handle" on transactions in the graph can be inherited from, and the idea is that in the full cluster mempool implementation (#28676, after it is rebased on this), `CTxMempoolEntry` will inherit from it, and all actually used Ref objects will be `CTxMempoolEntry`s. With that, the mempool code can just cast any `Ref*` returned by txgraph to `CTxMempoolEntry*`. ### 3. Implementation Internally the graph data is kept in clustered form (partitioned into connected components), for which linearizations are maintained and updated as needed using the `cluster_linearize.h` algorithms under the hood, but this is hidden from the users of this class. Implementation-wise, mutations are generally applied lazily, appending to queues of to-be-removed transactions and to-be-added dependencies, so they can be batched for higher performance. Inspectors will generally only evaluate as much as is needed to answer queries, with roughly 5 levels of processing to go to fully instantiated and acceptable cluster linearizations, in order: 1. `ApplyRemovals` (take batches of to-be-removed transactions and translate them to "holes" in the corresponding Clusters/DepGraphs). 2. `SplitAll` (creating holes in Clusters may cause them to break apart into smaller connected components, so make turn them into separate Clusters/linearizations). 3. `GroupClusters` (figure out which Clusters will need to be combined in order to add requested to-be-added dependencies, as these may span clusters). 4. `ApplyDependencies` (actually merge Clusters as precomputed by `GroupClusters`, and add the dependencies between them). 5. `MakeAcceptable` (perform the LIMO linearization algorithm on Clusters to make sure their linearizations are acceptable). ### 4. Future work This is only an initial version of TxGraph, and some functionality is missing before #28676 can be rebased on top of it: * The ability to get comparative feerate diagrams before/after for the set of staged changes (to evaluate RBF incentive-compatibility). * Mining interface (ability to iterate transactions quickly in mining score order) (see #31444). * Eviction interface (reverse of mining order, plus memory usage accounting) (see #31444). * Ability to fix oversizedness of clusters (before or after committing) - this is needed for reorgs where aborting/rejecting the change just is not an option (see #31553). * Interface for controlling how much effort is spent on LIMO. In this PR it is hardcoded. Then there are further improvements possible which would not block other work: * Making Cluster a virtual class with different implementations based on transaction count (which could dramatically reduce memory usage, as most Clusters are just a single transaction, for which the current implementation is overkill). * The ability to have background thread(s) for improving cluster linearizations. ACKs for top commit: instagibbs: reACKb2ea365648ajtowns: reACKb2ea365648ismaelsadeeq: reACKb2ea365648🚀 glozow: ACKb2ea365648Tree-SHA512: 0f86f73d37651fe47d469db1384503bbd1237b4556e5d50b1d0a3dd27754792d6fc3481f77a201cf2ed36c6ca76e0e44c30e175d112aacb53dfdb9e11d8abc6b
Unit tests
The sources in this directory are unit test cases. Boost includes a unit testing framework, and since Bitcoin Core already uses Boost, it makes sense to simply use this framework rather than require developers to configure some other framework (we want as few impediments to creating unit tests as possible).
The build system is set up to compile an executable called test_bitcoin
that runs all of the unit tests. The main source file for the test library is found in
util/setup_common.cpp.
The examples in this document assume the build directory is named
build. You'll need to adapt them if you named it differently.
Compiling/running unit tests
Unit tests will be automatically compiled if dependencies were met during the generation of the Bitcoin Core build system and tests weren't explicitly disabled.
The unit tests can be run with ctest --test-dir build, which includes unit
tests from subtrees.
Run test_bitcoin --list_content for the full list of tests.
To run the unit tests manually, launch build/bin/test_bitcoin. To recompile
after a test file was modified, run cmake --build build and then run the test again. If you
modify a non-test file, use cmake --build build --target test_bitcoin to recompile only what's needed
to run the unit tests.
To add more unit tests, add BOOST_AUTO_TEST_CASE functions to the existing
.cpp files in the test/ directory or add new .cpp files that
implement new BOOST_AUTO_TEST_SUITE sections.
To run the GUI unit tests manually, launch build/bin/test_bitcoin-qt
To add more GUI unit tests, add them to the src/qt/test/ directory and
the src/qt/test/test_main.cpp file.
Running individual tests
The test_bitcoin runner accepts command line arguments from the Boost
framework. To see the list of arguments that may be passed, run:
test_bitcoin --help
For example, to run only the tests in the getarg_tests file, with full logging:
build/bin/test_bitcoin --log_level=all --run_test=getarg_tests
or
build/bin/test_bitcoin -l all -t getarg_tests
or to run only the doubledash test in getarg_tests
build/bin/test_bitcoin --run_test=getarg_tests/doubledash
The --log_level= (or -l) argument controls the verbosity of the test output.
The test_bitcoin runner also accepts some of the command line arguments accepted by
bitcoind. Use -- to separate these sets of arguments:
build/bin/test_bitcoin --log_level=all --run_test=getarg_tests -- -printtoconsole=1
The -printtoconsole=1 after the two dashes sends debug logging, which
normally goes only to debug.log within the data directory, to the
standard terminal output as well.
Running test_bitcoin creates a temporary working (data) directory with a randomly
generated pathname within test_common bitcoin/, which in turn is within
the system's temporary directory (see
temp_directory_path).
This data directory looks like a simplified form of the standard bitcoind data
directory. Its content will vary depending on the test, but it will always
have a debug.log file, for example.
The location of the temporary data directory can be specified with the
-testdatadir option. This can make debugging easier. The directory
path used is the argument path appended with
/test_common bitcoin/<test-name>/datadir.
The directory path is created if necessary.
Specifying this argument also causes the data directory
not to be removed after the last test. This is useful for looking at
what the test wrote to debug.log after it completes, for example.
(The directory is removed at the start of the next test run,
so no leftover state is used.)
$ build/bin/test_bitcoin --run_test=getarg_tests/doubledash -- -testdatadir=/somewhere/mydatadir
Test directory (will not be deleted): "/somewhere/mydatadir/test_common bitcoin/getarg_tests/doubledash/datadir"
Running 1 test case...
*** No errors detected
$ ls -l '/somewhere/mydatadir/test_common bitcoin/getarg_tests/doubledash/datadir'
total 8
drwxrwxr-x 2 admin admin 4096 Nov 27 22:45 blocks
-rw-rw-r-- 1 admin admin 1003 Nov 27 22:45 debug.log
If you run an entire test suite, such as --run_test=getarg_tests, or all the test suites
(by not specifying --run_test), a separate directory
will be created for each individual test.
Adding test cases
To add a new unit test file to our test suite, you need
to add the file to either src/test/CMakeLists.txt or
src/wallet/test/CMakeLists.txt for wallet-related tests. The pattern is to create
one test file for each class or source file for which you want to create
unit tests. The file naming convention is <source_filename>_tests.cpp
and such files should wrap their tests in a test suite
called <source_filename>_tests. For an example of this pattern,
see uint256_tests.cpp.
Logging and debugging in unit tests
ctest --test-dir build will write to the log file build/Testing/Temporary/LastTest.log. You can
additionally use the --output-on-failure option to display logs of the failed tests automatically
on failure. For running individual tests verbosely, refer to the section
above.
To write to logs from unit tests you need to use specific message methods
provided by Boost. The simplest is BOOST_TEST_MESSAGE.
For debugging you can launch the test_bitcoin executable with gdb or lldb and
start debugging, just like you would with any other program:
gdb build/bin/test_bitcoin
Segmentation faults
If you hit a segmentation fault during a test run, you can diagnose where the fault
is happening by running gdb ./build/bin/test_bitcoin and then using the bt command
within gdb.
Another tool that can be used to resolve segmentation faults is valgrind.
If for whatever reason you want to produce a core dump file for this fault, you can do
that as well. By default, the boost test runner will intercept system errors and not
produce a core file. To bypass this, add --catch_system_errors=no to the
test_bitcoin arguments and ensure that your ulimits are set properly (e.g. ulimit -c unlimited).
Running the tests and hitting a segmentation fault should now produce a file called core
(on Linux platforms, the file name will likely depend on the contents of
/proc/sys/kernel/core_pattern).
You can then explore the core dump using
gdb build/bin/test_bitcoin core
(gdb) bt # produce a backtrace for where a segfault occurred