Do merkle root and txid duplicates check simultaneously

Move the txid duplicates check into BuildMerkleTree, where it can be done
much more efficiently (without needing to build a full txid set to detect
duplicates).

The previous version (using the std::set<uint256> to detect duplicates) was
also slightly too weak. A block mined with actual duplicate transactions
(which is invalid, due to the inputs of the duplicated transactions being
seen as double spends) would trigger the duplicates logic, resulting in the
block not being stored on disk, and rerequested. This change fixes that by
only triggering in the case of duplicated transactions that can actually
result in an identical merkle root.
This commit is contained in:
Pieter Wuille
2014-09-16 00:30:05 +02:00
parent 7a04f3d708
commit 584a358997
3 changed files with 57 additions and 17 deletions

View File

@@ -224,29 +224,66 @@ uint256 CBlockHeader::GetHash() const
return Hash(BEGIN(nVersion), END(nNonce));
}
uint256 CBlock::BuildMerkleTree() const
uint256 CBlock::BuildMerkleTree(bool* fMutated) const
{
// WARNING! If you're reading this because you're learning about crypto
// and/or designing a new system that will use merkle trees, keep in mind
// that the following merkle tree algorithm has a serious flaw related to
// duplicate txids, resulting in a vulnerability. (CVE-2012-2459) Bitcoin
// has since worked around the flaw, but for new applications you should
// use something different; don't just copy-and-paste this code without
// understanding the problem first.
/* WARNING! If you're reading this because you're learning about crypto
and/or designing a new system that will use merkle trees, keep in mind
that the following merkle tree algorithm has a serious flaw related to
duplicate txids, resulting in a vulnerability (CVE-2012-2459).
The reason is that if the number of hashes in the list at a given time
is odd, the last one is duplicated before computing the next level (which
is unusual in Merkle trees). This results in certain sequences of
transactions leading to the same merkle root. For example, these two
trees:
A A
/ \ / \
B C B C
/ \ | / \ / \
D E F D E F F
/ \ / \ / \ / \ / \ / \ / \
1 2 3 4 5 6 1 2 3 4 5 6 5 6
for transaction lists [1,2,3,4,5,6] and [1,2,3,4,5,6,5,6] (where 5 and
6 are repeated) result in the same root hash A (because the hash of both
of (F) and (F,F) is C).
The vulnerability results from being able to send a block with such a
transaction list, with the same merkle root, and the same block hash as
the original without duplication, resulting in failed validation. If the
receiving node proceeds to mark that block as permanently invalid
however, it will fail to accept further unmodified (and thus potentially
valid) versions of the same block. We defend against this by detecting
the case where we would hash two identical hashes at the end of the list
together, and treating that identically to the block having an invalid
merkle root. Assuming no double-SHA256 collisions, this will detect all
known ways of changing the transactions without affecting the merkle
root.
*/
vMerkleTree.clear();
vMerkleTree.reserve(vtx.size() * 2 + 16); // Safe upper bound for the number of total nodes.
BOOST_FOREACH(const CTransaction& tx, vtx)
vMerkleTree.push_back(tx.GetHash());
int j = 0;
bool mutated = false;
for (int nSize = vtx.size(); nSize > 1; nSize = (nSize + 1) / 2)
{
for (int i = 0; i < nSize; i += 2)
{
int i2 = std::min(i+1, nSize-1);
if (i2 == i + 1 && i2 + 1 == nSize && vMerkleTree[j+i] == vMerkleTree[j+i2]) {
// Two identical hashes at the end of the list at a particular level.
mutated = true;
}
vMerkleTree.push_back(Hash(BEGIN(vMerkleTree[j+i]), END(vMerkleTree[j+i]),
BEGIN(vMerkleTree[j+i2]), END(vMerkleTree[j+i2])));
}
j += nSize;
}
if (fMutated) {
*fMutated = mutated;
}
return (vMerkleTree.empty() ? 0 : vMerkleTree.back());
}