diff --git a/bip-0158.mediawiki b/bip-0158.mediawiki index bf2e856e..2062c6ed 100644 --- a/bip-0158.mediawiki +++ b/bip-0158.mediawiki @@ -65,11 +65,10 @@ For each block, compact filters are derived containing sets of items associated with the block (eg. addresses sent to, outpoints spent, etc.). A set of such data objects is compressed into a probabilistic structure called a ''Golomb-coded set'' (GCS), which matches all items in the set with probability -1, and matches other items with probability 2^(-P) for some -integer parameter P. We also introduce parameter M -which allows filter to uniquely tune the range that items are hashed onto -before compressing. Each defined filter also selects distinct parameters for P -and M. +1, and matches other items with probability 1/M for some +integer parameter M. The encoding is also parameterized by +P, the bit length of the remainder code. Each filter defined +specifies values for P and M. At a high level, a GCS is constructed from a set of N items by: # hashing all items to 64-bit integers in the range [0, N * M) @@ -88,8 +87,8 @@ one is able to select both Parameters independently, then more optimal values can be selectedhttps://gist.github.com/sipa/576d5f09c3b86c3b1b75598d799fc845. Set membership queries against the hash outputs will have a false positive rate -of 2^(-P). To avoid integer overflow, the -number of items N MUST be <2^32 and M MUST be <2^32. +of M. To avoid integer overflow, the number of items N +MUST be <2^32 and M MUST be <2^32. The items are first passed through the pseudorandom function ''SipHash'', which takes a 128-bit key k and a variable-sized byte vector and produces @@ -189,9 +188,10 @@ golomb_decode(stream, P: uint) -> uint64: ==== Set Construction ==== -A GCS is constructed from three parameters: +A GCS is constructed from four parameters: * L, a vector of N raw items -* P, which determines the false positive rate +* P, the bit parameter of the Golomb-Rice coding +* M, the target false positive rate * k, the 128-bit key used to randomize the SipHash outputs The result is a byte vector with a minimum size of N * (P + 1)