['''How can progress be guaranteed in a backwards-compatible way?''' In order to guarantee progress, it must be ensured that no deadlock occurs, i.e., no state is reached in which each party waits for the other party indefinitely. For example, any upgrade that adheres to the following conditions will guarantee progress:
* The initiator must start by sending at least as many bytes as necessary to mismatch the magic/version 12 bytes prefix.
* The responder must start sending after having received at least one byte that mismatches that 12-byte prefix.
@@ -157,19 +157,19 @@ All data on the wire after the garbage terminators takes the form of encrypted p
Each packet consists of:
* A 3-byte encrypted '''length''' field, encoding the length of the '''contents''' (between ''0'' and ''224-1'']['''Is ''224-1'' bytes sufficient as maximum content size?''' The current Bitcoin P2P protocol has no messages which support more than 4000000 bytes of application payload. By supporting up to ''224-1'' we can accommodate future evolutions needing more than 4 times that value. Hypothetical protocol changes that have even more data to exchange than that should probably use multiple separate messages anyway, because of the per-peer receive buffer sizes involved, and the inability to start processing a message before it is fully received. Of course, future versions of the transport protocol could change the size of the length field, if this were really needed.], inclusive).
* An authenticated encryption of the '''plaintext''', which consists of:
-** A 1-byte '''header''' which consists of transport layer protocol flags. Currently only the highest bit is defined as the '''ignore bit'''. The other bits are ignored, but this may change in future versions['''Why is the header a part of the plaintext and not included alongside the length field?''' The packet length field is the minimum information that must be available before we can leverage the standard RFC8439 AEAD. Any other data, including metadata like the header being in the content encryption makes it easier to reason about the protocol security w.r.t. data being used before it is authenticated. If the ignore bit was not part of the content, another mechanism would be needed to authenticate it; for example, it could be fed as AAD to the AEAD cipher. We feel the complexity of such an approach outweighs the benefit of saving one byte per message.].
+** A 1-byte '''header''' which consists of transport layer protocol flags. Currently, only the highest bit is defined as the '''ignore bit'''. The other bits are ignored, but this may change in future versions['''Why is the header a part of the plaintext and not included alongside the length field?''' The packet length field is the minimum information that must be available before we can leverage the standard RFC8439 AEAD. Any other data, including metadata like the header being in the content encryption makes it easier to reason about the protocol security w.r.t. data being used before it is authenticated. If the ignore bit was not part of the content, another mechanism would be needed to authenticate it; for example, it could be fed as AAD to the AEAD cipher. We feel the complexity of such an approach outweighs the benefit of saving one byte per message.].
** The variable-length '''contents'''.
-The encryption of the plaintext uses '''[https://en.wikipedia.org/wiki/ChaCha20-Poly1305 ChaCha20Poly1305]'''['''Why is ChaCha20Poly1305 chosen as basis for packet encryption?''' It is a very widely used authenticated encryption cipher (used amongst others in SSH, TLS 1.2, TLS 1.3, [https://en.wikipedia.org/wiki/QUIC QUIC], Noise, and [https://www.wireguard.com/protocol/ WireGuard]; in the latter it is currently even the only supported cipher), with very good performance in general purpose software implementations. While AES-based ciphers (including the winners in the [https://competitions.cr.yp.to/caesar.html CAESAR] competition in non-lightweight categories) perform significantly better on systems with AES hardware acceleration, they are also significantly slower in pure software implementations. We choose to optimize for the weakest hardware.], an [https://en.wikipedia.org/wiki/Authenticated_encryption authenticated encryption with associated data] (AEAD) cipher specified in [https://datatracker.ietf.org/doc/html/rfc8439 RFC 8439]. Every packet's plaintext is treated as a separate AEAD message, with a different nonce for each.
+The encryption of the plaintext uses '''[https://en.wikipedia.org/wiki/ChaCha20-Poly1305 ChaCha20Poly1305]'''['''Why is ChaCha20Poly1305 chosen as the basis for packet encryption?''' It is a very widely used authenticated encryption cipher (used among others in SSH, TLS 1.2, TLS 1.3, [https://en.wikipedia.org/wiki/QUIC QUIC], Noise, and [https://www.wireguard.com/protocol/ WireGuard]; in the latter it is currently even the only supported cipher), with very good performance in general purpose software implementations. While AES-based ciphers (including the winners in the [https://competitions.cr.yp.to/caesar.html CAESAR] competition in non-lightweight categories) perform significantly better on systems with AES hardware acceleration, they are also significantly slower in pure software implementations. We choose to optimize for the weakest hardware.], an [https://en.wikipedia.org/wiki/Authenticated_encryption authenticated encryption with associated data] (AEAD) cipher specified in [https://datatracker.ietf.org/doc/html/rfc8439 RFC 8439]. Every packet's plaintext is treated as a separate AEAD message, with a different nonce for each.
The length must be dealt with specially, as it is needed to determine packet boundaries before the whole packet is received and authenticated. As we want a stream that is pseudorandom to a passive attacker, it still needs encryption. We use unauthenticated['''Why is the length encryption not separately authenticated?''' Informally, the relevant security goal we aim for is to hide the number of packets and their lengths (i.e., the packet boundaries) against a passive attacker that receives the bytestream without timing or fragmentation information. (A formal definition can be found for example in [https://himsen.github.io/pdf/thesis.pdf Hansen 2016 (Definition 22)] under the name "boundary hiding against chosen-plaintext attacks (BH-CPA)".) However, we do not aim to hide packet boundaries against active attackers because active attackers can always exploit the fact that the Bitcoin P2P protocol is largely query-response based: they can trickle the bytes on the stream one-by-one unmodified and observe when a response comes (see [https://himsen.github.io/pdf/thesis.pdf Hansen 2016 (Section 3.9)] for a in-depth discussion). With that in mind, we accept that an active (non-MitM) attacker is able to figure out some information about packet boundaries by flipping certain bits in the unauthenticated length field, and observing the other side disconnecting immediately or later. Thus, we choose to use unauthenticated encryption for the length data, which is sufficient to achieve boundary hiding against passive attackers, and saves 16 bytes of bandwidth per packet.] '''ChaCha20''' encryption for this, with an independent key. Note that the plaintext length is still implicitly authenticated by the encryption of the plaintext, but this can only be verified after receiving the whole packet. This design is inspired by that of the ChaCha20Poly1305 cipher suite in [http://bxr.su/OpenBSD/usr.bin/ssh/PROTOCOL.chacha20poly1305 OpenSSH].['''How does packet encryption differ from the OpenSSH design?''' The differences are:
* The length field is only 3 bytes instead of 4, as that is sufficient for our purposes.
* Length encryption keeps drawing pseudorandom bytes from the same ChaCha20 cipher for multiple packets, rather than incrementing the nonce for every packet.
* The Poly1305 authentication tag only covers the encrypted plaintext, and not the encrypted length field. This means that plaintext encryption uses the standard ChaCha20Poly1305 construction without any modifications, maximizing applicability of analysis and review of that cipher. The length encryption can be seen as a separate layer, using a separate key, and thus cannot affect any of the confidentiality or integrity guarantees of the plaintext encryption. On the other hand, this change w.r.t. OpenSSH also does not worsen any properties, as incorrect lengths will still trigger authentication failure for the overall packet (the plaintext length is implicitly authenticated by ChaCha20Poly1305).
-* A hash step is performed every 224]['''How was the rekeying interval 224 chosen?''' Assuming a node sends only ping messages every 20 minutes (the timeout interval for post-[https://github.com/bitcoin/bips/blob/master/bip-0031.mediawiki BIP31] connections) on a connection, the node will transmit 224 packets in about 3.11 days. This means ''soft rekeying'' after a fixed number of packets automatically translates to an upper-bound of time interval for rekeying, while being much simpler to coordinate than an actual time-based rekeying regime. At the same time, doing it once every 224 messages is sufficiently infrequent that it has only negligible impact on performance. Furthermore, 224 times 3 bytes (the number of bytes consumed by each length encryption) is 672, which is a multiple of 64 minus 32. This means that at the end of 224 length encryptions, exactly 32 bytes of keystream data remain that can be used as next key.] messages to rekey the the encryption ciphers, in order to provide forward security.
+* A hash step is performed every 224['''How was the rekeying interval 224 chosen?''' Assuming a node sends only ping messages every 20 minutes (the timeout interval for post-[https://github.com/bitcoin/bips/blob/master/bip-0031.mediawiki BIP31] connections) on a connection, the node will transmit 224 packets in about 3.11 days. This means ''soft rekeying'' after a fixed number of packets automatically translates to an upper-bound of time interval for rekeying, while being much simpler to coordinate than an actual time-based rekeying regime. At the same time, doing it once every 224 messages is sufficiently infrequent that it has only negligible impact on performance. Furthermore, 224 times 3 bytes (the number of bytes consumed by each length encryption) is 672, which is a multiple of 64 minus 32. This means that at the end of 224 length encryptions, exactly 32 bytes of keystream data remain that can be used as next key.] messages to rekey the encryption ciphers, in order to provide forward security.
Because only fixed-length chunks (3-byte length fields) are encrypted, we do not need to treat all length chunks as separate messages. Instead, a single cipher (with the same nonce) is used for multiple consecutive length fields. This avoids wasting 61 pseudorandom bytes per packet, and makes the cost of having a separate cipher for length encryption negligible.['''Is it acceptable to use a less standard construction for length encryption?''' The fact that multiple (non-overlapping) bytes generated by a single ChaCha20 cipher are used for the encryption of multiple consecutive length fields is uncommon. We feel the performance cost gained by this deviation is worth it (especially for small packets, which are very common in Bitcoin's P2P protocol), given the low guarantees that are feasible for length encryption in the first place, and the result is still sufficient to provide pseudorandomness from the view of passive attackers. For plaintext encryption, we independently use a very standard construction, as the stakes for confidentiality and integrity there are much higher.]
-In order to provide forward security['''What value does forward security provide?''' Re-keying ensures [https://eprint.iacr.org/2001/035.pdf forward secrecy within a session], i.e., an attacker compromising the current session secrets cannot derive past encryption keys in the same session.]['''Why have a cipher with forward secrecy but no periodical refresh of the ECDH key exchange?''' Our cipher ratchets encryption keys forward in order to protect messages encrypted under ''past'' encryption keys. In contrast, re-performing ECDH key exchange would protect messages encrypted under ''future'' encryption keys, i.e., it would re-establish security after the attacker had compromised one of the peers ''temporarily'' (e.g., the attacker obtains a memory dump). We do not believe protecting against that is a priority: an attacker that, for whatever reason, is capable of an attack that reveals encryption keys (or other session secrets) of a peer once is likely capable of performing the same attack again after peers have re-performed the ECDH key exchange. Thus, we do not believe the benefits of re-performing key exchange outweigh the additional complexity that comes with the necessary coordination between the peers. We note that the initiator could choose to close and re-open the entire connection in order to force a refresh of the ECDH key exchange, but that introduces other issues: a connection slot needs to be kept open at the responder side, it is not cryptographically guaranteed that really the same initiator will use it, and the observable TCP reset and handshake may create a detectable pattern.], the encryption keys for both plaintext and length encryption are cycled every 224 messages, by switching to a new key that is generated by the key stream using the old key.
+In order to provide forward security['''What value does forward security provide?''' Re-keying ensures [https://eprint.iacr.org/2001/035.pdf forward secrecy within a session], i.e., an attacker compromising the current session secrets cannot derive past encryption keys in the same session.]['''Why have a cipher with forward secrecy but no periodical refresh of the ECDH key exchange?''' Our cipher ratchets encryption keys forward in order to protect messages encrypted under ''past'' encryption keys. In contrast, re-performing ECDH key exchange would protect messages encrypted under ''future'' encryption keys, i.e., it would re-establish security after the attacker had compromised one of the peers ''temporarily'' (e.g., the attacker obtains a memory dump). We do not believe protecting against that is a priority: an attacker that, for whatever reason, is capable of an attack that reveals encryption keys (or other session secrets) of a peer once is likely capable of performing the same attack again after peers have re-performed the ECDH key exchange. Thus, we do not believe the benefits of re-performing key exchange outweigh the additional complexity that comes with the necessary coordination between the peers. We note that the initiator could choose to close and re-open the entire connection to force a refresh of the ECDH key exchange, but that introduces other issues: a connection slot needs to be kept open at the responder side, it is not cryptographically guaranteed that really the same initiator will use it, and the observable TCP reset and handshake may create a detectable pattern.], the encryption keys for both plaintext and length encryption are cycled every 224 messages, by switching to a new key that is generated by the key stream using the old key.
==== Handshake: key exchange and version negotiation ====
@@ -242,19 +242,19 @@ To define the needed functions, we first introduce a helper function, matching t
** For every ''x'' in ''{u + 4Y2, (-X/Y - u)/2, (X/Y - u)/2}'' (all ''mod p''; the order matters):
*** If ''lift_x(x)'' succeeds, return ''x''. There is at least one such ''x''.
-To find encodings of a given X coordinate ''x'', we first need the inverse of ''XSwiftEC''. The function ''XSwiftECInv(x, u, case)'' either returns ''t'' such that ''XSwiftEC(u, t) = x'', or ''None''. The ''case'' variable is an integer in range 0 to 7 inclusive, which selects which of the up to 8 valid such ''t'' values to return:
+To find encodings of a given X coordinate ''x'', we first need the inverse of ''XSwiftEC''. The function ''XSwiftECInv(x, u, case)'' either returns ''t'' such that ''XSwiftEC(u, t) = x'', or ''None''. The ''case'' variable is an integer in range ''0..7'', which selects which of the up to 8 valid such ''t'' values to return:
* ''XSwiftECInv(x, u, case)'':
** If ''case & 2 = 0'':
*** If ''lift_x(-x - u)'' succeeds, return ''None''.
*** Let ''v = x''.
*** Let ''s = -(u3 + 7)/(u2 + uv + v2) (mod p)''.
-** If ''case & 2 = 2'':
+** Else (''case & 2 = 2''):
*** Let ''s = x - u (mod p)''.
*** If ''s = 0'', return ''None''.
*** Let ''r'' be the square root of ''-s(4(u3 + 7) + 3u2s) (mod p).''['''How to compute a square root mod ''p''?''' Due to the structure of ''p'', a candidate for the square root of ''a'' mod ''p'' can be computed as ''x = a(p+1)/4 mod p''. If ''a'' is not a square mod ''p'', this formula returns the square root of ''-a mod p'' instead, so it is necessary to verify that ''x2 mod p = a''. If that is the case ''-x mod p'' is a solution too, but we define "the" square root to be equal to that expression (the square root will therefore always be a square itself, as ''(p+1)/4'' is even). This algorithm is a specialization of the [https://en.wikipedia.org/wiki/Tonelli%E2%80%93Shanks_algorithm Tonelli-Shanks algorithm].] Return ''None'' if it does not exist.
-** If ''case & 1 = 1'' and ''r = 0'', return ''None''.
-*** Let ''v = (-u + r/s)/2''.
+*** If ''case & 1 = 1'' and ''r = 0'', return ''None''.
+*** Let ''v = (r/s - u)/2''.
** Let ''w'' be the square root of ''s (mod p)''. Return ''None'' if it does not exist.
** If ''case & 5 = 0'', return ''-w(u(1 - c)/2 + v)''.
** If ''case & 5 = 1'', return ''w(u(1 + c)/2 + v)''.
@@ -357,7 +357,7 @@ def respond_v2_handshake(peer, garbage_len):
use_v1_protocol()
-Upon receiving the encoded responder public key, the initiator derives the shared ECDH secret and instantiates the encrypted transport. It then sends the derived 16-byte initiator_garbage_terminator
followed by an authenticated, encrypted packet with empty contents['''Does the content of the garbage authentication packet need to be empty?''' The receiver ignores the content of the garbage authentication packet, so its content can be anything, and it can in principle be used as a shaping mechanism too. There is however no need for that, as immediately afterwards the initiator can start using decoy packets as (much more flexible) shaping mechanism instead.] to authenticate the garbage, and its own version packet. It then receives the responder's garbage and garbage authentication packet (delimited by the garbage terminator), and checks if the garbage is authenticated correctly. The responder performs very similar steps, but includes the earlier received prefix bytes in the public key. As mentioned before, the encrypted packets for the '''version negotiation phase''' can be piggybacked with the garbage authentication packet to minimize roundtrips.
+Upon receiving the encoded responder public key, the initiator derives the shared ECDH secret and instantiates the encrypted transport. It then sends the derived 16-byte initiator_garbage_terminator
followed by an authenticated, encrypted packet with empty contents['''Does the content of the garbage authentication packet need to be empty?''' The receiver ignores the content of the garbage authentication packet, so its content can be anything, and it can in principle be used as a shaping mechanism too. There is however no need for that, as immediately afterward the initiator can start using decoy packets as (a much more flexible) shaping mechanism instead.] to authenticate the garbage, and its own version packet. It then receives the responder's garbage and garbage authentication packet (delimited by the garbage terminator), and checks if the garbage is authenticated correctly. The responder performs very similar steps but includes the earlier received prefix bytes in the public key. As mentioned before, the encrypted packets for the '''version negotiation phase''' can be piggybacked with the garbage authentication packet to minimize roundtrips.
def complete_handshake(peer, initiating):
@@ -405,7 +405,7 @@ These will be used for plaintext encryption and length encryption, respectively.
To provide re-keying every 224 packets, we specify two wrappers.
-The first is '''FSChaCha20Poly1305''', which represents a ChaCha20Poly1305 AEAD, which automatically changes the nonce after every message, and rekeys every 224 messages by encrypting 32 zero bytes['''Why is rekeying implemented in terms of an invocation of the AEAD?''' This means the FSChaCha20Poly1305 wrapper can be thought of as a pure layer around the ChaCha20Poly1305 AEAD. Actual implementations can take advantage of the fact that this formulation is equivalent to using byte 64 through 95 of the keystream output of the underlying ChaCha20 cipher as new key, avoiding the need for Poly1305 in the process.], and using the first 32 bytes of the result. Each message will be used for one packet. Note that in our protocol, any FSChaCha20Poly1305 instance is always either exclusively encryption or exclusively decryption, as separate instances are used for each direction of the protocol. The nonce used for a message is composed of the 32-bit little endian encoding of the number of messages with the current key, followed by the 64-bit little endian encoding of the number of rekeyings performed. For rekeying, the first 32-bit integer is set to ''0xffffffff''.
+The first is '''FSChaCha20Poly1305''', which represents a ChaCha20Poly1305 AEAD, which automatically changes the nonce after every message, and rekeys every 224 messages by encrypting 32 zero bytes['''Why is rekeying implemented in terms of an invocation of the AEAD?''' This means the FSChaCha20Poly1305 wrapper can be thought of as a pure layer around the ChaCha20Poly1305 AEAD. Actual implementations can take advantage of the fact that this formulation is equivalent to using byte 64 through 95 of the keystream output of the underlying ChaCha20 cipher as new key, avoiding the need for Poly1305 in the process.], and using the first 32 bytes of the result. Each message will be used for one packet. Note that in our protocol, any FSChaCha20Poly1305 instance is always either exclusively encryption or exclusively decryption, as separate instances are used for each direction of the protocol. The nonce used for a message is composed of the 32-bit little-endian encoding of the number of messages with the current key, followed by the 64-bit little-endian encoding of the number of rekeyings performed. For rekeying, the first 32-bit integer is set to ''0xffffffff''.
REKEY_INTERVAL = 224
@@ -437,7 +437,7 @@ class FSChaCha20Poly1305:
return self.crypt(aad, plaintext, False)
-The second is '''FSChaCha20''', a (single) stream cipher which is used for the lengths of all packets. Encryption and decryption are identical here, so a single function crypt
is exposed. It XORs the input with bytes generated using the ChaCha20 block function, rekeying every 224 chunks using the next 32 bytes of the block function output as new key. A ''chunk'' refers here to a single invocation of crypt
. As explained before, the same cipher is used for 224 consecutive chunks, to avoid wasting cipher output. The nonce used for these batches of 224 chunks is composed of 4 zero bytes followed by the 64-bit little endian encoding of the number of rekeyings performed. The block counter is reset to 0 after every rekeying.
+The second is '''FSChaCha20''', a (single) stream cipher which is used for the lengths of all packets. Encryption and decryption are identical here, so a single function crypt
is exposed. It XORs the input with bytes generated using the ChaCha20 block function, rekeying every 224 chunks using the next 32 bytes of the block function output as new key. A ''chunk'' refers here to a single invocation of crypt
. As explained before, the same cipher is used for 224 consecutive chunks, to avoid wasting cipher output. The nonce used for these batches of 224 chunks is composed of 4 zero bytes followed by the 64-bit little-endian encoding of the number of rekeyings performed. The block counter is reset to 0 after every rekeying.
class FSChaCha20:
@@ -515,12 +515,12 @@ v2 Bitcoin P2P transport layer packets use the encrypted message structure shown
{|class="wikitable"
! Field !! Size in bytes !! Comments
|-
-| message_type
|| ''1..13'' || either a one byte ID or an ASCII string prefixed with a length byte
+| message_type
|| 1 or 13 || either a one byte ID in range ''1..255'' or b'\x00'
followed by a 12-byte ASCII message type (as in the v1 P2P protocol)
|-
| message_payload
|| message_length
|| message payload
|}
-If the first byte of message_type
is in the range ''1..12'', it is interpreted as the number of ASCII bytes that follow for the message type. If it is in the range ''13..255'', it is interpreted as a message type ID. This structure results in smaller messages than the v1 protocol as most messages sent/received will have a message type ID.['''How do the length between v1 and v2 compare?''' For messages that use the 1-byte short message type ID, v2 packets use 3 bytes less per message than v1.]
+If the first byte of message_type
is b'\x00'
, the following 12 bytes are interpreted as an ASCII message type (as in the v1 P2P protocol), trailing padded with b'\x00'
as necessary. If the first byte of message_type
is in the range ''1..255'', it is interpreted as a message type ID. This structure results in smaller messages than the v1 protocol, as most messages sent/received will have a message type ID. We recommend reserving 1-byte type IDs for message types that are sent more than once per direction per connection.['''How do the lengths between v1 and v2 compare?''' For messages that use the 1-byte short message type ID, v2 packets use 3 bytes less per message than v1.]['''Why not allow variable length long message type IDs?''' Allowing for variable length long IDs reduces the available 1-byte ID space by 12 (to encode the length itself) and incentivizes less descriptive message types. In addition, limiting message types to fixed lengths of 1 or 13 hampers traffic analysis.]
The following table lists currently defined message type IDs:
@@ -533,50 +533,38 @@ The following table lists currently defined message type IDs:
!3
|-
!+0
-|(undefined)||(1 byte string)||(2 byte string)||(3 byte string)
+|(12 bytes follow)||ADDR
||BLOCK
||BLOCKTXN
|-
!+4
-|(4 byte string)||(5 byte string)||(6 byte string)||(7 byte string)
-|-
-!+8
-|(8 byte string)||(9 byte string)||(10 byte string)||(11 byte string)
-|-
-!+12
-|(12 byte string)||ADDR
||BLOCK
||BLOCKTXN
-|-
-!+16
|CMPCTBLOCK
||FEEFILTER
||FILTERADD
||FILTERCLEAR
|-
+!+8
+|FILTERLOAD
||GETBLOCKS
||GETBLOCKTXN
||GETDATA
+|-
+!+12
+|GETHEADERS
||HEADERS
||INV
||MEMPOOL
+|-
+!+16
+|MERKLEBLOCK
||NOTFOUND
||PING
||PONG
+|-
!+20
-|FILTERLOAD
||GETADDR
||GETBLOCKS
||GETBLOCKTXN
+|SENDCMPCT
||TX
||GETCFILTERS
||CFILTER
|-
!+24
-|GETDATA
||GETHEADERS
||HEADERS
||INV
-|-
-!+28
-|MEMPOOL
||MERKLEBLOCK
||NOTFOUND
||PING
-|-
-!+32
-|PONG
||SENDCMPCT
||SENDHEADERS
||TX
-|-
-!+36
-|VERACK
||VERSION
||GETCFILTERS
||CFILTER
-|-
-!+40
|GETCFHEADERS
||CFHEADERS
||GETCFCHECKPT
||CFCHECKPT
|-
-!+44
-|WTXIDRELAY
||ADDRV2
||SENDADDRV2
||SENDTXRCNCL
+!+28
+|ADDRV2
||REQRECON
||SKETCH
||REQSKETCHEXT
|-
-!+48
-|REQRECON
||SKETCH
||REQSKETCHEXT
||RECONCILDIFF
+!+32
+|RECONCILDIFF
|-
-!≥52
+!≥33
|| colspan="4" | (undefined)
|}
-The message types may be updated separately after BIP finalization.
+Additional message types may be added separately after BIP finalization.
=== Signaling specification ===
==== Signaling v2 support ====
@@ -586,8 +574,8 @@ Peers supporting the v2 transport protocol signal support by advertising the