diff --git a/bip-taproot.mediawiki b/bip-taproot.mediawiki new file mode 100644 index 00000000..2e8b43b1 --- /dev/null +++ b/bip-taproot.mediawiki @@ -0,0 +1,285 @@ +
+ BIP: bip-taproot + Layer: Consensus (soft fork) + Title: Taproot: SegWit version 1 output spending rules + Author: Pieter Wuille+ +==Introduction== + +===Abstract=== + +This document proposes a new SegWit version 1 output type, with spending rules based on Taproot, Schnorr signatures, and Merkle branches. + +===Copyright=== + +This document is licensed under the 3-clause BSD license. + +===Motivation=== + +A number of related ideas for improving Bitcoin's scripting capabilities have been previously proposed: Schnorr signatures (bip-schnorr), Merkle branches ("MAST", [https://github.com/bitcoin/bips/blob/master/bip-0114.mediawiki BIP114], [https://github.com/bitcoin/bips/blob/master/bip-0117.mediawiki BIP117]), new sighash modes ([https://github.com/bitcoin/bips/blob/master/bip-0118.mediawiki BIP118]), new opcodes like CHECKSIGFROMSTACK, [https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2018-January/015614.html Taproot], [https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2018-February/015700.html Graftroot], [https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2018-July/016249.html G'root], and [https://bitcointalk.org/index.php?topic=1377298.0 cross-input aggregation]. + +Combining all these ideas in a single proposal would be an extensive change, be hard to review, and likely miss new discoveries that otherwise could have been made along the way. Some of these ideas are also less mature than others. On the other hand, separating them all into independent proposals would reduce the efficiency and privacy gains to be had, and complicate analysis of their interactions. It seems preferable to focus on one goal set at a time, and combine interacting technologies to achieve them. + +==Design== + +This proposal focuses on improvements to privacy, efficiency, and flexibility of Bitcoin's smart contracts, subject to two restrictions: +* Not adding any new strong security assumptions +* Not combining into the proposal any functionality which could be simply implemented independently. + +Specifically, it seeks to minimize how much information about the spendability conditions of a transaction output is revealed on chain at creation or spending time. To avoid reducing the effectiveness of future improvements a number of upgrade mechanisms are also included, as well as fixes for minor but long-standing issues. + +As a result we choose this combination of technologies: +* '''Merkle branches''' let us only reveal the actually executed part of the script to the blockchain, as opposed to all possible ways a script can be executed. Among the various known mechanisms for implementing this, one where the Merkle tree becomes part of the script's structure directly maximizes the space savings, so that approach is chosen. +* '''Taproot''' on top of that lets us merge the traditionally separate pay-to-pubkey and pay-to-scripthash policies, making all outputs spendable by either a key or (optionally) a script, and indistinguishable from each other. As long as the key-based spending path is used for spending, it is not revealed whether a script path was permitted as well, resulting in space savings and an increase in scripting privacy at spending time. +* Taproot's advantages become apparent under the assumption that most applications involve outputs that could be spent by all parties agreeing. That's where '''Schnorr''' signatures come in, as they permit [https://eprint.iacr.org/2018/068 key aggregation]: a public key can be constructed from multiple participant public keys, and which requires cooperation between all participants to sign for. Such multi-party public keys and signatures are indistinguishable from their single-party equivalents. This means that under this Taproot assumption, the all-parties-agree case can be handled using the key-based spending path, which is both private and efficient using Taproot. This can be generalized to arbitrary M-of-N policies, as Schnorr signatures support threshold signing, at the cost of more complex setup protocols. +* As Schnorr signatures also permit '''batch validation''', allowing multiple signatures to be validated together more efficiently than validating each one independently, we make sure all parts of the design are compatible with this. +* Where unused bits appear as a result of the above changes, they are reserved for mechanisms for '''future extensions'''. As a result, every script in the Merkle tree has an associated version such that new script versions can be introduced with a soft fork while remaining compatible with bip-taproot. Additionally, future soft forks can make use of the currently unused+ Comments-Summary: No comments yet. + Comments-URI: + Status: Draft + Type: Standards Track + Created: + License: BSD-3-Clause +
annex
in the witness (see [[#Rationale]]).
+* While the core semantics of the '''signature hashing algorithm''' are not changed, a number of improvements are included in this proposal. The new signature hashing algorithm fixes the verification capabilities of offline signing devices by including amount and scriptPubKey in the digest, avoids unnecessary hashing, introduces '''tagged hashes''' and defines a default sighash byte.
+
+Not included in this proposal are additional features like new sighash modes or opcodes that can be included with no loss in effectiveness as a future extension. Also not included is cross-input aggregation, as it [https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2018-March/015838.html interacts] in complex ways with upgrade mechanisms and solutions to that are still [https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2018-October/016461.html in flux].
+
+== Specification ==
+
+This section specifies the Taproot consensus rules. Validity is defined by exclusion: a block or transaction is valid if no condition exists that marks it failed.
+
+The notation below follows that of bip-schnorr.
+
+=== Tagged hashes ===
+
+Cryptographic hash functions are used for multiple purposes in the specification below and in Bitcoin in general. To make sure hashes used in one context can't be reinterpreted in another one, all hash functions are tweaked with a context-dependent tag name, in such a way that collisions across contexts can be assumed to be infeasible.
+
+In the text below, ''hashtag(m)'' is a shorthand for ''SHA256(SHA256(tag) || SHA256(tag) || m)'', where ''tag'' is a UTF-8 encoded tag name.
+* So far, nowhere in the Bitcoin protocol are hashes used where the input of SHA256 starts with two (non-double) SHA256 hashes, making collisions with existing uses of hash functions infeasible.
+* Because the prefix ''SHA256(tag) || SHA256(tag)'' is a 64-byte long context-specific constant, optimized implementations are possible (identical to SHA256 itself, but with a modified initial state).
+* Using SHA256 of the tag name itself is reasonably simple and efficient for implementations that don't choose to use the optimization above.
+
+=== Script validation rules ===
+
+A Taproot output is a SegWit output (native or P2SH-nested, see [https://github.com/bitcoin/bips/blob/master/bip-0141.mediawiki BIP141]) with version number 1, and a 33-byte witness program whose first byte is 0 or 1.
+The following rules only apply when such an output is being spent. Any other outputs, including version 1 outputs with lengths other than 33 bytes, or with a first byte different from 0 or 1, remain unencumbered.
+
+* Let ''u'' be the 33-byte array containing the witness program (second push in scriptPubKey or P2SH redeemScript).
+* Let ''Q = point(byte(2 + u[0]) || u[1:33])'''''Why is the public key directly included in the output?''' While typical earlier constructions store a hash of a script or a public key in the output, this is rather wasteful when a public key is always involved. To guarantee batch verifiability, ''Q'' must be known to every verifier, and thus only revealing its hash as an output would imply adding an additional 33 bytes to the witness. Furthermore, to maintain [https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2016-January/012198.html 128-bit collision security] for outputs, a 256-bit hash would be required anyway, which is comparable in size (and thus in cost for senders) to revealing the public key directly. While the usage of public key hashes is often said to protect against ECDLP breaks or quantum computers, this protection is very weak at best: transactions are not protected while being confirmed, and a very [https://twitter.com/pwuille/status/1108097835365339136 large portion] of the currency's supply is not under such protection regardless. Actual resistance to such systems can be introduced by relying on different cryptographic assumptions, but this proposal focuses on improvements that do not change the security model. Note that using P2SH-wrapped outputs only have 80-bit collision security. This is considered low, and is relevant whenever the output includes data from more than a single party (public keys, hashes, ...). If this is not a valid point on the curve, fail.
+* Fail if the witness stack has 0 elements.
+* If there are at least two witness elements, and the first byte of the last element is 0x50'''Why is the first byte of the annex 0x50
?''' Like the 0xc0
-0xc1
constants, 0x50
is chosen as it could not be confused with a valid P2WPKH or P2WSH spending. As the control block's initial byte's lowest bit is used to indicate the public key's Y oddness, each script version needs two subsequence byte values that are both not yet used in P2WPKH or P2WSH spending. To indicate the annex, only an "unpaired" available byte is necessary like 0x50
. This choice maximizes the available options for future script versions., this last element is called ''annex'' ''a'''''What is the purpose of the annex?''' The annex is a reserved space for future extensions, such as indicating the validation costs of computationally expensive new opcodes in a way that is recognizable without knowing the outputs being spent. Until the meaning of this field is defined by another softfork, users SHOULD NOT include annex
in transactions, or it may lead to PERMANENT FUND LOSS. and is removed from the witness stack. The annex (or the lack of thereof) is always covered by the transaction digest and contributes to transaction weight, but is otherwise ignored during taproot validation.
+* If there is exactly one element left in the witness stack, key path spending is used:
+** The single witness stack element is interpreted as the signature and must be valid (see the next section) for the public key ''Q'' and taproot transaction digest (to be defined hereinafter) as message. Fail if it is not. Otherwise pass.
+* If there are at least two witness elements left, script path spending is used:
+** Call the second-to-last stack element ''s'', the script.
+** The last stack element is called the control block ''c'', and must have length ''33 + 32m'', for a value of ''m'' that is an integer between 0 and 32, inclusive. Fail if it does not have such a length.
+** Let ''P = point(byte(2 + (c[0] & 1)) || c[1:33])'''''What is the purpose of the first byte of the control block?''' The first byte of the control block has three distinct functions:
+* The low bit is used to denote the oddness of the Y coordinate of the ''P'' point.
+* By keeping the top two bits set to true, it can be guaranteed that scripts can be recognized without knowledge of the UTXO being spent, simplifying analysis. This is because such values cannot occur as first byte of the final stack element in either P2WPKH or P2WSH spends.
+* The remaining five bits are used for introducing new script versions that are not observable unless actually executed.
+. Fail if this point is not on the curve.
+** Let ''l = c[0] & 0xfe'', the leaf version.
+** Let ''k0 = hashTapLeaf(l || compact_size(size of s) || s)''; also call it the ''tapleaf hash''.
+** For ''j'' in ''[0,1,...,m-1]'':
+*** Let ''ej = c[33+32j:65+32j]''.
+*** Let ''kj+1 depend on whether ''kj < ej'' (lexicographically)'''Why are child elements sorted before hashing in the Merkle tree?''' By doing so, it is not necessary to reveal the left/right directions along with the hashes in revealed Merkle branches. This is possible because we do not actually care about the position of specific scripts in the tree; only that they are actually committed to.:
+**** If ''kj < ej'': ''kj+1 = hashTapBranch(kj || ej)'''''Why not use a more efficient hash construction for inner Merkle nodes?''' The chosen construction does require two invocations of the SHA256 compression functions, one of which can be avoided in theory (see BIP98). However, it seems preferable to stick to constructions that can be implemented using standard cryptographic primitives, both for implementation simplicity and analyzability. If necessary, a significant part of the second compression function can be optimized out by [https://github.com/bitcoin/bitcoin/pull/13191 specialization] for 64-byte inputs..
+**** If ''kj ≥ ej'': ''kj+1 = hashTapBranch(ej || kj)''.
+** Let ''t = hashTapTweak(bytes(P) || km) = hashTapTweak(2 + (c[0] & 1) || c[1:33] || km)''.
+** If ''t ≥ 0xFFFFFFFF FFFFFFFF FFFFFFFF FFFFFFFE BAAEDCE6 AF48A03B BFD25E8C D0364141'' (order of secp256k1), fail.
+** If ''Q ≠ P + int(t)G'', fail.
+** Execute the script, according to the applicable script rules'''What are the applicable script rules in script path spends?''' Bip-tapscript specifies validity rules that apply if the leaf version is ''0xc0'', but future proposals can introduce rules for other leaf versions., using the witness stack elements excluding the script ''s'', the control block ''c'', and the annex ''a'' if present, as initial stack.
+
+''Q'' is referred to as ''taproot output key'' and ''P'' as ''taproot internal key''.
+
+=== Signature validation rules ===
+
+The following rules apply:
+
+* If the signature is not 64'''Why permit two signature lengths?''' By making the most common type of hash_type
implicit, a byte can often be saved. or 65 bytes, fail.
+* If the signature size is 65 bytes:
+** If the final byte is not a valid hash_type
(defined hereinafter), fail.
+** If the final byte is 0x00
, fail'''Why can the hash_type
not be 0x00
in 65-byte signatures?''' Permitting that would enable malleating 64-byte signatures into 65-byte ones, resulting a different fee rate than the creator intended.
+** If the first 64 bytes are not a valid signature according to bip-schnorr for the public key and message set to the transaction digest with hash_type
set as the final byte, fail.
+* If the signature size is 64 bytes:
+** If it is not a valid signature according to bip-schnorr for the public key and the hash_type = 0x00
transaction digest as message, fail.
+* Otherwise the signature is valid.
+
+==== hash_type ====
+
+hash_type
is an 8-bit unsigned value. The SIGHASH
encodings from the legacy script system are used, including SIGHASH_ALL
, SIGHASH_NONE
, SIGHASH_SINGLE
, and SIGHASH_ANYONECANPAY
+
+The following use of hash_type
are invalid, and fail execution:
+
+* Using SIGHASH_SINGLE
without a "corresponding output" (an output with the same index as the input being verified).
+* Using any hash_type
value that is not 0x00
, 0x01
, 0x02
, 0x03
, 0x81
, 0x82
, or 0x83
'''Why reject unknown hash_type
values?''' By doing so, it is easier to reason about the worst case amount of signature hashing an implementation with adequate caching must perform..
+* The signature has 65 bytes, and hash_type
is 0x00
.
+
+==== Transaction digest ====
+
+As the message for signature verification, transaction digest is ''hashTapSighash'' of the following values (size in byte) serialized. Numerical values in 2, 4, or 8-byte are encoded in little-endian.
+
+* Control:
+** epoch
(1): always 0. '''What's the purpose of the epoch?''' The epoch
can be increased to allow securely creating a new transaction digest algorithms with large changes to the structure or interpretation of hash_type
if needed.
+** hash_type
(1).
+* Transaction data:
+** nVersion
(4): the nVersion
of the transaction.
+** nLockTime
(4): the nLockTime
of the transaction.
+** If the SIGHASH_ANYONECANPAY
flag is not set:
+*** sha_prevouts
(32): the SHA256 of the serialization of all input outpoints.
+*** sha_amounts
(32): the SHA256 of the serialization of all input amounts.
+*** sha_sequences
(32): the SHA256 of the serialization of all input nSequence
.
+** If both the SIGHASH_NONE
and SIGHASH_SINGLE
flags are not set:
+*** sha_outputs
(32): the SHA256 of the serialization of all outputs in CTxOut
format.
+* Data about this input:
+** spend_type
(1):
+*** Bit-0 is set if the scriptPubKey
being spent is P2SH (opposed to "native segwit").
+*** Bit-1 is set if an annex is present (the original witness stack has two or more witness elements, and the first byte of the last element is 0x50
).
+*** The other bits are unset.
+** scriptPubKey
(24 or 36): scriptPubKey
of the previous output spent by this input, serialized as script inside CTxOut
. The size is 24-byte for P2SH-embedded segwit, or 36-byte for native segwit.
+** If the SIGHASH_ANYONECANPAY
flag is set:
+*** outpoint
(36): the COutPoint
of this input (32-byte hash + 4-byte little-endian).
+*** amount
(8): value of the previous output spent by this input.
+*** nSequence
(4): nSequence
of this input.
+** If the SIGHASH_ANYONECANPAY
flag is not set:
+*** input_index
(2): index of this input in the transaction input vector. Index of the first input is 0.
+** If the bit-1 of spend_type
is set:
+*** sha_annex
(32): the SHA256 of (compact_size(size of annex) || annex).
+* Data about this output:
+** If the SIGHASH_SINGLE
flag is set:
+*** sha_single_output
(32): the SHA256 of the corresponding output in CTxOut
format.
+
+The total number of bytes hashed is at most ''209'''''What is the number of bytes hashed for the signature hash?''' The total size of the input to ''hashTapSighash'' (excluding the initial 64-byte hash tag) can be computed using the following formula: ''177 - is_anyonecanpay * 50 - is_none * 32 - is_p2sh_spending * 12 + has_annex * 32''..
+
+In summary, the semantics of the BIP143 sighash types remain unchanged, except the following:
+# The way and order of serialization is changed.'''Why is the serialization in the transaction digest changed?''' Hashes that go into the digest and the digest itself are now computed with a single SHA256 invocation instead of double SHA256. There is no expected security improvement by doubling SHA256 because this only protects against length-extension attacks against SHA256 which are not a concern for transaction digests because there is no secret data. Therefore doubling SHA256 is a waste of resources. The digest computation now follows a logical order with transaction level data first, then input data and output data. This allows to efficiently cache the transaction part of the digest across different inputs using the SHA256 midstate. Additionally, digest computation avoids unnecessary hashing as opposed to BIP143 digests in which parts may be set zero and before hashing them. Despite that, collisions are made impossible by committing to the length of the data (implicit in hash_type
and spend_type
) before the variable length data.
+# The digest commits to the scriptPubKey
'''Why does the transaction digest commit to the scriptPubKey
?''' This prevents lying to offline signing devices about the type of output being spent, even when the actually executed script (scriptCode
in BIP143) is correct. Without committing to the scriptPubKey
an attacker can fool the device into overpaying fees by asking it to sign for a P2SH wrapped segwit output but actually using it to spend a native segwit output..
+# If the SIGHASH_ANYONECANPAY
flag is not set, the digest commits to the amounts of ''all'' transaction inputs.'''Why does the transaction digest commit to the amounts of all transaction inputs?''' This eliminates the possibility to lie to offline signing devices about the fee of a transaction.
+# The digest commits to all input nSequence
if SIGHASH_NONE
or SIGHASH_SINGLE
are set (unless SIGHASH_ANYONECANPAY
is set as well).'''Why does the transaction digest commit to all input nSequence
if SIGHASH_SINGLE
or SIGHASH_NONE
are set?''' Because setting them already makes the digest commit to the prevouts
part of all transaction inputs, it is not useful to treat the nSequence
any different. Moreover, this change makes nSequence
consistent with the view that SIGHASH_SINGLE
and SIGHASH_NONE
only modify the digest with respect to transaction outputs and not inputs.
+# The digest commits to taproot-specific data epoch
, spend_type
and annex
(if present).
+
+== Constructing and spending Taproot outputs ==
+
+This section discusses how to construct and spend Taproot outputs. It only affects wallet software that chooses to implement receiving and spending,
+and is not consensus critical in any way.
+
+Conceptually, every Taproot output corresponds to a combination of a single public key condition (the internal key), and zero or more general conditions encoded in scripts organized in a tree.
+Satisfying any of these conditions is sufficient to spend the output.
+
+'''Initial steps''' The first step is determining what the internal key and the organization of the rest of the scripts should be. The specifics are likely application dependent, but here are some general guidelines:
+* When deciding between scripts with conditionals (OP_IF
etc.) and splitting them up into multiple scripts (each corresponding to one execution path through the original script), it is generally preferable to pick the latter.
+* When a single condition requires signatures with multiple keys, key aggregation techniques like MuSig can be used to combine them into a single key. The details are out of scope for this document, but note that this may complicate the signing procedure.
+* If one or more of the spending conditions consist of just a single key (after aggregation), the most likely one should be made the internal key. If no such condition exists, it may be worthwhile adding one that consists of an aggregation of all keys participating in all scripts combined; effectively adding an "everyone agrees" branch. If that is inacceptable, pick as internal key a point with unknown discrete logarithm (TODO).
+* The remaining scripts should be organized into the leaves of a binary tree. This can be a balanced tree if each of the conditions these scripts correspond to are equally likely. If probabilities for each condition are known, consider constructing the tree as a Huffman tree.
+
+'''Computing the output script''' Once the spending conditions are split into an internal key internal_pubkey
and a binary tree whose leaves are (leaf_version, script) tuples, the following Python3 algorithm can be used to compute the output script. In the code below, ser_script
prefixes its input with a CCompactSize-encoded length, and public key objects have methods get_bytes
to get their compressed encoding (see bip-schnorr) and tweak_add
to add a multiple of the secp256k1 generator to it (similar to BIP32's derivation).
+
+taproot_output_script
returns a byte array with the scriptPubKey. It can be P2SH wrapped if desired (see BIP141).
+
+[[File:bip-taproot/tree.png|frame|This diagram shows the hashing structure to obtain the tweak from an internal key ''P'' and a Merkle tree consisting of 3 script leaves.]]
+
+'''Spending using the internal key''' A Taproot output can be spent with the private key corresponding to the internal_pubkey
. To do so, a witness stack consisting of a single element, a bip-schnorr signature on the signature hash as defined above, with the private key tweaked by the same t
in the above snippet. See the code below:
+
+tweak_add
method on private keys, and a sighash
function to compute the signature hash as defined above (for simplicity, the snippet above ignores passing information like the transaction, the input position, P2SH or not, ... to the sighashing code).
+
+'''Spending using one of the scripts''' A Taproot output can be spent by satisfying any of the scripts used in its construction. To do so, a witness stack consisting of the script's inputs, plus the script itself and the control block are necessary. See the code below:
+
++ BIP: bip-tapscript + Layer: Consensus (soft fork) + Title: Validation of Taproot Scripts + Author: Pieter Wuille+ +==Introduction== + +===Abstract=== + +This document specifies the semantics of the initial scripting system under bip-taproot. + +===Copyright=== + +This document is licensed under the 3-clause BSD license. + +===Motivation=== + +Bip-taproot proposes improvements to just the script structure, but some of its goals are incompatible with the semantics of certain opcodes within the scripting language itself. +While it is possible to deal with these in separate optional improvements, their impact is not guaranteed unless they are addressed simultaneously with bip-taproot itself. + +Specifically, the goal is making '''Schnorr signatures''', '''batch validation''', and '''signature hash''' improvements available to spends that use the script system as well. + +==Design== + +In order to achieve these goals, signature opcodes+ Comments-Summary: No comments yet. + Comments-URI: + Status: Draft + Type: Standards Track + Created: + License: BSD-3-Clause +
OP_CHECKSIG
and OP_CHECKSIGVERIFY
are modified to verify Schnorr signatures as specified in bip-schnorr and to use a new transaction digest based on the taproot transaction digest.
+The tapscript transaction digest also simplifies OP_CODESEPARATOR
handling and makes it more efficient.
+
+The inefficient OP_CHECKMULTISIG
and OP_CHECKMULTISIGVERIFY
opcodes are disabled.
+Instead, a new opcode OP_CHECKSIGADD
is introduced to allow creating the same multisignature policies in a batch-verifiable way.
+Tapscript uses a new, simpler signature opcode limit fixing complicated interactions with transaction weight.
+Furthermore, a potential malleability vector is eliminated by requiring MINIMALIF.
+
+Tapscript can be upgraded through soft forks by defining unknown key types, for example to add new hash_types
or signature algorithms.
+Additionally, the new tapscript OP_SUCCESS
opcodes allow introducing new opcodes more cleanly than through OP_NOP
.
+
+==Specification==
+
+The rules below only apply when validating a transaction input for which all of the conditions below are true:
+* The transaction output is a '''segregated witness spend''' (i.e., either the scriptPubKey or BIP16 redeemScript is a witness program as defined in BIP141).
+* It is a '''taproot spend''' as defined in bip-taproot (i.e., the witness version is 1, the witness program is 33 bytes, and the first of those is 0x00 or 0x01).
+* It is a '''script path spend''' as defined in bip-taproot (i.e., after removing the optional annex from the witness stack, two or more stack elements remain).
+* The leaf version is ''0xc0'' (i.e. the first byte of the last witness element after removing the optional annex is ''0xc0'' or ''0xc1'')'''How is the ''0xc0'' constant chosen?''' Following the guidelines in bip-taproot, by choosing a value having the two top bits set, tapscript spends are identifiable even without access to the UTXO being spent., marking it as a '''tapscript spend'''.
+
+Validation of such inputs must be equivalent to performing the following steps in the specified order.
+# If the input is invalid due to BIP16, BIP141, or bip-taproot, fail.
+# The script as defined in bip-taproot (i.e., the penultimate witness stack element after removing the optional annex) is called the '''tapscript''' and is decoded into opcodes, one by one:
+## If any opcode numbered ''80, 98, 126-129, 131-134, 137-138, 141-142, 149-153, 187-254'' is encountered, validation succeeds (none of the rules below apply). This is true even if later bytes in the tapscript would fail to decode otherwise. These opcodes are renamed to OP_SUCCESS80
, ..., OP_SUCCESS254
, and collectively known as OP_SUCCESSx
'''OP_SUCCESSx
''' OP_SUCCESSx
is a mechanism to upgrade the Script system. Using an OP_SUCCESSx
before its meaning is defined by a softfork is insecure and leads to fund loss. The inclusion of OP_SUCCESSx
in a script will pass it unconditionally. It precedes any script execution rules to avoid the difficulties in specifying various edge cases, for example: OP_SUCCESSx
being the 202nd opcode, OP_SUCCESSx
after too many signature opcodes, or even scripts with conditionals lacking OP_ENDIF
. The mere existence of an OP_SUCCESSx
anywhere in the script will guarantee a pass for all such cases. OP_SUCCESSx
are similar to the OP_RETURN
in very early bitcoin versions (v0.1 up to and including v0.3.5). The original OP_RETURN
terminates script execution immediately, and return pass or fail based on the top stack element at the moment of termination. This was one of a major design flaws in the original bitcoin protocol as it permitted unconditional third party theft by placing an OP_RETURN
in scriptSig
. This is not a concern in the present proposal since it is not possible for a third party to inject an OP_SUCCESSx
to the validation process, as the OP_SUCCESSx
is part of the script (and thus committed to be the taproot output), implying the consent of the coin owner. OP_SUCCESSx
can be used for a variety of upgrade possibilities:
+* An OP_SUCCESSx
could be turned into a functional opcode through a softfork. Unlike OP_NOPx
-derived opcodes which only have read-only access to the stack, OP_SUCCESSx
may also write to the stack. Any rule changes to an OP_SUCCESSx
-containing script may only turn a valid script into an invalid one, and this is always achievable with softforks.
+* Since OP_SUCCESSx
precedes size check of initial stack and push opcodes, an OP_SUCCESSx
-derived opcode requiring stack elements bigger than 520 bytes may uplift the limit in a softfork.
+* OP_SUCCESSx
may also redefine the behavior of existing opcodes so they could work together with the new opcode. For example, if an OP_SUCCESSx
-derived opcode works with 64-bit integers, it may also allow the existing arithmetic opcodes in the ''same script'' to do the same.
+* Given that OP_SUCCESSx
even causes potentially unparseable scripts to pass, it can be used to introduce multi-byte opcodes, or even a completely new scripting language when prefixed with a specific OP_SUCCESSx
opcode..
+## If any push opcode fails to decode because it would extend past the end of the tapscript, fail.
+# If the size of any element in the '''initial stack''' as defined in bip-taproot (i.e., the witness stack after removing both the optional annex and the two last stack elements after that) is bigger than 520 bytes, fail.
+# If the tapscript is bigger than 10000 bytes, fail.
+# The tapscript is executed according to the rules in the following section, with the initial stack as input.
+## If execution fails for any reason (including the 201 non-push opcode limit), fail.
+## If the execution results in anything but exactly one element on the stack which evaluates to true with CastToBool()
, fail.
+# If this step is reached without encountering a failure, validation succeeds.
+
+===Script execution===
+
+The execution rules for tapscript are based on those for P2WSH according to BIP141, including the OP_CHECKLOCKTIMEVERIFY
and OP_CHECKSEQUENCEVERIFY
opcodes defined in BIP65 and BIP112, but with the following modifications:
+* '''Disabled script opcodes''' The following script opcodes are disabled in tapscript: OP_CHECKMULTISIG
and OP_CHECKMULTISIGVERIFY
. The disabled opcodes behave in the same way as OP_RETURN
, by failing and terminating the script immediately when executed, and being ignored when found in unexecuted branch. While being ignored, they are still counted towards the 201 non-push opcodes limit.
+* '''Consensus-enforced MINIMALIF''' The MINIMALIF rules, which are only a standardness rule in P2WSH, are consensus enforced in tapscript. This means that the input argument to the OP_IF
and OP_NOTIF
opcodes must be either exactly 0 (the empty vector) or exactly 1 (the one-byte vector with value 1)'''Why make MINIMALIF consensus?''' This makes it considerably easier to write non-malleable scripts that take branch information from the stack..
+* '''OP_SUCCESSx opcodes''' As listed above, some opcodes are renamed to OP_SUCCESSx
, and make the script unconditionally valid.
+* '''Signature opcodes'''. The OP_CHECKSIG
and OP_CHECKSIGVERIFY
are modified to operate on Schnorr signatures (see bip-schnorr) instead of ECDSA, and a new opcode OP_CHECKSIGADD
is added.
+** The opcode 186 (0xba
) is named as OP_CHECKSIGADD
. '''OP_CHECKSIGADD
''' This opcode is added to compensate for the loss of OP_CHECKMULTISIG
-like opcodes, which are incompatible with batch verification. OP_CHECKSIGADD
is functionally equivalent to OP_ROT OP_SWAP OP_CHECKSIG OP_ADD
, but is only counted as one opcode towards the 201 non-push opcodes limit. All CScriptNum
-related behaviours of OP_ADD
are also applicable to OP_CHECKSIGADD
.'''Comparison of CHECKMULTISIG
and CHECKSIG
''' A CHECKMULTISIG
script m ... n CHECKMULTISIG
with witness 0 ...
can be rewritten as script CHECKSIG ... CHECKSIGADD m NUMEQUAL
with witness ...
. Every witness element w_i
is either a signature corresponding to the public key with the same index or an empty vector. A similar CHECKMULTISIGVERIFY
script can be translated to bip-tapscript by replacing NUMEQUAL
with NUMEQUALVERIFY
. Alternatively, an m-of-n multisig policy can be implemented by splitting the script into several leaves of the Merkle tree, each implementing an m-of-m policy using CHECKSIGVERIFY ... CHECKSIGVERIFY CHECKSIG
. If the setting allows the participants to interactively collaborate while signing, multisig policies can be realized with [https://eprint.iacr.org/2018/068 MuSig] for m-of-m and with [http://cacr.uwaterloo.ca/techreports/2001/corr2001-13.ps threshold signatures] using verifiable secret sharing for m-of-n.
+
+===Rules for signature opcodes===
+
+The following rules apply to OP_CHECKSIG
, OP_CHECKSIGVERIFY
, and OP_CHECKSIGADD
.
+
+* For OP_CHECKSIGVERIFY
and OP_CHECKSIG
, the public key (top element) and a signature (second to top element) are popped from the stack.
+** If fewer than 2 elements are on the stack, the script MUST fail and terminate immediately.
+* For OP_CHECKSIGADD
, the public key (top element), a CScriptNum
n
(second to top element), and a signature (third to top element) are popped from the stack.
+** If fewer than 3 elements are on the stack, the script MUST fail and terminate immediately.
+** If n
is larger than 4 bytes, the script MUST fail and terminate immediately.
+* If the public key size is zero, the script MUST fail and terminate immediately.
+* If the first byte of the public key is 0x04
, 0x06
, or 0x07
, the script MUST fail and terminate immediately regardless of the public key size.
+* If the first byte of the public key is 0x02
or 0x03
, it is considered to be a public key as described in bip-schnorr:
+** If the public key is not 33 bytes, the script MUST fail and terminate immediately.
+** If the signature is not the empty vector, the signature is validated according to the bip-taproot signing validation rules against the public key and the tapscript transaction digest (to be defined hereinafter) as message. Validation failure MUST cause the script to fail and terminate immediately.
+* If the first byte of the public key is not 0x02
, 0x03
, 0x04
, 0x06
, or 0x07
, the public key is of an ''unknown public key type'''''Unknown public key types''' allow adding new signature validation rules through softforks. A softfork could add actual signature validation which either passes or makes the script fail and terminate immediately. This way, new SIGHASH
modes can be added, as well as [https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2018-December/016549.html NOINPUT-tagged public keys] and a public key constant which is replaced by the taproot internal key for signature validation. and no actual signature verification is applied. During script execution of signature opcodes they behave exactly as known public key types except that signature validation is considered to be successful.
+* If the script did not fail and terminate before this step, regardless of the public key type:
+** If the signature is the empty vector:
+*** For OP_CHECKSIGVERIFY
, the script MUST fail and terminate immediately.
+*** For OP_CHECKSIG
, an empty vector is pushed onto the stack, and execution continues with the next opcode.
+*** For OP_CHECKSIGADD
, a CScriptNum
with value n
is pushed onto the stack, and execution continues with the next opcode.
+** If the signature is not the empty vector, the sigops_passed
counter is incremented (see further)
+*** For OP_CHECKSIGVERIFY
, execution continues without any further changes to the stack.
+*** For OP_CHECKSIG
, a 1-byte value 0x01
is pushed onto the stack.
+*** For OP_CHECKSIGADD
, a CScriptNum
with value of n + 1
is pushed onto the stack.
+
+These opcodes count toward the 201 non-push opcodes limit.
+
+===Transaction digest===
+
+As the message for signature opcodes signature verification, transaction digest has the same definition as in bip-taproot, except the following:
+
+The one-byte spend_type
has a different value, specificially at bit-2:
+* Bit-0 is set if the scriptPubKey
being spent is P2SH (opposed to "native segwit").
+* Bit-1 is set if an annex is present (the original witness stack has at least two witness elements, and the first byte of the last element is 0x50
).
+* Bit-2 is set.
+* The other bits are unset.
+
+As additional pieces of data, added at the end of the input to the ''hashTapSighash'' function:
+* tapleaf_hash
(32): the tapleaf hash as defined in bip-taproot
+* key_version
(1): a constant value 0x02
representing the current version of public keys in the tapscript signature opcode execution.
+* codeseparator_position
(2): the opcode position of the last executed OP_CODESEPARATOR
before the currently executed signature opcode, with the value in little endian (or 0xffff
if none executed). The first opcode in a script has a position of 0. A multi-byte push opcode is counted as one opcode, regardless of the size of data being pushed.
+
+The total number of bytes hashed is at most ''244'''''What is the number of bytes hashed for the signature hash?''' The total size of the input to ''hashTapSighash'' (excluding the initial 64-byte hash tag) can be computed using the following formula: ''212 - is_anyonecanpay * 50 - is_none * 32 - is_p2sh_spending * 12 + has_annex * 32''..
+
+In summary, the semantics of the BIP143 sighash types remain unchanged, except the following:
+# The exceptions mentioned in bip-taproot.
+# The digest commits to taproot-specific data key_version
.'''Why does the transaction digest commit to the key_version
?''' This is for future extensions that define unknown public key types, making sure signatures can't be moved from one key type to another. This value is intended to be set equal to the first byte of the public key, after masking out flags like the oddness of the Y coordinate.
+# The digest commits to the executed script through the tapleaf_hash
which includes the leaf version and script instead of scriptCode
. This implies that this commitment is unaffected by OP_CODESEPARATOR
.
+# The digest commits to the opcode position of the last executed OP_CODESEPARATOR
.'''Why does the transaction digest commit to the position of the last executed OP_CODESEPARATOR
?''' This allows continuing to use OP_CODESEPARATOR
to sign the executed path of the script. Because the codeseparator_position
is the last input to the digest, the SHA256 midstate can be efficiently cached for multiple OP_CODESEPARATOR
s in a single script. In contrast, the BIP143 handling of OP_CODESEPARATOR
is to commit to the executed script only from the last executed OP_CODESEPARATOR
onwards which requires unnecessary rehashing of the script. It should be noted that the one known OP_CODESEPARATOR
use case of saving a second public key push in a script by sharing the first one between two code branches can be most likely expressed even cheaper by moving each branch into a separate taproot leaf.
+
+===Signature opcodes limitation===
+
+In addition to the 201 non-push opcodes limit, the use of signature opcodes is subject to further limitations.
+
+* input_witness_weight
is defined as the size of the serialized input witness associated to a particular transaction input. As defined in BIP141, a serialized input witness includes CCompactSize
tags indicating the number of elements and size of each element, and contents of each element. input_witness_weight
is the total size of the said CCompactSize
tags and element contents.
+* sigops_passed
is defined as the total number of successfully executed signature opcodes, which have non-zero signature size and do not fail and terminate the script. For the avoidance of doubt, passing signature opcodes with unknown type public key and non-zero size signature are also counted towards sigops_passed
.
+* If 50 * (sigops_passed - 1)
is greater than input_witness_weight
, the script MUST fail and terminate immediately.
+
+This rule limits worst-case validation costs in tapscript similar to the ''sigops limit'' that only applies to legacy and P2WSH scripts'''The tapscript sigop limit''' The signature opcode limit protects against scripts which are slow to verify due to excessively many signature operations. In tapscript the number of signature opcodes does not count towards the BIP141 or legacy sigop limit. The old sigop limit makes transaction selection in block construction unnecessarily difficult because it is a second constraint in addition to weight. Instead, the number of tapscript signature opcodes is limited by witness weight. Additionally, the limit applies to the transaction input instead of the block and only actually executed signature opcodes are counted. Tapscript execution allows one signature opcode per 50 witness weight units plus one free signature opcode. The tapscript signature opcode limit allows to add new signature opcodes like CHECKSIGFROMSTACK
to count towards the limit through a soft fork. Even if in the future new opcodes are introduced which change normal script cost there is need to stuff the witness with meaningless data. In that case the taproot annex can be used to add weight to the witness without increasing the actual witness size.
+'''Parameter choice of the sigop limit''' Regular witnesses are unaffected by the limit as their weight is composed of public key and (SIGHASH_ALL
) signature pairs with ''34 + 65'' weight units each (which includes a 1 weight unit CCompactSize
tag). This is also the case if public keys are reused in the script because a signature's weight alone is 65 or 66 weight units. However, the limit increases the fees of abnormal scripts with duplicate signatures (and public keys) by requiring additional weight. The weight per sigop factor 50 corresponds to the ratio of BIP141 block limits: 4 mega weight units divided by 80,000 sigops. The "free" signature opcode permitted by the limit exists to account for the weight of the non-witness parts of the transaction input..
+
+==Rationale==
+
+