This converts the executeOpcode function defined on the engine to accept
an opcode and data slice instead of a parsed opcode as a step towards
removing the parsed opcode struct and associated supporting code altogether.
It also updates all callers accordingly.
This refactors the script engine to store and step through raw scripts
by making using of the new zero-allocation script tokenizer as opposed
to the less efficient method of storing and stepping through parsed
opcodes. It also improves several aspects while refactoring such as
optimizing the disassembly trace, showing all scripts in the trace in
the case of execution failure, and providing additional comments
describing the purpose of each field in the engine.
It should be noted that this is a step towards removing the parsed
opcode struct and associated supporting code altogether, however, in
order to ease the review process, this retains the struct and all
function signatures for opcode execution which make use of an individual
parsed opcode. Those will be updated in future commits.
The following is an overview of the changes:
- Modify internal engine scripts slice to use raw scripts instead of
parsed opcodes
- Introduce a tokenizer to the engine to track the current script
- Remove no longer needed script offset parameter from the engine since
that is tracked by the tokenizer
- Add an opcode index counter for disassembly purposes to the engine
- Update check for valid program counter to only consider the script
index
- Update tests for bad program counter accordingly
- Rework the NewEngine function
- Store the raw scripts
- Setup the initial tokenizer
- Explicitly check against version 0 instead of DefaultScriptVersion
which would break consensus if changed
- Check the scripts parse according to version 0 semantics to retain
current consensus rules
- Improve comments throughout
- Rework the Step function
- Use the tokenizer and raw scripts
- Create a parsed opcode on the fly for now to retain existing
opcode execution function signatures
- Improve comments throughout
- Update the Execute function
- Explicitly check against version 0 instead of DefaultScriptVersion
which would break consensus if changed
- Improve the disassembly tracing in the case of error
- Update the CheckErrorCondition function
- Modify clean stack error message to make sense in all cases
- Improve the comments
- Update the DisasmPC and DisasmScript functions on the engine
- Use the tokenizer
- Optimize construction via the use of strings.Builder
- Modify the subScript function to return the raw script bytes since the
parsed opcodes are no longer stored
- Update the various signature checking opcodes to use the raw opcode
data removal and signature hash calculation functions since the
subscript is now a raw script
- opcodeCheckSig
- opcodeCheckMultiSig
- opcodeCheckSigAlt
This converts the engine's current program counter disasembly to make
use of the standalone disassembly function to remove the dependency on
the parsed opcode struct.
It also updates the tests accordingly.
This converts the checkMinimalDataPush function defined on a parsed
opcode to a standalone function which accepts an opcode and data slice
instead in order to make it more flexible for raw script analysis.
It also updates all callers accordingly.
This converts the isConditional function defined on a parsed opcode to a
standalone function named isOpcodeConditional which accepts an opcode as
a byte instead in order to make it more flexible for raw script
analysis.
It also updates all callers accordingly.
This converts the alwaysIllegal function defined on a parsed opcode to a
standalone function named isOpcodeAlwaysIllegal which accepts an opcode
as a byte instead in order to make it more flexible for raw script
analysis.
It also updates all callers accordingly.
This converts the isDisabled function defined on a parsed opcode to a
standalone function which accepts an opcode as a byte instead in order
to make it more flexible for raw script analysis.
It also updates all callers accordingly.
This renames the canonicalPush function to isCanonicalPush and converts
it to accept an opcode as a byte and the associate data as a byte slice
instead of the internal parse opcode data struct in order to make it
more flexible for raw script analysis.
It also updates all callers and tests accordingly.
This moves the check for non push-only pay-to-script-hash signature
scripts before the script parsing logic when creating a new engine
instance to avoid the extra overhead in the error case.
This modifies the check for whether or not a pay-to-script-hash
signature script is a push only script to make use of the new and more
efficient raw script function.
Also, since the script will have already been checked further above when
the ScriptVerifySigPushOnly flags is set, avoid checking it again in
that case.
Backport of af67951b9a66df3aac1bf3d6376af0730287bbf2
This converts the IsMultisigScript function to make use of the new
tokenizer instead of the far less efficient parseScript thereby
significantly optimizing the function.
In order to accomplish this, it introduces two new functions. The first
one is named extractMultisigScriptDetails and works with the raw script
bytes to simultaneously determine if the script is a multisignature
script, and in the case it is, extract and return the relevant details.
The second new function is named isMultisigScript and is defined in
terms of the former.
The extract function accepts the script version, raw script bytes, and a
flag to determine whether or not the public keys should also be
extracted. The flag is provided because extracting pubkeys results in
an allocation that the caller might wish to avoid.
The extract function approach was chosen because it is common for
callers to want to only extract relevant details from a script if the
script is of the specific type. Extracting those details requires
performing the exact same checks to ensure the script is of the correct
type, so it is more efficient to combine the two into one and define the
type determination in terms of the result so long as the extraction does
not require allocations.
It is important to note that this new implementation intentionally has a
semantic difference from the existing implementation in that it will now
correctly identify a multisig script with zero pubkeys whereas
previously it incorrectly required at least one pubkey. This change is
acceptable because the function only deals with standardness rather than
consensus rules.
Finally, this also deprecates the isMultiSig function that requires
opcodes in favor of the new functions and deprecates the error return on
the export IsMultisigScript function since it really does not make sense
given the purpose of the function.
The following is a before and after comparison of analyzing both a large
script that is not a multisig script and a 1-of-2 multisig public key
script:
benchmark old ns/op new ns/op delta
BenchmarkIsMultisigScriptLarge-8 64166 5.52 -99.99%
BenchmarkIsMultisigScript-8 630 59.4 -90.57%
benchmark old allocs new allocs delta
BenchmarkIsMultisigScriptLarge-8 1 0 -100.00%
BenchmarkIsMultisigScript-8 1 0 -100.00%
benchmark old bytes new bytes delta
BenchmarkIsMultisigScriptLarge-8 311299 0 -100.00%
BenchmarkIsMultisigScript-8 2304 0 -100.00%
This cleans up the code for handling the checksig and checkmultisig
opcode strict signatures to explicitly call out any semantics that are
likely not obvious and improve readability.
It also introduce new distinct errors for each condition which can
result in a signature being rejected due to not following the strict
encoding requirements and updates reference test adaptor accordingly.
This commit adds verification of the post-segwit standardness
requirement that all pubkeys involved in checks operations MUST be
serialized as compressed public keys. A new ScriptFlag has been added
to guard this behavior when executing scripts.
This commit modifies the op-code execution for OP_IF and OP_NOTIF to
enforce the additional “minimal if” constraints which require the
top-stack item when the op codes are encountered to be either an empty
vector, or exactly [0x01].
This commit implements full witness program validation for the
currently defined version 0 witness programs. This includes validation
logic for nested p2sh, p2wsh, and p2wkh. Additionally, when in witness
validation mode, an additional set of constrains are enforced such as
using the new sighash digest algorithm and enforcing clean stack
behavior within witness programs.
This commit fixes an off-by-one error which is only manifested by the
new behavior of OP_CODESEPARATOR within sig hashes triggered by the
segwit behavior. The current behavior within the Script VM
(txscript.Engine) is known to be fully correct to the extent that it has
been verified. However, once segwit activates a consensus divergence
would emerge due to *when* the program counter was incremented in the
previous code (pre-this-commit).
Currently (pre-segwit) when calculating the pre-image to a transaction
sighash for signature verification, *all* instances of OP_CODESEPARATOR
are removed from the subScript being signed before generating the final
sighash. SegWit has additional nerfed the behavior of OP_CODESEPARATOR
by no longer removing them (and starting after the last instance), but
instead simply starting the subScript to be directly *after* the last
instance of an OP_CODESEPARATOR within the pkScript.
Due to this new behavior, without this commit, an off-by-one error
(which only matters post-segwit), would cause txscript to generate an
incorrect subScript since the instance of OP_CODESEPARATOR would remain
as part of the subScript instead of being sliced off as the new behavior
dictates. The off-by-one error itself is manifested due to a slight
divergence in txscript.Engine’s logic compared to Bitcoin Core. In
Bitcoin Core script verification is as follows: first the next op-code
is fetched, then program counter is incremented, and finally the op-code
itself is executed. Before this commit, btcd flipped the order
of the last two steps, executing the op-code *before* the program
counter was incremented.
This commit fixes the post-segwit consensus divergence by incrementing
the program-counter *before* the next op-code is executed. It is
important to note that this divergence is only significant post-segwit,
meaning that txscript.Engine is still consensus compliant independent of
this commit.
This commit introduces a series of internal and external helper
functions which enable the txscript package to be aware of the new
standard script templates introduced as part of BIP0141. The two new
standard script templates recognized are pay-to-witness-key-hash
(P2WKH) and pay-to-witness-script-hash (P2WSH).
This commit implements most of BIP0143 by adding logic to implement the
new sighash calculation, signing, and additionally introduces the
HashCache optimization which eliminates the O(N^2) computational
complexity for the SIGHASH_ALL sighash type.
The HashCache struct is the equivalent to the existing SigCache struct,
but for caching the reusable midstate for transactions which are
spending segwitty outputs.
ScriptVerifyNullFail defines that signatures must be empty if a
CHECKSIG or CHECKMULTISIG operation fails.
This commit also enables ScriptVerifyNullFail at the mempool policy
level.
This converts the majority of script errors from generic errors created
via errors.New and fmt.Errorf to use a concrete type that implements the
error interface with an error code and description.
This allows callers to programmatically detect the type of error via
type assertions and an error code while still allowing the errors to
provide more context.
For example, instead of just having an error the reads "disabled opcode"
as would happen prior to these changes when a disabled opcode is
encountered, the error will now read "attempt to execute disabled opcode
OP_FOO".
While it was previously possible to programmatically detect many errors
due to them being exported, they provided no additional context and
there were also various instances that were just returning errors
created on the spot which callers could not reliably detect without
resorting to looking at the actual error message, which is nearly always
bad practice.
Also, while here, export the MaxStackSize and MaxScriptSize constants
since they can be useful for consumers of the package and perform some
minor cleanup of some of the tests.
See https://github.com/bitcoin/bips/blob/master/bip-0065.mediawiki for
more information.
This commit mimics Bitcoin Core commit bc60b2b4b401f0adff5b8b9678903ff8feb5867b
and includes additional tests from Bitcoin Core commit
cb54d17355864fa08826d6511a0d7692b21ef2c9
Introduce an ECDSA signature verification into btcd in order to
mitigate a certain DoS attack and as a performance optimization.
The benefits of SigCache are two fold. Firstly, usage of SigCache
mitigates a DoS attack wherein an attacker causes a victim's client to
hang due to worst-case behavior triggered while processing attacker
crafted invalid transactions. A detailed description of the mitigated
DoS attack can be found here: https://bitslog.wordpress.com/2013/01/23/fixed-bitcoin-vulnerability-explanation-why-the-signature-cache-is-a-dos-protection/
Secondly, usage of the SigCache introduces a signature verification
optimization which speeds up the validation of transactions within a
block, if they've already been seen and verified within the mempool.
The server itself manages the sigCache instance. The blockManager and
txMempool respectively now receive pointers to the created sigCache
instance. All read (sig triplet existence) operations on the sigCache
will not block unless a separate goroutine is adding an entry (writing)
to the sigCache. GetBlockTemplate generation now also utilizes the
sigCache in order to avoid unnecessarily double checking signatures
when generating a template after previously accepting a txn to the
mempool. Consequently, the CPU miner now also employs the same
optimization.
The maximum number of entries for the sigCache has been introduced as a
config parameter in order to allow users to configure the amount of
memory consumed by this new additional caching.
This commit moves all code related to standard scripts into a separate
file named standard.go as well as the associated tests into
standard_test.go. Since the code in address.go and address_test.go is
only related to standard scripts, it has been combined into the new
files and the old files deleted.
The intent here is to make it clear that the code in standard.go is not
related to consensus.
This commit implements a new type, named scriptNum, for handling all
numeric values used in scripts and converts the code over to make use of
it. This is being done for a few of reasons.
First, the consensus rules for handling numeric values in the scripts
require special handling with subtle semantics. By encapsulating those
details into a type specifically dedicated to that purpose, it
simplifies the code and generally helps prevent improper usage.
Second, the new type is quite a bit more efficient than big.Ints which
are designed to be arbitrarily large and thus involve a lot of heap
allocations and additional multi-precision bookkeeping. Because this
new type is based on an int64, it allows the numbers to be stack
allocated thereby eliminating a lot of GC and also eliminates the extra
multi-precision arithmetic bookkeeping.
The use of an int64 is possible because the consensus rules dictate that
when data is interpreted as a number, it is limited to an int32 even
though results outside of this range are allowed so long as they are not
interpreted as integers again themselves. Thus, the maximum possible
result comes from multiplying a max int32 by itself which safely fits
into an int64 and can then still appropriately provide the serialization
of the larger number as required by consensus.
Finally, it more closely resembles the implementation used by Bitcoin
Core and thus makes is easier to compare the behavior between the two
implementations.
This commit also includes a full suite of tests with 100% coverage of
the semantics of the new type.
This commit contains a lot of cleanup on the txscript code to make it
more consistent with the code throughout the rest of the project. It
doesn't change any operational logic.
The following is an overview of the changes:
- Add a significant number of comments throughout in order to better
explain what the code is doing
- Fix several comment typos
- Move a couple of constants only used by the engine to engine.go
- Move a variable only used by the engine to engine.go
- Fix a couple of format specifiers in the test prints
- Reorder functions so they're defined before/closer to use
- Make the code lint clean with the exception of the opcode definitions
This commit moves the opcode execution logic from the opcode type to the
engine type because execution of an opcode modifies the engine state
(primarily the main and alternate data stacks) as opposed to the state
of the opcode. Making the engine the receiver more clearly indicates
this fact.
This commit unexports the Stack type since it is only intended to be
used internally during script execution. Further, the engine exposes
the {G,S}etStack and {G,S}etAltStack functions which return the items as
a slice of byte slices ([][]byte) for caller access while stepping.
This commit improves the way the conditional execution stack is handled in
a few ways.
First, the current execution state is now pushed onto the end of the slice
rather than the front of it. This has been done because it results in
fewer allocations and is therefore more efficient.
Second, the need for allocating and setting an initial true in the
conditional stack has been eliminated. The vast majority of scripts don't
contain any conditionals, so there is no reason to allocate a slice when
it isn't needed.
Third, a new function has been added to the engine to determine if the
current conditional branch is executing named isBranchExecuting which
handles the fact the conditional execution stack can now be empty and
improves the readability of the code.
Finally, it removes a couple of TODOs which I have verified do not apply.
Rather than storing a separate bool for whether or not each flag is set in
every script engine instance, store the flags and check if the relevant
flag is set from each specific location.
This reduces the memory needed by each script engine instance and means
future flags will not require new fields.
This commit renames the Script type to Engine to better reflect its
purpose. It also renames the NewScript function to NewEngine to match.
This is being done because name Script for the engine is confusing since
it implies it is an actual script rather than the execution environment
for the script. It also paves the way for eventually supplying a
ParsedScript type which will be less likely to be confused with the
execution environment.
While moving the code, some additional variable names and comments have
been updated to better match the style used throughout the rest of the
code base. In addition, an attempt has been made to use consistent naming
of the engine as 'vm' instead of using different variables names as it was
previously.
Finally, the relevant engine code has been moved into a new file named
engine.go and related tests moved to engine_test.go.