This refactors the script engine to store and step through raw scripts
by making using of the new zero-allocation script tokenizer as opposed
to the less efficient method of storing and stepping through parsed
opcodes. It also improves several aspects while refactoring such as
optimizing the disassembly trace, showing all scripts in the trace in
the case of execution failure, and providing additional comments
describing the purpose of each field in the engine.
It should be noted that this is a step towards removing the parsed
opcode struct and associated supporting code altogether, however, in
order to ease the review process, this retains the struct and all
function signatures for opcode execution which make use of an individual
parsed opcode. Those will be updated in future commits.
The following is an overview of the changes:
- Modify internal engine scripts slice to use raw scripts instead of
parsed opcodes
- Introduce a tokenizer to the engine to track the current script
- Remove no longer needed script offset parameter from the engine since
that is tracked by the tokenizer
- Add an opcode index counter for disassembly purposes to the engine
- Update check for valid program counter to only consider the script
index
- Update tests for bad program counter accordingly
- Rework the NewEngine function
- Store the raw scripts
- Setup the initial tokenizer
- Explicitly check against version 0 instead of DefaultScriptVersion
which would break consensus if changed
- Check the scripts parse according to version 0 semantics to retain
current consensus rules
- Improve comments throughout
- Rework the Step function
- Use the tokenizer and raw scripts
- Create a parsed opcode on the fly for now to retain existing
opcode execution function signatures
- Improve comments throughout
- Update the Execute function
- Explicitly check against version 0 instead of DefaultScriptVersion
which would break consensus if changed
- Improve the disassembly tracing in the case of error
- Update the CheckErrorCondition function
- Modify clean stack error message to make sense in all cases
- Improve the comments
- Update the DisasmPC and DisasmScript functions on the engine
- Use the tokenizer
- Optimize construction via the use of strings.Builder
- Modify the subScript function to return the raw script bytes since the
parsed opcodes are no longer stored
- Update the various signature checking opcodes to use the raw opcode
data removal and signature hash calculation functions since the
subscript is now a raw script
- opcodeCheckSig
- opcodeCheckMultiSig
- opcodeCheckSigAlt
This converts the engine's current program counter disasembly to make
use of the standalone disassembly function to remove the dependency on
the parsed opcode struct.
It also updates the tests accordingly.
This converts the checkMinimalDataPush function defined on a parsed
opcode to a standalone function which accepts an opcode and data slice
instead in order to make it more flexible for raw script analysis.
It also updates all callers accordingly.
This converts the isConditional function defined on a parsed opcode to a
standalone function named isOpcodeConditional which accepts an opcode as
a byte instead in order to make it more flexible for raw script
analysis.
It also updates all callers accordingly.
This converts the alwaysIllegal function defined on a parsed opcode to a
standalone function named isOpcodeAlwaysIllegal which accepts an opcode
as a byte instead in order to make it more flexible for raw script
analysis.
It also updates all callers accordingly.
This converts the isDisabled function defined on a parsed opcode to a
standalone function which accepts an opcode as a byte instead in order
to make it more flexible for raw script analysis.
It also updates all callers accordingly.
This introduces a new function named calcSignatureHashRaw which accepts
the raw script bytes to calculate the script hash versus requiring the
parsed opcode only to unparse them later in order to make it more
flexible for working with raw scripts.
Since there are several places in the rest of the code that currently
only have access to the parsed opcodes, this modifies the existing
calcSignatureHash to first unparse the script before calling the new
function.
Backport of decred/dcrd:f306a72a16eaabfb7054a26f9d9f850b87b00279
This converts the DisasmString function to make use of the new
zero-allocation script tokenizer instead of the far less efficient
parseScript thereby significantly optimizing the function.
In order to facilitate this, the opcode disassembly functionality is
split into a separate function called disasmOpcode that accepts the
opcode struct and data independently as opposed to requiring a parsed
opcode. The new function also accepts a pointer to a string builder so
the disassembly can be more efficiently be built.
While here, the comment is modified to explicitly call out the script
version semantics.
The following is a before and after comparison of a large script:
benchmark old ns/op new ns/op delta
BenchmarkDisasmString-8 102902 40124 -61.01%
benchmark old allocs new allocs delta
BenchmarkDisasmString-8 46 51 +10.87%
benchmark old bytes new bytes delta
BenchmarkDisasmString-8 389324 130552 -66.47%
- create benchmarks to measure allocations
- add test for benchmark input
- create a low alloc parseScriptTemplate
- refactor parsing logic for a single opcode
This commit adds verification of the post-segwit standardness
requirement that all pubkeys involved in checks operations MUST be
serialized as compressed public keys. A new ScriptFlag has been added
to guard this behavior when executing scripts.
This commit modifies the op-code execution for OP_IF and OP_NOTIF to
enforce the additional “minimal if” constraints which require the
top-stack item when the op codes are encountered to be either an empty
vector, or exactly [0x01].
This commit implements full witness program validation for the
currently defined version 0 witness programs. This includes validation
logic for nested p2sh, p2wsh, and p2wkh. Additionally, when in witness
validation mode, an additional set of constrains are enforced such as
using the new sighash digest algorithm and enforcing clean stack
behavior within witness programs.
This commit implements most of BIP0143 by adding logic to implement the
new sighash calculation, signing, and additionally introduces the
HashCache optimization which eliminates the O(N^2) computational
complexity for the SIGHASH_ALL sighash type.
The HashCache struct is the equivalent to the existing SigCache struct,
but for caching the reusable midstate for transactions which are
spending segwitty outputs.
Now that glide is used for version management and a specific commit of
the upstream repository can be locked it is no longer necessary to
maintain a fork of the package specifically to keep a stable dependency.
While here, update the glide dependency for btcutil as well since it was
switched to use the upstream path as well.
ScriptVerifyNullFail defines that signatures must be empty if a
CHECKSIG or CHECKMULTISIG operation fails.
This commit also enables ScriptVerifyNullFail at the mempool policy
level.
This converts the majority of script errors from generic errors created
via errors.New and fmt.Errorf to use a concrete type that implements the
error interface with an error code and description.
This allows callers to programmatically detect the type of error via
type assertions and an error code while still allowing the errors to
provide more context.
For example, instead of just having an error the reads "disabled opcode"
as would happen prior to these changes when a disabled opcode is
encountered, the error will now read "attempt to execute disabled opcode
OP_FOO".
While it was previously possible to programmatically detect many errors
due to them being exported, they provided no additional context and
there were also various instances that were just returning errors
created on the spot which callers could not reliably detect without
resorting to looking at the actual error message, which is nearly always
bad practice.
Also, while here, export the MaxStackSize and MaxScriptSize constants
since they can be useful for consumers of the package and perform some
minor cleanup of some of the tests.
The CSV consensus rules dictate that the opcode fails when the
transaction version is not at least version 2, however that only applies
if the disable flag is not set in the sequence.
This is not an issue at the current time because we do not yet enforce
CSV at a consensus level, however, I noticed this discrepancy when doing
a thorough audit of the CSV paths due to the ongoing work to add full
consensus-enforced CSV support.
As a result, this must be merged prior to enabling consensus enforcement
for CSV or it would open up the potential for a hard fork.
This is mostly a backport of some of the same modifications made in
Decred along with a few additional things cleaned up. In particular,
this updates the code to make use of the new chainhash package.
Also, since this required API changes anyways and the hash algorithm is
no longer tied specifically to SHA, all other functions throughout the
code base which had "Sha" in their name have been changed to Hash so
they are not incorrectly implying the hash algorithm.
The following is an overview of the changes:
- Remove the wire.ShaHash type
- Update all references to wire.ShaHash to the new chainhash.Hash type
- Rename the following functions and update all references:
- wire.BlockHeader.BlockSha -> BlockHash
- wire.MsgBlock.BlockSha -> BlockHash
- wire.MsgBlock.TxShas -> TxHashes
- wire.MsgTx.TxSha -> TxHash
- blockchain.ShaHashToBig -> HashToBig
- peer.ShaFunc -> peer.HashFunc
- Rename all variables that included sha in their name to include hash
instead
- Update for function name changes in other dependent packages such as
btcutil
- Update copyright dates on all modified files
- Update glide.lock file to use the required version of btcutil
See https://github.com/bitcoin/bips/blob/master/bip-0065.mediawiki for
more information.
This commit mimics Bitcoin Core commit bc60b2b4b401f0adff5b8b9678903ff8feb5867b
and includes additional tests from Bitcoin Core commit
cb54d17355864fa08826d6511a0d7692b21ef2c9
Introduce an ECDSA signature verification into btcd in order to
mitigate a certain DoS attack and as a performance optimization.
The benefits of SigCache are two fold. Firstly, usage of SigCache
mitigates a DoS attack wherein an attacker causes a victim's client to
hang due to worst-case behavior triggered while processing attacker
crafted invalid transactions. A detailed description of the mitigated
DoS attack can be found here: https://bitslog.wordpress.com/2013/01/23/fixed-bitcoin-vulnerability-explanation-why-the-signature-cache-is-a-dos-protection/
Secondly, usage of the SigCache introduces a signature verification
optimization which speeds up the validation of transactions within a
block, if they've already been seen and verified within the mempool.
The server itself manages the sigCache instance. The blockManager and
txMempool respectively now receive pointers to the created sigCache
instance. All read (sig triplet existence) operations on the sigCache
will not block unless a separate goroutine is adding an entry (writing)
to the sigCache. GetBlockTemplate generation now also utilizes the
sigCache in order to avoid unnecessarily double checking signatures
when generating a template after previously accepting a txn to the
mempool. Consequently, the CPU miner now also employs the same
optimization.
The maximum number of entries for the sigCache has been introduced as a
config parameter in order to allow users to configure the amount of
memory consumed by this new additional caching.
This commit contains fixes from the results of a thorough audit of
txscript to find any cases of script evaluation which doesn't match the
required consensus behavior. These conditions are fairly obscure and
highly unlikely to happen in any real scripts, but they could have
nevertheless been used by a clever attacker with malicious intent to
cause a fork.
Test cases which exercise these conditions have been added to the
reference tests and will contributed upstream to improve the quality for
the entire ecosystem.
Unlike OP_IF and OP_NOTIF which interpret the top stack item as a
number, OP_IFDUP interprets it as a boolean. This has important
consequences because numbers are imited to int32s while booleans can be
an arbitrary number of bytes.
The offending script was found and reported by Jonas Nick through the
use of fuzzing.
This commit implements a new type, named scriptNum, for handling all
numeric values used in scripts and converts the code over to make use of
it. This is being done for a few of reasons.
First, the consensus rules for handling numeric values in the scripts
require special handling with subtle semantics. By encapsulating those
details into a type specifically dedicated to that purpose, it
simplifies the code and generally helps prevent improper usage.
Second, the new type is quite a bit more efficient than big.Ints which
are designed to be arbitrarily large and thus involve a lot of heap
allocations and additional multi-precision bookkeeping. Because this
new type is based on an int64, it allows the numbers to be stack
allocated thereby eliminating a lot of GC and also eliminates the extra
multi-precision arithmetic bookkeeping.
The use of an int64 is possible because the consensus rules dictate that
when data is interpreted as a number, it is limited to an int32 even
though results outside of this range are allowed so long as they are not
interpreted as integers again themselves. Thus, the maximum possible
result comes from multiplying a max int32 by itself which safely fits
into an int64 and can then still appropriately provide the serialization
of the larger number as required by consensus.
Finally, it more closely resembles the implementation used by Bitcoin
Core and thus makes is easier to compare the behavior between the two
implementations.
This commit also includes a full suite of tests with 100% coverage of
the semantics of the new type.
This commit contains a lot of cleanup on the txscript code to make it
more consistent with the code throughout the rest of the project. It
doesn't change any operational logic.
The following is an overview of the changes:
- Add a significant number of comments throughout in order to better
explain what the code is doing
- Fix several comment typos
- Move a couple of constants only used by the engine to engine.go
- Move a variable only used by the engine to engine.go
- Fix a couple of format specifiers in the test prints
- Reorder functions so they're defined before/closer to use
- Make the code lint clean with the exception of the opcode definitions
- Remove all redundant opcode tests in favor of the JSON-based tests
in the data directory.
- Remove duplicate stack nip test
- Add new tests to data/script_invalid.json to exercise additional
negative error paths
- Remove old unneeded pubkey trace code from opcodeCheckSig
- Simplify and improve the disassembly print function
- Add new tests to directly test all individual opcode disassembly
- Add new tests to directly test opcode disabled function which does not
get invoked during ordinary execution
- Improve test coverage of opcode.go
This commit moves the opcode execution logic from the opcode type to the
engine type because execution of an opcode modifies the engine state
(primarily the main and alternate data stacks) as opposed to the state
of the opcode. Making the engine the receiver more clearly indicates
this fact.
This commit very slightly optimizes the cryptographic hashing performed
by the script opcodes by calling the hash sum routines directly (for
those that support it) rather than allocating a new generic hash.Hash
hasher instance for them.
This commit improves the way the conditional execution stack is handled in
a few ways.
First, the current execution state is now pushed onto the end of the slice
rather than the front of it. This has been done because it results in
fewer allocations and is therefore more efficient.
Second, the need for allocating and setting an initial true in the
conditional stack has been eliminated. The vast majority of scripts don't
contain any conditionals, so there is no reason to allocate a slice when
it isn't needed.
Third, a new function has been added to the engine to determine if the
current conditional branch is executing named isBranchExecuting which
handles the fact the conditional execution stack can now be empty and
improves the readability of the code.
Finally, it removes a couple of TODOs which I have verified do not apply.
This commit exports a new map named OpcodeByName which can be used to
lookup an opcode value given a human-readable opcode name.
It also modifies the test function which does short form parsing to use
the new map instead of the internal array.
Closes#267.
This commit converts the opcode map to an array to improve performance.
Benchmark of executing a standard p2pk transaction:
New: BenchmarkExecute 2000 784349 ns/op
Old: BenchmarkExecute 2000 792600 ns/op
The time is dominated by the signature checking as expected, however there
is still an increase in speed.
This commit modifies the definition of the opcodes to their hex
counterparts rather than decimal since it is far more common to see
scripts in hex. This makes it easier when manually looking at script
dumps to correlate opcodes. However, since there are also cases where it
is useful to see the decimal value of the opcode, the decimal value has
been left as a comment. Obviously converting the numbers is trivial, but
it is handy when looking at the opcode definitions to already have it
there.
In addition, it syncs the opcodes with the latest Bitcoin Core internal
opcodes for completeness and modifies the tests accordingly.
Rather than storing a separate bool for whether or not each flag is set in
every script engine instance, store the flags and check if the relevant
flag is set from each specific location.
This reduces the memory needed by each script engine instance and means
future flags will not require new fields.
This commit renames the Script type to Engine to better reflect its
purpose. It also renames the NewScript function to NewEngine to match.
This is being done because name Script for the engine is confusing since
it implies it is an actual script rather than the execution environment
for the script. It also paves the way for eventually supplying a
ParsedScript type which will be less likely to be confused with the
execution environment.
While moving the code, some additional variable names and comments have
been updated to better match the style used throughout the rest of the
code base. In addition, an attempt has been made to use consistent naming
of the engine as 'vm' instead of using different variables names as it was
previously.
Finally, the relevant engine code has been moved into a new file named
engine.go and related tests moved to engine_test.go.
The ScriptVerifyMinimalData enforces that all push operations use the
minimal data push required. This is part of BIP0062.
This commit mimics Bitcoin Core commit
698c6abb25c1fbbc7fa4ba46b60e9f17d97332ef
Remove ScriptCanonicalSignatures and use the new
ScriptVerifyDERSignatures flag. The ScriptVerifyDERSignatures
flag accomplishes the same functionality.
This commit adds two new verification flags to txscript named
ScriptVerifyStrictEncoding and ScriptVerifyDerSignatures.
The ScriptVerifyStrictEncoding flag enforces signature scripts
and public keys to follow the strict encoding requirements.
The ScriptVerifyDerSignatures flag enforces signature scripts
to follow the strict encoding requirements.
These flags mimic Bitcoin Core's SCRIPT_VERIFY_STRICTENC and
SCRIPT_VERIFY_DERSIG flags and brings the Bitcoin Core test scripts up
to date.
This commit corrects a case in the OP_CHECKMULTISIG handling where it was
possible to improperly validate a transaction that had a combination of
valid and malformed signatures.
It also adds a new test to ensure this case is properly handled and nukes
a superfluous comment.
Fixes#293.
This commit contains the entire btcscript repository along with several
changes needed to move all of the files into the txscript directory in
order to prepare it for merging. This does NOT update btcd or any of the
other packages to use the new location as that will be done separately.
- All import paths in the old btcscript test files have been changed to the
new location
- All references to btcscript as the package name have been chagned to
txscript
This is ongoing work toward #214.