The previous script validation logic entailed starting up a hard-coded
number of goroutines to process the transaction scripts in parallel. In
particular, one goroutine (up to 8 max) was started per transaction in a
block and another one was started for each input script pair in the
each transaction. This resulted in 64 goroutines simultaneously running
scripts and verifying cryptographic signatures. This could easily lead to
the overall system feeling sluggish.
Further the previous design could also result in bursty behavior since the
number of inputs to a transaction as well as its complexity can vary
widely between transactions. For example, starting 2 goroutines (one to
process the transaction and one for actual script pair validation) to
verify a transaction with a single input was not desirable.
Finally, the previous design validated all transactions and inputs
regardless of a failure in one of the other scripts. This really didn't
have a big impact since it's quite rare that blocks with invalid
verifications are being processed, but it was a potential way DoS vector.
This commit changes the logic in a few ways to improve things:
- The max number of validation goroutines is now based on the number of
cores in the system
- All transaction inputs from all transactions in the block are collated
into a single list which is fed through the aforementioned validation
goroutines
- The validation CPU usage is much more consistent due to the collation of
inputs
- A validation error in any goroutine immediately stops validation of all
remaining inputs
- The errors have been improved to include context about what tx script
pair failed as opposed to showing the information as a warning
This closesconformal/btcd#59.
It is not necessary to do all of the transaction validation on
blocks if they have been confirmed to be in the block chain leading
up to the final checkpoint in a given blockschain.
This algorithm fetches block headers from the peer, then once it has
established the full blockchain connection, it requests blocks.
Any blocks before the final checkpoint pass true for fastAdd on
btcchain operation, which causes it to do less valiation on the block.
This commit modifies the tests to setup a chain instance backed by the new
memory database backend for btcdb. This allows the tests to avoid
creating and cleaning up files and also allows the tests to run faster
since it can all happen in memory.
The chainSetup function has also been changed to provide logic to switch
on the database type to allow for easy changing of the backend to a
different database type as needed. For example, it could be useful to
provide extra testing against new database backends.
Rather than defining CheckBlockSanity as a member of a BlockChain
instance, define it at the root level so it is truly context free as
intended. In order to make it context free, the proof of work limit is
now a required parameter.
Profiling showed the duplicate transaction input check was taking around
6% of the total CheckTransactionSanity processing time. This was largely
due to using fmt.Sprintf to generate the map key.
This commit modifies the check instead to use the actual output as a map
key.
The following benchmark results show the difference:
Before: BenchmarkOldDuplicatInputCheck 100000 21787 ns/op
After: BenchmarkNewDuplicatInputCheck 2000000 937 ns/op
Closes#2
This commit modifies the ValidateTransactionScripts API to accept the
recently added ScriptFlags from btcscript. This provides flexibility to
the caller to choose validation behavior based on those new flags.
This commit modifies the main processing loop for orphan dependencies
(orphans that are processed due to their parents showing up) to use an
index based for loop over range. Since the Go range statement does not
reevaluate on every iteration, it was previously possible under certain
circumstances for the slice to be changed out from under the range
statement while processing the orphan blocks.
Rather than fetching the hash of each block individually 2k+ times, make
use of the FetchHeightRange function so all of the most recent hashes can
be fetched at once.
This commit modifies the code to make use of the new btcd APIs that allow
fetching of transaction lists which either do or do not include fully
spent transactions. It is more efficient to avoid fetching fully spent
transactions from the database when they aren't needed.
This commit modifies the errors that result from missing expected input
transactions to a RuleError. This allows the caller to detect a block was
rejected due to a rule violation as opposed to an unexpected error.
This commit adds a quick check to the transaction store fetch code which
simply returns an empty store if no hashes were requested rather than
bothering the db with an empty list.
This commit modifies the transaction lookup code to use a set instead of a
slice (list). This allows the lookup to automatically prevent duplicate
requests to the database.
Previously, the code simply added every referenced transaction to a list
without checking for duplicates, which led to multiple requests against
the database for the same transaction. It also meant the request list
could grow quite large with all of the duplicates using far more memory
than required.
While the end result was accurate, operating that way is not as efficient
as only requesting unique transactions.
This commit implents a basic infrastructure to be used throughout the
tests for creating a new chain instance that is ready to have tests run
against it. It also returns a teardown function the caller can use to
clean up after it is done testing. This paves the way for adding more
tests.
The original thought was that chain would also house the transaction
memory pool, but that ultimately was decided against. As a result,
it only makes sense to query chain for blocks rather than generic
inventory.
This commit corrects the reading of the serialized height in coinbase
transactions for block height of version 2 or greater. On mainnet, the
serialized height is always 3 bytes and will continue to be so for
something like another ~159 years, so there was no issue with mainnet.
However on testnet, there are some version 2 blocks which are low enough
in the chain to only take 2 bytes to serialize.
In addition, this commit adds a full tests for the relavant function
including negative tests and variable length serialized lengths for block
heights.
Closes#1.
btcdb was changed a while back to not insert the genesis block by default.
This commit modifies the reorg test to insert it as required so not all
blocks are orphans.
Rather than using a channel for notifictions, use a callback instead.
There are several issues that arise with async notifications via a
channel, so use a callback instead. The caller can always make the
notifications async by using channels in the provided callback if
needed.
This commit modifies CheckInputTransactions to ensure that not only must a
transaction exist in the transaction store, it must also not have any
errors associated with it or be nil.
Several of the functions require a map of contextual transaction data to
use as a source for referenced transactions. This commit exports the
underlying TxData type and creates a new type TxStore, which is a map of
points to the under TxData. In addition, this commit exposes a new
function, FetchTransactionStore, which returns a transaction store
(TxStore) containing all of the transactions referenced by the passed
transaction, as well as the existing transaction if it already exists.
This paves the way for subsequent commits which will expose some of the
functions which depend on this transaction store.
This commit modifies the code to choose sane defaults for the backing
arrays for slices that involve a lot of appends such block locators, hash
processing, and needed transactions. This is an optimization to avoid
the overhead of growing the backing arrays and copying the data multiple
times in the most common case. This also prevents a leak in Go GC which
will likely ultimatley be fixed, but the efficiecy gains alone are worth
the change.
After discussion with others and thinking about the notification channel
some more, we've decided to leave it up to the caller to quickly handle
notifications. While it is true that notification should be handled
quickly to not block the chain processing code unnecessarily, launching a
goroutine in chain means the notifications are no longer necessarily in
order. Also, if the caller is not properly handling the notifications,
the goroutines end up sicking around forever. By leaving it up to the
caller to quickly handle the notification or launch a goroutine as
necessary for the caller, it provides the flexibility to ensure proper
notification ordering as well as control over other things such as
how to handle backpressure.
Rather than sending the root of the an orphan chain when sending an orphan
notification, send the hash of the orphan block itself. The caller can
then call the GetOrphanRoot function with the hash to get the root of the
orphan chain as needed. This is being changed since it's cleaner and more
detministic sending the hash for the orphan block that was just processed
rather than sending a possibly different hash depending on whether there
is an orphan chain or not.