This commit resolves an issue where the block node index was forcing
entire blocks to be kept in memory thereby forcing excessive memory usage.
For example, prior to this change, the memory usage could consume upwards
of 1.5GB while importing bootstrap.dat via the addblock utility. With
this change the entire import takes <150MB. This also has the same memory
reduction to btcd since it uses the same code path.
Previously there was only a function to get the latest checkpoint. This
commit exposes a new function named Checkpoints which returns a slice of
checkpoints ordered by height for the active network or nil when
checkpoints are disabled.
The recent addition of the fast add path to support headers first was not
running the block node index pruning code which removes unneeded block
nodes from memory. This resulted in higher memory usage than needed in
fast add mode.
Previously the code was only adding a new block node as a child in the
inernal node index for the cases it extended the main chain or a side
chain and not for a node which caused a reorg. This resulted in the block
node pruning code not clearing the parent link of reorged nodes which
ultimately led to a sanity check error accordingly.
This commit resolves the issue by ensuring new block nodes are added as
children of their respective parents in all cases.
Closes#4.
This commit changes the node index creation to use block headers instead
of full blocks. This speeds up the initial node index generation since it
doesn't require loading a bunch of full blocks at startup.
This commit modifies local variables that are used for more convenient
access to a block's header to use pointers. This avoids copying the
header multiple times.
Previously the code performed a database query for every checkpoint (going
backwards) to find the latest known checkpoint on every block. This was
particularly noticabled near the beginning of the block chain when there
are still several checkpoints that haven't been reached yet.
This commit changes the logic to cache the latest known checkpoint while
keeping track of when it needs to be updated once a new later known
checkpoint has been reached.
While here, also add a log message when a checkpoint has been reached and
verified.
The previous script validation logic entailed starting up a hard-coded
number of goroutines to process the transaction scripts in parallel. In
particular, one goroutine (up to 8 max) was started per transaction in a
block and another one was started for each input script pair in the
each transaction. This resulted in 64 goroutines simultaneously running
scripts and verifying cryptographic signatures. This could easily lead to
the overall system feeling sluggish.
Further the previous design could also result in bursty behavior since the
number of inputs to a transaction as well as its complexity can vary
widely between transactions. For example, starting 2 goroutines (one to
process the transaction and one for actual script pair validation) to
verify a transaction with a single input was not desirable.
Finally, the previous design validated all transactions and inputs
regardless of a failure in one of the other scripts. This really didn't
have a big impact since it's quite rare that blocks with invalid
verifications are being processed, but it was a potential way DoS vector.
This commit changes the logic in a few ways to improve things:
- The max number of validation goroutines is now based on the number of
cores in the system
- All transaction inputs from all transactions in the block are collated
into a single list which is fed through the aforementioned validation
goroutines
- The validation CPU usage is much more consistent due to the collation of
inputs
- A validation error in any goroutine immediately stops validation of all
remaining inputs
- The errors have been improved to include context about what tx script
pair failed as opposed to showing the information as a warning
This closesconformal/btcd#59.
It is not necessary to do all of the transaction validation on
blocks if they have been confirmed to be in the block chain leading
up to the final checkpoint in a given blockschain.
This algorithm fetches block headers from the peer, then once it has
established the full blockchain connection, it requests blocks.
Any blocks before the final checkpoint pass true for fastAdd on
btcchain operation, which causes it to do less valiation on the block.
This commit modifies the tests to setup a chain instance backed by the new
memory database backend for btcdb. This allows the tests to avoid
creating and cleaning up files and also allows the tests to run faster
since it can all happen in memory.
The chainSetup function has also been changed to provide logic to switch
on the database type to allow for easy changing of the backend to a
different database type as needed. For example, it could be useful to
provide extra testing against new database backends.
Rather than defining CheckBlockSanity as a member of a BlockChain
instance, define it at the root level so it is truly context free as
intended. In order to make it context free, the proof of work limit is
now a required parameter.
Profiling showed the duplicate transaction input check was taking around
6% of the total CheckTransactionSanity processing time. This was largely
due to using fmt.Sprintf to generate the map key.
This commit modifies the check instead to use the actual output as a map
key.
The following benchmark results show the difference:
Before: BenchmarkOldDuplicatInputCheck 100000 21787 ns/op
After: BenchmarkNewDuplicatInputCheck 2000000 937 ns/op
Closes#2
This commit modifies the ValidateTransactionScripts API to accept the
recently added ScriptFlags from btcscript. This provides flexibility to
the caller to choose validation behavior based on those new flags.
This commit modifies the main processing loop for orphan dependencies
(orphans that are processed due to their parents showing up) to use an
index based for loop over range. Since the Go range statement does not
reevaluate on every iteration, it was previously possible under certain
circumstances for the slice to be changed out from under the range
statement while processing the orphan blocks.
Rather than fetching the hash of each block individually 2k+ times, make
use of the FetchHeightRange function so all of the most recent hashes can
be fetched at once.
This commit modifies the code to make use of the new btcd APIs that allow
fetching of transaction lists which either do or do not include fully
spent transactions. It is more efficient to avoid fetching fully spent
transactions from the database when they aren't needed.
This commit modifies the errors that result from missing expected input
transactions to a RuleError. This allows the caller to detect a block was
rejected due to a rule violation as opposed to an unexpected error.
This commit adds a quick check to the transaction store fetch code which
simply returns an empty store if no hashes were requested rather than
bothering the db with an empty list.
This commit modifies the transaction lookup code to use a set instead of a
slice (list). This allows the lookup to automatically prevent duplicate
requests to the database.
Previously, the code simply added every referenced transaction to a list
without checking for duplicates, which led to multiple requests against
the database for the same transaction. It also meant the request list
could grow quite large with all of the duplicates using far more memory
than required.
While the end result was accurate, operating that way is not as efficient
as only requesting unique transactions.
This commit implents a basic infrastructure to be used throughout the
tests for creating a new chain instance that is ready to have tests run
against it. It also returns a teardown function the caller can use to
clean up after it is done testing. This paves the way for adding more
tests.
The original thought was that chain would also house the transaction
memory pool, but that ultimately was decided against. As a result,
it only makes sense to query chain for blocks rather than generic
inventory.
This commit corrects the reading of the serialized height in coinbase
transactions for block height of version 2 or greater. On mainnet, the
serialized height is always 3 bytes and will continue to be so for
something like another ~159 years, so there was no issue with mainnet.
However on testnet, there are some version 2 blocks which are low enough
in the chain to only take 2 bytes to serialize.
In addition, this commit adds a full tests for the relavant function
including negative tests and variable length serialized lengths for block
heights.
Closes#1.
btcdb was changed a while back to not insert the genesis block by default.
This commit modifies the reorg test to insert it as required so not all
blocks are orphans.