In this commit, we publicly export the spentTxOut struct and all its
attributes. This is the first in a set of commits to optimize the
existing address index by using the spend journal rather than manually
re-creating the utxoViewPoint each time.
This modifies the utxoset in the database and related UtxoViewpoint to
store and work with unspent transaction outputs on a per-output basis
instead of at a transaction level. This was inspired by similar recent
changes in Bitcoin Core.
The primary motivation is to simplify the code, pave the way for a
utxo cache, and generally focus on optimizing runtime performance.
The tradeoff is that this approach does somewhat increase the size of
the serialized utxoset since it means that the transaction hash is
duplicated for each output as a part of the key and some additional
details such as whether the containing transaction is a coinbase and the
block height it was a part of are duplicated in each output.
However, in practice, the size difference isn't all that large, disk
space is relatively cheap, certainly cheaper than memory, and it is much
more important to provide more efficient runtime operation since that is
the ultimate purpose of the daemon.
While performing this conversion, it also simplifies the code to remove
the transaction version information from the utxoset as well as the
spend journal. The logic for only serializing it under certain
circumstances is complicated and it isn't actually used anywhere aside
from the gettxout RPC where it also isn't used by anything important
either. Consequently, this also removes the version field of the
gettxout RPC result.
The utxos in the database are automatically migrated to the new format
with this commit and it is possible to interrupt and resume the
migration process.
Finally, it also updates the tests for the new format and adds a new
function to the tests to convert the old test data to the new format for
convenience. The data has already been converted and updated in the
commit.
An overview of the changes are as follows:
- Remove transaction version from both spent and unspent output entries
- Update utxo serialization format to exclude the version
- Modify the spend journal serialization format
- The old version field is now reserved and always stores zero and
ignores it when reading
- This allows old entries to be used by new code without having to
migrate the entire spend journal
- Remove version field from gettxout RPC result
- Convert UtxoEntry to represent a specific utxo instead of a
transaction with all remaining utxos
- Optimize for memory usage with an eye towards a utxo cache
- Combine details such as whether the txout was contained in a
coinbase, is spent, and is modified into a single packed field of
bit flags
- Align entry fields to eliminate extra padding since ultimately
there will be a lot of these in memory
- Introduce a free list for serializing an outpoint to the database
key format to significantly reduce pressure on the GC
- Update all related functions that previously dealt with transaction
hashes to accept outpoints instead
- Update all callers accordingly
- Only add individually requested outputs from the mempool when
constructing a mempool view
- Modify the spend journal to always store the block height and coinbase
information with every spent txout
- Introduce code to handle fetching the missing information from
another utxo from the same transaction in the event an old style
entry is encountered
- Make use of a database cursor with seek to do this much more
efficiently than testing every possible output
- Always decompress data loaded from the database now that a utxo entry
only consists of a specific output
- Introduce upgrade code to migrate the utxo set to the new format
- Store versions of the utxoset and spend journal buckets
- Allow migration process to be interrupted and resumed
- Update all tests to expect the correct encodings, remove tests that no
longer apply, and add new ones for the new expected behavior
- Convert old tests for the legacy utxo format deserialization code to
test the new function that is used during upgrade
- Update the utxostore test data and add function that was used to
convert it
- Introduce a few new functions on UtxoViewpoint
- AddTxOut for adding an individual txout versus all of them
- addTxOut to handle the common code between the new AddTxOut and
existing AddTxOuts
- RemoveEntry for removing an individual txout
- fetchEntryByHash for fetching any remaining utxo for a given
transaction hash
Each node in the block index records some flags about its validation
state. This is just stored in memory for now, but can save effort if
attempting to reconnect a block that failed validation or was
disconnected.
This was only used to test block proposals, which has been changed to
instead use CheckConnectBlockTemplate. The flag complicated the
implementation of some chain processing routines and would be
difficult to implement with headers-first syncing.
This renames CheckConnectBlock to CheckConnectBlockTemplate and
modifies it to be easily consumable by the getblocktemplate RPC
handler. Performs full block validation now instead of partial
validation.
- Remove inMainChain from block nodes since that can now be efficiently
determined by using the chain view
- Track the best chain via a chain view instead of a single block node
- Use the tip of the best chain view everywhere bestNode was used
- Update chain view tip instead of updating best node
- Change reorg logic to use more efficient chain view fork finding logic
- Change block locator code over to use more efficient chain view logic
- Remove now unused block-index-based block locator code
- Move BlockLocator definition to chain.go
- Move BlockLocatorFromHash and LatestBlockLocator to chain.go
- Update both to use more efficient chain view logic
- Rework IsCheckpointCandidate to use block index and chain view
- Optimize MainChainHasBlock to use chain view instead of hitting db
- Move to chain.go since it no longer involves database I/O
- Removed error return since it can no longer fail
- Optimize BlockHeightByHash to use chain view instead of hitting db
- Move to chain.go since it no longer involves database I/O
- Removed error return since it can no longer fail
- Optimize BlockHashByHeight to use chain view instead of hitting db
- Move to chain.go since it no longer involves database I/O
- Removed error return since it can no longer fail
- Optimize HeightRange to use chain view instead of hitting db
- Move to chain.go since it no longer involves database I/O
- Optimize BlockByHeight to use chain view for main chain check
- Optimize BlockByHash to use chain view for main chain check
This removes the DisableVerify function and related state since nothing
uses it anymore since the command line option was removed. It is a
remnant of initial development.
This modifies the code that determines the most recently known
checkpoint to take advantage of recent changes which make the entire
block index available in memory by only storing a reference to the
specific node in the index that represents the latest known checkpoint.
Previously, the entire block was stored and new checkpoints required
loading it from the database.
This reworks the block index code such that it loads all of the headers
in the main chain at startup and constructs the full block index
accordingly.
Since the full index from the current best tip all the way back to the
genesis block is now guaranteed to be in memory, this also removes all
code related to dynamically loading the nodes and updates some of the
logic to take advantage of the fact traversing the block index can
longer potentially fail. There are also more optimizations and
simplifications that can be made in the future as a result of this.
Due to removing all of the extra overhead of tracking the dynamic state,
and ensuring the block node structs are aligned to eliminate extra
padding, the end result of a fully populated block index now takes quite
a bit less memory than the previous dynamically loaded version.
The main downside is that it now takes a while to start whereas it was
nearly instant before, however, it is much better to provide more
efficient runtime operation since that is its ultimate purpose and the
benefits far outweigh this downside.
Some benefits are:
- Since every block node is in memory, the recent code which
reconstructs headers from block nodes means that all headers can
always be served from memory which is important since the majority of
the network has moved to header-based semantics
- Several of the error paths can be removed since they are no longer
necessary
- It is no longer expensive to calculate CSV sequence locks or median
times of blocks way in the past
- It will be possible to create much more efficient iteration and
simplified views of the overall index
- The entire threshold state database cache can be removed since it is
cheap to construct it from the full block index as needed
An overview of the logic changes are as follows:
- Move AncestorNode from blockIndex to blockNode and greatly simplify
since it no longer has to deal with the possibility of dynamically
loading nodes and related failures
- Rename RelativeNode to RelativeAncestor, move to blockNode, and
redefine in terms of AncestorNode
- Move CalcPastMedianTime from blockIndex to blockNode and remove no
longer necessary test for nil
- Change calcSequenceLock to use Ancestor instead of RelativeAncestor
since it reads more clearly
This replaces the ErrDoubleSpend and ErrMissingTx error codes with a
single error code named ErrMissingTxOut and updates the relevant errors
and expected test results accordingly.
Once upon a time, the code relied on a transaction index, so it was able
to definitively differentiate between a transaction output that
legitimately did not exist and one that had already been spent.
However, since the code now uses a pruned utxoset, it is no longer
possible to reliably differentiate since once all outputs of a
transaction are spent, it is removed from the utxoset completely.
Consequently, a missing transaction could be either because the
transaction never existed or because it is fully spent.
This commit updates the new segwit validation logic within block
validation to be guarded by an initial check to the version bits state
before conditionally enforcing the logic based off of the state.
This commit implements the new block validation rules as defined by
BIP0141. The new rules include the constraints that if a block has
transactions with witness data, then there MUST be a commitment within
the conies transaction to the root of a new merkle tree which commits
to the wtxid of all transactions. Additionally, rather than limiting
the size of a block by size in bytes, blocks are now limited by their
total weight unit. Similarly, a newly define “sig op cost” is now used
to limit the signature validation cost of transactions found within
blocks.
This commit implements the flag activation portion of BIP 0147. The
verification behavior triggered by the NULLDUMMY script verification
flag has been present within btcd for some time, however it wasn’t
activated by default.
With this commit, once segwit has activated, the ScriptStrictMultiSig
will also be activated within the Script VM. Additionally, the
ScriptStrictMultiSig is now a standard script verification flag which
is used unconditionally within the mempool.
This commit modifies the existing block validation logic to examine the
current version bits state of the CSV soft-fork, enforcing the new
validation rules (BIPs 68, 112, and 113) accordingly based on the
current `ThesholdState`.
This simplifies the code based on the recommendations of the gosimple
lint tool.
Also, it increases the deadline for the linters to run to 10 minutes and
reduces the number of threads that is uses. This is being done because
the Travis environment has become increasingly slower and it also seems
to be hampered by too many threads running concurrently.
This modifies the blockNode and BestState structs in the blockchain
package to store hashes directly instead of pointers to them and updates
callers to deal with the API change in the exported BestState struct.
In general, the preferred approach for hashes moving forward is to store
hash values in complex data structures, particularly those that will be
used for cache entries, and accept pointers to hashes in arguments to
functions.
Some of the reasoning behind making this change is:
- It is generally preferred to avoid storing pointers to data in cache
objects since doing so can easily lead to storing interior pointers
into other structs that then can't be GC'd
- Keeping the hash values directly in the block node provides better
cache locality
This refactors the block index logic into a separate struct and
introduces an individual lock for it so it can be queried independent of
the chain lock.
This modifies the block nodes used in the blockchain package for keeping
track of the block index to use int64 for the timestamps instead of
time.Time.
This is being done because a time.Time takes 24 bytes while an int64
only takes 8 and the plan is to eventually move the entire block index
into memory instead of the current dynamically-loaded version, so
cutting the number of bytes used for the timestamp by a third is highly
desirable.
Also, the consensus code requires working with unix-style timestamps
anyways, so switching over to them in the block node does not seem
unreasonable.
Finally, this does not go so far as to change all of the time.Time
references, particularly those that are in the public API, so it is
purely an internal change.
This modifies the code to only enforce the fairly expensive BIP0030
(duplicate transcactions) checks when the chain has not yet reached the
BIP0034 activation height (and is not one of the 2 special historical
blocks that break the rule and prompted BIP0034 to being with) since
that BIP made it impossible to create duplicate coinbases and thus
removed the possibility of creating transactions that overwrite older
ones.
This is a rather large optimization because the check is expensive due
to involving a ton of cache misses in the utxoset. For example, the
following are times it took to perform the BIP0030 check on blocks
425490 - 425502 and a system with a relatively old Hitachi spinner HDD:
block 425490: 674.5857ms
block 425491: 726.5923ms
block 425492: 827.6051ms
block 425493: 680.0863ms
block 425494: 722.0917ms
block 425495: 700.0889ms
block 425496: 647.5823ms
block 425497: 445.0565ms
block 425498: 602.5765ms
block 425499: 375.0476ms
block 425500: 771.0979ms
block 425501: 461.5586ms
block 425502: 603.0766ms
As can be seen from these numbers, this reduces the block validation
time by an average of just over half a second for the given
representative data set and hardware.
Signed-off-by: Dave Collins <davec@conformal.com>
Now that all softforking is done via BIP0009 versionbits, replace the
old isMajorityVersion deployment mechanism with hard coded historical
block heights at which they became active.
Since the activation heights vary per network, this adds new parameters
to the chaincfg.Params struct for them and sets the correct heights at
which each softfork became active on each chain.
It should be noted that this is a technically hard fork since the
behavior of alternate chain history is different with these hard-coded
activation heights as opposed to the old isMajorityVersion code. In
particular, an alternate chain history could activate one of the soft
forks earlier than these hard-coded heights which means the old code
would reject blocks which violate the new soft fork rules whereas this
new code would not.
However, all of the soft forks this refers to were activated so far in
the chain history there is there is no way a reorg that long could
happen and checkpoints reject alternate chains before the most recent
checkpoint anyways. Furthermore, the same change was made in Bitcoin
Core so this needs to be changed to be consistent anyways.
This commit introduces the concept of “sequence locks” borrowed from
Bitcoin Core for converting an input’s relative time-locks to an
absolute value based on a particular block for input maturity
evaluation.
A sequence lock is computed as the most distant maturity height/time
amongst all the referenced outputs within a particular transaction.
A transaction with sequence locks activated within any of its inputs
can *only* be included within a block if from the point-of-view of that
block either the time-based or height-based maturity for all referenced
inputs has been met.
A transaction with sequence locks can only be accepted to the mempool
iff from the point-of-view of the *next* (yet to be found block) all
referenced inputs within the transaction are mature.
This modifies the ExtractCoinbaseHeight function to recognize small
canonically serialized block heights in coinbase scripts of blocks
higher than version 2.
This allows regression test chains in which blocks encode the serialized
height in the coinbase starting from block 1.
This moves several of the chain constants to the Params struct in the
chaincfg package which is intended for that purpose. This is mostly a
backport of the same modifications made in Decred along with a few
additional things cleaned up.
The following is an overview of the changes:
- Comment all fields in the Params struct definition
- Add locals to BlockChain instance for the calculated values based on
the provided chain params
- Rename the following param fields:
- SubsidyHalvingInterval -> SubsidyReductionInterval
- ResetMinDifficulty -> ReduceMinDifficulty
- Add new Param fields:
- CoinbaseMaturity
- TargetTimePerBlock
- TargetTimespan
- BlocksPerRetarget
- RetargetAdjustmentFactor
- MinDiffReductionTime
This is mostly a backport of some of the same modifications made in
Decred along with a few additional things cleaned up. In particular,
this updates the code to make use of the new chainhash package.
Also, since this required API changes anyways and the hash algorithm is
no longer tied specifically to SHA, all other functions throughout the
code base which had "Sha" in their name have been changed to Hash so
they are not incorrectly implying the hash algorithm.
The following is an overview of the changes:
- Remove the wire.ShaHash type
- Update all references to wire.ShaHash to the new chainhash.Hash type
- Rename the following functions and update all references:
- wire.BlockHeader.BlockSha -> BlockHash
- wire.MsgBlock.BlockSha -> BlockHash
- wire.MsgBlock.TxShas -> TxHashes
- wire.MsgTx.TxSha -> TxHash
- blockchain.ShaHashToBig -> HashToBig
- peer.ShaFunc -> peer.HashFunc
- Rename all variables that included sha in their name to include hash
instead
- Update for function name changes in other dependent packages such as
btcutil
- Update copyright dates on all modified files
- Update glide.lock file to use the required version of btcutil
This commit is the first stage of several that are planned to convert
the blockchain package into a concurrent safe package that will
ultimately allow support for multi-peer download and concurrent chain
processing. The goal is to update btcd proper after each step so it can
take advantage of the enhancements as they are developed.
In addition to the aforementioned benefit, this staged approach has been
chosen since it is absolutely critical to maintain consensus.
Separating the changes into several stages makes it easier for reviewers
to logically follow what is happening and therefore helps prevent
consensus bugs. Naturally there are significant automated tests to help
prevent consensus issues as well.
The main focus of this stage is to convert the blockchain package to use
the new database interface and implement the chain-related functionality
which it no longer handles. It also aims to improve efficiency in
various areas by making use of the new database and chain capabilities.
The following is an overview of the chain changes:
- Update to use the new database interface
- Add chain-related functionality that the old database used to handle
- Main chain structure and state
- Transaction spend tracking
- Implement a new pruned unspent transaction output (utxo) set
- Provides efficient direct access to the unspent transaction outputs
- Uses a domain specific compression algorithm that understands the
standard transaction scripts in order to significantly compress them
- Removes reliance on the transaction index and paves the way toward
eventually enabling block pruning
- Modify the New function to accept a Config struct instead of
inidividual parameters
- Replace the old TxStore type with a new UtxoViewpoint type that makes
use of the new pruned utxo set
- Convert code to treat the new UtxoViewpoint as a rolling view that is
used between connects and disconnects to improve efficiency
- Make best chain state always set when the chain instance is created
- Remove now unnecessary logic for dealing with unset best state
- Make all exported functions concurrent safe
- Currently using a single chain state lock as it provides a straight
forward and easy to review path forward however this can be improved
with more fine grained locking
- Optimize various cases where full blocks were being loaded when only
the header is needed to help reduce the I/O load
- Add the ability for callers to get a snapshot of the current best
chain stats in a concurrent safe fashion
- Does not block callers while new blocks are being processed
- Make error messages that reference transaction outputs consistently
use <transaction hash>:<output index>
- Introduce a new AssertError type an convert internal consistency
checks to use it
- Update tests and examples to reflect the changes
- Add a full suite of tests to ensure correct functionality of the new
code
The following is an overview of the btcd changes:
- Update to use the new database and chain interfaces
- Temporarily remove all code related to the transaction index
- Temporarily remove all code related to the address index
- Convert all code that uses transaction stores to use the new utxo
view
- Rework several calls that required the block manager for safe
concurrency to use the chain package directly now that it is
concurrent safe
- Change all calls to obtain the best hash to use the new best state
snapshot capability from the chain package
- Remove workaround for limits on fetching height ranges since the new
database interface no longer imposes them
- Correct the gettxout RPC handler to return the best chain hash as
opposed the hash the txout was found in
- Optimize various RPC handlers:
- Change several of the RPC handlers to use the new chain snapshot
capability to avoid needlessly loading data
- Update several handlers to use new functionality to avoid accessing
the block manager so they are able to return the data without
blocking when the server is busy processing blocks
- Update non-verbose getblock to avoid deserialization and
serialization overhead
- Update getblockheader to request the block height directly from
chain and only load the header
- Update getdifficulty to use the new cached data from chain
- Update getmininginfo to use the new cached data from chain
- Update non-verbose getrawtransaction to avoid deserialization and
serialization overhead
- Update gettxout to use the new utxo store versus loading
full transactions using the transaction index
The following is an overview of the utility changes:
- Update addblock to use the new database and chain interfaces
- Update findcheckpoint to use the new database and chain interfaces
- Remove the dropafter utility which is no longer supported
NOTE: The transaction index and address index will be reimplemented in
another commit.
Introduce an ECDSA signature verification into btcd in order to
mitigate a certain DoS attack and as a performance optimization.
The benefits of SigCache are two fold. Firstly, usage of SigCache
mitigates a DoS attack wherein an attacker causes a victim's client to
hang due to worst-case behavior triggered while processing attacker
crafted invalid transactions. A detailed description of the mitigated
DoS attack can be found here: https://bitslog.wordpress.com/2013/01/23/fixed-bitcoin-vulnerability-explanation-why-the-signature-cache-is-a-dos-protection/
Secondly, usage of the SigCache introduces a signature verification
optimization which speeds up the validation of transactions within a
block, if they've already been seen and verified within the mempool.
The server itself manages the sigCache instance. The blockManager and
txMempool respectively now receive pointers to the created sigCache
instance. All read (sig triplet existence) operations on the sigCache
will not block unless a separate goroutine is adding an entry (writing)
to the sigCache. GetBlockTemplate generation now also utilizes the
sigCache in order to avoid unnecessarily double checking signatures
when generating a template after previously accepting a txn to the
mempool. Consequently, the CPU miner now also employs the same
optimization.
The maximum number of entries for the sigCache has been introduced as a
config parameter in order to allow users to configure the amount of
memory consumed by this new additional caching.
This commit converts all block height references to int32 instead of
int64. The current target block production rate is 10 mins per block
which means it will take roughly 40,800 years to reach the maximum
height an int32 affords. Even if the target rate were lowered to one
block per minute, it would still take roughly another 4,080 years to
reach the maximum.
In the mean time, there is no reason to use a larger type which results
in higher memory and disk space usage. However, for now, in order to
avoid having to reserialize a bunch of database information, the heights
are still serialized to the database as 8-byte uint64s.
This is being mainly being done in preparation for further upcoming
infrastructure changes which will use the smaller and more efficient
4-byte serialization in the database as well.
This change moves IsFinalizedTransaction to txscript and also changes
the first argument to take a wire.MsgTx instead of btcutil.Tx. This
is needed for an upcoming diff in which txscript will require
IsFinalizedTransaction and we do not want to import the btcd/blockchain.
This commit refactors the consensus rule checks for block headers and
blocks in the blockchain package into separate functions. These changes
contain no modifications to consensus rules and the code still passes all
block consensus tests. It is only a refactoring.
This is being done to help pave the way toward supporting concurrent
downloads. While the package already supports headers-first mode up
through the latest checkpoint through the use of the BFFastAdd flag and
hard-coded checkpoints, it currently only works when downloading from a
single peer. In order to support concurrent downloads from multiple
peers, the ability for the caller to do things such as independently
checking a block header (both context-free and full-context checks) will
be needed.
There are several more changes that will be necessary to support
concurrent downloads as well, such as making the package concurrent safe,
modifying it to make use of the new database API, etc. Those changes are
planned for future commits.
In order to avoid prior situations of stalled syncs due to
outdated peer height data, we now update block heights up peers in
real-time as we learn of their announced
blocks.
Updates happen when:
* A peer sends us an orphan block. We update based on
the height embedded in the scriptSig for the coinbase tx
* When a peer sends us an inv for a block we already know
of
* When peers announce new blocks. Subsequent
announcements that lost the announcement race are
recognized and peer heights are updated accordingly
Additionally, the `getpeerinfo` command has been modified
to include both the starting height, and current height of
connected peers.
Docs have been updated with `getpeerinfo` extension.