In this commit, we publicly export the spentTxOut struct and all its
attributes. This is the first in a set of commits to optimize the
existing address index by using the spend journal rather than manually
re-creating the utxoViewPoint each time.
This modifies the utxoset in the database and related UtxoViewpoint to
store and work with unspent transaction outputs on a per-output basis
instead of at a transaction level. This was inspired by similar recent
changes in Bitcoin Core.
The primary motivation is to simplify the code, pave the way for a
utxo cache, and generally focus on optimizing runtime performance.
The tradeoff is that this approach does somewhat increase the size of
the serialized utxoset since it means that the transaction hash is
duplicated for each output as a part of the key and some additional
details such as whether the containing transaction is a coinbase and the
block height it was a part of are duplicated in each output.
However, in practice, the size difference isn't all that large, disk
space is relatively cheap, certainly cheaper than memory, and it is much
more important to provide more efficient runtime operation since that is
the ultimate purpose of the daemon.
While performing this conversion, it also simplifies the code to remove
the transaction version information from the utxoset as well as the
spend journal. The logic for only serializing it under certain
circumstances is complicated and it isn't actually used anywhere aside
from the gettxout RPC where it also isn't used by anything important
either. Consequently, this also removes the version field of the
gettxout RPC result.
The utxos in the database are automatically migrated to the new format
with this commit and it is possible to interrupt and resume the
migration process.
Finally, it also updates the tests for the new format and adds a new
function to the tests to convert the old test data to the new format for
convenience. The data has already been converted and updated in the
commit.
An overview of the changes are as follows:
- Remove transaction version from both spent and unspent output entries
- Update utxo serialization format to exclude the version
- Modify the spend journal serialization format
- The old version field is now reserved and always stores zero and
ignores it when reading
- This allows old entries to be used by new code without having to
migrate the entire spend journal
- Remove version field from gettxout RPC result
- Convert UtxoEntry to represent a specific utxo instead of a
transaction with all remaining utxos
- Optimize for memory usage with an eye towards a utxo cache
- Combine details such as whether the txout was contained in a
coinbase, is spent, and is modified into a single packed field of
bit flags
- Align entry fields to eliminate extra padding since ultimately
there will be a lot of these in memory
- Introduce a free list for serializing an outpoint to the database
key format to significantly reduce pressure on the GC
- Update all related functions that previously dealt with transaction
hashes to accept outpoints instead
- Update all callers accordingly
- Only add individually requested outputs from the mempool when
constructing a mempool view
- Modify the spend journal to always store the block height and coinbase
information with every spent txout
- Introduce code to handle fetching the missing information from
another utxo from the same transaction in the event an old style
entry is encountered
- Make use of a database cursor with seek to do this much more
efficiently than testing every possible output
- Always decompress data loaded from the database now that a utxo entry
only consists of a specific output
- Introduce upgrade code to migrate the utxo set to the new format
- Store versions of the utxoset and spend journal buckets
- Allow migration process to be interrupted and resumed
- Update all tests to expect the correct encodings, remove tests that no
longer apply, and add new ones for the new expected behavior
- Convert old tests for the legacy utxo format deserialization code to
test the new function that is used during upgrade
- Update the utxostore test data and add function that was used to
convert it
- Introduce a few new functions on UtxoViewpoint
- AddTxOut for adding an individual txout versus all of them
- addTxOut to handle the common code between the new AddTxOut and
existing AddTxOuts
- RemoveEntry for removing an individual txout
- fetchEntryByHash for fetching any remaining utxo for a given
transaction hash
This corrects the assertion in the decodeSpentTxOut function so it does
not improperly cause a panic when unwinding transactions during a reorg
under certain circumstances. In particular, the provided transaction
version that is passed when a stxo entry does not exist is now -1 in
order to properly distinguish it from the zero value.
It also updates the tests accordingly.
This was discovered by the reorg on testnet from block
00000000000018c58c2d2816f03dac327d975a18af6edf1a369df67ecddaf816 to
0000000000001c1161a367156465cc6226e9f862d9c585f94db5779fdf5455ff.
This is mostly a backport of some of the same modifications made in
Decred along with a few additional things cleaned up. In particular,
this updates the code to make use of the new chainhash package.
Also, since this required API changes anyways and the hash algorithm is
no longer tied specifically to SHA, all other functions throughout the
code base which had "Sha" in their name have been changed to Hash so
they are not incorrectly implying the hash algorithm.
The following is an overview of the changes:
- Remove the wire.ShaHash type
- Update all references to wire.ShaHash to the new chainhash.Hash type
- Rename the following functions and update all references:
- wire.BlockHeader.BlockSha -> BlockHash
- wire.MsgBlock.BlockSha -> BlockHash
- wire.MsgBlock.TxShas -> TxHashes
- wire.MsgTx.TxSha -> TxHash
- blockchain.ShaHashToBig -> HashToBig
- peer.ShaFunc -> peer.HashFunc
- Rename all variables that included sha in their name to include hash
instead
- Update for function name changes in other dependent packages such as
btcutil
- Update copyright dates on all modified files
- Update glide.lock file to use the required version of btcutil
This commit is the first stage of several that are planned to convert
the blockchain package into a concurrent safe package that will
ultimately allow support for multi-peer download and concurrent chain
processing. The goal is to update btcd proper after each step so it can
take advantage of the enhancements as they are developed.
In addition to the aforementioned benefit, this staged approach has been
chosen since it is absolutely critical to maintain consensus.
Separating the changes into several stages makes it easier for reviewers
to logically follow what is happening and therefore helps prevent
consensus bugs. Naturally there are significant automated tests to help
prevent consensus issues as well.
The main focus of this stage is to convert the blockchain package to use
the new database interface and implement the chain-related functionality
which it no longer handles. It also aims to improve efficiency in
various areas by making use of the new database and chain capabilities.
The following is an overview of the chain changes:
- Update to use the new database interface
- Add chain-related functionality that the old database used to handle
- Main chain structure and state
- Transaction spend tracking
- Implement a new pruned unspent transaction output (utxo) set
- Provides efficient direct access to the unspent transaction outputs
- Uses a domain specific compression algorithm that understands the
standard transaction scripts in order to significantly compress them
- Removes reliance on the transaction index and paves the way toward
eventually enabling block pruning
- Modify the New function to accept a Config struct instead of
inidividual parameters
- Replace the old TxStore type with a new UtxoViewpoint type that makes
use of the new pruned utxo set
- Convert code to treat the new UtxoViewpoint as a rolling view that is
used between connects and disconnects to improve efficiency
- Make best chain state always set when the chain instance is created
- Remove now unnecessary logic for dealing with unset best state
- Make all exported functions concurrent safe
- Currently using a single chain state lock as it provides a straight
forward and easy to review path forward however this can be improved
with more fine grained locking
- Optimize various cases where full blocks were being loaded when only
the header is needed to help reduce the I/O load
- Add the ability for callers to get a snapshot of the current best
chain stats in a concurrent safe fashion
- Does not block callers while new blocks are being processed
- Make error messages that reference transaction outputs consistently
use <transaction hash>:<output index>
- Introduce a new AssertError type an convert internal consistency
checks to use it
- Update tests and examples to reflect the changes
- Add a full suite of tests to ensure correct functionality of the new
code
The following is an overview of the btcd changes:
- Update to use the new database and chain interfaces
- Temporarily remove all code related to the transaction index
- Temporarily remove all code related to the address index
- Convert all code that uses transaction stores to use the new utxo
view
- Rework several calls that required the block manager for safe
concurrency to use the chain package directly now that it is
concurrent safe
- Change all calls to obtain the best hash to use the new best state
snapshot capability from the chain package
- Remove workaround for limits on fetching height ranges since the new
database interface no longer imposes them
- Correct the gettxout RPC handler to return the best chain hash as
opposed the hash the txout was found in
- Optimize various RPC handlers:
- Change several of the RPC handlers to use the new chain snapshot
capability to avoid needlessly loading data
- Update several handlers to use new functionality to avoid accessing
the block manager so they are able to return the data without
blocking when the server is busy processing blocks
- Update non-verbose getblock to avoid deserialization and
serialization overhead
- Update getblockheader to request the block height directly from
chain and only load the header
- Update getdifficulty to use the new cached data from chain
- Update getmininginfo to use the new cached data from chain
- Update non-verbose getrawtransaction to avoid deserialization and
serialization overhead
- Update gettxout to use the new utxo store versus loading
full transactions using the transaction index
The following is an overview of the utility changes:
- Update addblock to use the new database and chain interfaces
- Update findcheckpoint to use the new database and chain interfaces
- Remove the dropafter utility which is no longer supported
NOTE: The transaction index and address index will be reimplemented in
another commit.