This reworks the block index code such that it loads all of the headers
in the main chain at startup and constructs the full block index
accordingly.
Since the full index from the current best tip all the way back to the
genesis block is now guaranteed to be in memory, this also removes all
code related to dynamically loading the nodes and updates some of the
logic to take advantage of the fact traversing the block index can
longer potentially fail. There are also more optimizations and
simplifications that can be made in the future as a result of this.
Due to removing all of the extra overhead of tracking the dynamic state,
and ensuring the block node structs are aligned to eliminate extra
padding, the end result of a fully populated block index now takes quite
a bit less memory than the previous dynamically loaded version.
The main downside is that it now takes a while to start whereas it was
nearly instant before, however, it is much better to provide more
efficient runtime operation since that is its ultimate purpose and the
benefits far outweigh this downside.
Some benefits are:
- Since every block node is in memory, the recent code which
reconstructs headers from block nodes means that all headers can
always be served from memory which is important since the majority of
the network has moved to header-based semantics
- Several of the error paths can be removed since they are no longer
necessary
- It is no longer expensive to calculate CSV sequence locks or median
times of blocks way in the past
- It will be possible to create much more efficient iteration and
simplified views of the overall index
- The entire threshold state database cache can be removed since it is
cheap to construct it from the full block index as needed
An overview of the logic changes are as follows:
- Move AncestorNode from blockIndex to blockNode and greatly simplify
since it no longer has to deal with the possibility of dynamically
loading nodes and related failures
- Rename RelativeNode to RelativeAncestor, move to blockNode, and
redefine in terms of AncestorNode
- Move CalcPastMedianTime from blockIndex to blockNode and remove no
longer necessary test for nil
- Change calcSequenceLock to use Ancestor instead of RelativeAncestor
since it reads more clearly
This modifies the blockNode and BestState structs in the blockchain
package to store hashes directly instead of pointers to them and updates
callers to deal with the API change in the exported BestState struct.
In general, the preferred approach for hashes moving forward is to store
hash values in complex data structures, particularly those that will be
used for cache entries, and accept pointers to hashes in arguments to
functions.
Some of the reasoning behind making this change is:
- It is generally preferred to avoid storing pointers to data in cache
objects since doing so can easily lead to storing interior pointers
into other structs that then can't be GC'd
- Keeping the hash values directly in the block node provides better
cache locality
This refactors the block index logic into a separate struct and
introduces an individual lock for it so it can be queried independent of
the chain lock.
This is mostly a backport of some of the same modifications made in
Decred along with a few additional things cleaned up. In particular,
this updates the code to make use of the new chainhash package.
Also, since this required API changes anyways and the hash algorithm is
no longer tied specifically to SHA, all other functions throughout the
code base which had "Sha" in their name have been changed to Hash so
they are not incorrectly implying the hash algorithm.
The following is an overview of the changes:
- Remove the wire.ShaHash type
- Update all references to wire.ShaHash to the new chainhash.Hash type
- Rename the following functions and update all references:
- wire.BlockHeader.BlockSha -> BlockHash
- wire.MsgBlock.BlockSha -> BlockHash
- wire.MsgBlock.TxShas -> TxHashes
- wire.MsgTx.TxSha -> TxHash
- blockchain.ShaHashToBig -> HashToBig
- peer.ShaFunc -> peer.HashFunc
- Rename all variables that included sha in their name to include hash
instead
- Update for function name changes in other dependent packages such as
btcutil
- Update copyright dates on all modified files
- Update glide.lock file to use the required version of btcutil
This commit is the first stage of several that are planned to convert
the blockchain package into a concurrent safe package that will
ultimately allow support for multi-peer download and concurrent chain
processing. The goal is to update btcd proper after each step so it can
take advantage of the enhancements as they are developed.
In addition to the aforementioned benefit, this staged approach has been
chosen since it is absolutely critical to maintain consensus.
Separating the changes into several stages makes it easier for reviewers
to logically follow what is happening and therefore helps prevent
consensus bugs. Naturally there are significant automated tests to help
prevent consensus issues as well.
The main focus of this stage is to convert the blockchain package to use
the new database interface and implement the chain-related functionality
which it no longer handles. It also aims to improve efficiency in
various areas by making use of the new database and chain capabilities.
The following is an overview of the chain changes:
- Update to use the new database interface
- Add chain-related functionality that the old database used to handle
- Main chain structure and state
- Transaction spend tracking
- Implement a new pruned unspent transaction output (utxo) set
- Provides efficient direct access to the unspent transaction outputs
- Uses a domain specific compression algorithm that understands the
standard transaction scripts in order to significantly compress them
- Removes reliance on the transaction index and paves the way toward
eventually enabling block pruning
- Modify the New function to accept a Config struct instead of
inidividual parameters
- Replace the old TxStore type with a new UtxoViewpoint type that makes
use of the new pruned utxo set
- Convert code to treat the new UtxoViewpoint as a rolling view that is
used between connects and disconnects to improve efficiency
- Make best chain state always set when the chain instance is created
- Remove now unnecessary logic for dealing with unset best state
- Make all exported functions concurrent safe
- Currently using a single chain state lock as it provides a straight
forward and easy to review path forward however this can be improved
with more fine grained locking
- Optimize various cases where full blocks were being loaded when only
the header is needed to help reduce the I/O load
- Add the ability for callers to get a snapshot of the current best
chain stats in a concurrent safe fashion
- Does not block callers while new blocks are being processed
- Make error messages that reference transaction outputs consistently
use <transaction hash>:<output index>
- Introduce a new AssertError type an convert internal consistency
checks to use it
- Update tests and examples to reflect the changes
- Add a full suite of tests to ensure correct functionality of the new
code
The following is an overview of the btcd changes:
- Update to use the new database and chain interfaces
- Temporarily remove all code related to the transaction index
- Temporarily remove all code related to the address index
- Convert all code that uses transaction stores to use the new utxo
view
- Rework several calls that required the block manager for safe
concurrency to use the chain package directly now that it is
concurrent safe
- Change all calls to obtain the best hash to use the new best state
snapshot capability from the chain package
- Remove workaround for limits on fetching height ranges since the new
database interface no longer imposes them
- Correct the gettxout RPC handler to return the best chain hash as
opposed the hash the txout was found in
- Optimize various RPC handlers:
- Change several of the RPC handlers to use the new chain snapshot
capability to avoid needlessly loading data
- Update several handlers to use new functionality to avoid accessing
the block manager so they are able to return the data without
blocking when the server is busy processing blocks
- Update non-verbose getblock to avoid deserialization and
serialization overhead
- Update getblockheader to request the block height directly from
chain and only load the header
- Update getdifficulty to use the new cached data from chain
- Update getmininginfo to use the new cached data from chain
- Update non-verbose getrawtransaction to avoid deserialization and
serialization overhead
- Update gettxout to use the new utxo store versus loading
full transactions using the transaction index
The following is an overview of the utility changes:
- Update addblock to use the new database and chain interfaces
- Update findcheckpoint to use the new database and chain interfaces
- Remove the dropafter utility which is no longer supported
NOTE: The transaction index and address index will be reimplemented in
another commit.
This commit converts all block height references to int32 instead of
int64. The current target block production rate is 10 mins per block
which means it will take roughly 40,800 years to reach the maximum
height an int32 affords. Even if the target rate were lowered to one
block per minute, it would still take roughly another 4,080 years to
reach the maximum.
In the mean time, there is no reason to use a larger type which results
in higher memory and disk space usage. However, for now, in order to
avoid having to reserialize a bunch of database information, the heights
are still serialized to the database as 8-byte uint64s.
This is being mainly being done in preparation for further upcoming
infrastructure changes which will use the smaller and more efficient
4-byte serialization in the database as well.
This commit contains the entire btcchain repository along with several
changes needed to move all of the files into the blockchain directory in
order to prepare it for merging. This does NOT update btcd or any of the
other packages to use the new location as that will be done separately.
- All import paths in the old btcchain test files have been changed to
the new location
- All references to btcchain as the package name have been changed to
blockchain