Commit graph

31 commits

Author SHA1 Message Date
Mikael Lindlof b11bf582c5 Improve chain state init efficiency
Remove unnecessary slice of all block indexes and
remove DB iteration over all block indexes that
used to determined the size of the slice.
2020-06-08 12:53:00 -04:00
Olaoluwa Osuntokun 4b13e79691
blockchain: in initChainState mark ancestors of best tip as valid if not marked
In this commit, we add an additional consistency check within the
`initChainState` method.  It has been observed that at times, a block
wil lbe written to disk (as it's valid), but then the block index isn't
updated to reflect this. This can cause btcd to fail to do things like
serve cfheaders for valid blocks.

To partially remedy this, when we're loading in the index, we assume
that all ancestors of the current chain tip are valid, and mark them as
such. At the very end, we'll flush the index to ensure the state is
fully consistent. Typically this will be a noop, as only dirty elements
are flushed.
2018-08-08 17:50:00 -07:00
Olaoluwa Osuntokun e4d82bd6e2 blockchain: publicly export spentTxOut and all attributes
In this commit, we publicly export the spentTxOut struct and all its
attributes. This is the first in a set of commits to optimize the
existing address index by using the spend journal rather than manually
re-creating the utxoViewPoint each time.
2018-05-30 20:46:12 -07:00
Olaoluwa Osuntokun 4bd5b1a43a blockchain: add new FetchSpendJournal method 2018-05-30 20:46:12 -07:00
Dave Collins a59ac5b18f
multi: Rework utxoset/view to use outpoints.
This modifies the utxoset in the database and related UtxoViewpoint to
store and work with unspent transaction outputs on a per-output basis
instead of at a transaction level.  This was inspired by similar recent
changes in Bitcoin Core.

The primary motivation is to simplify the code, pave the way for a
utxo cache, and generally focus on optimizing runtime performance.

The tradeoff is that this approach does somewhat increase the size of
the serialized utxoset since it means that the transaction hash is
duplicated for each output as a part of the key and some additional
details such as whether the containing transaction is a coinbase and the
block height it was a part of are duplicated in each output.

However, in practice, the size difference isn't all that large, disk
space is relatively cheap, certainly cheaper than memory, and it is much
more important to provide more efficient runtime operation since that is
the ultimate purpose of the daemon.

While performing this conversion, it also simplifies the code to remove
the transaction version information from the utxoset as well as the
spend journal.  The logic for only serializing it under certain
circumstances is complicated and it isn't actually used anywhere aside
from the gettxout RPC where it also isn't used by anything important
either.  Consequently, this also removes the version field of the
gettxout RPC result.

The utxos in the database are automatically migrated to the new format
with this commit and it is possible to interrupt and resume the
migration process.

Finally, it also updates the tests for the new format and adds a new
function to the tests to convert the old test data to the new format for
convenience.  The data has already been converted and updated in the
commit.

An overview of the changes are as follows:

- Remove transaction version from both spent and unspent output entries
  - Update utxo serialization format to exclude the version
  - Modify the spend journal serialization format
    - The old version field is now reserved and always stores zero and
      ignores it when reading
    - This allows old entries to be used by new code without having to
      migrate the entire spend journal
  - Remove version field from gettxout RPC result
- Convert UtxoEntry to represent a specific utxo instead of a
  transaction with all remaining utxos
  - Optimize for memory usage with an eye towards a utxo cache
    - Combine details such as whether the txout was contained in a
      coinbase, is spent, and is modified into a single packed field of
      bit flags
    - Align entry fields to eliminate extra padding since ultimately
      there will be a lot of these in memory
    - Introduce a free list for serializing an outpoint to the database
      key format to significantly reduce pressure on the GC
  - Update all related functions that previously dealt with transaction
    hashes to accept outpoints instead
  - Update all callers accordingly
  - Only add individually requested outputs from the mempool when
    constructing a mempool view
- Modify the spend journal to always store the block height and coinbase
  information with every spent txout
  - Introduce code to handle fetching the missing information from
    another utxo from the same transaction in the event an old style
    entry is encountered
    - Make use of a database cursor with seek to do this much more
      efficiently than testing every possible output
- Always decompress data loaded from the database now that a utxo entry
  only consists of a specific output
- Introduce upgrade code to migrate the utxo set to the new format
  - Store versions of the utxoset and spend journal buckets
  - Allow migration process to be interrupted and resumed
- Update all tests to expect the correct encodings, remove tests that no
  longer apply, and add new ones for the new expected behavior
  - Convert old tests for the legacy utxo format deserialization code to
    test the new function that is used during upgrade
  - Update the utxostore test data and add function that was used to
    convert it
- Introduce a few new functions on UtxoViewpoint
  - AddTxOut for adding an individual txout versus all of them
  - addTxOut to handle the common code between the new AddTxOut and
    existing AddTxOuts
  - RemoveEntry for removing an individual txout
  - fetchEntryByHash for fetching any remaining utxo for a given
    transaction hash
2018-05-27 03:07:41 -05:00
Jim Posen 0ef0d8c59b blockchain: Initialize database with genesis block.
This fixes a bug introduced by #1066.
2018-02-09 11:34:30 -05:00
Jim Posen 52cddc19cd blockchain: Persist block status changes to disk.
The block index now tracks the set of dirty block nodes with status
changes that haven't been persisted and flushes the changes to the DB
at the appropriate times.
2018-01-28 23:34:56 -06:00
Jim Posen 31444f5890 blockchain: Add parent to blockNode constructor. 2018-01-28 23:34:56 -06:00
Jim Posen a4d22d8384 blockchain: Delete block headers from old block index bucket.
As part of the migration to the new block index bucket, also delete
the redundant data from the old bucket.
2018-01-28 23:34:56 -06:00
Jim Posen 74fb6e56da blockchain: Database migration to populate block index bucket.
This creates a migration function that populates the block index
bucket using data from the ffldb block index bucket if it does not
exist.
2018-01-28 23:34:56 -06:00
Jim Posen 6315cea70c blockchain: Load all block headers into block index on init.
Currently only the blocks in the active chain are loaded into the
block index on initialization. This instead iterates over the entire
block index bucket in LevelDB and loads all nodes.
2018-01-28 23:34:56 -06:00
Jim Posen 175fd940bb blockchain: Store block headers in bucket managed by chainio.
The bucket contains block headers keyed by the block height encoded as
big-endian concatenated with the block hash. This allows block headers
to be fetched from the DB in height order with a cursor.
2018-01-28 23:34:56 -06:00
Jim Posen c7588cbf76 blockchain: NodeStatus & Set/UnsetStatusFlags methods on blockIndex.
These method allows safe concurrent access to reading and modifying
block node statuses. When block statuses get persisted in a later
change, the setter methods can be used to mark block nodes as dirty.
2017-10-23 04:33:15 -05:00
Jim Posen e1ef2f899b blockchain: Track block validation status in block index.
Each node in the block index records some flags about its validation
state. This is just stored in memory for now, but can save effort if
attempting to reconnect a block that failed validation or was
disconnected.
2017-10-23 04:33:15 -05:00
Dave Collins 20910511e9
blockchain: Refactor to use new chain view.
- Remove inMainChain from block nodes since that can now be efficiently
  determined by using the chain view
- Track the best chain via a chain view instead of a single block node
  - Use the tip of the best chain view everywhere bestNode was used
  - Update chain view tip instead of updating best node
- Change reorg logic to use more efficient chain view fork finding logic
- Change block locator code over to use more efficient chain view logic
  - Remove now unused block-index-based block locator code
  - Move BlockLocator definition to chain.go
  - Move BlockLocatorFromHash and LatestBlockLocator to chain.go
    - Update both to use more efficient chain view logic
- Rework IsCheckpointCandidate to use block index and chain view
- Optimize MainChainHasBlock to use chain view instead of hitting db
  - Move to chain.go since it no longer involves database I/O
  - Removed error return since it can no longer fail
- Optimize BlockHeightByHash to use chain view instead of hitting db
  - Move to chain.go since it no longer involves database I/O
  - Removed error return since it can no longer fail
- Optimize BlockHashByHeight to use chain view instead of hitting db
  - Move to chain.go since it no longer involves database I/O
  - Removed error return since it can no longer fail
- Optimize HeightRange to use chain view instead of hitting db
  - Move to chain.go since it no longer involves database I/O
- Optimize BlockByHeight to use chain view for main chain check
- Optimize BlockByHash to use chain view for main chain check
2017-08-23 23:43:37 -05:00
Dave Collins 349450379f
blockchain: Remove threshold state db cache.
This completely removes the threshold state database caching code since
it can very quickly be calculated at startup now that the entire block
index is loaded first.
2017-08-15 16:55:40 -05:00
Dave Collins 296fa0a5a0
blockchain: Convert to full block index in mem.
This reworks the block index code such that it loads all of the headers
in the main chain at startup and constructs the full block index
accordingly.

Since the full index from the current best tip all the way back to the
genesis block is now guaranteed to be in memory, this also removes all
code related to dynamically loading the nodes and updates some of the
logic to take advantage of the fact traversing the block index can
longer potentially fail.  There are also more optimizations and
simplifications that can be made in the future as a result of this.

Due to removing all of the extra overhead of tracking the dynamic state,
and ensuring the block node structs are aligned to eliminate extra
padding, the end result of a fully populated block index now takes quite
a bit less memory than the previous dynamically loaded version.

The main downside is that it now takes a while to start whereas it was
nearly instant before, however, it is much better to provide more
efficient runtime operation since that is its ultimate purpose and the
benefits far outweigh this downside.

Some benefits are:

- Since every block node is in memory, the recent code which
  reconstructs headers from block nodes means that all headers can
  always be served from memory which is important since the majority of
  the network has moved to header-based semantics
- Several of the error paths can be removed since they are no longer
  necessary
- It is no longer expensive to calculate CSV sequence locks or median
  times of blocks way in the past
- It will be possible to create much more efficient iteration and
  simplified views of the overall index
- The entire threshold state database cache can be removed since it is
  cheap to construct it from the full block index as needed

An overview of the logic changes are as follows:

- Move AncestorNode from blockIndex to blockNode and greatly simplify
  since it no longer has to deal with the possibility of dynamically
  loading nodes and related failures
- Rename RelativeNode to RelativeAncestor, move to blockNode, and
  redefine in terms of AncestorNode
- Move CalcPastMedianTime from blockIndex to blockNode and remove no
  longer necessary test for nil
- Change calcSequenceLock to use Ancestor instead of RelativeAncestor
  since it reads more clearly
2017-08-15 15:42:34 -05:00
Olaoluwa Osuntokun d0768abcc4 BIP0141+blockchain: implement segwit block validation rules
This commit implements the new block validation rules as defined by
BIP0141. The new rules include the constraints that if a block has
transactions with witness data, then there MUST be a commitment within
the conies transaction to the root of a new merkle tree which commits
to the wtxid of all transactions. Additionally, rather than limiting
the size of a block by size in bytes, blocks are now limited by their
total weight unit. Similarly, a newly define “sig op cost” is now used
to limit the signature validation cost of transactions found within
blocks.
2017-08-13 23:17:40 -05:00
Dave Collins 9f0e66ee3f
blockchain: Correct invalid assertion.
This corrects the assertion in the decodeSpentTxOut function so it does
not improperly cause a panic when unwinding transactions during a reorg
under certain circumstances.  In particular, the provided transaction
version that is passed when a stxo entry does not exist is now -1 in
order to properly distinguish it from the zero value.

It also updates the tests accordingly.

This was discovered by the reorg on testnet from block
00000000000018c58c2d2816f03dac327d975a18af6edf1a369df67ecddaf816 to
0000000000001c1161a367156465cc6226e9f862d9c585f94db5779fdf5455ff.
2017-07-01 02:25:41 -05:00
Dave Collins d06c0bb181
blockchain: Use hash values in structs.
This modifies the blockNode and BestState structs in the blockchain
package to store hashes directly instead of pointers to them and updates
callers to deal with the API change in the exported BestState struct.

In general, the preferred approach for hashes moving forward is to store
hash values in complex data structures, particularly those that will be
used for cache entries, and accept pointers to hashes in arguments to
functions.

Some of the reasoning behind making this change is:

- It is generally preferred to avoid storing pointers to data in cache
  objects since doing so can easily lead to storing interior pointers
  into other structs that then can't be GC'd
- Keeping the hash values directly in the block node provides better
  cache locality
2017-02-03 11:36:33 -06:00
Dave Collins 1ecfea4928
blockchain: Refactor main block index logic.
This refactors the block index logic into a separate struct and
introduces an individual lock for it so it can be queried independent of
the chain lock.
2017-02-01 13:14:41 -06:00
Dave Collins 5ffd552214
blockchain: Use int64 timestamps in block nodes.
This modifies the block nodes used in the blockchain package for keeping
track of the block index to use int64 for the timestamps instead of
time.Time.

This is being done because a time.Time takes 24 bytes while an int64
only takes 8 and the plan is to eventually move the entire block index
into memory instead of the current dynamically-loaded version, so
cutting the number of bytes used for the timestamp by a third is highly
desirable.

Also, the consensus code requires working with unix-style timestamps
anyways, so switching over to them in the block node does not seem
unreasonable.

Finally, this does not go so far as to change all of the time.Time
references, particularly those that are in the public API, so it is
purely an internal change.
2017-01-31 17:51:29 -06:00
Dave Collins 712399c0db
blockchain: Correct startup threshold state warns.
The thresholdState and deploymentState functions expect the block node
for the block prior to which the threshold state is calculated, however
the startup code which checked the threshold states was using the
current best node instead of its parent.

While here, also update the comments and rename a couple of variables to
help make this fact more clear.
2016-12-03 13:42:26 -06:00
Dave Collins c440584efc
Implement infrastructure for BIP0009.
This commit adds all of the infrastructure needed to support BIP0009
soft forks.

The following is an overview of the changes:

- Add new configuration options to the chaincfg package which allows the
  rule deployments to be defined per chain
- Implement code to calculate the threshold state as required by BIP0009
  - Use threshold state caches that are stored to the database in order
    to accelerate startup time
  - Remove caches that are invalid due to definition changes in the
    params including additions, deletions, and changes to existing
    entries
- Detect and warn when a new unknown rule is about to activate or has
  been activated in the block connection code
- Detect and warn when 50% of the last 100 blocks have unexpected
  versions.
- Remove the latest block version from wire since it no longer applies
- Add a version parameter to the wire.NewBlockHeader function since the
  default is no longer available
- Update the miner block template generation code to use the calculated
  block version based on the currently defined rule deployments and
  their threshold states as of the previous block
- Add tests for new error type
- Add tests for threshold state cache
2016-11-30 13:51:59 -06:00
Dave Collins c57c18c8e2
blockchain: Add median time to state snapshot.
This adds a new field to the best chain state snapshot for the
calculated past median time as returned by the calcPastMedianTime
function.  This is useful since it provides fast access to it without
having to acquire the chain lock which is needed to recalculate it.

This will ultimately allow the associated exported function to be
removed since it only exists to be able to calculate this exact value,
however this commit only introduces the new field in order to keep the
changes minimal.
2016-08-23 01:29:11 -05:00
Dave Collins bd4e64d1d4 chainhash: Abstract hash logic to new package. (#729)
This is mostly a backport of some of the same modifications made in
Decred along with a few additional things cleaned up.  In particular,
this updates the code to make use of the new chainhash package.

Also, since this required API changes anyways and the hash algorithm is
no longer tied specifically to SHA, all other functions throughout the
code base which had "Sha" in their name have been changed to Hash so
they are not incorrectly implying the hash algorithm.

The following is an overview of the changes:

- Remove the wire.ShaHash type
- Update all references to wire.ShaHash to the new chainhash.Hash type
- Rename the following functions and update all references:
  - wire.BlockHeader.BlockSha -> BlockHash
  - wire.MsgBlock.BlockSha -> BlockHash
  - wire.MsgBlock.TxShas -> TxHashes
  - wire.MsgTx.TxSha -> TxHash
  - blockchain.ShaHashToBig -> HashToBig
  - peer.ShaFunc -> peer.HashFunc
- Rename all variables that included sha in their name to include hash
  instead
- Update for function name changes in other dependent packages such as
  btcutil
- Update copyright dates on all modified files
- Update glide.lock file to use the required version of btcutil
2016-08-08 14:04:33 -05:00
Dave Collins de4fb24389 blockchain: Remove unused root field. (#680)
This removes the root field and all references to it from the BlockChain
since it is no longer required.

It was previously required because the chain state was not initialized
when the instance was created.  However, that is no longer the case, so
there is no reason to keep it around any longer.
2016-04-25 16:17:29 -05:00
Dave Collins 5a1e77bd2d blockchain: Remove unneeded unspentness size check. (#665)
The current code is needlessly checking the number of bytes needed to
serialize the unspentness bitmap in the utxo against a maximum value
that could never be returned because the function takes a uint32 output
index which is treated as a bit offset, and converts it bytes, which
will necessarily be less than a max uint32.

This check also causes a compile error on arm where native integers are
32 bits.

This simply removes the unneeded check.
2016-04-13 21:43:16 -05:00
Dave Collins b580cdb7d3 database: Replace with new version.
This commit removes the old database package, moves the new package into
its place, and updates all imports accordingly.
2016-04-12 14:55:15 -05:00
Dave Collins 7c174620f7 indexers: Implement optional tx/address indexes.
This introduces a new indexing infrastructure for supporting optional
indexes using the new database and blockchain infrastructure along with
two concrete indexer implementations which provide both a
transaction-by-hash and a transaction-by-address index.

The new infrastructure is mostly separated into a package named indexers
which is housed under the blockchain package.  In order to support this,
a new interface named IndexManager has been introduced in the blockchain
package which provides methods to be notified when the chain has been
initialized and when blocks are connected and disconnected from the main
chain.  A concrete implementation of an index manager is provided by the
new indexers package.

The new indexers package also provides a new interface named Indexer
which allows the index manager to manage concrete index implementations
which conform to the interface.

The following is high level overview of the main index infrastructure
changes:

- Define a new IndexManager interface in the blockchain package and
  modify the package to make use of the interface when specified
- Create a new indexers package
  - Provides an Index interface which allows concrete indexes to plugin
    to an index manager
  - Provides a concrete IndexManager implementation
    - Handles the lifecycle of all indexes it manages
    - Tracks the index tips
    - Handles catching up disabled indexes that have been reenabled
    - Handles reorgs while the index was disabled
    - Invokes the appropriate methods for all managed indexes to allow
      them to index and deindex the blocks and transactions
  - Implement a transaction-by-hash index
    - Makes use of internal block IDs to save a significant amount of
      space and indexing costs over the old transaction index format
  - Implement a transaction-by-address index
    - Makes use of a leveling scheme in order to provide a good tradeoff
      between space required and indexing costs
- Supports enabling and disabling indexes at will
- Support the ability to drop indexes if they are no longer desired

The following is an overview of the btcd changes:

- Add a new index logging subsystem
- Add new options --txindex and --addrindex in order to enable the
  optional indexes
  - NOTE: The transaction index will automatically be enabled when the
    address index is enabled because it depends on it
- Add new options --droptxindex and --dropaddrindex to allow the indexes
  to be removed
  - NOTE: The address index will also be removed when the transaction
    index is dropped because it depends on it
- Update getrawtransactions RPC to make use of the transaction index
- Reimplement the searchrawtransaction RPC that makes use of the address
  index
- Update sample-btcd.conf to include sample usage for the new optional
  index flags
2016-04-11 17:16:42 -05:00
Dave Collins 491acd4ca6 blockchain: Rework to use new db interface.
This commit is the first stage of several that are planned to convert
the blockchain package into a concurrent safe package that will
ultimately allow support for multi-peer download and concurrent chain
processing.  The goal is to update btcd proper after each step so it can
take advantage of the enhancements as they are developed.

In addition to the aforementioned benefit, this staged approach has been
chosen since it is absolutely critical to maintain consensus.
Separating the changes into several stages makes it easier for reviewers
to logically follow what is happening and therefore helps prevent
consensus bugs.  Naturally there are significant automated tests to help
prevent consensus issues as well.

The main focus of this stage is to convert the blockchain package to use
the new database interface and implement the chain-related functionality
which it no longer handles.  It also aims to improve efficiency in
various areas by making use of the new database and chain capabilities.

The following is an overview of the chain changes:

- Update to use the new database interface
- Add chain-related functionality that the old database used to handle
  - Main chain structure and state
  - Transaction spend tracking
- Implement a new pruned unspent transaction output (utxo) set
  - Provides efficient direct access to the unspent transaction outputs
  - Uses a domain specific compression algorithm that understands the
    standard transaction scripts in order to significantly compress them
  - Removes reliance on the transaction index and paves the way toward
    eventually enabling block pruning
- Modify the New function to accept a Config struct instead of
  inidividual parameters
- Replace the old TxStore type with a new UtxoViewpoint type that makes
  use of the new pruned utxo set
- Convert code to treat the new UtxoViewpoint as a rolling view that is
  used between connects and disconnects to improve efficiency
- Make best chain state always set when the chain instance is created
  - Remove now unnecessary logic for dealing with unset best state
- Make all exported functions concurrent safe
  - Currently using a single chain state lock as it provides a straight
    forward and easy to review path forward however this can be improved
    with more fine grained locking
- Optimize various cases where full blocks were being loaded when only
  the header is needed to help reduce the I/O load
- Add the ability for callers to get a snapshot of the current best
  chain stats in a concurrent safe fashion
  - Does not block callers while new blocks are being processed
- Make error messages that reference transaction outputs consistently
  use <transaction hash>:<output index>
- Introduce a new AssertError type an convert internal consistency
  checks to use it
- Update tests and examples to reflect the changes
- Add a full suite of tests to ensure correct functionality of the new
  code

The following is an overview of the btcd changes:

- Update to use the new database and chain interfaces
- Temporarily remove all code related to the transaction index
- Temporarily remove all code related to the address index
- Convert all code that uses transaction stores to use the new utxo
  view
- Rework several calls that required the block manager for safe
  concurrency to use the chain package directly now that it is
  concurrent safe
- Change all calls to obtain the best hash to use the new best state
  snapshot capability from the chain package
- Remove workaround for limits on fetching height ranges since the new
  database interface no longer imposes them
- Correct the gettxout RPC handler to return the best chain hash as
  opposed the hash the txout was found in
- Optimize various RPC handlers:
  - Change several of the RPC handlers to use the new chain snapshot
    capability to avoid needlessly loading data
  - Update several handlers to use new functionality to avoid accessing
    the block manager so they are able to return the data without
    blocking when the server is busy processing blocks
  - Update non-verbose getblock to avoid deserialization and
    serialization overhead
  - Update getblockheader to request the block height directly from
    chain and only load the header
  - Update getdifficulty to use the new cached data from chain
  - Update getmininginfo to use the new cached data from chain
  - Update non-verbose getrawtransaction to avoid deserialization and
    serialization overhead
  - Update gettxout to use the new utxo store versus loading
    full transactions using the transaction index

The following is an overview of the utility changes:
- Update addblock to use the new database and chain interfaces
- Update findcheckpoint to use the new database and chain interfaces
- Remove the dropafter utility which is no longer supported

NOTE: The transaction index and address index will be reimplemented in
another commit.
2016-04-11 16:47:27 -05:00