Commit graph

9 commits

Author SHA1 Message Date
Olaoluwa Osuntokun
a26e2634fa
blockchain/indexers: update indexing to use stxos instead of utxo view for blocks
In this commit, we update all the indexers to use the stxo set for a
particular block rather than the utxo view for the block. We do this as
we can eliminate a large number of random reads for each block, and can
instead deserialize a single instance of all the outputs spent in that
block and feed in the prev input scripts to each indexer.
2018-05-30 20:47:40 -07:00
Dave Collins
a59ac5b18f
multi: Rework utxoset/view to use outpoints.
This modifies the utxoset in the database and related UtxoViewpoint to
store and work with unspent transaction outputs on a per-output basis
instead of at a transaction level.  This was inspired by similar recent
changes in Bitcoin Core.

The primary motivation is to simplify the code, pave the way for a
utxo cache, and generally focus on optimizing runtime performance.

The tradeoff is that this approach does somewhat increase the size of
the serialized utxoset since it means that the transaction hash is
duplicated for each output as a part of the key and some additional
details such as whether the containing transaction is a coinbase and the
block height it was a part of are duplicated in each output.

However, in practice, the size difference isn't all that large, disk
space is relatively cheap, certainly cheaper than memory, and it is much
more important to provide more efficient runtime operation since that is
the ultimate purpose of the daemon.

While performing this conversion, it also simplifies the code to remove
the transaction version information from the utxoset as well as the
spend journal.  The logic for only serializing it under certain
circumstances is complicated and it isn't actually used anywhere aside
from the gettxout RPC where it also isn't used by anything important
either.  Consequently, this also removes the version field of the
gettxout RPC result.

The utxos in the database are automatically migrated to the new format
with this commit and it is possible to interrupt and resume the
migration process.

Finally, it also updates the tests for the new format and adds a new
function to the tests to convert the old test data to the new format for
convenience.  The data has already been converted and updated in the
commit.

An overview of the changes are as follows:

- Remove transaction version from both spent and unspent output entries
  - Update utxo serialization format to exclude the version
  - Modify the spend journal serialization format
    - The old version field is now reserved and always stores zero and
      ignores it when reading
    - This allows old entries to be used by new code without having to
      migrate the entire spend journal
  - Remove version field from gettxout RPC result
- Convert UtxoEntry to represent a specific utxo instead of a
  transaction with all remaining utxos
  - Optimize for memory usage with an eye towards a utxo cache
    - Combine details such as whether the txout was contained in a
      coinbase, is spent, and is modified into a single packed field of
      bit flags
    - Align entry fields to eliminate extra padding since ultimately
      there will be a lot of these in memory
    - Introduce a free list for serializing an outpoint to the database
      key format to significantly reduce pressure on the GC
  - Update all related functions that previously dealt with transaction
    hashes to accept outpoints instead
  - Update all callers accordingly
  - Only add individually requested outputs from the mempool when
    constructing a mempool view
- Modify the spend journal to always store the block height and coinbase
  information with every spent txout
  - Introduce code to handle fetching the missing information from
    another utxo from the same transaction in the event an old style
    entry is encountered
    - Make use of a database cursor with seek to do this much more
      efficiently than testing every possible output
- Always decompress data loaded from the database now that a utxo entry
  only consists of a specific output
- Introduce upgrade code to migrate the utxo set to the new format
  - Store versions of the utxoset and spend journal buckets
  - Allow migration process to be interrupted and resumed
- Update all tests to expect the correct encodings, remove tests that no
  longer apply, and add new ones for the new expected behavior
  - Convert old tests for the legacy utxo format deserialization code to
    test the new function that is used during upgrade
  - Update the utxostore test data and add function that was used to
    convert it
- Introduce a few new functions on UtxoViewpoint
  - AddTxOut for adding an individual txout versus all of them
  - addTxOut to handle the common code between the new AddTxOut and
    existing AddTxOuts
  - RemoveEntry for removing an individual txout
  - fetchEntryByHash for fetching any remaining utxo for a given
    transaction hash
2018-05-27 03:07:41 -05:00
Nicola 'tekNico' Larosa
11fcd83963 btcd/multi: fix a number of typos in comments. 2018-01-25 23:23:59 -06:00
Dave Collins
8c883d1fca
blockchain/indexers: Allow interrupts.
This propagates the interrupt channel through to blockchain and the
indexers so that it is possible to interrupt long-running operations
such as catching up indexes.
2017-09-05 11:02:46 -05:00
Olaoluwa Osuntokun
137aabd631 blockchain/indexers: add addrindex support for p2wkh and p2wsh 2017-08-13 23:17:40 -05:00
Dave Collins
14b51fc5f8
multi: Correct misspellings detected by misspell. 2016-10-28 09:43:38 -05:00
Dave Collins
bd4e64d1d4 chainhash: Abstract hash logic to new package. (#729)
This is mostly a backport of some of the same modifications made in
Decred along with a few additional things cleaned up.  In particular,
this updates the code to make use of the new chainhash package.

Also, since this required API changes anyways and the hash algorithm is
no longer tied specifically to SHA, all other functions throughout the
code base which had "Sha" in their name have been changed to Hash so
they are not incorrectly implying the hash algorithm.

The following is an overview of the changes:

- Remove the wire.ShaHash type
- Update all references to wire.ShaHash to the new chainhash.Hash type
- Rename the following functions and update all references:
  - wire.BlockHeader.BlockSha -> BlockHash
  - wire.MsgBlock.BlockSha -> BlockHash
  - wire.MsgBlock.TxShas -> TxHashes
  - wire.MsgTx.TxSha -> TxHash
  - blockchain.ShaHashToBig -> HashToBig
  - peer.ShaFunc -> peer.HashFunc
- Rename all variables that included sha in their name to include hash
  instead
- Update for function name changes in other dependent packages such as
  btcutil
- Update copyright dates on all modified files
- Update glide.lock file to use the required version of btcutil
2016-08-08 14:04:33 -05:00
Dave Collins
b580cdb7d3 database: Replace with new version.
This commit removes the old database package, moves the new package into
its place, and updates all imports accordingly.
2016-04-12 14:55:15 -05:00
Dave Collins
7c174620f7 indexers: Implement optional tx/address indexes.
This introduces a new indexing infrastructure for supporting optional
indexes using the new database and blockchain infrastructure along with
two concrete indexer implementations which provide both a
transaction-by-hash and a transaction-by-address index.

The new infrastructure is mostly separated into a package named indexers
which is housed under the blockchain package.  In order to support this,
a new interface named IndexManager has been introduced in the blockchain
package which provides methods to be notified when the chain has been
initialized and when blocks are connected and disconnected from the main
chain.  A concrete implementation of an index manager is provided by the
new indexers package.

The new indexers package also provides a new interface named Indexer
which allows the index manager to manage concrete index implementations
which conform to the interface.

The following is high level overview of the main index infrastructure
changes:

- Define a new IndexManager interface in the blockchain package and
  modify the package to make use of the interface when specified
- Create a new indexers package
  - Provides an Index interface which allows concrete indexes to plugin
    to an index manager
  - Provides a concrete IndexManager implementation
    - Handles the lifecycle of all indexes it manages
    - Tracks the index tips
    - Handles catching up disabled indexes that have been reenabled
    - Handles reorgs while the index was disabled
    - Invokes the appropriate methods for all managed indexes to allow
      them to index and deindex the blocks and transactions
  - Implement a transaction-by-hash index
    - Makes use of internal block IDs to save a significant amount of
      space and indexing costs over the old transaction index format
  - Implement a transaction-by-address index
    - Makes use of a leveling scheme in order to provide a good tradeoff
      between space required and indexing costs
- Supports enabling and disabling indexes at will
- Support the ability to drop indexes if they are no longer desired

The following is an overview of the btcd changes:

- Add a new index logging subsystem
- Add new options --txindex and --addrindex in order to enable the
  optional indexes
  - NOTE: The transaction index will automatically be enabled when the
    address index is enabled because it depends on it
- Add new options --droptxindex and --dropaddrindex to allow the indexes
  to be removed
  - NOTE: The address index will also be removed when the transaction
    index is dropped because it depends on it
- Update getrawtransactions RPC to make use of the transaction index
- Reimplement the searchrawtransaction RPC that makes use of the address
  index
- Update sample-btcd.conf to include sample usage for the new optional
  index flags
2016-04-11 17:16:42 -05:00