The status management of index does need some refactoring.
For now, we just manually clear the statusValid in every occurance
of statusValidateFailed being set.
Co-authored-by: Roy Lee <roylee17@gmail.com>
The current methods to add to a UtxoViewpoint don't allow for a situation where
we have only UTXO data but not a whole transaction. This commit allows
contstruction of a UtxoEntry without requiring a full MsgTx.
AddTxOut() and AddTxOuts() both require a whole transaction, including the inputs,
which are only used in order to calculate the txid. In some situations, such as
with use of the utreexo accumulator, we only have the utxo data but not the
transaction which created it.
For reference, utreexo's initial usage of the blockchain.NewUtxoEntry() function is at
https://github.com/mit-dci/utreexo/pull/135/files#diff-3f7b8f9991ea957f1f4ad9f5a95415f0R96
In this commit, we fix an existing bug in the re-org catch up logic for
the `IndexManager`. Before this commit, we would assign the block to the
_local_ scope rather than the outer scope. As a result, we would never
properly bisect the chain to find the fork point to be able to reconcile
the index state to the main chain only after a re-org occurs.
Fixes#1261
In this commit, we add an additional consistency check within the
`initChainState` method. It has been observed that at times, a block
wil lbe written to disk (as it's valid), but then the block index isn't
updated to reflect this. This can cause btcd to fail to do things like
serve cfheaders for valid blocks.
To partially remedy this, when we're loading in the index, we assume
that all ancestors of the current chain tip are valid, and mark them as
such. At the very end, we'll flush the index to ensure the state is
fully consistent. Typically this will be a noop, as only dirty elements
are flushed.
backport of https://github.com/decred/dcrd/pull/1273
Notable difference being that btcd mainline currenlty
doesn't have a blockchain/blockindex_test.go file, so
those changes are omitted.
Great work @davecgh :)
This generalizes the reorganizeChain function in the blockchain package
to allow it to rewind the chain in the case no attach nodes are provided
or extend the chain in the case no detach nodes are provided.
It also adds several assertions to ensure the assumptions about the
state hold and cleans up the handling of setting invalid ancestor nodes
in the case of a failed block validation.
In this commit, we update all the indexers to use the stxo set for a
particular block rather than the utxo view for the block. We do this as
we can eliminate a large number of random reads for each block, and can
instead deserialize a single instance of all the outputs spent in that
block and feed in the prev input scripts to each indexer.
In this commit, we update the IndexManager interface to use spent txos
rather than the unspent output set for a particualr block. We do this in
order to improve the performance of the current address index which
requires reconstructing the utxo view from the PoV of that new block. In
practice, this is very slow as we need to perform a series of random
reads in order to reconstruct the utxo set. Instead, we can use the set
of SpentTxOut's for that block as this already contains the previous
output scripts which is what all of the current indexers really need.
In this commit, we publicly export the spentTxOut struct and all its
attributes. This is the first in a set of commits to optimize the
existing address index by using the spend journal rather than manually
re-creating the utxoViewPoint each time.
This modifies the utxoset in the database and related UtxoViewpoint to
store and work with unspent transaction outputs on a per-output basis
instead of at a transaction level. This was inspired by similar recent
changes in Bitcoin Core.
The primary motivation is to simplify the code, pave the way for a
utxo cache, and generally focus on optimizing runtime performance.
The tradeoff is that this approach does somewhat increase the size of
the serialized utxoset since it means that the transaction hash is
duplicated for each output as a part of the key and some additional
details such as whether the containing transaction is a coinbase and the
block height it was a part of are duplicated in each output.
However, in practice, the size difference isn't all that large, disk
space is relatively cheap, certainly cheaper than memory, and it is much
more important to provide more efficient runtime operation since that is
the ultimate purpose of the daemon.
While performing this conversion, it also simplifies the code to remove
the transaction version information from the utxoset as well as the
spend journal. The logic for only serializing it under certain
circumstances is complicated and it isn't actually used anywhere aside
from the gettxout RPC where it also isn't used by anything important
either. Consequently, this also removes the version field of the
gettxout RPC result.
The utxos in the database are automatically migrated to the new format
with this commit and it is possible to interrupt and resume the
migration process.
Finally, it also updates the tests for the new format and adds a new
function to the tests to convert the old test data to the new format for
convenience. The data has already been converted and updated in the
commit.
An overview of the changes are as follows:
- Remove transaction version from both spent and unspent output entries
- Update utxo serialization format to exclude the version
- Modify the spend journal serialization format
- The old version field is now reserved and always stores zero and
ignores it when reading
- This allows old entries to be used by new code without having to
migrate the entire spend journal
- Remove version field from gettxout RPC result
- Convert UtxoEntry to represent a specific utxo instead of a
transaction with all remaining utxos
- Optimize for memory usage with an eye towards a utxo cache
- Combine details such as whether the txout was contained in a
coinbase, is spent, and is modified into a single packed field of
bit flags
- Align entry fields to eliminate extra padding since ultimately
there will be a lot of these in memory
- Introduce a free list for serializing an outpoint to the database
key format to significantly reduce pressure on the GC
- Update all related functions that previously dealt with transaction
hashes to accept outpoints instead
- Update all callers accordingly
- Only add individually requested outputs from the mempool when
constructing a mempool view
- Modify the spend journal to always store the block height and coinbase
information with every spent txout
- Introduce code to handle fetching the missing information from
another utxo from the same transaction in the event an old style
entry is encountered
- Make use of a database cursor with seek to do this much more
efficiently than testing every possible output
- Always decompress data loaded from the database now that a utxo entry
only consists of a specific output
- Introduce upgrade code to migrate the utxo set to the new format
- Store versions of the utxoset and spend journal buckets
- Allow migration process to be interrupted and resumed
- Update all tests to expect the correct encodings, remove tests that no
longer apply, and add new ones for the new expected behavior
- Convert old tests for the legacy utxo format deserialization code to
test the new function that is used during upgrade
- Update the utxostore test data and add function that was used to
convert it
- Introduce a few new functions on UtxoViewpoint
- AddTxOut for adding an individual txout versus all of them
- addTxOut to handle the common code between the new AddTxOut and
existing AddTxOuts
- RemoveEntry for removing an individual txout
- fetchEntryByHash for fetching any remaining utxo for a given
transaction hash
The index will hold three types of entries for each filter type, block
pair: filter, header, and hash. Since they all have similar methods
and implementations, refactor to reduce duplication.