This modifies the utxoset in the database and related UtxoViewpoint to
store and work with unspent transaction outputs on a per-output basis
instead of at a transaction level. This was inspired by similar recent
changes in Bitcoin Core.
The primary motivation is to simplify the code, pave the way for a
utxo cache, and generally focus on optimizing runtime performance.
The tradeoff is that this approach does somewhat increase the size of
the serialized utxoset since it means that the transaction hash is
duplicated for each output as a part of the key and some additional
details such as whether the containing transaction is a coinbase and the
block height it was a part of are duplicated in each output.
However, in practice, the size difference isn't all that large, disk
space is relatively cheap, certainly cheaper than memory, and it is much
more important to provide more efficient runtime operation since that is
the ultimate purpose of the daemon.
While performing this conversion, it also simplifies the code to remove
the transaction version information from the utxoset as well as the
spend journal. The logic for only serializing it under certain
circumstances is complicated and it isn't actually used anywhere aside
from the gettxout RPC where it also isn't used by anything important
either. Consequently, this also removes the version field of the
gettxout RPC result.
The utxos in the database are automatically migrated to the new format
with this commit and it is possible to interrupt and resume the
migration process.
Finally, it also updates the tests for the new format and adds a new
function to the tests to convert the old test data to the new format for
convenience. The data has already been converted and updated in the
commit.
An overview of the changes are as follows:
- Remove transaction version from both spent and unspent output entries
- Update utxo serialization format to exclude the version
- Modify the spend journal serialization format
- The old version field is now reserved and always stores zero and
ignores it when reading
- This allows old entries to be used by new code without having to
migrate the entire spend journal
- Remove version field from gettxout RPC result
- Convert UtxoEntry to represent a specific utxo instead of a
transaction with all remaining utxos
- Optimize for memory usage with an eye towards a utxo cache
- Combine details such as whether the txout was contained in a
coinbase, is spent, and is modified into a single packed field of
bit flags
- Align entry fields to eliminate extra padding since ultimately
there will be a lot of these in memory
- Introduce a free list for serializing an outpoint to the database
key format to significantly reduce pressure on the GC
- Update all related functions that previously dealt with transaction
hashes to accept outpoints instead
- Update all callers accordingly
- Only add individually requested outputs from the mempool when
constructing a mempool view
- Modify the spend journal to always store the block height and coinbase
information with every spent txout
- Introduce code to handle fetching the missing information from
another utxo from the same transaction in the event an old style
entry is encountered
- Make use of a database cursor with seek to do this much more
efficiently than testing every possible output
- Always decompress data loaded from the database now that a utxo entry
only consists of a specific output
- Introduce upgrade code to migrate the utxo set to the new format
- Store versions of the utxoset and spend journal buckets
- Allow migration process to be interrupted and resumed
- Update all tests to expect the correct encodings, remove tests that no
longer apply, and add new ones for the new expected behavior
- Convert old tests for the legacy utxo format deserialization code to
test the new function that is used during upgrade
- Update the utxostore test data and add function that was used to
convert it
- Introduce a few new functions on UtxoViewpoint
- AddTxOut for adding an individual txout versus all of them
- addTxOut to handle the common code between the new AddTxOut and
existing AddTxOuts
- RemoveEntry for removing an individual txout
- fetchEntryByHash for fetching any remaining utxo for a given
transaction hash
new txs that it observes. The block manager alerts the fee estimator
of new and orphaned blocks.
Check for invalid state and recreate FeeEstimator if necessary.
This removes the standardness check to reject transactions with a lock
time greater than a maxint32 because the old bitcoind nodes which it was
designed to protect against are no longer valid for other reasons and
thus there are no longer any of them on the network to worry about.
This commit implements the new “weight” metric introduced as part of
the segwit soft-fork. Post-fork activation, rather than limiting the
size of blocks and transactions based purely on serialized size, a new
metric “weight” will instead be used as a way to more accurately
reflect the costs of a tx/block on the system. With blocks constrained
by weight, the maximum block-size increases to ~4MB.
This allows a caller-provided tag to be associated with orphan
transactions. This is useful since the caller can use the tag for
purposes such as keeping track of which peers orphans were first seen
from.
Also, since a parameter is required now anyways, it associates the peer
ID with processed transactions from remote peers.
This move the export for MinHighPriority from the mempool package to the
mining package. This should have been done when the priority
calculation code was moved to the mining package.
This commit adds a new option to the mempool’s policy configuration
which determines which transaction versions should be accepted as
standard.
The default version set by the policy within the server is 2; this
allows accepting transactions which have version 2 enabled in order to
utilize the new sequence locks feature.
This moves the priority-related code from the mempool package to the
mining package and also exports a new constant named UnminedHeight which
takes the place of the old unexported mempoolHeight.
Even though the mempool makes use of the priority code to make decisions
about what it will accept, priority really has to do with mining since
it influences which transactions will end up into a block. This change
also has the side effect of being a step towards enabling separation of
the mining code into its own package which, as previously mentioned,
needs access to the priority calculation code as well.
Finally, the mempoolHeight variable was poorly named since what it
really represents is a transaction that has not been mined into a block
yet. Renaming the variable to more accurately reflect its purpose makes
it clear that it belongs in the mining package which also needs the
definition now as well since the priority calculation code relies on it.
This will also benefit an outstanding PR which needs access to the same
value.
This implements orphan expiration in the mempool such that any orphans
that have not had their ancestors materialize within 15 minutes of their
initial arrival time will be evicted which in turn will remove any other
orphans that attempted to redeem it.
In order to perform the evictions with reasonable efficiency, an
opportunistic scan interval of 5 minutes is used. That is to say that
there is not a hard deadline on the scan interval and instead it runs
when a new orphan is added to the pool if enough time has passed.
The following is an example of running this code against the main
network for around 30 minutes:
23:05:34 2016-10-24 [DBG] TXMP: Expired 3 orphans (remaining: 254)
23:10:38 2016-10-24 [DBG] TXMP: Expired 112 orphans (remaining: 231)
23:15:43 2016-10-24 [DBG] TXMP: Expired 95 orphans (remaining: 206)
23:20:44 2016-10-24 [DBG] TXMP: Expired 90 orphans (remaining: 191)
23:25:51 2016-10-24 [DBG] TXMP: Expired 71 orphans (remaining: 191)
23:30:55 2016-10-24 [DBG] TXMP: Expired 70 orphans (remaining: 105)
23:36:19 2016-10-24 [DBG] TXMP: Expired 55 orphans (remaining: 107)
As can be seen from the above, without orphan expiration on this data
set, the orphan pool would have grown an additional 496 entries.
This modifies the way orphan removal and processing is done to more
aggressively remove orphans that can no longer be valid due to other
transactions being added or removed from the primary transaction pool.
The net effect of these changes is that orphan pool will typically be
much smaller which greatly improves its effectiveness. Previously, it
would typically quickly reach the max allowed worst-case usage and
effectively stay there forever.
The following is a summary of the changes:
- Modify the map that tracks which orphans redeem a given transaction to
instead track by the specific outpoints that are redeemed
- Modify the various orphan removal and processing functions to accept
the full transaction rather than just its hash
- Introduce a new flag on removeOrphans which specifies whether or not
to remove the transactions that redeem the orphan being removed as
well which is necessary since only some paths require it
- Add a new function named removeOrphanDoubleSpends that is invoked
whenever a transaction is added to the main pool and thus the outputs
they spent become concrete spends
- Introduce a new flag on maybeAcceptTransaction which specifies whether
or not duplicate orphans should be rejected since only some paths
require it
- Modify processOrphans as follows:
- Make use of the modified map
- Use newly available flags and logic work more strictly work with tx
chains
- Recursively remove any orphans that also redeem any outputs redeemed
by the accepted transactions
- Several new tests to ensure proper functionality
- Removing an orphan that doesn't exist is removed both when there is
another orphan that redeems it and when there is not
- Removing orphans works properly with orphan chains per the new
remove redeemers flag
- Removal of multi-input orphans that double spend an output when a
concrete redeemer enters the transaction pool
This optimizes the way in which the mempool oprhan map is limited in the
same way the server block manager maps were previously optimized.
Previously the code would read a cryptographically random value large
enough to construct a hash, find the first entry larger than that value,
and evict it.
That approach is quite inefficient and could easily become a
bottleneck when processing transactions due to the need to read from a
source such as /dev/urandom and all of the subsequent hash comparisons.
Luckily, strong cryptographic randomness is not needed here. The primary
intent of limiting the maps is to control memory usage with a secondary
concern of making it difficult for adversaries to force eviction of
specific entries.
Consequently, this changes the code to make use of the pseudorandom
iteration order of Go's maps along with the preimage resistance of the
hashing function to provide the desired functionality. It has
previously been discussed that the specific pseudorandom iteration order
is not guaranteed by the Go spec even though in practice that is how it
is implemented. This is not a concern however because even if the
specific compiler doesn't implement that, the preimage resistance of the
hashing function alone is enough.
The following is a before and after comparison of the function for both
speed and memory allocations:
benchmark old ns/op new ns/op delta
----------------------------------------------------------------
BenchmarkLimitNumOrphans 3727 243 -93.48%
benchmark old allocs new allocs delta
-----------------------------------------------------------------
BenchmarkLimitNumOrphans 4 0 -100.00%
This renames the mempool.Config.RelayNonStd option to AcceptNonStd which
more accurately describes its behavior since the mempool was refactored
into a separate package.
The reasoning for this change is that the mempool is not responsible for
relaying transactions (nor should it be). Its job is to maintain a pool
of unmined transactions that are validated according to consensus and
policy configuration options which are then used to provide a source of
transactions that need to be mined.
Instead, it is the server that is responsible for relaying transactions.
While it is true that the current server code currently only relays txns
that were accepted to the mempool, this does not necessarily have to
be the case. It would be entirely possible (and perhaps even a good
idea as something do in the future), to separate the relay policy from
the mempool acceptance policy (and thus indirectly the mining policy).
This coincides with the mempool only, policy change which enforces
transaction finality according to the median-time-past rather than
blockheader timestamps. The behavior is pre-cursor to full blown BIP
113 consensus deployment, and subsequent activation.
As a result, the TimeSource field in the mempoolConfig is no longer
needed so it has been removed. Additionally, checkTransactionStandard has been
modified to instead take a time.Time as the mempool is no longer explicitly
dependant on a Chain instance.
This commit adds an additional closure function to the mempool’s config
which computes the median time past from the point of view of the best
node in the chain. The mempool test harness has also been updated to allow
setting a mock median time past for testing purposes.
In addition to increasing the testability of the mempool, this commit
should also speed up transaction and block validation for BIP 113 as
the MTP no longer needs to be re-calculated each time from scratch.
This modifies the config for the new mempool package such that it takes
a callback function to obtain the best chain height instead of requiring
a fully initialized blockchain.BlockChain instance.
This will make it much easier to test the mempool since the tests will
be able to provide their own height function to test various
functionality without having create and manipulate full blocks and chain
instances.
This does the minimum work necessary to refactor the mempool code into
its own package. The idea is that separating this code into its own
package will greatly improve its testability, allow independent
benchmarking and profiling, and open up some interesting opportunities
for future development related to the memory pool.
There are likely some areas related to policy that could be further
refactored, however it is better to do that in future commits in order
to keep the changeset as small as possible during this refactor.
Overview of the major changes:
- Create the new package
- Move several files into the new package:
- mempool.go -> mempool/mempool.go
- mempoolerror.go -> mempool/error.go
- policy.go -> mempool/policy.go
- policy_test.go -> mempool/policy_test.go
- Update mempool logging to use the new mempool package logger
- Rename mempoolPolicy to Policy (so it's now mempool.Policy)
- Rename mempoolConfig to Config (so it's now mempool.Config)
- Rename mempoolTxDesc to TxDesc (so it's now mempool.TxDesc)
- Rename txMemPool to TxPool (so it's now mempool.TxPool)
- Move defaultBlockPrioritySize to the new package and export it
- Export DefaultMinRelayTxFee from the mempool package
- Export the CalcPriority function from the mempool package
- Introduce a new RawMempoolVerbose function on the TxPool and update
the RPC server to use it
- Update all references to the mempool to use the package.
- Add a skeleton README.md