The block index now tracks the set of dirty block nodes with status
changes that haven't been persisted and flushes the changes to the DB
at the appropriate times.
Currently only the blocks in the active chain are loaded into the
block index on initialization. This instead iterates over the entire
block index bucket in LevelDB and loads all nodes.
The bucket contains block headers keyed by the block height encoded as
big-endian concatenated with the block hash. This allows block headers
to be fetched from the DB in height order with a cursor.
These method allows safe concurrent access to reading and modifying
block node statuses. When block statuses get persisted in a later
change, the setter methods can be used to mark block nodes as dirty.
Each node in the block index records some flags about its validation
state. This is just stored in memory for now, but can save effort if
attempting to reconnect a block that failed validation or was
disconnected.
This was only used to test block proposals, which has been changed to
instead use CheckConnectBlockTemplate. The flag complicated the
implementation of some chain processing routines and would be
difficult to implement with headers-first syncing.
This renames CheckConnectBlock to CheckConnectBlockTemplate and
modifies it to be easily consumable by the getblocktemplate RPC
handler. Performs full block validation now instead of partial
validation.
This propagates the interrupt channel through to blockchain and the
indexers so that it is possible to interrupt long-running operations
such as catching up indexes.
This refactors the code that locates blocks (inventory discovery) out of
server and into blockchain where it can make use of the new much more
efficient chain view and more easily be tested. As an aside, it really
belongs in blockchain anyways since it's purely dealing with the block
index and best chain.
Since the majority of the network has moved to header-based semantics,
this also provides an additional optimization to allow headers to be
located directly versus needing to first discover the hashes and then
fetch the headers.
The new functions are named LocateBlocks and LocateHeaders. The former
returns a slice of located hashes and the latter returns a slice of
located headers.
Finally, it also updates the RPC server getheaders call and related
plumbing to use the new LocateHeaders function.
A comprehensive suite of tests is provided to ensure both functions
behave correctly for both correct and incorrect block locators.
- Remove inMainChain from block nodes since that can now be efficiently
determined by using the chain view
- Track the best chain via a chain view instead of a single block node
- Use the tip of the best chain view everywhere bestNode was used
- Update chain view tip instead of updating best node
- Change reorg logic to use more efficient chain view fork finding logic
- Change block locator code over to use more efficient chain view logic
- Remove now unused block-index-based block locator code
- Move BlockLocator definition to chain.go
- Move BlockLocatorFromHash and LatestBlockLocator to chain.go
- Update both to use more efficient chain view logic
- Rework IsCheckpointCandidate to use block index and chain view
- Optimize MainChainHasBlock to use chain view instead of hitting db
- Move to chain.go since it no longer involves database I/O
- Removed error return since it can no longer fail
- Optimize BlockHeightByHash to use chain view instead of hitting db
- Move to chain.go since it no longer involves database I/O
- Removed error return since it can no longer fail
- Optimize BlockHashByHeight to use chain view instead of hitting db
- Move to chain.go since it no longer involves database I/O
- Removed error return since it can no longer fail
- Optimize HeightRange to use chain view instead of hitting db
- Move to chain.go since it no longer involves database I/O
- Optimize BlockByHeight to use chain view for main chain check
- Optimize BlockByHash to use chain view for main chain check
This removes the DisableVerify function and related state since nothing
uses it anymore since the command line option was removed. It is a
remnant of initial development.
This exposes the ability to more efficiently create a block locator from
a chain view for a given block node by using their ability to do O(1)
lookups.
It also adds tests to ensure the behavior is correct.
This significantly optimizes and simplifies the generation of block
locators by making use of the fact that all block nodes are now in
memory and therefore it is no longer necessary to consult the database
for the hashes or worry about issues related to dynamic loading of nodes.
Also, it slightly modifies the algorithm so that the doubling doesn't
start for one additional iteration in order to mirror other prominent
clients on the network. Due to the way block locators are used, this
does not change any semantics in terms of requesting and locating
blocks.
Finally, the semantics of BlockLocatorFromHash have been changed to
return a locator for the current tip in the case the hash is unknown.
This is far preferable since only including the passed block hash, when
it isn't known, could end up leading to causing a redownload of the
entire chain under certain circumstances.
Putting the test code in the same package makes it easier for forks
since they don't have to change the import paths as much and it also
gets rid of the need for internal_test.go to bridge.
While here, remove the reorganization test since it is much better
handled by the full block tests and is no longer needed and do some
light cleanup on a few other tests.
The full block tests had to remain in the separate test package since it
is a circular dependency otherwise. This did require duplicating some
of the chain setup code, but given the other benefits this is
acceptable.
This introduces the concept of a synthetic block chain that can be used
in the tests to avoid needing setup a full blown chain instance with a
database and generate valid blocks and converts the sequence lock tests
in TestCalcSequenceLock to use it.
Not only does this speed up the test execution time, but it allows the
dependency on rpctest to be removed which will allow the sequence locks
tests to be consolidated into the main package without creating a
circular dependency.
This modifies the function to set the tip in the new chainview code to
bulk copy existing nodes when it needs to expand the cap rather than
simply creating a new empty slice and allowing the walk code below it to
repopulate it. This is a nice optimization since, in practice, most of
the time expanding the cap is only required when the active chain is
being extended after having run for a while which means the end result
is that it will be able to bulk copy all the nodes and just add the most
recent one versus having to walk them all and add them back.
Also, while here expand the tests for setting the tip to ensure the
nodes contained in the resulting view are correct after forcing the
resizes and correct a bug they exposed where changing between a
longer-shorter-longer chain where the longer chain is the same chain
could result in not populating the view correctly.
Finally, update the fake nodes generated by the tests to use a
nonce generated by a deterministic prng in order to ensure the hashes of
all fake nodes are unique, but reproducible.
This implements a new type in the blockchain package that takes
advantage of the fact that all block nodes are now in memory to provide
a flat view of a specific chain of blocks (a specific branch of the
overall block tree) from a given tip all the way back to the genesis
block along with several convenience functions such as efficiently
comparing two views, quickly finding the fork point (if any) between two
views, and O(1) lookup of the node at a specific height.
The view is not currently used, but the intent is that the code will be
refactored to make use of these views to simplify and optimize several
areas such as best chain selection and reorg logic and finding successor
nodes. They will also greatly simplify the process of disconnecting the
download logic from the connection logic.
A comprehensive suite of tests is provided to ensure the chain views
behave correctly.
This cleans up the test in TestCalcSequenceLock in the following ways:
- Use calculated values instead of magic value so it is easier to update
the tests as needed
- Make tests match the comments
- Change comments to be more consistent and fix some grammar errors
- Set mempool flag for unconfirmed tx tests since they are intended to
mimic transactions in the mempool
This modifies the code that determines the most recently known
checkpoint to take advantage of recent changes which make the entire
block index available in memory by only storing a reference to the
specific node in the index that represents the latest known checkpoint.
Previously, the entire block was stored and new checkpoints required
loading it from the database.
This completely removes the threshold state database caching code since
it can very quickly be calculated at startup now that the entire block
index is loaded first.
This reworks the block index code such that it loads all of the headers
in the main chain at startup and constructs the full block index
accordingly.
Since the full index from the current best tip all the way back to the
genesis block is now guaranteed to be in memory, this also removes all
code related to dynamically loading the nodes and updates some of the
logic to take advantage of the fact traversing the block index can
longer potentially fail. There are also more optimizations and
simplifications that can be made in the future as a result of this.
Due to removing all of the extra overhead of tracking the dynamic state,
and ensuring the block node structs are aligned to eliminate extra
padding, the end result of a fully populated block index now takes quite
a bit less memory than the previous dynamically loaded version.
The main downside is that it now takes a while to start whereas it was
nearly instant before, however, it is much better to provide more
efficient runtime operation since that is its ultimate purpose and the
benefits far outweigh this downside.
Some benefits are:
- Since every block node is in memory, the recent code which
reconstructs headers from block nodes means that all headers can
always be served from memory which is important since the majority of
the network has moved to header-based semantics
- Several of the error paths can be removed since they are no longer
necessary
- It is no longer expensive to calculate CSV sequence locks or median
times of blocks way in the past
- It will be possible to create much more efficient iteration and
simplified views of the overall index
- The entire threshold state database cache can be removed since it is
cheap to construct it from the full block index as needed
An overview of the logic changes are as follows:
- Move AncestorNode from blockIndex to blockNode and greatly simplify
since it no longer has to deal with the possibility of dynamically
loading nodes and related failures
- Rename RelativeNode to RelativeAncestor, move to blockNode, and
redefine in terms of AncestorNode
- Move CalcPastMedianTime from blockIndex to blockNode and remove no
longer necessary test for nil
- Change calcSequenceLock to use Ancestor instead of RelativeAncestor
since it reads more clearly
This takes care of a few minor nits on the recently merged subscribe
code. In particular:
- Avoid extra unnecessary allocation on notifications slice
- Avoid defer overhead on notification mutex in sendNotifications
- Make test function comment start with the name of the function per
Effective Go guidelines
- Use constant for number of subscribers in test
- Don't exceed column 80 in test print
The BlockChain struct emits notifications for various events, but
it is only possible to register one listener. This changes the
interface and implementations to allow multiple listeners.
This replaces the ErrDoubleSpend and ErrMissingTx error codes with a
single error code named ErrMissingTxOut and updates the relevant errors
and expected test results accordingly.
Once upon a time, the code relied on a transaction index, so it was able
to definitively differentiate between a transaction output that
legitimately did not exist and one that had already been spent.
However, since the code now uses a pruned utxoset, it is no longer
possible to reliably differentiate since once all outputs of a
transaction are spent, it is removed from the utxoset completely.
Consequently, a missing transaction could be either because the
transaction never existed or because it is fully spent.
This commit updates the new segwit validation logic within block
validation to be guarded by an initial check to the version bits state
before conditionally enforcing the logic based off of the state.
This commit implements the new block validation rules as defined by
BIP0141. The new rules include the constraints that if a block has
transactions with witness data, then there MUST be a commitment within
the conies transaction to the root of a new merkle tree which commits
to the wtxid of all transactions. Additionally, rather than limiting
the size of a block by size in bytes, blocks are now limited by their
total weight unit. Similarly, a newly define “sig op cost” is now used
to limit the signature validation cost of transactions found within
blocks.
This commit implements the new “weight” metric introduced as part of
the segwit soft-fork. Post-fork activation, rather than limiting the
size of blocks and transactions based purely on serialized size, a new
metric “weight” will instead be used as a way to more accurately
reflect the costs of a tx/block on the system. With blocks constrained
by weight, the maximum block-size increases to ~4MB.
This commit implements the flag activation portion of BIP 0147. The
verification behavior triggered by the NULLDUMMY script verification
flag has been present within btcd for some time, however it wasn’t
activated by default.
With this commit, once segwit has activated, the ScriptStrictMultiSig
will also be activated within the Script VM. Additionally, the
ScriptStrictMultiSig is now a standard script verification flag which
is used unconditionally within the mempool.
This commit modifies the logic within the block manager and service to
preferentially fetch transactions and blocks which include witness data
from fully upgraded peers.
Once the initial version handshake has completed, the server now tracks
which of the connected peers are witness enabled (they advertise
SFNodeWitness). From then on, if a peer is witness enabled, then btcd
will always request full witness data when fetching
transactions/blocks.
This corrects the assertion in the decodeSpentTxOut function so it does
not improperly cause a panic when unwinding transactions during a reorg
under certain circumstances. In particular, the provided transaction
version that is passed when a stxo entry does not exist is now -1 in
order to properly distinguish it from the zero value.
It also updates the tests accordingly.
This was discovered by the reorg on testnet from block
00000000000018c58c2d2816f03dac327d975a18af6edf1a369df67ecddaf816 to
0000000000001c1161a367156465cc6226e9f862d9c585f94db5779fdf5455ff.
The btclog package has been changed to defining its own logging
interface (rather than seelog's) and provides a default implementation
for callers to use.
There are two primary advantages to the new logger implementation.
First, all log messages are created before the call returns. Compared
to seelog, this prevents data races when mutable variables are logged.
Second, the new logger does not implement any kind of artifical rate
limiting (what seelog refers to as "adaptive logging"). Log messages
are outputted as soon as possible and the application will appear to
perform much better when watching standard output.
Because log rotation is not a feature of the btclog logging
implementation, it is handled by the main package by importing a file
rotation package that provides an io.Reader interface for creating
output to a rotating file output. The rotator has been configured
with the same defaults that btcd previously used in the seelog config
(10MB file limits with maximum of 3 rolls) but now compresses newly
created roll files. Due to the high compressibility of log text, the
compressed files typically reduce to around 15-30% of the original
10MB file.
The github markdown interpreter has been changed such that it no longer
allows spaces in between the brackets and parenthesis of links and now
requires a newline in between anchors and other formatting. This
updates all of the markdown files accordingly.
While here, it also corrects a couple of inconsistencies in some of the
README.md files.
This commit modifies the existing block validation logic to examine the
current version bits state of the CSV soft-fork, enforcing the new
validation rules (BIPs 68, 112, and 113) accordingly based on the
current `ThesholdState`.
This simplifies the code based on the recommendations of the gosimple
lint tool.
Also, it increases the deadline for the linters to run to 10 minutes and
reduces the number of threads that is uses. This is being done because
the Travis environment has become increasingly slower and it also seems
to be hampered by too many threads running concurrently.
This modifies the blockNode and BestState structs in the blockchain
package to store hashes directly instead of pointers to them and updates
callers to deal with the API change in the exported BestState struct.
In general, the preferred approach for hashes moving forward is to store
hash values in complex data structures, particularly those that will be
used for cache entries, and accept pointers to hashes in arguments to
functions.
Some of the reasoning behind making this change is:
- It is generally preferred to avoid storing pointers to data in cache
objects since doing so can easily lead to storing interior pointers
into other structs that then can't be GC'd
- Keeping the hash values directly in the block node provides better
cache locality
Since the code base is currently in the process of changing over to
decouple download and connection logic, but not all of the necessary
parts are updated yet, ensure blocks that are in the database, but do
not have an associated main chain block index entry, are treated as if
they do not exist for the purposes of chain connection and selection
logic.
This refactors the block index logic into a separate struct and
introduces an individual lock for it so it can be queried independent of
the chain lock.
This modifies the block node structure to include a couple of extra
fields needed to be able to reconstruct the block header from a node,
and exposes a new function from chain to fetch the block headers which
takes advantage of the new functionality to reconstruct the headers from
memory when possible. Finally, it updates both the p2p and RPC servers
to make use of the new function.
This is useful since many of the block header fields need to be kept in
order to form the block index anyways and storing the extra fields means
the database does not have to be consulted when headers are requested if
the associated node is still in memory.
The following timings show representative performance gains as measured
from one system:
new: Time to fetch 100000 headers: 59ms
old: Time to fetch 100000 headers: 4783ms
This removes the CalcPastMedianTime since it is now exposed much more
efficiently via the MedianTime field of the BestState snapshot returned
from the BestSnapshot function.
This modifies the block nodes used in the blockchain package for keeping
track of the block index to use int64 for the timestamps instead of
time.Time.
This is being done because a time.Time takes 24 bytes while an int64
only takes 8 and the plan is to eventually move the entire block index
into memory instead of the current dynamically-loaded version, so
cutting the number of bytes used for the timestamp by a third is highly
desirable.
Also, the consensus code requires working with unix-style timestamps
anyways, so switching over to them in the block node does not seem
unreasonable.
Finally, this does not go so far as to change all of the time.Time
references, particularly those that are in the public API, so it is
purely an internal change.
This modifies the blockchain code to store all blocks that have passed
proof-of-work and contextual validity tests in the database even if they
may ultimately fail to connect.
This eliminates the need to store those blocks in memory, allows them to
be available as orphans later even if they were never part of the main
chain, and helps pave the way toward being able to separate the download
logic from the connection logic.
This contains a bit of cleanup and additional logic to improve the
recently-added ability to specify additional checkpoints via the
--addcheckpoint option.
In particular:
- Improve error messages in the checkpoint parsing
- Correct the mergeCheckpoints function to weed out duplicate height
checkpoints while using the most-recently provided one as described by
its comment
- Add an assertion to blockchain.New that the provided checkpoints are
sorted as required
- Keep comments to 80 columns and use two spaces after periods in them to
be consistent with the rest of the code base
- Make the entry in doc.go match the actual btcd -h output
The thresholdState and deploymentState functions expect the block node
for the block prior to which the threshold state is calculated, however
the startup code which checked the threshold states was using the
current best node instead of its parent.
While here, also update the comments and rename a couple of variables to
help make this fact more clear.
This modifies the code to only enforce the fairly expensive BIP0030
(duplicate transcactions) checks when the chain has not yet reached the
BIP0034 activation height (and is not one of the 2 special historical
blocks that break the rule and prompted BIP0034 to being with) since
that BIP made it impossible to create duplicate coinbases and thus
removed the possibility of creating transactions that overwrite older
ones.
This is a rather large optimization because the check is expensive due
to involving a ton of cache misses in the utxoset. For example, the
following are times it took to perform the BIP0030 check on blocks
425490 - 425502 and a system with a relatively old Hitachi spinner HDD:
block 425490: 674.5857ms
block 425491: 726.5923ms
block 425492: 827.6051ms
block 425493: 680.0863ms
block 425494: 722.0917ms
block 425495: 700.0889ms
block 425496: 647.5823ms
block 425497: 445.0565ms
block 425498: 602.5765ms
block 425499: 375.0476ms
block 425500: 771.0979ms
block 425501: 461.5586ms
block 425502: 603.0766ms
As can be seen from these numbers, this reduces the block validation
time by an average of just over half a second for the given
representative data set and hardware.
Signed-off-by: Dave Collins <davec@conformal.com>
Now that all softforking is done via BIP0009 versionbits, replace the
old isMajorityVersion deployment mechanism with hard coded historical
block heights at which they became active.
Since the activation heights vary per network, this adds new parameters
to the chaincfg.Params struct for them and sets the correct heights at
which each softfork became active on each chain.
It should be noted that this is a technically hard fork since the
behavior of alternate chain history is different with these hard-coded
activation heights as opposed to the old isMajorityVersion code. In
particular, an alternate chain history could activate one of the soft
forks earlier than these hard-coded heights which means the old code
would reject blocks which violate the new soft fork rules whereas this
new code would not.
However, all of the soft forks this refers to were activated so far in
the chain history there is there is no way a reorg that long could
happen and checkpoints reject alternate chains before the most recent
checkpoint anyways. Furthermore, the same change was made in Bitcoin
Core so this needs to be changed to be consistent anyways.
This commit adds all of the infrastructure needed to support BIP0009
soft forks.
The following is an overview of the changes:
- Add new configuration options to the chaincfg package which allows the
rule deployments to be defined per chain
- Implement code to calculate the threshold state as required by BIP0009
- Use threshold state caches that are stored to the database in order
to accelerate startup time
- Remove caches that are invalid due to definition changes in the
params including additions, deletions, and changes to existing
entries
- Detect and warn when a new unknown rule is about to activate or has
been activated in the block connection code
- Detect and warn when 50% of the last 100 blocks have unexpected
versions.
- Remove the latest block version from wire since it no longer applies
- Add a version parameter to the wire.NewBlockHeader function since the
default is no longer available
- Update the miner block template generation code to use the calculated
block version based on the currently defined rule deployments and
their threshold states as of the previous block
- Add tests for new error type
- Add tests for threshold state cache