This commit improves how the headers-first mode works in several ways.
The previous headers-first code was an initial implementation that did not
have all of the bells and whistles and a few less than ideal
characteristics. This commit improves the heaers-first code to resolve
the issues discussed next.
- The previous code only used headers-first mode when starting out from
block height 0 rather than allowing it to work starting at any height
before the final checkpoint. This means if you stopped the chain
download at any point before the final checkpoint and restarted, it
would not resume and you therefore would not have the benefit of the
faster processing offered by headers-first mode.
- Previously all headers (even those after the final checkpoint) were
downloaded and only the final checkpoint was verified. This resulted in
the following issues:
- As the block chain grew, increasingly larger numbers of headers were
downloaded and kept in memory
- If the node the node serving up the headers was serving an invalid
chain, it wouldn't be detected until downloading a large number of
headers
- When an invalid checkpoint was detected, no action was taken to
recover which meant the chain download would essentially be stalled
- The headers were kept in memory even though they didn't need to be as
merely keeping track of the hashes and heights is enough to provde they
properly link together and checkpoints match
- There was no logging when headers were being downloaded so it could
appear like nothing was happening
- Duplicate requests for the same headers weren't being filtered which
meant is was possible to inadvertently download the same headers twice
only to throw them away.
This commit resolves these issues with the following changes:
- The current height is now examined at startup and prior each sync peer
selection to allow it to resume headers-first mode starting from the
known height to the next checkpoint
- All checkpoints are now verified and the headers are only downloaded
from the current known block height up to the next checkpoint. This has
several desirable properties:
- The amount of memory required is bounded by the maximum distance
between to checkpoints rather than the entire length of the chain
- A node serving up an invalid chain is detected very quickly and with
little work
- When an invalid checkpoint is detected, the headers are simply
discarded and the peer is disconnected for serving an invalid chain
- When the sync peer disconnets, all current headers are thrown away
and, due to the new aforementioned resume code, when a new sync peer
is selected, headers-first mode will continue from the last known good
block
- In addition to reduced memory usage from only keeping information about
headers between two checkpoints, the only information now kept in memory
about the headers is the hash and height rather than the entire header
- There is now logging information about what is happening with headers
- Duplicate header requests are now filtered
Previously the logging function which reports on progress was called for
every block, regardless of whether it was an orphan or not. This could be
confusing since it could show a different number of blocks processed as
compared to the old versus new heights reported (orphans do not add to the
block height since they aren't extending the main chain). Further, the
database had to be consulted for the latest block since the block we just
processed might not be the latest one if it was an orphan. This is quite
a bit more time conusming than it should've been for progress reporting.
This commit modifies that to only include non-orphan blocks. As a result,
the latest height shown will match the number of blocks processed (even
when there are orphans) and the additional block lookup from the database
is avoided.
This commit adds the btcdb memdb backend as a supported database type.
Note that users will NOT want to run in this mode because, being memory
only, it obviously does not persist the database when shutdown.
It is being added for testing purposes to help prevent constant abuse to
developer's hard drive when churning the block database multiple times a
day.
This commit changes a couple of sections which deal with large lists of
inventory vectors to use the new size hint functions recently added to
btcwire. This allows a bit more efficiency since the size of the list is
known up front and we can therefore avoid dynamically growing the backing
array several times. This also helps avoid a Go bug that leaks memory on
appends and GC churn.
This commit modifies the new valid peer message to display the useragent.
Previously this information was only available by setting the PEER
subsystem debuglevel to debug or lower.
This was prompted by #64.
This commit does some housekeeping on peer.go to make the code more
consistent, correct a few comments, and add new comments to explain the
peer data flow. A couple of examples are variables not using the standard
Go style (camelCase) and comments that don't match the style of other
comments.
The regression test does not work properly with the new headers-first
download approach, so force the old inv-based block download for
regression test mode.
It is not necessary to do all of the transaction validation on
blocks if they have been confirmed to be in the block chain leading
up to the final checkpoint in a given blockschain.
This algorithm fetches block headers from the peer, then once it has
established the full blockchain connection, it requests blocks.
Any blocks before the final checkpoint pass true for fastAdd on
btcchain operation, which causes it to do less valiation on the block.
Also, make every subsystem within btcd use its own logger instance so each
subsystem can have its own level specified independent of the others.
This is work towards #48.
- Lock the mempool when removing transactions during a notification as
intended
- When generating the inventory vectors to serve on a mempool request,
recheck the memory pool for each hash since it's possible another thread
could have removed an entry after the initial query for available
hashes
- When a block is connected, remove any transactions which are now double
spends as a result of the newly connected transactions
This commit modifies the transaction memory pool handling so that it does
not relay resurrected transactions. The other peers on the network will
also be reorganizing to the same block, so they already know about them.
This change allows wallet to record all transactions in a block before
receving the new block notification, and then process them all
together when the blockconnected notification arrives.
This commit updates btcd to work with the new btcchain APIs which now
accept btcutil.Tx instead of raw btcwire.MsgTx. It also modifies the
transaction memory pool to store btcutil.Tx.
This is part of the ongoing transaction hash optimization effort noted in
conformal/btcd#25.
This change allows btcwallet to keep a pool of transactions that have
not yet been mined into a block, notifying wallet when transactions
are mined, as well as introducing a new way to send the
btcd:blockconnected notification with wallet-specific information as
part of the same notification. When a transaction is sent using the
RPC call 'sendrawtransaction', a notification request will be
automatically registered with the connected wallet (if using
websockets) to notify the wallet when the transaction first appears in
a block.
To perform this notification, and to avoid requiring wallets from
waiting for seperate mined tx notifications (and resend after a
timeout) or from sending an additional tx mined request for every tx
in the pool after each new block, the blockconnected notification is
now created seperately for each wallet. If the notified wallet has
sent a transaction, an additional JSON field "minedtxs" will include
an array of transaction IDs that the wallet has created and which are
included in the new block.
This new unique blockconnected notification can also be used for
additional notifications that may happen each new block in the future,
and to cut down on existing notification handlers in btcwallet, such
as for transactions to a watched address.
If we don't hear from a peer for 5 minutes, we disconnect them. To keep
traffic flowing we send a ping every 2 minutes if we have not send any
other message that should get a reply.
This change adds additional http listeners for websocket connections
on "/wallet". Websockets are used to provide asynchronous messaging
between wallet daemons (i.e. btcwallet) and btcd as they allow an easy
way for btcd to provide instant notifications (instead of a wallet
polling for updates) and multiple replies to a single request.
Standard RPC commands sent over a websocket connection are handled
just like RPC, returning the same results, the only difference being
that the connection is async. In cases where the standard RPC
commands fall short of wallet daemons requests, and to request
notifications for addresses and events, extension JSON methods are
used.
Multiple wallets can be connected to the same btcd, and replies to
websocket requests and notifications are properly routed back to the
original requesting wallet.
Due to the nature of turning a synchronous protocol asynchronous, it
is highly recommended to use the JSON id field as a type of sequence
number, so replies from btcd can be routed back to the proper handler
in a wallet daemon.
This commit adds code to properly respond to getdata requests for
transactions by fetching them from the transaction pool. Previously, we
advertised newly available transactions, but the code to respond with the
actual transaction was not written yet.
Also, fix a couple of comments and make the pushTxMsg and pushBlockMsg
functions consistent.
This commit is a first pass at improving the logging. It changes a number
of things to improve the readability of the output. The biggest addition
is message summaries for each message type when using the debug logging
level.
There is sitll more to do here such as allowing the level of each
subsystem to be independently specified, syslog support, and allowing the
logging level to be changed run-time.
This commit provides a new flag, --nocheckpoints, to disable built-in
checkpoints.
Checkpoints are used for a number of things such a ensuring
the block chain being downloaded matches various known good blocks,
allowing quicker verification on old blocks since scripts don't have to be
executed, and preventing forks from old blocks, etc.
The block manager handles inventory messges to know which inventory should
be requested based on what is already known and what is already in flight.
So, this commit adds logic to ask the transaction memory pool if the
transaction is already known before requesting it and tracks pending
requests into an in-flight transaction map owned by the block manager.
It also moves the transaction processing into the block manager so the
in-flight map can be properly cleaned.
Also, the loops which only remove a single element and break or return
don't need the extra logic for iteration since they don't continue
iteration after removal.
It is not safe to remove an item from a container/list while iterating the
list without first saving the next pointer since removing the item nils
the internal list element's next pointer.
Rather than showing all errors from ProcessBlock as a failure, check if
the error is a RuleError meaning the block was rejected as opposed to
something actually going wrong and log it accordingly.
This commit is a rather large one which implements transaction pool and
relay according to the protocol rules of the reference implementation.
It makes use of btcchain to ensure the transactions are valid for the
block chain and includes several stricter checks which determine if they
are "standard" or not before admitting them into the pool and relaying
them.
There are still a few TODOs around the more strict rules which determine
which transactions are willing to be mined, but the core checks which
are imperative (everything except the all of the "standard" checks really)
to operate as a good citizen on the bitcoin network are in place.
We originally wanted to also not fetch orphan parents in this commit, however,
I have discovered that if you are doing a main sync from a peer, if it
sends you an orphan you must fetch it, else you ahven't fetched
everything it told you about and thus it will nto send you end more invs
from the main sync.
So we always fetch orphan parents, but we still don't fetch from
non-sync peers (all invs from them will be unsolicited). Seems to fix some hangs
with multiple peers.
The "official" regression test tool intentionally sends some unrequested
duplicate blocks to ensure the chain handling code does not fail when
trying to insert them. This commit adds an exception to the block manager
which typically disconnects peers that send unrequested blocks (they are
misbehaving if they do this) for regression test mode.