Rather than having a separate query channel for the block manager, use the
same channel so the block handler acts as a pure FIFO queue. This
prevents possible starvation of query related messages.
ok @owainga
This change modifies the RPC server's notifiation manager from a
struct with requests, protected by a mutux, to two goroutines. The
first maintains a queue of all notifications and control requests
(registering/unregistering notifications), while the second reads from
the queue and processes notifications and requests one at a time.
Previously, to prevent slowing down block and mempool processing, each
notification would be handled by spawning a new goroutine. This lead
to cases where notifications would end up being sent to clients in a
different order than they were created. Adding a queue keeps the
order of notifications originating from the same goroutine, while also
not slowing down processing while waiting for notifications to be
processed and sent.
ok @davecgh
This commit refactors the entire websocket client code to resolve several
issues with the previous implementation. Note that this commit does not
change the public API for websockets. It only consists of internal
improvements.
The following is the major issues which have been addressed:
- A slow websocket client could impede notifications to all clients
- Long-running operations such as rescans would block all other requests
until it had completed
- The above two points taken together could lead to apparant hangs since
the client doing the rescan would eventually run out of channel buffer
and block the entire group of clients until the rescan completed
- Disconnecting a websocket during certain operations could lead to a hang
- Stopping the rpc server with operations under way could lead to a hang
- There were no limits to the number of websocket clients that could
connect
The following is a summary of the major changes:
- The websocket code has been split into two entities: a
connection/notification manager and a websocket client
- The new connection/notification manager acts as the entry point from
the rest of the subsystems to feed data which potentially needs to
notify clients
- Each websocket client now has its own instance of the new websocket
client type which controls its own lifecycle
- The data flow has been completely redesigned to closely resemble the
peer data flow
- Each websocket now has its own long-lived goroutines for input, output,
and queuing of notifications
- Notifications use the new notification queue goroutine along with
queueing to ensure they dont't block on stalled or slow peers
- There is a new infrastructure for asynchronously executing long-running
commands such as a rescan while still allowing the faster operations to
continue to be serviced by the same client
- Since long-running operations now run asynchronously, they have been
limited to one at a time
- Added a limit of 10 websocket clients. This is hard coded for now, but
will be made configurable in the future
Taken together these changes make the code far easier to reason about and
update as well solve the aforementioned issues.
Further optimizations to improve performance are possible in regards to
the way the connection/notification manager works, however this commit
already contains a ton of changes, so they are being left for another
time.
Changed mempool.MaybeAcceptTransaction to accept an additional parameter
to differentiate betwee new transactions and those added from
disconnected blocks.
Added new fields to requestContexts to indicate which clients want to
receive all new transaction notifications.
Added NotifyForNewTx to rpcServer to deliver approriate transaction
notification.
Rather than using a dedicated channel for the sync peer request and reply,
use a single query channel that accepts a query type as well as a reply
channel. This will allow other queries to be added in the future without
the various queries being racy.
This commit improves how the headers-first mode works in several ways.
The previous headers-first code was an initial implementation that did not
have all of the bells and whistles and a few less than ideal
characteristics. This commit improves the heaers-first code to resolve
the issues discussed next.
- The previous code only used headers-first mode when starting out from
block height 0 rather than allowing it to work starting at any height
before the final checkpoint. This means if you stopped the chain
download at any point before the final checkpoint and restarted, it
would not resume and you therefore would not have the benefit of the
faster processing offered by headers-first mode.
- Previously all headers (even those after the final checkpoint) were
downloaded and only the final checkpoint was verified. This resulted in
the following issues:
- As the block chain grew, increasingly larger numbers of headers were
downloaded and kept in memory
- If the node the node serving up the headers was serving an invalid
chain, it wouldn't be detected until downloading a large number of
headers
- When an invalid checkpoint was detected, no action was taken to
recover which meant the chain download would essentially be stalled
- The headers were kept in memory even though they didn't need to be as
merely keeping track of the hashes and heights is enough to provde they
properly link together and checkpoints match
- There was no logging when headers were being downloaded so it could
appear like nothing was happening
- Duplicate requests for the same headers weren't being filtered which
meant is was possible to inadvertently download the same headers twice
only to throw them away.
This commit resolves these issues with the following changes:
- The current height is now examined at startup and prior each sync peer
selection to allow it to resume headers-first mode starting from the
known height to the next checkpoint
- All checkpoints are now verified and the headers are only downloaded
from the current known block height up to the next checkpoint. This has
several desirable properties:
- The amount of memory required is bounded by the maximum distance
between to checkpoints rather than the entire length of the chain
- A node serving up an invalid chain is detected very quickly and with
little work
- When an invalid checkpoint is detected, the headers are simply
discarded and the peer is disconnected for serving an invalid chain
- When the sync peer disconnets, all current headers are thrown away
and, due to the new aforementioned resume code, when a new sync peer
is selected, headers-first mode will continue from the last known good
block
- In addition to reduced memory usage from only keeping information about
headers between two checkpoints, the only information now kept in memory
about the headers is the hash and height rather than the entire header
- There is now logging information about what is happening with headers
- Duplicate header requests are now filtered
Previously the logging function which reports on progress was called for
every block, regardless of whether it was an orphan or not. This could be
confusing since it could show a different number of blocks processed as
compared to the old versus new heights reported (orphans do not add to the
block height since they aren't extending the main chain). Further, the
database had to be consulted for the latest block since the block we just
processed might not be the latest one if it was an orphan. This is quite
a bit more time conusming than it should've been for progress reporting.
This commit modifies that to only include non-orphan blocks. As a result,
the latest height shown will match the number of blocks processed (even
when there are orphans) and the additional block lookup from the database
is avoided.
This commit adds the btcdb memdb backend as a supported database type.
Note that users will NOT want to run in this mode because, being memory
only, it obviously does not persist the database when shutdown.
It is being added for testing purposes to help prevent constant abuse to
developer's hard drive when churning the block database multiple times a
day.
This commit changes a couple of sections which deal with large lists of
inventory vectors to use the new size hint functions recently added to
btcwire. This allows a bit more efficiency since the size of the list is
known up front and we can therefore avoid dynamically growing the backing
array several times. This also helps avoid a Go bug that leaks memory on
appends and GC churn.
This commit modifies the new valid peer message to display the useragent.
Previously this information was only available by setting the PEER
subsystem debuglevel to debug or lower.
This was prompted by #64.
This commit does some housekeeping on peer.go to make the code more
consistent, correct a few comments, and add new comments to explain the
peer data flow. A couple of examples are variables not using the standard
Go style (camelCase) and comments that don't match the style of other
comments.
The regression test does not work properly with the new headers-first
download approach, so force the old inv-based block download for
regression test mode.
It is not necessary to do all of the transaction validation on
blocks if they have been confirmed to be in the block chain leading
up to the final checkpoint in a given blockschain.
This algorithm fetches block headers from the peer, then once it has
established the full blockchain connection, it requests blocks.
Any blocks before the final checkpoint pass true for fastAdd on
btcchain operation, which causes it to do less valiation on the block.
Also, make every subsystem within btcd use its own logger instance so each
subsystem can have its own level specified independent of the others.
This is work towards #48.
- Lock the mempool when removing transactions during a notification as
intended
- When generating the inventory vectors to serve on a mempool request,
recheck the memory pool for each hash since it's possible another thread
could have removed an entry after the initial query for available
hashes
- When a block is connected, remove any transactions which are now double
spends as a result of the newly connected transactions
This commit modifies the transaction memory pool handling so that it does
not relay resurrected transactions. The other peers on the network will
also be reorganizing to the same block, so they already know about them.
This change allows wallet to record all transactions in a block before
receving the new block notification, and then process them all
together when the blockconnected notification arrives.
This commit updates btcd to work with the new btcchain APIs which now
accept btcutil.Tx instead of raw btcwire.MsgTx. It also modifies the
transaction memory pool to store btcutil.Tx.
This is part of the ongoing transaction hash optimization effort noted in
conformal/btcd#25.
This change allows btcwallet to keep a pool of transactions that have
not yet been mined into a block, notifying wallet when transactions
are mined, as well as introducing a new way to send the
btcd:blockconnected notification with wallet-specific information as
part of the same notification. When a transaction is sent using the
RPC call 'sendrawtransaction', a notification request will be
automatically registered with the connected wallet (if using
websockets) to notify the wallet when the transaction first appears in
a block.
To perform this notification, and to avoid requiring wallets from
waiting for seperate mined tx notifications (and resend after a
timeout) or from sending an additional tx mined request for every tx
in the pool after each new block, the blockconnected notification is
now created seperately for each wallet. If the notified wallet has
sent a transaction, an additional JSON field "minedtxs" will include
an array of transaction IDs that the wallet has created and which are
included in the new block.
This new unique blockconnected notification can also be used for
additional notifications that may happen each new block in the future,
and to cut down on existing notification handlers in btcwallet, such
as for transactions to a watched address.
If we don't hear from a peer for 5 minutes, we disconnect them. To keep
traffic flowing we send a ping every 2 minutes if we have not send any
other message that should get a reply.
This change adds additional http listeners for websocket connections
on "/wallet". Websockets are used to provide asynchronous messaging
between wallet daemons (i.e. btcwallet) and btcd as they allow an easy
way for btcd to provide instant notifications (instead of a wallet
polling for updates) and multiple replies to a single request.
Standard RPC commands sent over a websocket connection are handled
just like RPC, returning the same results, the only difference being
that the connection is async. In cases where the standard RPC
commands fall short of wallet daemons requests, and to request
notifications for addresses and events, extension JSON methods are
used.
Multiple wallets can be connected to the same btcd, and replies to
websocket requests and notifications are properly routed back to the
original requesting wallet.
Due to the nature of turning a synchronous protocol asynchronous, it
is highly recommended to use the JSON id field as a type of sequence
number, so replies from btcd can be routed back to the proper handler
in a wallet daemon.
This commit adds code to properly respond to getdata requests for
transactions by fetching them from the transaction pool. Previously, we
advertised newly available transactions, but the code to respond with the
actual transaction was not written yet.
Also, fix a couple of comments and make the pushTxMsg and pushBlockMsg
functions consistent.
This commit is a first pass at improving the logging. It changes a number
of things to improve the readability of the output. The biggest addition
is message summaries for each message type when using the debug logging
level.
There is sitll more to do here such as allowing the level of each
subsystem to be independently specified, syslog support, and allowing the
logging level to be changed run-time.
This commit provides a new flag, --nocheckpoints, to disable built-in
checkpoints.
Checkpoints are used for a number of things such a ensuring
the block chain being downloaded matches various known good blocks,
allowing quicker verification on old blocks since scripts don't have to be
executed, and preventing forks from old blocks, etc.
The block manager handles inventory messges to know which inventory should
be requested based on what is already known and what is already in flight.
So, this commit adds logic to ask the transaction memory pool if the
transaction is already known before requesting it and tracks pending
requests into an in-flight transaction map owned by the block manager.
It also moves the transaction processing into the block manager so the
in-flight map can be properly cleaned.
Also, the loops which only remove a single element and break or return
don't need the extra logic for iteration since they don't continue
iteration after removal.
It is not safe to remove an item from a container/list while iterating the
list without first saving the next pointer since removing the item nils
the internal list element's next pointer.
Rather than showing all errors from ProcessBlock as a failure, check if
the error is a RuleError meaning the block was rejected as opposed to
something actually going wrong and log it accordingly.
This commit is a rather large one which implements transaction pool and
relay according to the protocol rules of the reference implementation.
It makes use of btcchain to ensure the transactions are valid for the
block chain and includes several stricter checks which determine if they
are "standard" or not before admitting them into the pool and relaying
them.
There are still a few TODOs around the more strict rules which determine
which transactions are willing to be mined, but the core checks which
are imperative (everything except the all of the "standard" checks really)
to operate as a good citizen on the bitcoin network are in place.
We originally wanted to also not fetch orphan parents in this commit, however,
I have discovered that if you are doing a main sync from a peer, if it
sends you an orphan you must fetch it, else you ahven't fetched
everything it told you about and thus it will nto send you end more invs
from the main sync.
So we always fetch orphan parents, but we still don't fetch from
non-sync peers (all invs from them will be unsolicited). Seems to fix some hangs
with multiple peers.
The "official" regression test tool intentionally sends some unrequested
duplicate blocks to ensure the chain handling code does not fail when
trying to insert them. This commit adds an exception to the block manager
which typically disconnects peers that send unrequested blocks (they are
misbehaving if they do this) for regression test mode.
Really, it would be nice to pass an interface{} into chain to be given
to us when the callback calls, it would avoid the awkward sidchanneling
through the map and should actually be more efffieint (pointer passing >
hashtable insert, lookup, then remove).
Rather than having all of the various places that print peer figure out
the direction and form the string, centralize it by implementing the
Stringer interface on the peer.
Chain is not concurrency safe, so we move the chainNotifySink handling
into the main blockmanager goroutine. Due to a possible deadlock if the
buffer is filled this still has to be a single channel that isn't linked
to the other ones. There is a possible starvation issue where the main
msgChan gets selected more often than the notification sink, but until
chain is concurrency safe this is rather unavoidable.
Only log errors for most cases if the peer is persisent (and thus requested).
Only log by default after version exchange, and after losing a peer that had
completed version exchange. Make most other messages debug.
Use this information so that we do not request a block per peer we got
an inv for it, makes multi peer much quieter and rather more bandwidth
efficient.
In order to remove a number of possible races we combine blockhandling
an synchandler and use one channel for all messages. This ensures that
all messages from a single peer will be recieved in order. It also
removes the need for a lot of locking between the peer removal code and
the block/inv handlers.
Previously a new goroutine was launched for each notification in order to
avoid blocking chain from continuing while the notification is being
processed. This approach had a couple of issues.
First, since goroutines are not guaranteed to execute in any given order,
the notifications were no longer handled in the same order as they were
sent. For the current code, this is not a problem, but upcoming code that
handles a transaction memory pool, the order needs to be correct.
Second, goroutines are relatively cheap, but it's still quite a bit of
overhead to launch 3-4 goroutines per block.
This commit modifies the handling code to have a single sink executing in
a separate goroutine. The main handler then adds the notifications to a
queue which is processed by the sink. This approach retains the
non-blocking behavior of the previous approach, but also keeps the order
correct and, as an additional benefit, is also more efficient.
This removes a horrible case of reach-around from per into the guts of
the blockmaanger to frob the chain. Soon, when we try to deduplicate the
fetching of blocks from multiple peers this will need decisions made in
a central point.
Discussed at length with davec.
- Remove leftover debug log prints
- Increment waitgroup outside of goroutine
- Various comment and log message consistency
- Combine peer setup and newPeer -> newInboundPeer
- Save and load peers.json to/from cfg.DataDir
- Only claim addrmgr needs more addresses when it has less than 1000
- Add warning if unkown peer on orphan block.
The regression test mode is special in that the 'official' block test
suite requires an empty database to work properly. Rather than having to
manual go delete it before each test, add code to automatically delete the
old regression test database when in regression test mode.
This commit modifies the way the data paths are handled. Since there will
ultimately be more data associated with each network than just the block
database, the data path has been modified to be "namespaced" based on the
network. This allows all data associated with a specific network to
simply use the data path without having to worry about conflicts with data
from other networks.
In addition, this commit renames the block database to "blocks" plus a
suffix which denotes the database type. This prevents issues that would
otherwise arise if the user decides to use a different database type and
a file/folder with the same name already eixsts but is of the old database
type. For most users this won't matter, but it does provide nice
properties for testing and development as well since it makes it easy to
go back and forth between database types.
This commit also includes code to upgrade the old database paths to the
new ones so the change is seamless for the user.
Finally, bump the version to 0.2.0.
This change paves the way for saving more than just the block database to
the filesystem (such as address manager data, index data, etc) where the
name "dbdir" no longer makes sense.
This commit changes the code so that all calls to .Add on waitgroups
happen before the associated goroutines are launched. Doing this after
the goroutine could technically cause a race where the goroutine started
and finished before the main goroutine has a chance to increment the
counter. In our particular case none of the goroutines exit quickly
enough for this to be an issue, but nevertheless the correct way should be
used.
This commit adds support for relaying blocks between peers. It keeps
track of inventory that has either already been advertised to remote peers
or advertised by remote peers using a size-limited most recently used
cache. This helps avoid relaying inventory the peer already knows as
much as possible while not allowing rogue peers to eat up arbitrary
amounts of memory with bogus inventory.
This commit significantly reworks the fetching code to interop better with
bitcoind. In particular, when an inventory message is sent, and the
remote peer requests the final block, the remote peer sends the current
end of the main chain to signal that there are more blocks to get.
Previously this code was automatically requesting more blocks when the
number of in-flight blocks was under a certain threshold. The original
approach does help alleviate delays in the "request final, wait for
orphan, request more" round trip, but due to the aforementioned mechanism,
it leads to double requests and other subtle issues.
This commit modifies the input message handler so that when a remote peer
sends a block, no further messages from that peer are accepted until the
block has been fully processed and therefore known good or bad. This
helps prevent a malicious peer from queueing up a bunch of bad blocks
before disconnecting (or being disconnected) and wasting memory.
Additionally, this behavior is depended on by at least the block
acceptance test tool as the reference implementation processes blocks in
the same thread and therefore blocks further messages until the block has
been fully processed as well.
Previously, the genesis block was only inserted when the database was
created, but it's possible due to rollback that the database is created
and the genesis block insert gets rolled back if the app is existed too
quickly. This commit modifies the logic to test the need for the genesis
block any time the database is loaded and insert it if necessary.