The use of mocktime in test logic means that comparisons between
GetTime() and GetTimeMicros()/1000000 are unreliable since the former
can use mocktime values while the latter always gets the system clock;
this changes the networking code's inactivity checks to consistently
use the system clock for inactivity comparisons.
Also remove some hacks from setmocktime() that are no longer needed,
now that we're using the system clock for nLastSend and nLastRecv.
Technically cs_sendProcessing is entirely useless now because it
is only ever taken on the one MessageHandler thread, but because
there may be multiple of those in the future, it is left in place
cs_vSend is used for two purposes - to lock the datastructures used
to queue messages to place on the wire and to only call
SendMessages once at a time per-node. I believe SendMessages used
to access some of the vSendMsg stuff, but it doesn't anymore, so
these locks do not need to be on the same mutex, and also make
deadlocking much more likely.
e60360e net: remove cs_vRecvMsg (Cory Fields)
991955e net: add a flag to indicate when a node's send buffer is full (Cory Fields)
c6e8a9b net: add a flag to indicate when a node's process queue is full (Cory Fields)
4d712e3 net: add a new message queue for the message processor (Cory Fields)
c5a8b1b net: rework the way that the messagehandler sleeps (Cory Fields)
c72cc88 net: remove useless comments (Cory Fields)
ef7b5ec net: Add a simple function for waking the message handler (Cory Fields)
f5c36d1 net: record bytes written before notifying the message processor (Cory Fields)
60befa3 net: handle message accounting in ReceiveMsgBytes (Cory Fields)
56212e2 net: set message deserialization version when it's actually time to deserialize (Cory Fields)
0e973d9 net: remove redundant max sendbuffer size check (Cory Fields)
6042587 net: wait until the node is destroyed to delete its recv buffer (Cory Fields)
f6315e0 net: only disconnect if fDisconnect has been set (Cory Fields)
5b4a8ac net: make GetReceiveFloodSize public (Cory Fields)
e5bcd9c net: make vRecvMsg a list so that we can use splice() (Cory Fields)
53ad9a1 net: fix typo causing the wrong receive buffer size (Cory Fields)
vRecvMsg is now only touched by the socket handler thread.
The accounting vars (nRecvBytes/nLastRecv/mapRecvBytesPerMsgCmd) are also
only used by the socket handler thread, with the exception of queries from
rpc/gui. These accesses are not threadsafe, but they never were. This needs to
be addressed separately.
Also, update comment describing data flow
Similar to the recv flag, but this one indicates whether or not the net's send
buffer is full.
The socket handler checks the send queue when a new message is added and pauses
if necessary, and possibly unpauses after each message is drained from its buffer.
Messages are dumped very quickly from the socket handler to the processor, so
it's the depth of the processing queue that's interesting.
The socket handler checks the process queue's size during the brief message
hand-off and pauses if necessary, and the processor possibly unpauses each time
a message is popped off of its queue.
In order to sleep accurately, the message handler needs to know if _any_ node
has more processing that it should do before the entire thread sleeps.
Rather than returning a value that represents whether ProcessMessages
encountered a message that should trigger a disconnnect, interpret the return
value as whether or not that node has more work to do.
Also, use a global fProcessWake value that can be set by other threads,
which takes precedence (for one cycle) over the messagehandler's decision.
Note that the previous behavior was to only process one message per loop
(except in the case of a bad checksum or invalid header). That was changed in
PR #3180.
The only change here in that regard is that the current node now falls to the
back of the processing queue for the bad checksum/invalid header cases.
Previously addnodes were in competition with outbound connections
for access to the eight outbound slots.
One result of this is that frequently a node with several addnode
configured peers would end up connected to none of them, because
while the addnode loop was in its two minute sleep the automatic
connection logic would fill any free slots with random peers.
This is particularly unwelcome to users trying to maintain links
to specific nodes for fast block relay or purposes.
Another result is that a group of nine or more nodes which are
have addnode configured towards each other can become partitioned
from the public network.
This commit introduces a new limit of eight connections just for
addnode peers which is not subject to any of the other connection
limitations (including maxconnections).
The choice of eight is sufficient so that under no condition would
a user find themselves connected to fewer addnoded peers than
previously. It is also low enough that users who are confused
about the significance of more connections and have gotten too
copy-and-paste happy will not consume more than twice the slot
usage of a typical user.
Any additional load on the network resulting from this will likely
be offset by a reduction in users applying even more wasteful
workaround for the prior behavior.
The retry delays are reduced to avoid nodes sitting around without
their added peers up, but are still sufficient to prevent overly
aggressive repeated connections. The reduced delays also make
the system much more responsive to the addnode RPC.
Ban-disconnects are also exempted for peers added via addnode since
the outbound addnode logic ignores bans. Previously it would ban
an addnode then immediately reconnect to it.
A minor change was also made to CSemaphoreGrant so that it is
possible to re-acquire via an object whos grant was moved.
Surprisingly this hasn't been causing me any issues while testing, probably
because it requires lots of large blocks to be flying around.
Send/Recv corks need tests!
- Drop the interruption point directly after the pnode allocation. This would
be leaky if hit.
- Rearrange thread creation so that the socket handler comes first
This fixes one of the last major layer violations in the networking stack.
The network side is no longer in charge of message serialization, so it is now
decoupled from Bitcoin structures. Only the header is serialized and attached
to the payload.
The changes here are dense and subtle, but hopefully all is more explicit
than before.
- CConnman is now in charge of sending data rather than the nodes themselves.
This is necessary because many decisions need to be made with all nodes in
mind, and a model that requires the nodes calling up to their manager quickly
turns to spaghetti.
- The per-node-serializer (ssSend) has been replaced with a (quasi-)const
send-version. Since the send version for serialization can only change once
per connection, we now explicitly tag messages with INIT_PROTO_VERSION if
they are sent before the handshake. With this done, there's no need to lock
for access to nSendVersion.
Also, a new stream is used for each message, so there's no need to lock
during the serialization process.
- This takes care of accounting for optimistic sends, so the
nOptimisticBytesWritten hack can be removed.
- -dropmessagestest and -fuzzmessagestest have not been preserved, as I suspect
they haven't been used in years.
nMaxInbound might very well be 0 or -1, if the user prefers to keep
a small number of maxconnections.
Note: nMaxInbound of -1 means that the user set maxconnections
to 8 or less, but we still want to keep an additional slot for
the feeler connection.
Add getNetworkActive()/setNetworkActive() method to client model.
Send network active status through NotifyNetworkActiveChanged.
Indicate in tool tip of gui status bar network indicator whether network activity is disabled.
Indicate in debug window whether network activity is disabled and add button to allow user to toggle network activity state.
Added the function SetNetworkActive() which when called with argument set to false disconnects all nodes and sets the flag fNetworkActive to false. As long as this flag is false no new connections are attempted and no incoming connections are accepted. Network activity is reenabled by calling the function with argument true.
4630479 Make dnsseed's definition of acute need include relevant services. (Gregory Maxwell)
9583477 Be more aggressive in connecting to peers with relevant services. (Gregory Maxwell)
We normally prefer to connect to peers offering the relevant services.
If we're not connected to enough peers with relevant services, we
probably don't know about them and could use dnsseed's help.
Only allow skipping relevant services until there are four outbound
connections up.
This avoids quickly filling up with peers lacking the relevant
services when addrman has few or none of them.
There are only a few uses of `insecure_random` outside the tests.
This PR replaces uses of insecure_random (and its accompanying global
state) in the core code with an FastRandomContext that is automatically
seeded on creation.
This is meant to be used for inner loops. The FastRandomContext
can be in the outer scope, or the class itself, then rand32() is used
inside the loop. Useful e.g. for pushing addresses in CNode or the fee
rounding, or randomization for coin selection.
As a context is created per purpose, thus it gets rid of
cross-thread unprotected shared usage of a single set of globals, this
should also get rid of the potential race conditions.
- I'd say TxMempool::check is not called enough to warrant using a special
fast random context, this is switched to GetRand() (open for
discussion...)
- The use of `insecure_rand` in ConnectThroughProxy has been replaced by
an atomic integer counter. The only goal here is to have a different
credentials pair for each connection to go on a different Tor circuit,
it does not need to be random nor unpredictable.
- To avoid having a FastRandomContext on every CNode, the context is
passed into PushAddress as appropriate.
There remains an insecure_random for test usage in `test_random.h`.
In principle, the checksums of P2P packets are simply 4-byte blobs which
are the first four bytes of SHA256(SHA256(payload)).
Currently they are handled as little-endian 32-bit integers half of the
time, as blobs the other half, sometimes copying the one to the other,
resulting in somewhat confused code.
This PR changes the handling to be consistent both at packet creation
and receiving, making it (I think) easier to understand.
After #8594 the addrFrom sent by a node is not used anymore at all,
so don't bother sending it.
Also mitigates the privacy issue in (#8616). It doesn't completely solve
the issue as GetLocalAddress is also called in AdvertiseLocal, but at
least when advertising addresses it stands out less as *our* address.
This was broken by 63cafa6329.
Note that while this fixes the settings, it doesn't fix the actual usage of
-maxuploadtarget completely, as there is currently a bug in the
nOptimisticBytesWritten accounting that causes a delayed response if the target
is reached. That bug will be addressed separately.
In the case of (for example) an already-running bitcoind, the shutdown sequence
begins before CConnman has been created, leading to a null-pointer dereference
when g_connman->Stop() is called.
Instead, Just let the CConnman dtor take care of stopping.
CConnman then passes the current best height into CNode at creation time.
This way CConnman/CNode have no dependency on main for height, and the signals
only move in one direction.
This also helps to prevent identity leakage a tiny bit. Before this change, an
attacker could theoretically make 2 connections on different interfaces. They
would connect fully on one, and only establish the initial connection on the
other. Once they receive a new block, they would relay it to your first
connection, and immediately commence the version handshake on the second. Since
the new block height is reflected immediately, they could attempt to learn
whether the two connections were correlated.
This is, of course, incredibly unlikely to work due to the small timings
involved and receipt from other senders. But it doesn't hurt to lock-in
nBestHeight at the time of connection, rather than letting the remote choose
the time.
This behavior seems to have been quite racy and broken.
Move nLocalHostNonce into CNode, and check received nonces against all
non-fully-connected nodes. If there's a match, assume we've connected
to ourself.