This commit removes the TxnCount field from the BlockHeader type and
updates the tests accordingly. Note that this change does not affect the
actual wire protocol encoding in any way.
The reason the field has been removed is it really doesn't belong there
even though the wire protocol wiki entry on the official bitcoin wiki
implies it does. The implication is an artifact from the way the
reference implementation serializes headers (MsgHeaders) messages. It
includes the transaction count, which is naturally always 0 for headers,
along with every header. However, in reality, a block header does not
include the transaction count. This can be evidenced by looking at how a
block hash is calculated. It is only up to and including the Nonce field
(a total of 80 bytes).
From an API standpoint, having the field as part of the BlockHeader type
results in several odd cases.
For example, the transaction count for MsgBlocks (the only place that
actually has a real transaction count since MsgHeaders does not) is
available by taking the len of the Transactions slice. As such, having
the extra field in the BlockHeader is really a useless field that could
potentially get out of sync and cause the encode to fail.
Another example is related to deserializing a block header from the
database in order to serve it in response to a getheaders (MsgGetheaders)
request. If a block header is assumed to have the transaction count as a
part of it, then derserializing a block header not only consumes more than
the 80 bytes that actually comprise the header as stated above, but you
then need to change the transaction count to 0 before sending the headers
(MsgHeaders) message. So, not only are you reading and deserializing more
bytes than needed, but worse, you generally have to make a copy of it so
you can change the transaction count without busting cached headers.
This is part 1 of #13.
Several of the bitcoin data structures contain variable length entries,
many of which have well-defined maximum limits. However, there are still
a few cases, such as variable length strings and number of transactions
which don't have clearly defined maximum limits. Instead they are only
limited by the maximum size of a message.
In order to efficiently decode messages, space is pre-allocated for the
slices which hold these variable length pieces as to avoid needing to
dynamically grow the backing arrays. Due to this however, it was
previously possible to claim extremely high slice lengths which exceed
available memory (or maximum allowed slice lengths).
This commit imposes limits to all of these cases based on calculating
the maximum possible number of elements that could fit into a message
and using those as sane upper limits.
The variable length string case was found (and tests added to hit it) by
drahn@ which prompted an audit to find all cases.
Several of the messages store the parts that have a variable number of
elements as slices. This commit modifies the code to choose sane defaults
for the backing arrays for the slices so when the entries are actually
appended, a lot of the overhead of growing the backing arrays and copying
the data multiple times is avoided.
Along the same lines, when decoding messages, the actual size is known and
now is pre-allocated instead of dynamically growing the backing array
thereby avoiding some overhead.
Both of these depend on the serialized bytes which are dependent on the
version field in the block/transaction. They must be independent of the
protocol version so there is no need to require it.
This commit introduces two new functions for MsgBlock and MsgTx named
Serialize and Deserialize. The functions provide a stable mechanism for
serializing and deserializing blocks and transactions to and from disk
without having to worry about the protocol version. Instead these
functions use the Version fields in the blocks and transactions.
These new functions differ from BtcEncode and BtcDecode in that the latter
functions are intended to encode/decode blocks and transaction from the
wire which technically can differ depending on the protocol version and
don't even really need to use the same format as the stored data.
Currently, there is no difference between the two, and due to how
intertwined they are in the reference implementaiton, they may not ever
diverge, but there is a difference and the goal for btcwire is to provide
a stable API that is flexible enough to deal with encoding changes.
Although you can technically get at this value via the MaxPayloadLength
function on a block, it is less overhead for any consumers that need to
know the value to simply export it directly.
This commit changes MsgBlock to enforce a 1MB max payload per the spec.
Previously it was only limited to the max overall message size. While
here, also enforce max payloads per message type (instead of only the max
overall message payload) when writing messages.
The functions for generating transaction and block hashes contained a few
error checks for conditions which could never fail without run-time
panics. This commit removes those superfluous checks and adds explanatory
comments.