This commit adds a new function named SerializeSize to the public API for
MsgBlock which can be used to determine how many bytes the serialized data would
take without having to actually serialize it. In addition, it makes the
exported BlockVersion an untyped constant as well as changes the block and
tx versions to a signed integer to more closely match the protocol.
Finally, this commit also adds tests for the new function.
The following benchmark shows the difference between using the new
function to get the serialize size for a typical block and serializing
into a temporary buffer and taking the length of it:
Bufffer: BenchmarkBlockSerializeSizeBuffer 200000 27050 ns/op
New: BenchmarkBlockSerializeSizeNew 100000000 34 ns/op
Closes#19.
Rather than using bytes.NewBuffer, which is a read/write entity
(io.ReadWriter), use bytes.NewReader which is only a read entitiy
(io.Reader) in all cases where it is possible. Benchmarking shows it's
slightly faster and it's also technically more accurate since it ensures
the data is read-only.
There are a few cases where bytes.NewBuffer must still be used since a
buffer with a known length is required for those instances.
This commit removes the TxnCount field from the BlockHeader type and
updates the tests accordingly. Note that this change does not affect the
actual wire protocol encoding in any way.
The reason the field has been removed is it really doesn't belong there
even though the wire protocol wiki entry on the official bitcoin wiki
implies it does. The implication is an artifact from the way the
reference implementation serializes headers (MsgHeaders) messages. It
includes the transaction count, which is naturally always 0 for headers,
along with every header. However, in reality, a block header does not
include the transaction count. This can be evidenced by looking at how a
block hash is calculated. It is only up to and including the Nonce field
(a total of 80 bytes).
From an API standpoint, having the field as part of the BlockHeader type
results in several odd cases.
For example, the transaction count for MsgBlocks (the only place that
actually has a real transaction count since MsgHeaders does not) is
available by taking the len of the Transactions slice. As such, having
the extra field in the BlockHeader is really a useless field that could
potentially get out of sync and cause the encode to fail.
Another example is related to deserializing a block header from the
database in order to serve it in response to a getheaders (MsgGetheaders)
request. If a block header is assumed to have the transaction count as a
part of it, then derserializing a block header not only consumes more than
the 80 bytes that actually comprise the header as stated above, but you
then need to change the transaction count to 0 before sending the headers
(MsgHeaders) message. So, not only are you reading and deserializing more
bytes than needed, but worse, you generally have to make a copy of it so
you can change the transaction count without busting cached headers.
This is part 1 of #13.
Both of these depend on the serialized bytes which are dependent on the
version field in the block/transaction. They must be independent of the
protocol version so there is no need to require it.
The go vet command complains about untagged struct initializers when
defining a ShaHash directly. This seems to be a limitation where go vet
does not exclude the warning for types which are a constant size byte array
like it does for normal constant size byte array definition.
This commit simply modifies the tests to use a constant definition cast to a
ShaHash to overcome the limitation of go vet.
This commit changes MsgBlock to enforce a 1MB max payload per the spec.
Previously it was only limited to the max overall message size. While
here, also enforce max payloads per message type (instead of only the max
overall message payload) when writing messages.
This commit corrects the tests so that the main API functions are tested
against the latest protocol version, but the TxSha and BlockSha functions
are run against the specific protocol version used to encode the test
data. This will help future proof the tests against protocol changes.