Commit graph

9 commits

Author SHA1 Message Date
Dave Collins
6229e35835 wire: Further reduce transaction allocs.
This commit drastically reduces the number of allocations needed to
deserialize a transaction and its scripts by using the combination of a
free list for initially deserializing the individual scripts along with
copying them into a single contiguous byte slice after the final size is
known and modifying each script in the transaction to point to its
location within the contiguous blob.

The end result is only a single allocation that holds all of the scripts
for a transaction regardless of the total number of scripts it has.

The script free list allows a maximum of 12,500 items with each buffer
being 512 bytes.  This implies it will have a peak usage of 6.1MB.  The
values were chosen based on profiling data and a desire to allow at
least 100 scripts per transaction to be simultaneously deserialized by
125 peers.

Also, while optimizing, decode directly into the existing previous
outpoint structure of each transaction input in order to avoid the extra
allocation per input that is otherwise caused when the local escapes to
the heap.

The following is a before and after comparison of the allocations
with the benchmarks that did not change removed:

benchmark              old allocs     new allocs     delta
-----------------------------------------------------------
ReadTxOut              1              0              -100.00%
ReadTxIn               2              0              -100.00%
DeserializeTxSmall     7              5              -28.57%
DeserializeTxLarge     11146          6              -99.95%
2016-06-03 17:09:14 -05:00
Dave Collins
2adfb3b56a wire: Reduce allocs with contiguous slices.
The current code involves a ton of small allocations which is harsh on
the garbage collector and in turn causes a lot of addition runtime
overhead both in terms of additional memory and processing time.

In order to improve the situation, this drasticially reduces the number
of allocations by creating contiguous slices of objects and
deserializing into them.  Since the final data structures consist of
slices of pointers to the objects, they are constructed by pointing them
into the appropriate offset of the contiguous slice.

This could be improved upon even further by converting all of the data
structures provided the wire package to be slices of contiguous objects
directly, however that would be a major breaking API change and would
end up requiring updating a lot more code in every caller.  I do think
that ultimately the API should be changed, but the changes in this
commit already makes a massive difference and it doesn't require
touching any of the callers, so it is a good place to begin.

The following is a before and after comparison of the allocations
with the benchmarks that did not change removed:

benchmark              old allocs     new allocs     delta
-----------------------------------------------------------
DeserializeTxLarge     16715          11146          -33.32%
DecodeGetHeaders       501            2              -99.60%
DecodeHeaders          2001           2              -99.90%
DecodeGetBlocks        501            2              -99.60%
DecodeAddr             3001           2002           -33.29%
DecodeInv              50003          3              -99.99%
DecodeNotFound         50002          3              -99.99%
DecodeMerkleBlock      107            3              -97.20%
2016-06-03 17:08:31 -05:00
Dave Collins
f68cd7422d wire: Reduce allocs with a binary free list.
This introduces a new binary free list which provides a concurrent safe
list of unused buffers for the purpose of serializing and deserializing
primitive integers to their raw binary bytes.

For convenience, the type also provides functions for each of the
primitive unsigned integers that automatically obtain a buffer from the
free list, perform the necessary binary conversion, read from or write
to the given io.Reader or io.Writer, and return the buffer to the free
list.

A global instance of the type has been introduced with a maximum number
of 1024 items. Since each buffer is 8 bytes, it will consume a maximum
of 8KB.  Theoretically, this value would only allow up to 1024 peers
simultaneously reading and writing without having to resort to burdening
the garbage collector with additional allocations.  However, due to the
fact the code is designed in such a way that the buffers are quickly
used and returned to the free list, in practice it can support much more
than 1024 peers without involving the garbage collector since it is
highly unlikely every peer would need a buffer at the exact same time.

The following is a before and after comparison of the allocations
with the benchmarks that did not change removed:

benchmark              old allocs     new allocs     delta
-------------------------------------------------------------
WriteVarInt1           1              0              -100.00%
WriteVarInt3           1              0              -100.00%
WriteVarInt5           1              0              -100.00%
WriteVarInt9           1              0              -100.00%
ReadVarInt1            1              0              -100.00%
ReadVarInt3            1              0              -100.00%
ReadVarInt5            1              0              -100.00%
ReadVarInt9            1              0              -100.00%
ReadVarStr4            3              2              -33.33%
ReadVarStr10           3              2              -33.33%
WriteVarStr4           2              1              -50.00%
WriteVarStr10          2              1              -50.00%
ReadOutPoint           1              0              -100.00%
WriteOutPoint          1              0              -100.00%
ReadTxOut              3              1              -66.67%
WriteTxOut             2              0              -100.00%
ReadTxIn               5              2              -60.00%
WriteTxIn              3              0              -100.00%
DeserializeTxSmall     15             7              -53.33%
DeserializeTxLarge     33428          16715          -50.00%
SerializeTx            8              0              -100.00%
ReadBlockHeader        7              1              -85.71%
WriteBlockHeader       10             4              -60.00%
DecodeGetHeaders       1004           501            -50.10%
DecodeHeaders          18002          4001           -77.77%
DecodeGetBlocks        1004           501            -50.10%
DecodeAddr             9002           4001           -55.55%
DecodeInv              150005         50003          -66.67%
DecodeNotFound         150004         50002          -66.67%
DecodeMerkleBlock      222            108            -51.35%
TxSha                  10             2              -80.00%
2016-06-03 17:08:31 -05:00
Jouke Hofman
c17ff82061 wire: Export (read|write)(VarInt|VarBytes). 2016-02-22 18:11:58 +01:00
Dave Collins
6e402deb35 Relicense to the btcsuite developers.
This commit relicenses all code in this repository to the btcsuite
developers.
2015-05-01 12:00:56 -05:00
Dave Collins
a4a52ae24f wire: Remove errs from BlockHeader/MsgBlock/MsgTx.
This commit removes the error returns from the BlockHeader.BlockSha,
MsgBlock.BlockSha, and MsgTx.TxSha functions since they can never fail and
end up causing a lot of unneeded error checking throughout the code base.

It also updates all call sites for the change.
2015-04-17 01:27:12 -05:00
Dave Collins
6211eef7ee wire: Add new DoubleSha256SH function.
This commit adds a new function which is similar to the DoubleSha256
function except it returns a ShaHash copy instead of a byte slice.  It
also adds a new benchmark for it.

This can be a slight optimization in certain cases where the caller
ultimately wants a ShaHash since it can avoid a heap allocation and
additional copy to convert the result to a ShaHash (the function simply
performs a type cast against the returned array which is not possible
against a []byte).

existing: DoubleSha256     500000   3081 ns/op   32 B/op   1 allocs/op
     new: DoubleSha256SH   500000   2939 ns/op    0 B/op   0 allocs/op

The hashing functions for blocks and transactions have also been updated
to make use of the new function since they directly return the ShaHash.
The transaction change in particular is quite useful since transactions
are frequently hashed and this change allows all of those hashes to avoid
an additional heap allocation.
2015-04-06 11:37:43 -05:00
Dave Collins
62432a6f90 wire: Add func to get pkscript locs from a tx.
This commit provides a new function named PkScriptLocs on the MsgTx type
which can be used to efficiently retrieve a list of offsets for the public
key scripts for the serialized form of the transaction.

This is useful for certain applications which store fully serialized
transactions and want to be able to quickly index into the serialized
transaction to extract a give public key script directly thereby avoiding
the need to deserialize the entire transaction.
2015-03-09 22:09:09 -05:00
Dave Collins
2eef3720a9 Import btcwire repo into wire directory.
This commit contains the entire btcwire repository along with several
changes needed to move all of the files into the wire directory in
order to prepare it for merging.  This does NOT update btcd or any of the
other packages to use the new location as that will be done separately.

- All import paths in the old btcwire test files have been changed to the
  new location
- All references to btcwire as the package name have been chagned to
  wire
- The coveralls badge has been removed since it unfortunately doesn't
  support coverage of sub-packages

This is ongoing work toward #214.
2015-01-31 14:59:57 -06:00
Renamed from msgtx.go (Browse further)