lbcd/wire/msgheaders.go

137 lines
4.1 KiB
Go
Raw Normal View History

blockchain: Rework to use new db interface. This commit is the first stage of several that are planned to convert the blockchain package into a concurrent safe package that will ultimately allow support for multi-peer download and concurrent chain processing. The goal is to update btcd proper after each step so it can take advantage of the enhancements as they are developed. In addition to the aforementioned benefit, this staged approach has been chosen since it is absolutely critical to maintain consensus. Separating the changes into several stages makes it easier for reviewers to logically follow what is happening and therefore helps prevent consensus bugs. Naturally there are significant automated tests to help prevent consensus issues as well. The main focus of this stage is to convert the blockchain package to use the new database interface and implement the chain-related functionality which it no longer handles. It also aims to improve efficiency in various areas by making use of the new database and chain capabilities. The following is an overview of the chain changes: - Update to use the new database interface - Add chain-related functionality that the old database used to handle - Main chain structure and state - Transaction spend tracking - Implement a new pruned unspent transaction output (utxo) set - Provides efficient direct access to the unspent transaction outputs - Uses a domain specific compression algorithm that understands the standard transaction scripts in order to significantly compress them - Removes reliance on the transaction index and paves the way toward eventually enabling block pruning - Modify the New function to accept a Config struct instead of inidividual parameters - Replace the old TxStore type with a new UtxoViewpoint type that makes use of the new pruned utxo set - Convert code to treat the new UtxoViewpoint as a rolling view that is used between connects and disconnects to improve efficiency - Make best chain state always set when the chain instance is created - Remove now unnecessary logic for dealing with unset best state - Make all exported functions concurrent safe - Currently using a single chain state lock as it provides a straight forward and easy to review path forward however this can be improved with more fine grained locking - Optimize various cases where full blocks were being loaded when only the header is needed to help reduce the I/O load - Add the ability for callers to get a snapshot of the current best chain stats in a concurrent safe fashion - Does not block callers while new blocks are being processed - Make error messages that reference transaction outputs consistently use <transaction hash>:<output index> - Introduce a new AssertError type an convert internal consistency checks to use it - Update tests and examples to reflect the changes - Add a full suite of tests to ensure correct functionality of the new code The following is an overview of the btcd changes: - Update to use the new database and chain interfaces - Temporarily remove all code related to the transaction index - Temporarily remove all code related to the address index - Convert all code that uses transaction stores to use the new utxo view - Rework several calls that required the block manager for safe concurrency to use the chain package directly now that it is concurrent safe - Change all calls to obtain the best hash to use the new best state snapshot capability from the chain package - Remove workaround for limits on fetching height ranges since the new database interface no longer imposes them - Correct the gettxout RPC handler to return the best chain hash as opposed the hash the txout was found in - Optimize various RPC handlers: - Change several of the RPC handlers to use the new chain snapshot capability to avoid needlessly loading data - Update several handlers to use new functionality to avoid accessing the block manager so they are able to return the data without blocking when the server is busy processing blocks - Update non-verbose getblock to avoid deserialization and serialization overhead - Update getblockheader to request the block height directly from chain and only load the header - Update getdifficulty to use the new cached data from chain - Update getmininginfo to use the new cached data from chain - Update non-verbose getrawtransaction to avoid deserialization and serialization overhead - Update gettxout to use the new utxo store versus loading full transactions using the transaction index The following is an overview of the utility changes: - Update addblock to use the new database and chain interfaces - Update findcheckpoint to use the new database and chain interfaces - Remove the dropafter utility which is no longer supported NOTE: The transaction index and address index will be reimplemented in another commit.
2015-08-26 06:03:18 +02:00
// Copyright (c) 2013-2016 The btcsuite developers
2013-05-08 21:31:00 +02:00
// Use of this source code is governed by an ISC
// license that can be found in the LICENSE file.
package wire
2013-05-08 21:31:00 +02:00
import (
"fmt"
"io"
)
// MaxBlockHeadersPerMsg is the maximum number of block headers that can be in
// a single bitcoin headers message.
const MaxBlockHeadersPerMsg = 2000
// MsgHeaders implements the Message interface and represents a bitcoin headers
// message. It is used to deliver block header information in response
// to a getheaders message (MsgGetHeaders). The maximum number of block headers
// per message is currently 2000. See MsgGetHeaders for details on requesting
// the headers.
type MsgHeaders struct {
Headers []*BlockHeader
}
// AddBlockHeader adds a new block header to the message.
func (msg *MsgHeaders) AddBlockHeader(bh *BlockHeader) error {
if len(msg.Headers)+1 > MaxBlockHeadersPerMsg {
str := fmt.Sprintf("too many block headers in message [max %v]",
MaxBlockHeadersPerMsg)
return messageError("MsgHeaders.AddBlockHeader", str)
2013-05-08 21:31:00 +02:00
}
msg.Headers = append(msg.Headers, bh)
return nil
}
// BtcDecode decodes r using the bitcoin protocol encoding into the receiver.
// This is part of the Message interface implementation.
func (msg *MsgHeaders) BtcDecode(r io.Reader, pver uint32, enc MessageEncoding) error {
count, err := ReadVarInt(r, pver)
2013-05-08 21:31:00 +02:00
if err != nil {
return err
}
// Limit to max block headers per message.
if count > MaxBlockHeadersPerMsg {
str := fmt.Sprintf("too many block headers for message "+
"[count %v, max %v]", count, MaxBlockHeadersPerMsg)
return messageError("MsgHeaders.BtcDecode", str)
2013-05-08 21:31:00 +02:00
}
wire: Reduce allocs with contiguous slices. The current code involves a ton of small allocations which is harsh on the garbage collector and in turn causes a lot of addition runtime overhead both in terms of additional memory and processing time. In order to improve the situation, this drasticially reduces the number of allocations by creating contiguous slices of objects and deserializing into them. Since the final data structures consist of slices of pointers to the objects, they are constructed by pointing them into the appropriate offset of the contiguous slice. This could be improved upon even further by converting all of the data structures provided the wire package to be slices of contiguous objects directly, however that would be a major breaking API change and would end up requiring updating a lot more code in every caller. I do think that ultimately the API should be changed, but the changes in this commit already makes a massive difference and it doesn't require touching any of the callers, so it is a good place to begin. The following is a before and after comparison of the allocations with the benchmarks that did not change removed: benchmark old allocs new allocs delta ----------------------------------------------------------- DeserializeTxLarge 16715 11146 -33.32% DecodeGetHeaders 501 2 -99.60% DecodeHeaders 2001 2 -99.90% DecodeGetBlocks 501 2 -99.60% DecodeAddr 3001 2002 -33.29% DecodeInv 50003 3 -99.99% DecodeNotFound 50002 3 -99.99% DecodeMerkleBlock 107 3 -97.20%
2016-04-21 23:49:38 +02:00
// Create a contiguous slice of headers to deserialize into in order to
// reduce the number of allocations.
headers := make([]BlockHeader, count)
msg.Headers = make([]*BlockHeader, 0, count)
2013-05-08 21:31:00 +02:00
for i := uint64(0); i < count; i++ {
wire: Reduce allocs with contiguous slices. The current code involves a ton of small allocations which is harsh on the garbage collector and in turn causes a lot of addition runtime overhead both in terms of additional memory and processing time. In order to improve the situation, this drasticially reduces the number of allocations by creating contiguous slices of objects and deserializing into them. Since the final data structures consist of slices of pointers to the objects, they are constructed by pointing them into the appropriate offset of the contiguous slice. This could be improved upon even further by converting all of the data structures provided the wire package to be slices of contiguous objects directly, however that would be a major breaking API change and would end up requiring updating a lot more code in every caller. I do think that ultimately the API should be changed, but the changes in this commit already makes a massive difference and it doesn't require touching any of the callers, so it is a good place to begin. The following is a before and after comparison of the allocations with the benchmarks that did not change removed: benchmark old allocs new allocs delta ----------------------------------------------------------- DeserializeTxLarge 16715 11146 -33.32% DecodeGetHeaders 501 2 -99.60% DecodeHeaders 2001 2 -99.90% DecodeGetBlocks 501 2 -99.60% DecodeAddr 3001 2002 -33.29% DecodeInv 50003 3 -99.99% DecodeNotFound 50002 3 -99.99% DecodeMerkleBlock 107 3 -97.20%
2016-04-21 23:49:38 +02:00
bh := &headers[i]
err := readBlockHeader(r, pver, bh)
2013-05-08 21:31:00 +02:00
if err != nil {
return err
}
txCount, err := ReadVarInt(r, pver)
Remove BlockHeader.TxnCount field. This commit removes the TxnCount field from the BlockHeader type and updates the tests accordingly. Note that this change does not affect the actual wire protocol encoding in any way. The reason the field has been removed is it really doesn't belong there even though the wire protocol wiki entry on the official bitcoin wiki implies it does. The implication is an artifact from the way the reference implementation serializes headers (MsgHeaders) messages. It includes the transaction count, which is naturally always 0 for headers, along with every header. However, in reality, a block header does not include the transaction count. This can be evidenced by looking at how a block hash is calculated. It is only up to and including the Nonce field (a total of 80 bytes). From an API standpoint, having the field as part of the BlockHeader type results in several odd cases. For example, the transaction count for MsgBlocks (the only place that actually has a real transaction count since MsgHeaders does not) is available by taking the len of the Transactions slice. As such, having the extra field in the BlockHeader is really a useless field that could potentially get out of sync and cause the encode to fail. Another example is related to deserializing a block header from the database in order to serve it in response to a getheaders (MsgGetheaders) request. If a block header is assumed to have the transaction count as a part of it, then derserializing a block header not only consumes more than the 80 bytes that actually comprise the header as stated above, but you then need to change the transaction count to 0 before sending the headers (MsgHeaders) message. So, not only are you reading and deserializing more bytes than needed, but worse, you generally have to make a copy of it so you can change the transaction count without busting cached headers. This is part 1 of #13.
2014-01-19 02:37:33 +01:00
if err != nil {
return err
}
2013-05-08 21:31:00 +02:00
// Ensure the transaction count is zero for headers.
Remove BlockHeader.TxnCount field. This commit removes the TxnCount field from the BlockHeader type and updates the tests accordingly. Note that this change does not affect the actual wire protocol encoding in any way. The reason the field has been removed is it really doesn't belong there even though the wire protocol wiki entry on the official bitcoin wiki implies it does. The implication is an artifact from the way the reference implementation serializes headers (MsgHeaders) messages. It includes the transaction count, which is naturally always 0 for headers, along with every header. However, in reality, a block header does not include the transaction count. This can be evidenced by looking at how a block hash is calculated. It is only up to and including the Nonce field (a total of 80 bytes). From an API standpoint, having the field as part of the BlockHeader type results in several odd cases. For example, the transaction count for MsgBlocks (the only place that actually has a real transaction count since MsgHeaders does not) is available by taking the len of the Transactions slice. As such, having the extra field in the BlockHeader is really a useless field that could potentially get out of sync and cause the encode to fail. Another example is related to deserializing a block header from the database in order to serve it in response to a getheaders (MsgGetheaders) request. If a block header is assumed to have the transaction count as a part of it, then derserializing a block header not only consumes more than the 80 bytes that actually comprise the header as stated above, but you then need to change the transaction count to 0 before sending the headers (MsgHeaders) message. So, not only are you reading and deserializing more bytes than needed, but worse, you generally have to make a copy of it so you can change the transaction count without busting cached headers. This is part 1 of #13.
2014-01-19 02:37:33 +01:00
if txCount > 0 {
str := fmt.Sprintf("block headers may not contain "+
Remove BlockHeader.TxnCount field. This commit removes the TxnCount field from the BlockHeader type and updates the tests accordingly. Note that this change does not affect the actual wire protocol encoding in any way. The reason the field has been removed is it really doesn't belong there even though the wire protocol wiki entry on the official bitcoin wiki implies it does. The implication is an artifact from the way the reference implementation serializes headers (MsgHeaders) messages. It includes the transaction count, which is naturally always 0 for headers, along with every header. However, in reality, a block header does not include the transaction count. This can be evidenced by looking at how a block hash is calculated. It is only up to and including the Nonce field (a total of 80 bytes). From an API standpoint, having the field as part of the BlockHeader type results in several odd cases. For example, the transaction count for MsgBlocks (the only place that actually has a real transaction count since MsgHeaders does not) is available by taking the len of the Transactions slice. As such, having the extra field in the BlockHeader is really a useless field that could potentially get out of sync and cause the encode to fail. Another example is related to deserializing a block header from the database in order to serve it in response to a getheaders (MsgGetheaders) request. If a block header is assumed to have the transaction count as a part of it, then derserializing a block header not only consumes more than the 80 bytes that actually comprise the header as stated above, but you then need to change the transaction count to 0 before sending the headers (MsgHeaders) message. So, not only are you reading and deserializing more bytes than needed, but worse, you generally have to make a copy of it so you can change the transaction count without busting cached headers. This is part 1 of #13.
2014-01-19 02:37:33 +01:00
"transactions [count %v]", txCount)
return messageError("MsgHeaders.BtcDecode", str)
2013-05-08 21:31:00 +02:00
}
wire: Reduce allocs with contiguous slices. The current code involves a ton of small allocations which is harsh on the garbage collector and in turn causes a lot of addition runtime overhead both in terms of additional memory and processing time. In order to improve the situation, this drasticially reduces the number of allocations by creating contiguous slices of objects and deserializing into them. Since the final data structures consist of slices of pointers to the objects, they are constructed by pointing them into the appropriate offset of the contiguous slice. This could be improved upon even further by converting all of the data structures provided the wire package to be slices of contiguous objects directly, however that would be a major breaking API change and would end up requiring updating a lot more code in every caller. I do think that ultimately the API should be changed, but the changes in this commit already makes a massive difference and it doesn't require touching any of the callers, so it is a good place to begin. The following is a before and after comparison of the allocations with the benchmarks that did not change removed: benchmark old allocs new allocs delta ----------------------------------------------------------- DeserializeTxLarge 16715 11146 -33.32% DecodeGetHeaders 501 2 -99.60% DecodeHeaders 2001 2 -99.90% DecodeGetBlocks 501 2 -99.60% DecodeAddr 3001 2002 -33.29% DecodeInv 50003 3 -99.99% DecodeNotFound 50002 3 -99.99% DecodeMerkleBlock 107 3 -97.20%
2016-04-21 23:49:38 +02:00
msg.AddBlockHeader(bh)
2013-05-08 21:31:00 +02:00
}
return nil
}
// BtcEncode encodes the receiver to w using the bitcoin protocol encoding.
// This is part of the Message interface implementation.
func (msg *MsgHeaders) BtcEncode(w io.Writer, pver uint32, enc MessageEncoding) error {
2013-05-08 21:31:00 +02:00
// Limit to max block headers per message.
count := len(msg.Headers)
if count > MaxBlockHeadersPerMsg {
str := fmt.Sprintf("too many block headers for message "+
"[count %v, max %v]", count, MaxBlockHeadersPerMsg)
return messageError("MsgHeaders.BtcEncode", str)
2013-05-08 21:31:00 +02:00
}
err := WriteVarInt(w, pver, uint64(count))
2013-05-08 21:31:00 +02:00
if err != nil {
return err
}
for _, bh := range msg.Headers {
Remove BlockHeader.TxnCount field. This commit removes the TxnCount field from the BlockHeader type and updates the tests accordingly. Note that this change does not affect the actual wire protocol encoding in any way. The reason the field has been removed is it really doesn't belong there even though the wire protocol wiki entry on the official bitcoin wiki implies it does. The implication is an artifact from the way the reference implementation serializes headers (MsgHeaders) messages. It includes the transaction count, which is naturally always 0 for headers, along with every header. However, in reality, a block header does not include the transaction count. This can be evidenced by looking at how a block hash is calculated. It is only up to and including the Nonce field (a total of 80 bytes). From an API standpoint, having the field as part of the BlockHeader type results in several odd cases. For example, the transaction count for MsgBlocks (the only place that actually has a real transaction count since MsgHeaders does not) is available by taking the len of the Transactions slice. As such, having the extra field in the BlockHeader is really a useless field that could potentially get out of sync and cause the encode to fail. Another example is related to deserializing a block header from the database in order to serve it in response to a getheaders (MsgGetheaders) request. If a block header is assumed to have the transaction count as a part of it, then derserializing a block header not only consumes more than the 80 bytes that actually comprise the header as stated above, but you then need to change the transaction count to 0 before sending the headers (MsgHeaders) message. So, not only are you reading and deserializing more bytes than needed, but worse, you generally have to make a copy of it so you can change the transaction count without busting cached headers. This is part 1 of #13.
2014-01-19 02:37:33 +01:00
err := writeBlockHeader(w, pver, bh)
if err != nil {
return err
2013-05-08 21:31:00 +02:00
}
Remove BlockHeader.TxnCount field. This commit removes the TxnCount field from the BlockHeader type and updates the tests accordingly. Note that this change does not affect the actual wire protocol encoding in any way. The reason the field has been removed is it really doesn't belong there even though the wire protocol wiki entry on the official bitcoin wiki implies it does. The implication is an artifact from the way the reference implementation serializes headers (MsgHeaders) messages. It includes the transaction count, which is naturally always 0 for headers, along with every header. However, in reality, a block header does not include the transaction count. This can be evidenced by looking at how a block hash is calculated. It is only up to and including the Nonce field (a total of 80 bytes). From an API standpoint, having the field as part of the BlockHeader type results in several odd cases. For example, the transaction count for MsgBlocks (the only place that actually has a real transaction count since MsgHeaders does not) is available by taking the len of the Transactions slice. As such, having the extra field in the BlockHeader is really a useless field that could potentially get out of sync and cause the encode to fail. Another example is related to deserializing a block header from the database in order to serve it in response to a getheaders (MsgGetheaders) request. If a block header is assumed to have the transaction count as a part of it, then derserializing a block header not only consumes more than the 80 bytes that actually comprise the header as stated above, but you then need to change the transaction count to 0 before sending the headers (MsgHeaders) message. So, not only are you reading and deserializing more bytes than needed, but worse, you generally have to make a copy of it so you can change the transaction count without busting cached headers. This is part 1 of #13.
2014-01-19 02:37:33 +01:00
// The wire protocol encoding always includes a 0 for the number
// of transactions on header messages. This is really just an
// artifact of the way the original implementation serializes
// block headers, but it is required.
err = WriteVarInt(w, pver, 0)
2013-05-08 21:31:00 +02:00
if err != nil {
return err
}
}
return nil
}
// Command returns the protocol command string for the message. This is part
// of the Message interface implementation.
func (msg *MsgHeaders) Command() string {
return CmdHeaders
2013-05-08 21:31:00 +02:00
}
// MaxPayloadLength returns the maximum length the payload can be for the
// receiver. This is part of the Message interface implementation.
func (msg *MsgHeaders) MaxPayloadLength(pver uint32) uint32 {
Remove BlockHeader.TxnCount field. This commit removes the TxnCount field from the BlockHeader type and updates the tests accordingly. Note that this change does not affect the actual wire protocol encoding in any way. The reason the field has been removed is it really doesn't belong there even though the wire protocol wiki entry on the official bitcoin wiki implies it does. The implication is an artifact from the way the reference implementation serializes headers (MsgHeaders) messages. It includes the transaction count, which is naturally always 0 for headers, along with every header. However, in reality, a block header does not include the transaction count. This can be evidenced by looking at how a block hash is calculated. It is only up to and including the Nonce field (a total of 80 bytes). From an API standpoint, having the field as part of the BlockHeader type results in several odd cases. For example, the transaction count for MsgBlocks (the only place that actually has a real transaction count since MsgHeaders does not) is available by taking the len of the Transactions slice. As such, having the extra field in the BlockHeader is really a useless field that could potentially get out of sync and cause the encode to fail. Another example is related to deserializing a block header from the database in order to serve it in response to a getheaders (MsgGetheaders) request. If a block header is assumed to have the transaction count as a part of it, then derserializing a block header not only consumes more than the 80 bytes that actually comprise the header as stated above, but you then need to change the transaction count to 0 before sending the headers (MsgHeaders) message. So, not only are you reading and deserializing more bytes than needed, but worse, you generally have to make a copy of it so you can change the transaction count without busting cached headers. This is part 1 of #13.
2014-01-19 02:37:33 +01:00
// Num headers (varInt) + max allowed headers (header length + 1 byte
// for the number of transactions which is always 0).
2014-03-12 02:22:32 +01:00
return MaxVarIntPayload + ((MaxBlockHeaderPayload + 1) *
Remove BlockHeader.TxnCount field. This commit removes the TxnCount field from the BlockHeader type and updates the tests accordingly. Note that this change does not affect the actual wire protocol encoding in any way. The reason the field has been removed is it really doesn't belong there even though the wire protocol wiki entry on the official bitcoin wiki implies it does. The implication is an artifact from the way the reference implementation serializes headers (MsgHeaders) messages. It includes the transaction count, which is naturally always 0 for headers, along with every header. However, in reality, a block header does not include the transaction count. This can be evidenced by looking at how a block hash is calculated. It is only up to and including the Nonce field (a total of 80 bytes). From an API standpoint, having the field as part of the BlockHeader type results in several odd cases. For example, the transaction count for MsgBlocks (the only place that actually has a real transaction count since MsgHeaders does not) is available by taking the len of the Transactions slice. As such, having the extra field in the BlockHeader is really a useless field that could potentially get out of sync and cause the encode to fail. Another example is related to deserializing a block header from the database in order to serve it in response to a getheaders (MsgGetheaders) request. If a block header is assumed to have the transaction count as a part of it, then derserializing a block header not only consumes more than the 80 bytes that actually comprise the header as stated above, but you then need to change the transaction count to 0 before sending the headers (MsgHeaders) message. So, not only are you reading and deserializing more bytes than needed, but worse, you generally have to make a copy of it so you can change the transaction count without busting cached headers. This is part 1 of #13.
2014-01-19 02:37:33 +01:00
MaxBlockHeadersPerMsg)
2013-05-08 21:31:00 +02:00
}
// NewMsgHeaders returns a new bitcoin headers message that conforms to the
2013-05-08 21:31:00 +02:00
// Message interface. See MsgHeaders for details.
func NewMsgHeaders() *MsgHeaders {
return &MsgHeaders{
Headers: make([]*BlockHeader, 0, MaxBlockHeadersPerMsg),
}
2013-05-08 21:31:00 +02:00
}