lbcd/cmd/addblock/import.go
Dave Collins 491acd4ca6 blockchain: Rework to use new db interface.
This commit is the first stage of several that are planned to convert
the blockchain package into a concurrent safe package that will
ultimately allow support for multi-peer download and concurrent chain
processing.  The goal is to update btcd proper after each step so it can
take advantage of the enhancements as they are developed.

In addition to the aforementioned benefit, this staged approach has been
chosen since it is absolutely critical to maintain consensus.
Separating the changes into several stages makes it easier for reviewers
to logically follow what is happening and therefore helps prevent
consensus bugs.  Naturally there are significant automated tests to help
prevent consensus issues as well.

The main focus of this stage is to convert the blockchain package to use
the new database interface and implement the chain-related functionality
which it no longer handles.  It also aims to improve efficiency in
various areas by making use of the new database and chain capabilities.

The following is an overview of the chain changes:

- Update to use the new database interface
- Add chain-related functionality that the old database used to handle
  - Main chain structure and state
  - Transaction spend tracking
- Implement a new pruned unspent transaction output (utxo) set
  - Provides efficient direct access to the unspent transaction outputs
  - Uses a domain specific compression algorithm that understands the
    standard transaction scripts in order to significantly compress them
  - Removes reliance on the transaction index and paves the way toward
    eventually enabling block pruning
- Modify the New function to accept a Config struct instead of
  inidividual parameters
- Replace the old TxStore type with a new UtxoViewpoint type that makes
  use of the new pruned utxo set
- Convert code to treat the new UtxoViewpoint as a rolling view that is
  used between connects and disconnects to improve efficiency
- Make best chain state always set when the chain instance is created
  - Remove now unnecessary logic for dealing with unset best state
- Make all exported functions concurrent safe
  - Currently using a single chain state lock as it provides a straight
    forward and easy to review path forward however this can be improved
    with more fine grained locking
- Optimize various cases where full blocks were being loaded when only
  the header is needed to help reduce the I/O load
- Add the ability for callers to get a snapshot of the current best
  chain stats in a concurrent safe fashion
  - Does not block callers while new blocks are being processed
- Make error messages that reference transaction outputs consistently
  use <transaction hash>:<output index>
- Introduce a new AssertError type an convert internal consistency
  checks to use it
- Update tests and examples to reflect the changes
- Add a full suite of tests to ensure correct functionality of the new
  code

The following is an overview of the btcd changes:

- Update to use the new database and chain interfaces
- Temporarily remove all code related to the transaction index
- Temporarily remove all code related to the address index
- Convert all code that uses transaction stores to use the new utxo
  view
- Rework several calls that required the block manager for safe
  concurrency to use the chain package directly now that it is
  concurrent safe
- Change all calls to obtain the best hash to use the new best state
  snapshot capability from the chain package
- Remove workaround for limits on fetching height ranges since the new
  database interface no longer imposes them
- Correct the gettxout RPC handler to return the best chain hash as
  opposed the hash the txout was found in
- Optimize various RPC handlers:
  - Change several of the RPC handlers to use the new chain snapshot
    capability to avoid needlessly loading data
  - Update several handlers to use new functionality to avoid accessing
    the block manager so they are able to return the data without
    blocking when the server is busy processing blocks
  - Update non-verbose getblock to avoid deserialization and
    serialization overhead
  - Update getblockheader to request the block height directly from
    chain and only load the header
  - Update getdifficulty to use the new cached data from chain
  - Update getmininginfo to use the new cached data from chain
  - Update non-verbose getrawtransaction to avoid deserialization and
    serialization overhead
  - Update gettxout to use the new utxo store versus loading
    full transactions using the transaction index

The following is an overview of the utility changes:
- Update addblock to use the new database and chain interfaces
- Update findcheckpoint to use the new database and chain interfaces
- Remove the dropafter utility which is no longer supported

NOTE: The transaction index and address index will be reimplemented in
another commit.
2016-04-11 16:47:27 -05:00

317 lines
8.7 KiB
Go

// Copyright (c) 2013-2016 The btcsuite developers
// Use of this source code is governed by an ISC
// license that can be found in the LICENSE file.
package main
import (
"encoding/binary"
"fmt"
"io"
"sync"
"time"
"github.com/btcsuite/btcd/blockchain"
database "github.com/btcsuite/btcd/database2"
"github.com/btcsuite/btcd/wire"
"github.com/btcsuite/btcutil"
)
var zeroHash = wire.ShaHash{}
// importResults houses the stats and result as an import operation.
type importResults struct {
blocksProcessed int64
blocksImported int64
err error
}
// blockImporter houses information about an ongoing import from a block data
// file to the block database.
type blockImporter struct {
db database.DB
chain *blockchain.BlockChain
medianTime blockchain.MedianTimeSource
r io.ReadSeeker
processQueue chan []byte
doneChan chan bool
errChan chan error
quit chan struct{}
wg sync.WaitGroup
blocksProcessed int64
blocksImported int64
receivedLogBlocks int64
receivedLogTx int64
lastHeight int64
lastBlockTime time.Time
lastLogTime time.Time
}
// readBlock reads the next block from the input file.
func (bi *blockImporter) readBlock() ([]byte, error) {
// The block file format is:
// <network> <block length> <serialized block>
var net uint32
err := binary.Read(bi.r, binary.LittleEndian, &net)
if err != nil {
if err != io.EOF {
return nil, err
}
// No block and no error means there are no more blocks to read.
return nil, nil
}
if net != uint32(activeNetParams.Net) {
return nil, fmt.Errorf("network mismatch -- got %x, want %x",
net, uint32(activeNetParams.Net))
}
// Read the block length and ensure it is sane.
var blockLen uint32
if err := binary.Read(bi.r, binary.LittleEndian, &blockLen); err != nil {
return nil, err
}
if blockLen > wire.MaxBlockPayload {
return nil, fmt.Errorf("block payload of %d bytes is larger "+
"than the max allowed %d bytes", blockLen,
wire.MaxBlockPayload)
}
serializedBlock := make([]byte, blockLen)
if _, err := io.ReadFull(bi.r, serializedBlock); err != nil {
return nil, err
}
return serializedBlock, nil
}
// processBlock potentially imports the block into the database. It first
// deserializes the raw block while checking for errors. Already known blocks
// are skipped and orphan blocks are considered errors. Finally, it runs the
// block through the chain rules to ensure it follows all rules and matches
// up to the known checkpoint. Returns whether the block was imported along
// with any potential errors.
func (bi *blockImporter) processBlock(serializedBlock []byte) (bool, error) {
// Deserialize the block which includes checks for malformed blocks.
block, err := btcutil.NewBlockFromBytes(serializedBlock)
if err != nil {
return false, err
}
// update progress statistics
bi.lastBlockTime = block.MsgBlock().Header.Timestamp
bi.receivedLogTx += int64(len(block.MsgBlock().Transactions))
// Skip blocks that already exist.
blockSha := block.Sha()
exists, err := bi.chain.HaveBlock(blockSha)
if err != nil {
return false, err
}
if exists {
return false, nil
}
// Don't bother trying to process orphans.
prevHash := &block.MsgBlock().Header.PrevBlock
if !prevHash.IsEqual(&zeroHash) {
exists, err := bi.chain.HaveBlock(prevHash)
if err != nil {
return false, err
}
if !exists {
return false, fmt.Errorf("import file contains block "+
"%v which does not link to the available "+
"block chain", prevHash)
}
}
// Ensure the blocks follows all of the chain rules and match up to the
// known checkpoints.
isOrphan, err := bi.chain.ProcessBlock(block, bi.medianTime,
blockchain.BFFastAdd)
if err != nil {
return false, err
}
if isOrphan {
return false, fmt.Errorf("import file contains an orphan "+
"block: %v", blockSha)
}
return true, nil
}
// readHandler is the main handler for reading blocks from the import file.
// This allows block processing to take place in parallel with block reads.
// It must be run as a goroutine.
func (bi *blockImporter) readHandler() {
out:
for {
// Read the next block from the file and if anything goes wrong
// notify the status handler with the error and bail.
serializedBlock, err := bi.readBlock()
if err != nil {
bi.errChan <- fmt.Errorf("Error reading from input "+
"file: %v", err.Error())
break out
}
// A nil block with no error means we're done.
if serializedBlock == nil {
break out
}
// Send the block or quit if we've been signalled to exit by
// the status handler due to an error elsewhere.
select {
case bi.processQueue <- serializedBlock:
case <-bi.quit:
break out
}
}
// Close the processing channel to signal no more blocks are coming.
close(bi.processQueue)
bi.wg.Done()
}
// logProgress logs block progress as an information message. In order to
// prevent spam, it limits logging to one message every cfg.Progress seconds
// with duration and totals included.
func (bi *blockImporter) logProgress() {
bi.receivedLogBlocks++
now := time.Now()
duration := now.Sub(bi.lastLogTime)
if duration < time.Second*time.Duration(cfg.Progress) {
return
}
// Truncate the duration to 10s of milliseconds.
durationMillis := int64(duration / time.Millisecond)
tDuration := 10 * time.Millisecond * time.Duration(durationMillis/10)
// Log information about new block height.
blockStr := "blocks"
if bi.receivedLogBlocks == 1 {
blockStr = "block"
}
txStr := "transactions"
if bi.receivedLogTx == 1 {
txStr = "transaction"
}
log.Infof("Processed %d %s in the last %s (%d %s, height %d, %s)",
bi.receivedLogBlocks, blockStr, tDuration, bi.receivedLogTx,
txStr, bi.lastHeight, bi.lastBlockTime)
bi.receivedLogBlocks = 0
bi.receivedLogTx = 0
bi.lastLogTime = now
}
// processHandler is the main handler for processing blocks. This allows block
// processing to take place in parallel with block reads from the import file.
// It must be run as a goroutine.
func (bi *blockImporter) processHandler() {
out:
for {
select {
case serializedBlock, ok := <-bi.processQueue:
// We're done when the channel is closed.
if !ok {
break out
}
bi.blocksProcessed++
bi.lastHeight++
imported, err := bi.processBlock(serializedBlock)
if err != nil {
bi.errChan <- err
break out
}
if imported {
bi.blocksImported++
}
bi.logProgress()
case <-bi.quit:
break out
}
}
bi.wg.Done()
}
// statusHandler waits for updates from the import operation and notifies
// the passed doneChan with the results of the import. It also causes all
// goroutines to exit if an error is reported from any of them.
func (bi *blockImporter) statusHandler(resultsChan chan *importResults) {
select {
// An error from either of the goroutines means we're done so signal
// caller with the error and signal all goroutines to quit.
case err := <-bi.errChan:
resultsChan <- &importResults{
blocksProcessed: bi.blocksProcessed,
blocksImported: bi.blocksImported,
err: err,
}
close(bi.quit)
// The import finished normally.
case <-bi.doneChan:
resultsChan <- &importResults{
blocksProcessed: bi.blocksProcessed,
blocksImported: bi.blocksImported,
err: nil,
}
}
}
// Import is the core function which handles importing the blocks from the file
// associated with the block importer to the database. It returns a channel
// on which the results will be returned when the operation has completed.
func (bi *blockImporter) Import() chan *importResults {
// Start up the read and process handling goroutines. This setup allows
// blocks to be read from disk in parallel while being processed.
bi.wg.Add(2)
go bi.readHandler()
go bi.processHandler()
// Wait for the import to finish in a separate goroutine and signal
// the status handler when done.
go func() {
bi.wg.Wait()
bi.doneChan <- true
}()
// Start the status handler and return the result channel that it will
// send the results on when the import is done.
resultChan := make(chan *importResults)
go bi.statusHandler(resultChan)
return resultChan
}
// newBlockImporter returns a new importer for the provided file reader seeker
// and database.
func newBlockImporter(db database.DB, r io.ReadSeeker) (*blockImporter, error) {
chain, err := blockchain.New(&blockchain.Config{
DB: db,
ChainParams: activeNetParams,
})
if err != nil {
return nil, err
}
return &blockImporter{
db: db,
r: r,
processQueue: make(chan []byte, 2),
doneChan: make(chan bool),
errChan: make(chan error),
quit: make(chan struct{}),
chain: chain,
medianTime: blockchain.NewMedianTime(),
lastLogTime: time.Now(),
}, nil
}