Commit graph

3 commits

Author SHA1 Message Date
Gavin Andresen
8dc206a1e2 Reject non-canonically-encoded sizes
The length of vectors, maps, sets, etc are serialized using
Write/ReadCompactSize -- which, unfortunately, do not use a
unique encoding.

So deserializing and then re-serializing a transaction (for example)
can give you different bits than you started with. That doesn't
cause any problems that we are aware of, but it is exactly the type
of subtle mismatch that can lead to exploits.

With this pull, reading a non-canonical CompactSize throws an
exception, which means nodes will ignore 'tx' or 'block' or
other messages that are not properly encoded.

Please check my logic... but this change is safe with respect to
causing a network split. Old clients that receive
non-canonically-encoded transactions or blocks deserialize
them into CTransaction/CBlock structures in memory, and then
re-serialize them before relaying them to peers.

And please check my logic with respect to causing a blockchain
split: there are no CompactSize fields in the block header, so
the block hash is always canonical. The merkle root in the block
header is computed on a vector<CTransaction>, so
any non-canonical encoding of the transactions in 'tx' or 'block'
messages is erased as they are read into memory by old clients,
and does not affect the block hash. And, as noted above, old
clients re-serialize (with canonical encoding) 'tx' and 'block'
messages before relaying to peers.
2013-08-09 10:01:35 +10:00
Gavin Andresen
87b9931bed Fix signed/unsigned comparison warnings 2013-04-03 14:04:21 -04:00
Pieter Wuille
4d6144f97f Compact serialization for variable-length integers
Variable-length integers: bytes are a MSB base-128 encoding of the number.
The high bit in each byte signifies whether another digit follows. To make
the encoding is one-to-one, one is subtracted from all but the last digit.
Thus, the byte sequence a[] with length len, where all but the last byte
has bit 128 set, encodes the number:

  (a[len-1] & 0x7F) + sum(i=1..len-1, 128^i*((a[len-i-1] & 0x7F)+1))

Properties:
* Very small (0-127: 1 byte, 128-16511: 2 bytes, 16512-2113663: 3 bytes)
* Every integer has exactly one encoding
* Encoding does not depend on size of original integer type
2012-10-20 23:08:56 +02:00