In this commit, we add two new package level functions: `EncodeM`, and
`DecodeGeneric`. The new encode method is intended to allow callers to
specify that they want to use the new bech32m checksum. This should be
used when encoding segwit addresses with version 1 and beyond. The new
`DecodeGeneric` function allows a caller to decode a bech32 and bech32m
string with a single function. A new return value is added which is the
version of the returned bech32 string, which allows callers to perform
additional segwit addr validation (v1+ should use bech32m etc).
We opted to add new functions rather than modifying the existing
functions to not cause a breaking API change, as most uses in the wild
can just use the existing functions, and only taproot related logic/code
needs to worry about the new methods.
A series of tests have been added to ensure that `DecodeGeneric`
extracts the proper bech version, and we've also adopted the bech32m
tests from BIP 350.
In this commit, we add an additional field to the ErrInvalidChecksum,
the bech32m version of a checksum. When decoding, we don't now what
version they actually _intended_ to use, so we'll opt to include both
checksums to aide in debugging and error reporting.
Since bech32 itself works with data encoded with 5 bits per byte (aka
base32) padded out to the nearest byte boundary, the existing functions
for Encode and Decode accept and return data encoded that way.
However, the most common way to use bech32 is to encode data that is
already encoded with 8 bits per byte (aka base256) without padding which
means it is up to the caller to use the ConvertBits function properly to
convert between the two encodings.
Consequently, this introduces two convenience functions for working
directly with base256-encoded data named EncodeFromBase256 and
DecodeToBase256 along with a full set of tests to ensure they work
expected.
BIP173 specifically calls out that encoders must always output an all
lowercase bech32 string and that the lowercase form is used when
determining a character's value for calculating the checksum.
Currently, the implementation does not respect either of those
requirements.
This modifies the Encode function to convert the provided HRP to
lowercase to ensure the requirements are satisfied and adds tests
accordingly.
This commit brings a host of improvements to the bech32 package. The
public interface of the package remains unchanged.
Summary of changes:
* Improved error handling using dedicated error types. Programmatically
detect if the errors produced are the expected ones.
* Improve test coverage to test more corner cases. Added test vectors
from Bitcoin Core.
* Add a benchmark for a full encode/decode cycle of a bech32 string.
* Add a new function DecodeNoLimit, for decoding large bech32 encoded
strings. It does NOT validate against the BIP-173 maximum length
allowed for bech32 strings.
* Automatically convert the HRP to lowercase in Encode function.
* Improve performance of encode/decode functions by using
strings.Builder.
* Improve memory allocation in ConvertBits function.
* Updated documentation.
Credits: @matheusd
Closes#152 and #168.
The bech32 format is used to encode base32 data
in a string format specified in BIP 173. This is
among other things used to encode native segwit
addresses, also specified in the same BIP.
This commit adds the package bech32, which
contains the necessary utility methods to create
bech32 encoded strings from arbitrary data.