Use Non-Adjacent Form (NAF) of large numbers to reduce ScalarMult computation times.
Preliminary results indicate around a 8-9% speed improvement according to BenchmarkScalarMult.
The algorithm used is 3.77 from Guide to Elliptical Curve Crytography by Hankerson, et al.
This closes#3
This implements a speedup to ScalarMult using the endomorphism available to secp256k1.
Note the constants lambda, beta, a1, b1, a2 and b2 are from here:
https://bitcointalk.org/index.php?topic=3238.0
Preliminary tests indicate a speedup of between 17%-20% (BenchScalarMult).
More speedup can probably be achieved once splitK uses something more like what fieldVal uses. Unfortunately, the prime for this math is the order of G (N), not P.
Note the NAF optimization was specifically not done as that's the purview of another issue.
Changed both ScalarMult and ScalarBaseMult to take advantage of curve.N to reduce k.
This results in a 80% speedup to large values of k for ScalarBaseMult.
Note the new test BenchmarkScalarBaseMultLarge is how that speedup number can
be checked.
This closes#1
This commit reworks the way that the pre-computed table which is used to
accelerate scalar base multiple is generated and loaded to make use of the
go generate infrastructure and greatly reduce the memory needed to compile
as well as speed up the compile.
Previously, the table was being generated using the in-memory
representation directly written into the file. Since the table has a very
large number of entries, the Go compiler was taking up to nearly 1GB to
compile. It also took a comparatively long period of time to compile.
Instead, this commit modifies the generated table to be a serialized,
compressed, and base64-encoded byte slice. At init time, this process is
reversed to create the in-memory representation. This approach provides
fast compile times with much lower memory needed to compile (16MB versus
1GB). In addition, the init time cost is extremely low, especially as
compared to computing the entire table.
Finally, the automatic generation wasn't really automatic. It is now
fully automatic with 'go generate'.
The addition of the pre-computed values for the ScalarBaseMult
optimizations added a benign race condition since a pointer to each
pre-computed Jacobian point was being used in the call to addJacobian
instead of a local stack copy.
This resulted in the code which normalizes the field values to check for
further optimization conditions such as equal Z values to race against the
IsZero checks when multiple goroutines were performing EC operations since
they were all operating on the same point in memory.
In practice this was benign since the value was being replaced with the
same thing and thus it was the same before and after the race, but it's
never a good idea to have races.
Code uses a windowing/precomputing strategy to minimize ECC math.
Every 8-bit window of the 256 bits that compose a possible scalar multiple has a complete map that's pre-computed.
The precomputed data is in secp256k1.go and the generator for that file is in gensecp256k1.go
Also fixed a spelling error in a benchmark test.
Results so far seem to indicate the time taken is about 35% of what it was before.
Closes#2
This commit adds an example test file so it integrates nicely with Go's
example tooling.
This allows the example output to be tested as a part of running the
normal Go tests to help ensure it doesn't get out of date with the code.
It is also nice to have the examples in one place rather than repeating it
in doc.go and README.md.
Links and information about the examples have been included in README.md in
place of the examples and doc.go has been updated accordingly.
- Keep comments to 80 cols for consistency with the rest of the code base
- Made verify a method off of Signature instead of PublicKey since one
verifies a signature with a public key as opposed to the other way
around
- Return new signature from Sign function directly rather than creating a
local temporary variable
- Modify a couple of comments as recommended by @owainga
- Update sample usage in doc.go for both signing messages and verifying
signatures
ok @owainga
This change removes the internal pad function in favor a more opimized
paddedAppend function. Unlike pad, which would always alloate a new
slice of the desired size and copy the bytes into it, paddedAppend
only appends the leading padding when necesary, and uses the builtin
append to copy the remaining source bytes. pad was also used in
combination with another call to the builtin copy func to copy into a
zeroed byte slice. As the slice is now created using make with an
initial length of zero, this copy can also be removed.
As confirmed by poking the bytes with the unsafe package, gc does not
zero array elements between the len and cap when allocating slices
with make(). In combination with the paddedAppend func, this results
in only a single copy of each byte, with no unnecssary zeroing, when
creating the serialized pubkeys. This has not been tested with other
Go compilers (namely, gccgo and llgo), but the new behavior is still
functionally correct regardless of compiler optimizations.
The TestPad function has been removed as the pad func it tested has
likewise been removed.
ok @davecgh
Since the Z values are normalized (which ordinarily mutates them as
needed) before checking for equality, the race detector gets confused when
using a global value for the field representation of the value 1 and
passing it into the various internal arithmetic routines and reports a
false positive.
Even though the race was a false positive and had no adverse effects, this
commit silences the race detector by creating new variables at the top
level and passing them instead of the global fieldOne variable. The
global is still used for comparison operations since those have no
potential to mutate the value and hence don't trigger the race detector.
This commit exposes a new function named Serialize on the Signature type
which can be used to obtain a DER encoded signature. Previously this
function was named sigDer and was part of btcscript, but as @donovanhide
pointed out in issue btcscript/#3, it really should have been part of this
package.
ok @owainga
Since the function was only exported for use by the test package (and was
commented as such), just move it into the internal_test.go file so it is
only available when the tests run.
This commit essentially rewrites all of the primitives needed to perform
the arithmetic for ECDSA signature verification of secp256k1 signatures to
significantly speed it up. Benchmarking has shown signature verification
is roughly 10 times faster with this commit over the previous.
In particular, it introduces a new field value which is used to perform the
modular field arithmetic using fixed-precision operations specifically
tailored for the secp256k1 prime. The field also takes advantage of
special properties of the prime for significantly faster modular reduction
than is available through generic methods.
In addition, the curve point addition and doubling have been optimized
minimize the number of field multiplications in favor field squarings
since they are quite a bit faster. They routines also now look for
certain assumptions such as z values of 1 or equivalent z values which
can be used to further reduce the number of multiplicaitons needed when
possible.
Note there are still quite a few more optimizations that could be done
such as using precomputation for ScalarBaseMult, making use of the
secp256k1 endomorphism, and using windowed NAF, however this work already
offers significant performance improvements.
For example, testing 10000 random signature verifications resulted in:
New btcec took 15.9821565s
Old btcec took 2m34.1016716s
Closesconformal/btcd#26.