This implements a speedup to ScalarMult using the endomorphism available to secp256k1.
Note the constants lambda, beta, a1, b1, a2 and b2 are from here:
https://bitcointalk.org/index.php?topic=3238.0
Preliminary tests indicate a speedup of between 17%-20% (BenchScalarMult).
More speedup can probably be achieved once splitK uses something more like what fieldVal uses. Unfortunately, the prime for this math is the order of G (N), not P.
Note the NAF optimization was specifically not done as that's the purview of another issue.
Changed both ScalarMult and ScalarBaseMult to take advantage of curve.N to reduce k.
This results in a 80% speedup to large values of k for ScalarBaseMult.
Note the new test BenchmarkScalarBaseMultLarge is how that speedup number can
be checked.
This closes#1
This commit reworks the way that the pre-computed table which is used to
accelerate scalar base multiple is generated and loaded to make use of the
go generate infrastructure and greatly reduce the memory needed to compile
as well as speed up the compile.
Previously, the table was being generated using the in-memory
representation directly written into the file. Since the table has a very
large number of entries, the Go compiler was taking up to nearly 1GB to
compile. It also took a comparatively long period of time to compile.
Instead, this commit modifies the generated table to be a serialized,
compressed, and base64-encoded byte slice. At init time, this process is
reversed to create the in-memory representation. This approach provides
fast compile times with much lower memory needed to compile (16MB versus
1GB). In addition, the init time cost is extremely low, especially as
compared to computing the entire table.
Finally, the automatic generation wasn't really automatic. It is now
fully automatic with 'go generate'.
Code uses a windowing/precomputing strategy to minimize ECC math.
Every 8-bit window of the 256 bits that compose a possible scalar multiple has a complete map that's pre-computed.
The precomputed data is in secp256k1.go and the generator for that file is in gensecp256k1.go
Also fixed a spelling error in a benchmark test.
Results so far seem to indicate the time taken is about 35% of what it was before.
Closes#2