Commit graph

15 commits

Author SHA1 Message Date
Jimmy Song
6c36218ef3 Optimize ScalarMult with NAF
Use Non-Adjacent Form (NAF) of large numbers to reduce ScalarMult computation times.

Preliminary results indicate around a 8-9% speed improvement according to BenchmarkScalarMult.

The algorithm used is 3.77 from Guide to Elliptical Curve Crytography by Hankerson, et al.

This closes #3
2015-02-05 08:28:51 -06:00
Jimmy Song
95b23c293c Optimize ScalarMult using endomorphism
This implements a speedup to ScalarMult using the endomorphism available to secp256k1.

Note the constants lambda, beta, a1, b1, a2 and b2 are from here:

https://bitcointalk.org/index.php?topic=3238.0

Preliminary tests indicate a speedup of between 17%-20% (BenchScalarMult).

More speedup can probably be achieved once splitK uses something more like what fieldVal uses. Unfortunately, the prime for this math is the order of G (N), not P.

Note the NAF optimization was specifically not done as that's the purview of another issue.

Changed both ScalarMult and ScalarBaseMult to take advantage of curve.N to reduce k.
This results in a 80% speedup to large values of k for ScalarBaseMult.
Note the new test BenchmarkScalarBaseMultLarge is how that speedup number can
be checked.

This closes #1
2015-02-03 14:14:21 -06:00
John C. Vernaleo
d4d2f622b5 Fix bug and inconsistant error msg seen by lint. 2015-02-03 10:02:44 -06:00
Dave Collins
9535058a7b Rework the pre-computed table generation and load.
This commit reworks the way that the pre-computed table which is used to
accelerate scalar base multiple is generated and loaded to make use of the
go generate infrastructure and greatly reduce the memory needed to compile
as well as speed up the compile.

Previously, the table was being generated using the in-memory
representation directly written into the file.  Since the table has a very
large number of entries, the Go compiler was taking up to nearly 1GB to
compile.  It also took a comparatively long period of time to compile.

Instead, this commit modifies the generated table to be a serialized,
compressed, and base64-encoded byte slice.  At init time, this process is
reversed to create the in-memory representation.  This approach provides
fast compile times with much lower memory needed to compile (16MB versus
1GB).  In addition, the init time cost is extremely low, especially as
compared to computing the entire table.

Finally, the automatic generation wasn't really automatic.  It is now
fully automatic with 'go generate'.
2015-02-01 03:26:51 -06:00
David Evans
f9365fd542 Update btcec.go
Updated link to SEC 2: Recommended Elliptic Curve Domain Parameters standard (URL given no longer exists).
2015-01-20 20:44:43 -05:00
Dave Collins
da74b98565 Fix a benign race detected by the race detector.
The addition of the pre-computed values for the ScalarBaseMult
optimizations added a benign race condition since a pointer to each
pre-computed Jacobian point was being used in the call to addJacobian
instead of a local stack copy.

This resulted in the code which normalizes the field values to check for
further optimization conditions such as equal Z values to race against the
IsZero checks when multiple goroutines were performing EC operations since
they were all operating on the same point in memory.

In practice this was benign since the value was being replaced with the
same thing and thus it was the same before and after the race, but it's
never a good idea to have races.
2015-01-15 14:22:01 -06:00
Jimmy Song
d69442834c Optimize ScalarBaseMult
Code uses a windowing/precomputing strategy to minimize ECC math.
Every 8-bit window of the 256 bits that compose a possible scalar multiple has a complete map that's pre-computed.
The precomputed data is in secp256k1.go and the generator for that file is in gensecp256k1.go

Also fixed a spelling error in a benchmark test.

Results so far seem to indicate the time taken is about 35% of what it was before.

Closes #2
2014-09-24 19:07:58 -05:00
Owain G. Ainsworth
ff3fac426d Add code to produce and verify compact signatures.
The format used is identical to that used in bitcoind.
2014-02-13 18:47:10 +00:00
Dave Collins
218906a91e Make the race detect happy.
Since the Z values are normalized (which ordinarily mutates them as
needed) before checking for equality, the race detector gets confused when
using a global value for the field representation of the value 1 and
passing it into the various internal arithmetic routines and reports a
false positive.

Even though the race was a false positive and had no adverse effects, this
commit silences the race detector by creating new variables at the top
level and passing them instead of the global fieldOne variable.  The
global is still used for comparison operations since those have no
potential to mutate the value and hence don't trigger the race detector.
2014-02-13 10:59:14 -06:00
Dave Collins
58cab817f0 Add 2014 to copyright dates. 2014-01-08 23:51:37 -06:00
Dave Collins
e3c2b87536 Fix a comment typo. 2013-12-26 18:52:30 -06:00
Dave Collins
9be5c5cbd9 Significantly optimize signature verification.
This commit essentially rewrites all of the primitives needed to perform
the arithmetic for ECDSA signature verification of secp256k1 signatures to
significantly speed it up.  Benchmarking has shown signature verification
is roughly 10 times faster with this commit over the previous.

In particular, it introduces a new field value which is used to perform the
modular field arithmetic using fixed-precision operations specifically
tailored for the secp256k1 prime.  The field also takes advantage of
special properties of the prime for significantly faster modular reduction
than is available through generic methods.

In addition, the curve point addition and doubling have been optimized
minimize the number of field multiplications in favor field squarings
since they are quite a bit faster.  They routines also now look for
certain assumptions such as z values of 1 or equivalent z values which
can be used to further reduce the number of multiplicaitons needed when
possible.

Note there are still quite a few more optimizations that could be done
such as using precomputation for ScalarBaseMult, making use of the
secp256k1 endomorphism, and using windowed NAF, however this work already
offers significant performance improvements.

For example, testing 10000 random signature verifications resulted in:
New btcec took 15.9821565s
Old btcec took 2m34.1016716s

Closes conformal/btcd#26.
2013-12-20 15:07:15 -06:00
Owain G. Ainsworth
95b3c063e3 remove lazy computation of QPlus1Div4 and do at init time.
Should shut up the race detector (thought this should be harmless)
2013-11-21 18:59:15 +00:00
Owain G. Ainsworth
abfd6b44af More documentation commentary. 2013-08-06 18:22:16 +01:00
Dave Collins
6e9cc57131 Initial implementation. 2013-06-13 14:38:54 -05:00