[Midnightbsd-cvs] [MidnightBSD/src] 258785: We are a group of security researchers and cryptog...
Lucas Holt
noreply at github.com
Sat May 21 07:44:04 EDT 2022
Branch: refs/heads/feature/openssl
Home: https://github.com/MidnightBSD/src
Commit: 258785a06a50e05ae97973dacbe0ab8c028a4ece
https://github.com/MidnightBSD/src/commit/258785a06a50e05ae97973dacbe0ab8c028a4ece
Author: Lucas Holt <luke at foolishgames.com>
Date: 2022-05-21 (Sat, 21 May 2022)
Changed paths:
M crypto/openssl/crypto/ec/ec.h
M crypto/openssl/crypto/ec/ec2_mult.c
M crypto/openssl/crypto/ec/ec_err.c
M crypto/openssl/crypto/ec/ec_mult.c
Log Message:
-----------
We are a group of security researchers and cryptographers from academia and industry, and would like to continue the report of a security vulnerability in OpenSSL's implementation of binary/prime field ECDSA signatures. We initiated contact by e-mail in December 2019 and decided to open a pull request publicly to collaborate further with a fix.
Affected versions: 1.0.2u and 1.1.0l (current stable releases except 1.1.1 branch)
Affected curve parameters:
sect163r1, sect283r1/k1, sect409k1, sect571r1 (i.e. binary curves with group order slightly below the power of two) for 1.0.2u and 1.1.0l
secp192r1/k1, secp224r1, secp256r1/k1, secp384r1, secp521r1 (i.e. prime curves with group order slightly below the power of two) for 1.0.2u
Severity: full key exposure via cache timing attack
Executive summary
We discovered non-constant time implementations of Montgomery ladder scalar multiplication in the aforementioned releases, which enable the attacker to learn 1-bit of secret nonce with high precision making use of FLUSH+RELOAD cache timing attack technique [1]. Such a small leakage of nonces yields to key-recovery attacks from sufficiently many ECDSA signatures, due to our optimized version of Bleichenbacher's technique [2,3].
A full description of the attack can be found in [4] below.
[1] Y. Yarom, K. Falkner. "FLUSH+RELOAD: a High Resolution, Low Noise,L3 Cache Side-Channel Attack". USENIX Security 2014.
[2] D. Bleichenbacher. "On the generation of one-time keys in DL signature schemes". Presentation at IEEE P1363 working group meeting. 2000.
[3] A. Takahashi, M. Tibouchi, M. Abe. "New Bleichenbacher records: fault attacks on qDSA signatures". TCHES 2018(3), pp. 331–371, 2018.
[4] D. F. Aranha, F. R. Novaes, A. Takahashi, M. Tibouchi, Y. Yarom. "LadderLeak: Breaking ECDSA With Less Than One Bit Of Nonce Leakage". Cryptology ePrint Archive: Report 2020/615, available at https://eprint.iacr.org/2020/615
Overview of the vulnerability
The attack starts with the detection based on the second topmost bit using a cache-timing attack and follows with the Bleichenbacher methodology. Although the vulnerabilities are similar we split the discussion in the binary and prime curve cases.
For the binary curve case, the Montomery ladder is implemented in function ec_GF2m_montgomery_point_multiply() within file ec2_mult.c using López-Dahab coordinates. The function computes scalar multiplication kP for fixed-length scalar k and input point P = (x,y). The ladder starts by initializing two points (X1,Z1) = (X, 1) and (X2,Z2) = 2P = (x^4 + b, x^2). The first loop iteration follows after a conditional swap function that exchanges these two points based on the value of the second topmost key bit. The first function to be called within the first iteration is gf2m_Madd(), which starts by multiplying by value Z1. However, since the finite field arithmetic is not implemented in constant-time for binary fields, there is a timing difference between multiplying by (1) or (x^2), since modular reduction is only needed in the second case. In particular, a modular reduction will be computed when Z1 is x^2 after the conditional swap. This happens when the second topmost bit is 1 because the conditional swap effectively swapped the two sets of values. Although the timing difference is very small, it can be amplified by running a FLUSH-RELOAD attack that measures the amount of time the first multiplication takes while multiple threads in the background penalize the modular reduction code by evicting it from the cache. We observed that it is possible to amplify the timing difference to more than 100,000 cycles on multiple processors, which allows for a detection probability of success above 95% when FLUSH-RELOAD is used.
For the prime curve case, the analysis is a little more involved. OpenSSL implements the Montgomery Ladder by using optimized formulas for elliptic curve arithmetic in the Weierstrass model. The algorithm is implemented in function ec_mul_consttime(), but which does not run in constant-time from a cache perspective. The ladder starts again by initializing two accumulators r = P (in affine coordinates) and s = 2P (in projective coordinates). The first loop iteration is non-trivial and computes a point addition and a point doubling after a conditional swap. Depending on the key bit, the conditional swap is effective and only one point will remain stored in projective coordinates. Both the point addition and point doubling functions have optimizations in place for mixed addition, and our detection works on the point doubling case implemented in function ec_GFp_simple_dbl(). When the input point for the doubling function is in affine coordinates, a field multiplication is replaced by a faster call to BN_copy(). This happens when the two accumulators are not swapped in the ladder, which means that point r in affine coordinates is doubled and the second topmost bit is 0. The timing difference is again very small, but can be amplified to at least 15,000 cycles using performance degradation threads that evict the BN_copy() code from the cache. Our detection code implements the FLUSH-RELOAD technique and correctly determines the second topmost bit with around 99% probability of success.
Validation of the attack
We have conducted an experiment that recovers the signing key of a sect163r1 ECDSA key pair given about 2^26 signatures generated by OpenSSL, with relatively modest computational resources (around 3000 CPU hours and 720GB on a high-performance workstation). Even fewer signatures would suffice with a slightly bigger computation.
Our attack code also generalizes to other larger parameters in theory, although the required number of signatures and time complexity are orders of magnitude larger. We're currently executing a practical experiment of our attack against secp192r1.
Impact of the vulnerability
The vulnerability impacts private keys for ECDSA signatures instantiated with the affected curves. The most likely attack scenario is targeting a server's private key, in which the attacker has execution capabilities in the same machine.
How to fix
A possible fix amounts to implementing coordinate randomization to balance the two possibilities for the key bit in the first loop iteration of the Montgomery ladder. This way, the Z coordinates of both accumulator points will be non-trivial and the multiplication latency will be similar, with a tiny performance penalty.
This pull request implement such a countermeasure for the binary case in version 1.0.2, but we are happy to contribute additional patches for prime curves and version 1.1.0 if necessary.
Contact information
Diego F. Aranha @dfaranha (Aarhus University)
Akira Takahashi @akiratk0355 (Aarhus University)
Mehdi Tibouchi @mti (NTT Corporation)
Yuval Yarom @javali7 (University of Adelaide)
Obtained from: https://github.com/openssl/openssl/pull/11361
More information about the Midnightbsd-cvs
mailing list