RCTF 2020 - infantECC

Here is the challenge code:

from Crypto.Util.number import getStrongPrime, bytes_to_long, long_to_bytes
from hashlib import sha256

flag = open("flag.txt","rb").read()

p=getStrongPrime(512)
q=getStrongPrime(512)
R=Zmod(p*q)

Mx=R.random_element()
My=R.random_element()
b=My^2-Mx^3
E=EllipticCurve(R, [0,b])
Ep=EllipticCurve(GF(p), [0,b])
Eq=EllipticCurve(GF(q), [0,b])
Ecard=Ep.cardinality()*Eq.cardinality()
r=random_prime((p^^q)>>100)
s=inverse_mod(r, Ecard)

print((s,b))
print(s*E(Mx,My))
print(randint(0,Ecard)*E(Mx,My))
print(r^^(bytes_to_long(sha256(long_to_bytes(Mx)).digest())^^bytes_to_long(flag))<<256)

We are given a curve over $\mathbb{Z}_n$ with $n=pq$ an RSA-like modulus. First, note the special form of the curve: $y^2 \equiv x^3 + b \pmod{n}$. Whenever $p \equiv 2$, the curve $y^2 \equiv x^3 + b \pmod{p}$ has $p+1$ points for any $b$, such curves are called supersingular. Otherwise, the group order varies in $p\pm \mathcal{O}(\sqrt{p})$, which is hard to work with. Thus, we aim to get $p \equiv q \equiv 2 \pmod{3}$, which happens quite often.

Remark: we are not given $n$, but from $b$ and two points on the curve it is easy to recover.

We can immediately dismiss $n\equiv 2 \pmod{3}$ since this implies $p\equiv 1\pmod{3}$ or $q\equiv 1\pmod{3}$. However, we can not easily distinguish the bad case $p\equiv q \equiv 1\pmod{3}$ and the good case $p\equiv q\equiv 2\pmod{3}$, so the attack will only work with probability 1/2.

In the fortunate case, the cardinality of the curve over $\mathbb{Z}_n$ is equal to $(p+1)(q+1)=n+p+q+1$. So, we get an equation on the private/public exponent: $$ rs \equiv 1 \pmod{n+p+q-1}, $$ $$ \Rightarrow r\cdot s - k(n+p+q-1)=1, ~~~ k < r < 2^{412}. $$

This equation is very similar to the one in the Boneh-Durfee attack on small secret RSA exponent. However, that attack requires $k < r < n^{0.292}$, while our $r$ is much larger: $r \approx n^{0.402}$. Do we have any other information?

Yes, it is somewhat hidden, but the least 256 bits of $r$ are given out in the last printed value! Can we use it to improve the Boneh-Durfee attack? Unfortunately, not directly, since the attack considers the equation mod $s$, and thus, the value of $r$ does not matter (it does matter indirectly by giving a bound on $k$, but learning bits of $r$ does not change the bound).

A possibility is to consider the equation modulo $2^{256}s$. Then we get an equation of the form $c+k(a+z) \equiv 0\pmod{2^{256}s}$, where $k<2^{412}, z=p+q<2^{513}$, and the modulus is close to $2^{1280}$. This could be comparable to the Boneh-Durfee bounds: $k<2^{0.282\cdot1280}=2^{360},z<2^{641}$. However, I did not manage to get it work because of $c\ne 1$.

Alternative Approach

Instead, let's try to tackle with the equation by hands (and LLL).

Let's generate a sample instance.

In [1]:
from sage.all import log
from tqdm import tqdm

log2 = lambda val: f"{float(RR(log(abs(val),2))):.3f}"
proof.arithmetic(False)

from random import randint, seed
seed(101)

p = q = 1
while p % 3 != 2:
    p = next_prime(randint(1,2**512))
while q % 3 != 2:
    q = next_prime(randint(1,2**512))

n = p*q
R = Zmod(p*q)
Mx = R.random_element()
My = R.random_element()
b = My**2 - Mx**3

Ep = EllipticCurve(GF(p), [0,b])
Eq = EllipticCurve(GF(q), [0,b])
E = EllipticCurve(R, [0,b])

Ecard = Ep.cardinality()*Eq.cardinality()

#r = random_prime((p^^q)>>100)
r = next_prime(randint(0, (p^^q)>>100))
s = inverse_mod(r, Ecard)

print("   s", log2(s))
print("   r", log2(r))
print("rmax", log2((p^^q)>>100))

spt = s*E(Mx, My)
randpt = randint(0, Ecard)*E(Mx, My)

from hashlib import sha256
from Crypto.Util.number import bytes_to_long, long_to_bytes
flag = b"RCTF{Copper5mith_M3thod_f0r_ECC}"
v = r^^(bytes_to_long(sha256(long_to_bytes(Mx)).digest())^^bytes_to_long(flag))<<256
   s 1020.453
   r 404.274
rmax 410.041
In [2]:
k = (r*s-1)//(n+p+q+1)
print("k", log2(k))
assert r*s - k*(n+p+q+1) == 1
assert (1 + k*(n+1+p+q)) % s == 0

r0 = v % 2**256  # given
r1 = r >> 256
assert (1-r0*s + k*(n+1+p+q)) % 2**256 == 0
k 401.430

Let $r = 2^{256}r_1 + r_0$, where $r_0 < 2^{256}$. We can rewrite the main equation modulo $2^{256}s$: $$ k(n+1)+k(p+q) \equiv sr_0-1 \pmod{2^{256}s}. $$ Let's try to find a multiple of this equation that will make the left part less than the modulus, or overflow it by a small amount.

In [3]:
mod = s*2**256
assert k*(n+1+p+q) % mod == (s*r0-1) % mod
In [4]:
weight = 2**512
m = matrix(ZZ, [
    [n+1, weight*1],
    [mod, 0]
])

for tn, t in m.LLL():
    if t * tn < 0: continue    
    t //= weight
    if t < 0:
        tn = -tn
        t = -t
    break
else:
    raise Exception("negative case, let's skip")
    
print("t", log2(t))
print("tn", log2(tn))

w = k * ((n + 1 + p + q) * t % mod) // mod
print("overflow", log2(w), w)
t 381.838
tn 895.010
overflow 20.756 1771398

The overflow has rather large variance over problem instances, it is about 20-32 bits.

... a part of writeup with intermediate results is missing ...

Define $$ \begin{align} h &= \left\lfloor \frac{r_0t}{2^{256}} \right\rfloor,\\ u &= \left\lfloor \frac{t(n+1)}{2^{256}s} \right\rfloor,\\ w &= \left\lfloor \frac{(~t(n + 1 + p + q) \mod{2^{256}s})k}{2^{256}s} \right\rfloor. \end{align} $$ Note that $u$ and $h$ are known, and $w$ is unknown but rather small (experimentally, around 20-32 bits). Then with overwhelming probability $$ ku + w - r_1t = h. $$

In [5]:
u = t * (n+1) // mod
h = r0 * t // 2**256
w = k * ((n + 1 + p + q) * t % mod) // mod
assert h == k*u + w - r1*t

Once $w$ is guessed, we can recover $r_1$ modulo $u$: $$ r_1 \equiv \frac{w-h}{t} \pmod{u} $$ Let $r_1 = r_{11}u + r_{10}$, where $0 \le r_{10} < u$. Note that $r_{11}$ is of size 20-32 bits

In [6]:
r10 = r1 % u
r11 = r1 // u
assert (-h + w) * inverse_mod(t, u) % u == r1 % u == r10
print("to guess (bit size):")
print("r11", log2(r11), log2(k//t))
print("  w", log2(w))
assert r == r0 + 2**256*u*r11 + 2**256*r10
to guess (bit size):
r11 19.593 19.593
  w 20.756

Guessing both $r_{11}$ and $w$ is too much. However, we can exploit the curve group and mount a BSGS-like attack. We have $$ r_1 = r_{11}u + r_{10} = r_{11}u + \left(\frac{w-h}{t}\mod{u}\right). $$ Since $r_0$ is known, we can express full secret $r$: $$ r = r_0 + 2^{256}\left(\frac{w-h}{t}\mod{u}\right) + 2^{256}ur_{11}. $$ Let $G$ be an arbitrary point on the curve $E$ (e.g. one of those given). We know that $[rs-1]G=O$, therefore: $$ [(r_0+2^{256}ur_{11})s-1]G = -[2^{256}\left(\frac{w-h}{t}\mod{u}\right)s]G. $$ We can precompute the left part for all possible values of $r_{11}$ and store in the table. Then we guess $w$ and compute the right part and check against the table. It's possible to optimize both steps so that 1-2 curve point additions per guess are needed.

In [7]:
G = spt

assert (r0 + 2**256*u*r11) * s * G - G == -(2**256*r10) * s * G

# reasonable bounds:
# high enough probability to solve,
# low enough memory req.
if 1:  # full search
    start_r = 0
    start_w = 0
    total_r = 2**22
    total_w = 2**23
else:  # fast check on known secrets
    start_r = r11 - 5
    start_w = w - 5
    total_r = 10
    total_w = 10

mask = 2**100-1  # hashing points

tab = {}
acc = (r0*s-1)*G
step = (2**256*u*s)*G
acc += start_r * step

perc = 0
for i in tqdm(range(total_r)):
    test = int(acc.xy()[0]) & mask
    tab[test] = start_r + i 
    acc += step
100%|██████████| 4194304/4194304 [03:14<00:00, 21537.50it/s]
In [8]:
H = -2**256*s*G
base = (-h) * inverse_mod(t, u) % u
step_e = inverse_mod(t, u) % u
cur_e = base

cur = cur_e * H
step = step_e * H
red = u * H

cur_e += start_w * step_e
cur_e %= u
cur = cur_e * H

for w in tqdm(range(total_w)):
    test = int(cur.xy()[0]) & mask
    if test in tab:
        r11 = tab[test]
        w += start_w
        print("Solution:", w, r11)
        r10 = (-h + w) * step_e % u
        r = r0 + 2**256 * r10 + 2**256*u*r11
        Mx = (r * spt).xy()[0]
        print("recovered r", r)
        m = (v ^^ r) >> 256
        m = m ^^ bytes_to_long(sha256(long_to_bytes( int(Mx) )).digest())
        print(long_to_bytes(m))
        break

    # optimized reduction mod u
    # track the exponent and reduce if needed
    cur_e += step_e
    cur += step
    if cur_e >= u:
        cur_e -= u
        cur -= red
 21%|██        | 1771398/8388608 [02:29<09:17, 11875.62it/s]
Solution: 1771398 790644
recovered r 49943357289587115406335857308667372798949001275969321697728163810970501325410259988956553512194719612541025798846577063031
b'RCTF{Copper5mith_M3thod_f0r_ECC}'

On my laptop the whole attack with the chosen bounds runs in 6 minutes. Note that the bounds hold only with some probability, so multiple (typically around 10) instances have to be attempted (recall that $p,q$ can lead to non-supersingular curves and thus the probabilty is halved further).

Conclusions

The method I used is rather unusual and I still have many gaps in its understanding. It is not exactly clear how to derive good bounds for $r_{11}$ and $w$ and why they have so large variance in size. A possible explanation is to look at the relations e.g. $k < r < ((p\oplus q)\gg 100)$. While $k$ can be close to the maximum, it has higher chances to be much lower: if $p$ and $q$ agree in several most significant bits, the bound is decreased; then if $r$ is small by chance, the bound is decreased further.

Also, accidentally I noticed that $\lfloor k/t\rfloor = r_{11} = \lfloor r_1/u \rfloor = \lfloor r/{2^{256}u} \rfloor$:

In [10]:
print(k//t, r1//u)
assert k // t == r11 == r1 // u == r // (2**256*u)
790644 790644

This is rather surprising, as $r$ an $k$ are unknowns which are related by a quantity dependent on $p+q$, and we can find such $t,u$ to make this relation to hold without factoring $n$. Also, $t$ was chosen rather arbitrarily!

Finally, it feels like the problem should be very easy once $w$ is guessed, but I didn't find a good method avoiding use of the curve.

PS: The intended solution seems to be using Coppersmith methods, but I haven't seen it yet.