I'm reading through A Key Recovery Attack on Discrete Log-based Schemes Using a Prime Order Subgoup which is a Small subgroup confinement attack.
It deals with stuff I had no knowledge of, like Schnorr's Signature that I talk about in a previous post, or like what I'm going to talk about now:
The Pohlig-Hellman Algorithm is a method to compute a Discrete Logarithm (which is a difficult problem) on a multiplicative group whose order is a smooth number (also called friable). Meaning its order can be factorized into small primes.
y = g^x mod p
ord_p(g) = p - 1
p - 1 = q_1^(i_1) * ... * q_j^(i_j)
Here y is the public key, x
is the secret key we're trying to compute.
The order of g
, our generator, is p - 1
since p is prime.
p - 1 is smooth so it can be factorized into something like 2^2 * 3 * 7 (extremely convenient example!)
Following is an overview of the method, if you read an equation and feel like it comes from nowhere (and it should feel like that), I posted a very short paper containing the simple proofs of those bellow.
Overview
The idea that should come to your mind, if you're used to seeing that kind of problem, is that there might be a way to use the Chinese Remainder Theorem (abbreviated CRT) to our advantage. What if we could write x
modulo the factors of p - 1
and then reconstruct the real x
with the CRT? Well, let's do just that!
To write x
modulo a factor q
of p - 1
we can write y^((p-1) / q)
which we know and can compute, and is also equal to g^(x_q * (p-1) / q)
(If you have a factor that appears multiple times in the prime decomposition of p - 1
(for example p - 1 = 2^5
, then there is also a way to ease the computations by finding multiple unknowns (5 unknowns in our example))
We then have a discrete logarithm to compute, but a small one, that we can compute efficiently thanks to Shanks' method (baby-step giant-step) or Pollard's rho algorithm.
Then we have multiple ways of writing our x
(modulo the factors of p - 1
) and we find out what is x
with CRT (I explained this step in my airbus write-up here).
Proofs + Video
You can read through the Algorithm (or watch the video bellow), but I don't really like to do that since I'm not really good at memorizing something if I don't understand the nut and bolts of it. So here's a really good paper written by D. R. Stinson that demonstrate where the equations of the algorithm come from. And here's an explanation + example of the algorithm:
Interactive Protocols
Interactive Protocols are basically a discussion between a Prover and a Verifier where the Prover has some piece of information he wants to prove, without giving out the information.
It is often illustrated with Peggy and Victor and their super tunnel.
Usualy it takes 3 steps:
- Prover sends a fixed value.
- Verifier sends a challenge.
- Prover answers the challenge.
The Verifier can then verify the answer based on the fixed value. If the answer is correct, the Verifier can assume the Prover knows what he's trying to prove. Sometimes you have to repeat the protocols multiple time to be sure, and not all problems have an Interactive Proof.
Classic examples of such proofs can be found on the Discrete Logarithm problem (we'll talk about that later) and the Hamiltonian Cycle problem.
Interactive Protocols are verified if they are :
- Complete: a Prover can successfully answer the challenge if he is honest.
- Sound : a dishonest Prover cannot convince the Verifier he knows the secret.
In the real definitions we use probabilities (an honest prover still has a small chance of making a mistake, a dishonest prover still has a small chance of convincing the Verifier).
We also often want a 3rd condition on our Interactive Protocols: we want it to be Zero-knowledge, no information about our secret should be leaked in this interaction.
Here are how you prove each one of them:
- Completeness: Can the Prover answer correctly thanks to his secret?
- Soundness: From the point of view of the Verifier. If the Prover can correctly answer two different challenges for the same fixed value (however he crafted the answers and the fixed value), does it mean that he must know the secret then?
- Zero-Knowledgeness: If you see a transcript of a recorded instance of this interaction, will you learn anything about the secret? (See if you can create fake transcripts)
There are also notions of weak Zero-knowledge, strong Zero-knowledge, dishonnest verifiers, etc...
But let's talk about something else.
Non-interactive Protocols
Since we said that a recorded transcript of a past interaction has no value (if it is zero-knowledge), then we could assume that there is no way of proving something by showing an old transcript, or by showing a transcript with yourself.
Don't fool yourself! Yes we can. We do this by using hash functions that we deem random enough.
The idea is that, by replacing the Verifier by a random oracle, we cannot predict the challenges and we thus cannot craft a fake transcript (we like to use random oracles instead of hashes, to prove some scheme is secure).
a random oracle is an oracle (a theoretical black box) that responds to every unique query with a (truly) random response chosen uniformly from its output domain. If a query is repeated it responds the same way every time that query is submitted.
What is interesting is that this protocol was used in a Signature Scheme.
Interactive Proof of a Discrete Logarithm
The most famous academic example of Interactive Protocol is done using the Discrete Logarithm problem.
we have <g> = G
, with g
of order q
. The Prover wants to show he knows x
in g^x = y
.
- the Prover sends
t = g^e
- the Verifier sends a challenge
c
- the Prover sends
d = e + cx
The Verifier can then compute y^c * t = g^(e + cx)
and see if it equals g^d = g^(e + cx)
A transcript would look like this: (t, c, d)
Non-Interactive Proof of a Discrete Logarithm
Doing this with a non-interactive protocol, it would look like this:
(t, h(t), d)
with h
a hash function.
Schnorr's Signature
This is what Schnorr's Signature is doing:
t = g^e
c = H(m || t)
d = e - x*c
he would then send (c, d)
as the signature, along the message m
. Which is basically a hash of the message with a proof that he knows the secret x
.
To verify the signature you would use the public key y = g^x
to compute y^c * g^d = t
and then you would compute the hash. It would give you the proof that the signer knows x
(authentication, non-repudiation) and that the message hasn't been tampered (integrity).
So this is one way of using Non-interactive proofs!
Airbus made a "private" challenge called « Trust the future » and accessible only by some selected schools (epitech, insa, and others). I wasn't invited to participate but there was a "crypto" challenge I thought was interesting. Since the challenge just finished I'm posting the write up.
Crypto challenge #1
We have 4 certificates and a challenge1
file that seems to be a s/mime file of a pkcs#7 enveloped-data object.
2.4.3 EnvelopedData Content Type
This content type is used to apply privacy protection to a message. A
sender needs to have access to a public key for each intended message
recipient to use this service. This content type does not provide
authentication.
and
3.2 The application/pkcs7-mime Type
The application/pkcs7-mime type is used to carry CMS objects of several types including envelopedData and signedData. The details of constructing these entities is described in subsequent sections. This section describes the general characteristics of the application/pkcs7-mime type.
rfc2633
Certificates
We dump the info of each certificates in human readable format, openssl has commands for that (I think certtool does as well, but I'm on windows using cmder and openssl is the one included).
openssl x509 -in alice.crt -text -noout -out alice.crt.txt
We see that alice, bob and charly use the same rsa exponent (3).
Reminder: RSA
If you're familiar with RSA (and it's highly probable you are if you read this blog) you can skip this section.
RSA is an asymmetric encryption scheme (also used as a signature). It works by generating a set of private key/public key, the private key is of course kept private and the public key is publicly disclosed. If someone wants to send us a private message he can encrypt it with our public key and we will be able to decrypt it with the private key. The public key is the pair of number (n, e) where n is called the modulus and e is called the exponent. If we want to encrypt a message m with the public key we "basically" do c = m^e modulo n
and send c. To decrypt it we use our private key d like this: m = c^d modulo n
.
The math behind this is that n is generated from two secret primes p and q (big enough) n = p x q
and d = e^-1 modulo (p-1)(q-1)
, (p-1)(q-1)
being phi(n)
being the order of the multiplicative group Z/nZ
. The security comes from the fact that it's Computationally Hard to find the inverse of e if we don't know p and q. By the way, Heartbleed (a recent attack on openssl) led to finding one of the prime, thus the entire decomposition of n.
Textbook RSA vs real life RSA
This is all theory. And in practice we have to go through several steps to encrypt an ascii message, make sure it is of length lesser than the modulus, make sure the modulus is big enough, etc...
Textbook RSA is also deterministic and thus not semantically secure (see my previous post) + it is malleable: imagine you intercept c, and of course you know (n, e) (the public key). You could compute c' = 2^e * c = 2^e * m^e = (2m)^e modulo n
, this would correctly decrypt as 2m
.
Thus, to counter those in practice, RSA Encrytion uses padding (usually OAEP) to make it probabilist and not malleable.
Let's go back to our challenge
We open our challenge1
file:
MIME-Version: 1.0
Content-Disposition: attachment; filename="smime.p7m"
Content-Type: application/x-pkcs7-mime; smime-type=enveloped-data; name="smime.p7m"
Content-Transfer-Encoding: base64
MIIy1wYJKoZIhvcNAQcDoIIyyDCCMsQCAQAxggQ0MIIBYgIBADBKMDYxCzAJBgNV
BAYTAkZSMQ4wDAYDVQQHEwVQYXJpczEXMBUGA1UEAxQOY2FAZXhhbXBsZS5jb20C
EGOE4rIYS8v1jszxDKemVjwwDQYJKoZIhvcNAQEBBQAEggEAweI1fG/FPxzF4Odu
sSJL6PJOiDklHPlUqYCQxFSfG6+3vLEAbdKpgtVsHS0+a0IhItAfeNoAmXdreJFi
6M6U7j0ee4iqgXXbuG8vSsZTYbyUmzuQRgdByu5vGr3FvWxSlvvI8tr/d/cRDqMt
To read that we need to extract the pkcs7 object and parse it. Openssl allows us to do this:
openssl smime -in challenge1 -pk7out -out challenge1.p7m
openssl asn1parse -text -in challenge1.p7m
We get an annoying dump of info to read. With three of those things:
95:d=6 hl=2 l= 16 prim: INTEGER :6384E2B2184BCBF58ECCF10CA7A6563C
113:d=5 hl=2 l= 13 cons: SEQUENCE
115:d=6 hl=2 l= 9 prim: OBJECT :rsaEncryption
126:d=6 hl=2 l= 0 prim: NULL
128:d=5 hl=4 l= 256 prim: OCTET STRING [HEX DUMP]:C1E2357C6FC53F1CC5E0E76EB1224BE8F24E8839251CF954A98090C4549F1BAFB7BCB1006DD2A982D56C1D2D3E6B422122D01F78DA0099776B789162E8CE94EE3D1E7B88AA8175DBB86F2F4AC65361BC949B3B90460741CAEE6F1ABDC5BD6C5296FBC8F2DAFF77F7110EA32D330D38DD2CA2FE13E785C86FE2210B58074C2DA5F440794BA023FC98B3D1E7DC979DBAC6672B5C19ABF4A91E21D5E474475BC09B78910D1F8E0290B38AE8D756E04D7F5EFBA64BFB5A0E96CD3DE1D82F609544A423F666D08B63262229687E1982BC8E424C7B5266B11A59036625F8E92C06740A3C9D8F3CE87FEB1F4444BC2039C8C6FF0AB9457D8AA63851ECF3C4AF1A2328FD
Which means the same message was sent to three recipients, identified by their serial number which we recognize as being our alice, bob and charly.
We also get this at the end:
1110:d=4 hl=2 l= 9 prim: OBJECT :pkcs7-data
1121:d=4 hl=2 l= 20 cons: SEQUENCE
1123:d=5 hl=2 l= 8 prim: OBJECT :des-ede3-cbc
1133:d=5 hl=2 l= 8 prim: OCTET STRING [HEX DUMP]:01D4CE3AF4D17ABB
Which means that the data sent (after this dump) is encrypted by 3DES version 3 (three different keys) in CBC mode with an IV 01D4CE3AF4D17ABB
.
Reminder: DES-EDE3-CBC
I like to put reminders like this so you don't have to switch to Wikipedia if you don't remember what are those letters.
DES (Data Encryption Standard) is the famous no-longer-used block cipher (because it was broken ages ago).
EDE3 short for the third version of the Triple DES block cipher (that is still considered secure today, it was a response to DES no longer being secure) which uses 3 different keys. Encrypting is done like this:
- we encrypt with key1
- then we decrypt with key2
- then we encrypt again with key3
E(k3, D(k2, E(k1, M)))
Hence the triple DES.
CBC is a mode of operation. A block cipher can only encrypt/decrypt blocks of a certain size (64bits with DES). If you want to do more (or less) you have to use a mode of operation (and a padding system).
Chinese Remainder Theorem
Here the interesting thing is that the same message was sent to three different recipients, encrypted with the same exponent (3). Let's write down the informations we have:
c1 = m^3 modulo n1
c2 = m^3 modulo n2
c3 = m^3 modulo n3
c1 being the encrypted message sent to Alice, n1 being Alice's modulus, and so on...
We have a system with one unknown: the message. The Chinese Remainder Theorem works in a similar fashion to Lagrange Interpolation (anecdote time: it is used in Shamir's Secret Sharing).
So that we have:
m^3 = c1 * n2 * n3 * ((n2 * n3)^-1 [n1]) +
c2 * n1 * n3 * ((n1 * n3)^-1 [n2]) +
c3 * n1 * n2 * ((n1 * n2)^-1 [n3])
modulo n1 * n2 * n3
A brief explanation:
We have `c1 = m^3 modulo n1
, to place it in a formula modulo n1 * n2 * n3
we have to cancel it when it's modulo n2 or modulo n3. How to make something congruent to zero when its modulo n2 or n3 ? Make it a multiple of n2 or n3. So we multiply c1 with n2 and n3. But then when it will be modulo n1 we will have the value c1 * n2 * n3
which is not correct (c1 = m^3 modulo n1
!). So let's cancel the n2 and n3 with their inverse modulo n1. We then have c1 * n2 * n3 * ((n2 * n3)^-1 [n1])
. We do this with all the equations to find the bigger equation. This is the Chinese Remainder Theorem. Simple no?
And this result is even more useful since we know that:
m < n1
m < n2
m < n3
=>
m^3 < n1*n2*n3
Of course if m was greater than one of the modulus then it would decrypt incorrectly.
So what we have is:
m^3 = something modulo n1*n2*n3
=>
m^3 = something
That's right, we can get rid of the modulo. We then do a normal cubic root and we find m.
Here's the quick python code I hacked together for this:
(by the way we can quickly get the modulus of each recipients with openssl: openssl x509 -in alice.crt -modulus
)
## 6384E2B2184BCBF58ECCF10CA7A6563C (Alice)
c1 = "C1E2357C6FC53F1CC5E0E76EB1224BE8F24E8839251CF954A98090C4549F1BAFB7BCB1006DD2A982D56C1D2D3E6B422122D01F78DA0099776B789162E8CE94EE3D1E7B88AA8175DBB86F2F4AC65361BC949B3B90460741CAEE6F1ABDC5BD6C5296FBC8F2DAFF77F7110EA32D330D38DD2CA2FE13E785C86FE2210B58074C2DA5F440794BA023FC98B3D1E7DC979DBAC6672B5C19ABF4A91E21D5E474475BC09B78910D1F8E0290B38AE8D756E04D7F5EFBA64BFB5A0E96CD3DE1D82F609544A423F666D08B63262229687E1982BC8E424C7B5266B11A59036625F8E92C06740A3C9D8F3CE87FEB1F4444BC2039C8C6FF0AB9457D8AA63851ECF3C4AF1A2328FD"
n1 = "EFBA9C442084759DC9770021B03C2E2913053E770779316F92C5DBFCAE4D3682E64006E38FA6A3AC24CC13AD2E747A50E5735064549F590294E36F2A1B23DB29567B49C007F8F8C224D3CD19B81D3F198C540291C135965E549881B775EEE29684F0E6CD4C2A017BE38F2E78E070D503BE9EE3EA2C491E53DE9C705FEB973918A168F275D90D055778289598BD2377D79ACC1BA493F570C5C8301913CEF12CD513321F8F320D8EC8172182D03F33721F02DFCE24463AE7A6CAB7C3A0CBB7D2AB149D347A2C9ABDB81BE4B60CAECBF31CF79C4BA0081FC00BB0939A950CBACA5B7B79FF92AF273B0D01A7E183FF30C90F27705D18F70EBB32281C5A873ED0A90D"
## 9F9D51BC70EF21CA5C14F307980A29D8 (Bob)
c2 = "27EB5A62A311DFAECF09318BEF7D60B98EA151AF09BDBF2A89A884617B8A8A14FF6F8045A8FD5D8956F5768C32A7E47AB17FA08D5F7D2EB590C4FC8296A1F70069C338CFA3C131A58FE05A75E36D7457ECC7B1BB403C1FF31FA66FB478B1F4548325B57961191AE1BFDDD7F5AF6FEC6FD94F66B1BC482337B579AD790466D1F33EDB09AA388085053D3C383F91A8EC40DB150365735B7E2D01F172E23717D31ABE0350FAC6730673C3C70EE593E8008A222DE40CF8A62615D119CD119FE8C30DA49E7A1D3596279B659E72C584D6D8262B1126B8C8BCDC09A31761EE746A14ADA7EC387AF2C52CBABDD8F443DF1F4C5E70C83B82A12F3BBCB21494689D13791F"
n2 = "C0028301D5645483592E2117463A804454BD1AE33B1CBEF33D9CDFF86EFD46CB24BA1984874DAE3A4A3D3609D9607276F91DFAD38E9ED6B6BCE15F29C49E1C7E0B532AE2A1343AF3F8064B4A093F23F981624D4E08F901787B761847B4F0963512EAC13B5D47DA4295FF614501A3D7D7EDA6B4E6197B974A70BF11FDC5D619D50974415E209DE76C68F190DBCA5BB6F1963C1E0E987BE105FF9082EAE003C2051A4C95E3299D3C1BC64D5F8E95D40C7B27EEEA5EBCD9B54CD6B5655B0AB9AEAA15E976AE37EA228A151D2AFD5417D4BCBC3BFB396E6B10AF561CBE0FD0374081B7034585C54096849FF82B79A117E44AE8FDB5B304BC0D9CA297C2F4A57FF91F"
## A6D4EF4DD38B1BB016D250C16A680470 (Charly)
c3 = "04991C5BA4882F329B03B18E2B317F4A54905ED4EB832B084A42AD700A0D3136A14BB57D61D4A1982E2CAB0FF773356759EE4AD77C1982E642CF574332AB32D109952FDE6221D77C35E4D0B69E559392DBE602E5336BD09239E85F21A70F4A824907AF75C9C372D4BE4C15E45431C35FE678E2646017D74186B3B084A41F217655A2ED262AA5C300BA737AB0DF270BD0B38A2FF215A3B5DB3CBB79350DDFEF1A08E40CB253B506D92002BBF4AD112AC1DDDB96CD4539A01035E76B1CC5C43427F46C83DBAA318387FE2C8C7FAA75FC0099050CF98671015A568CFFC56DFF6F8CB80A6A55B4CCB0D825AA9D99098DDA5D2EEC7D40D0BCCDA42D9E618A09AEC50A"
n3 = "CFA7854352FF9DF5E84AB10AB8F034F8106811D973BAB528BFCBD3DCBCFDF9FB5C398E23B58BA883F7C78F47C6694B4F042CDB8E54E856040F8A8A9ADBCA4C6D0813894C43352EB3EE19C1F76DF46DFD1B6BB38349BCE811036B0ED7ACFE2E5045FE4232F11DA3F113189A176964C206155342FD9E2E8AD11CBBCB85DFDF30E62AEA068F2DD7CEC6CF818D1E312BBA5FA6385461CA5ADCA0F95B6299FA366EEF8856416D72A42A93FD979E269D8FEA143870985FD353C85850FB4A11B6E4BA483CDC97F7E1717C34DF7D9E34DF83F67A9DA97ACA69926167D44C2CB3BB858EC041A244A6197D7F3B9AFD02A0562F13EACE6494F289184DAD16D2D995ED1ADC13"
## base16 -> base10
c1 = int(c1, 16)
c2 = int(c2, 16)
c3 = int(c3, 16)
n1 = int(n1, 16)
n2 = int(n2, 16)
n3 = int(n3, 16)
## extended euclide algorithm
def xgcd(a,b):
"""Extended GCD:
Returns (gcd, x, y) where gcd is the greatest common divisor of a and b
with the sign of b if b is nonzero, and with the sign of a if b is 0.
The numbers x,y are such that gcd = ax+by."""
prevx, x = 1, 0; prevy, y = 0, 1
while b:
q, r = divmod(a,b)
x, prevx = prevx - q*x, x
y, prevy = prevy - q*y, y
a, b = b, r
return a, prevx, prevy
## chinese remainder formula
n2n3 = n2 * n3
n1n3 = n1 * n3
n1n2 = n1 * n2
n2n3_ = xgcd(n2n3, n1)[1]
n1n3_ = xgcd(n1n3, n2)[1]
n1n2_ = xgcd(n1n2, n3)[1]
m3 = c1 * n2n3 * n2n3_ + c2 * n1n3 * n1n3_ + c3 * n1n2 * n1n2_
m3 = m3 % (n1n2 * n3)
print(m3)
from decimal import *
getcontext().prec = len(str(m3))
x = Decimal(m3)
power = Decimal(1)/Decimal(3)
answer = x**power
ranswer = answer.quantize(Decimal('1.'), rounding=ROUND_UP)
diff = x - ranswer**Decimal(3)
if diff == Decimal(0):
print("x is the cubic number of", ranswer)
else:
print("x has a cubic root of ", answer)
Note:
- The xgcd function is included in sage but here I use Python so I included it in the code.
- We need to use the decimal package to calculate the cubic root because our number is too big.
We then get this big ass number that we convert to hexadecimal (hex(number)
in python). This yields:
0001ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff004f8957408f0ea202c785b95e206b3ba8da3dba7aea08dca1
We refer once more to the RFCs
8.1 Encryption-block formatting
A block type BT, a padding string PS, and the data D shall be
formatted into an octet string EB, the encryption block.
EB = 00 || BT || PS || 00 || D . (1)
The block type BT shall be a single octet indicating the structure of
the encryption block. For this version of the document it shall have
value 00, 01, or 02. For a private- key operation, the block type
shall be 00 or 01. For a public-key operation, it shall be 02.
The padding string PS shall consist of k-3-||D|| octets. For block
type 00, the octets shall have value 00; for block type 01, they
shall have value FF; and for block type 02, they shall be
pseudorandomly generated and nonzero. This makes the length of the
encryption block EB equal to k.
rfc 2315
We have our 3DES key: 4f8957408f0ea202c785b95e206b3ba8da3dba7aea08dca1
to use.
Let's get the hexdump the end of the file (you can use commandline utilities like base64, hexdump, dd and xdd):
openssl smime -in challenge1 -pk7out > b64file`
base64 -d b64file > hexfile
hexdump -s 1135 hexfile
dd
xdd
And finally decrypt our encrypted
file with openssl since it provides a command for that:
openssl des-ede3-cbc -d -iv 01D4CE3AF4D17ABB -K 4f8957408f0ea202c785b95e206b3ba8da3dba7aea08dca1 -in encrypted
Voila ! That was really fun :)