Hey! I'm David, a security engineer at the Blockchain team of Facebook, previously a security consultant for the Cryptography Services of NCC Group. This is my blog about cryptography and security and other related topics that I find interesting.

If you don't know where to start, you might want to check these popular articles:

The funny tale of a dude who wants to safely ssh to his server on his brand new windows laptop. This follows by how to safely download, execute and use PuttY... and it's hilarious.

A whitehouse blogpost by Ed Felten on cooperative strategy, a nice counter-intuitive puzzle that I will not forget!

Alice and Bob are playing a game. They are teammates, so they will win or lose together. Before the game starts, they can talk to each other and agree on a strategy.
When the game starts, Alice and Bob go into separate soundproof rooms – they cannot communicate with each other in any way. They each flip a coin and note whether it came up Heads or Tails. (No funny business allowed – it has to be an honest coin flip and they have to tell the truth later about how it came out.) Now Alice writes down a guess as to the result of Bob’s coin flip; and Bob likewise writes down a guess as to Alice’s flip.
If either or both of the written-down guesses turns out to be correct, then Alice and Bob both win as a team. But if both written-down guesses are wrong, then they both lose.

Okay that one seems kind of useless. But if someone wants to tell me otherwise I'm all ears! But this seems more like a stunt to introduce their new cloud service:

One of the main motivations for adding cryptographic functionality to the Wolfram Language was the arrival of the Wolfram Cloud.

If you haven't heard, some people from (or not) Lulzsec have found some serious vulns on the Hola! Plugin. And also they are not happy. Personally I find this Hola! really useful as a free solution to get a netflix US account when not in the US and being able to watch youtube (because everything is "blocked in your country" when you are not in the US). And the fact that you are basically a TOR node is also nice, it increases global anonymity! But that's just my opinion.

The Hacking Week ended 2 weeks ago and EISTI got the victory.

I'm the proud creator of the crypto challenge number 4, still available here, which was solved 12 times.

I also wrote a Proof of Solvableness, reading it should teach you about a simple and elegant crypto attack on RSA: the Same Modulus Attack.

(Note that I wrote that back in January)

Let's start

We are presentend with 4 files:

alice.pub

irc.log

mykey.pem

secret

the irc.log reads like this:

Session Start: Thu Feb 05 20:49:04 2015
Session Ident: #mastercsi
[20:49] * Now talking in #mastercsi
[20:49] * Topic is 'http://www.math.u-bordeaux1.fr/CSI/ |||| http://www.youtube.com/watch?v=zuDtACzKGRs "das boot, ouh, ja" ||| http://www.koreus.com/video/chat-saut-balcon.html ||| http://blog.cryptographyengineering.com/ ||| http://www.youtube.com/watch?v=K1LZ60eMpiw ||| petit chat http://www.youtube.com/watch?v=eu2kVcWKvRo ||| sun : t'as le droit de boire quand même va'
[20:49] * Set by [email protected]:41d0:52:100::65d on Sat Nov 22 00:06:50
[20:49] <asdf> et donc j'ai chopé une vieille clé rsa qu'alice utilisait
[20:49] <qwer> alice la alice? tu te fous de moi ?
[20:49] <asdf> haha non
[20:49] <asdf> mais le truc est corrompu, ça a l'air de marcher pour chiffrer mais la moitié de la clé a disparu
[20:49] <qwer> attend j'ai sa clé publique qui traine quelque part, et même un fichier chiffré avec. me suis toujours demandé ce que c'était...
[20:50] <asdf> je t'ai envoyé le truc, mais ça m'étonnerait que ça soit la même clé non ?
[21:22] * Disconnected
Session Close: Thu Feb 05 21:22:11 2015

so alice.pub seems to be alice public rsa key. secret seems to be the file encrypted under this key and mykey.pem should be the partial key which was found.

It looks like prime1, prime2 and some other stuff are pretty short. I guess this is what he meant by "half the key" is messed up.

By the way this is what a RSA PrivateKey should look like:

> RSAPrivateKey ::= SEQUENCE {
version Version,
modulus INTEGER, -- n
publicExponent INTEGER, -- e
privateExponent INTEGER, -- d
prime1 INTEGER, -- p
prime2 INTEGER, -- q
exponent1 INTEGER, -- d mod (p-1)
exponent2 INTEGER, -- d mod (q-1)
coefficient INTEGER, -- (inverse of q) mod p
otherPrimeInfos OtherPrimeInfos OPTIONAL
}

So this is what exponent1, exponent2 and coefficient are. Just additional information so that computations are faster thanks to CRT.

the partial key and alice public key seems to share the same modulus. this is vulnerable. If our public/private exponents are not messed up, this means we can factor the modulus and thus inverse Alice's public key.

Let's retrieve all the info we have and put them in a file:

openssl rsa -pubin -in alice.pub -modulus -noout | sed 's/Modulus=//' | xclip -selection c

Here's the modulus. We know that our public key is 3, let's get the private key in the clipboard as well

here I parse mykey.pem with openssl. I select the lines I want with grep. It returns two results, the modulus and the private key. I select only the second line with tail. I select only the 7th column with awk. I remove the : with sed. And now I have a beautiful integer in my clipboard.

Okay so let's do a bit of Sage now:

# let's write the info we have
modulus = int(0xC6C83529A2388F146365C5F5FD4B0D888961B95DE10FFA8853A3C2CBED750E9959BD0FF872C2232F6BAD32624F356A82D062755E1E4FEDAE54E8CA2471FC8D13AC700EE25720D4D9089FD6FBD42F12E6A41E1C1DE81F578C32132AD08594E851841D0239CD410DEF11D1C15EE75B92F86A04F7C6C7F36B9046B8FB2FE29565B1)
public = 3
private = int(0x848578C66C25B4B84243D94EA8DCB3B05B967B93EB5FFC5AE26D2C87F3A35F10E67E0AA5A1D6C21F9D1E2196DF78F1AC8AEC4E3EBEDFF3C98DF086C2F6A85E0BEFC0CA19C5E2495549FEE52E513E7BE9F22207D24B847FBB0CB5BAB795C690053E652D11539A2D960FEADECB9B175487000F7812CEACF5DB83301606CC357DA3)
# now let's factor the modulus
k = (private * public - 1)//2
carre = 1
g = 2
while carre == 1 or carre == modulus - 1:
g += 1
carre = power_mod(g, k, modulus)
p = gcd(carre - 1, modulus)
print(p)

This does not work. This should work.

Let's re-do the maths:

We know that our private and public keys cancel out. This is RSA:

private * public = 1 mod phi(N)

so we have private * public - 1 = 0 mod phi(N)

So for any g in our ring, we should have g^(private * public - 1) = 1 mod N

This is how RSA works.

Let's write it like that: private * public - 1 = k with k a multiple of phi(n). And we know that phi(n) = (p-1)(q-1) is even. So it could be written like this: k = 2^t * r with r an odd number.

Now if we take a random g mod N and we do g^(k/2) it should be the square root of a 1.

The Chinese Remainder Theorem tells us that there are 4 square roots mod N:

1 mod p

-1 mod p

1 mod q

-1 mod q

and two of them should be 1 mod N and -1 mod N. The 2 others should be different from 1 and -1 mod N. That's what I was trying to find in my code.

Once we have found this x mod N which is a square root of 1 mod N, we know that it is either x = 1 mod p or x = -1 mod p.
If we are in the first case, we shoudl have x - 1 = 0 mod p which translates into x - 1 is a multiple of p.
Doing gcd(x - 1, N) should give us p the first prime. If you don't understand it maybe check Dan Boneh's explanation (proof end of page 3) which should be clearer than mine.

With p it's easy to get q the other prime.

But it doesn't work...

Ah! I forgot that g^(k/2) could equal 1 all the time if k/2 were to be a multiple of phi(n). So let's code a loop that divides k by 2 and tries any g^k until it is giving us something else than 1. Then we know how many times we have to divide k by 2 so it's not a multiple of phi(n) anymore.

It turns out we just have to do it 3 times. And then it magically works. A bit more of Sage gives us the primes:

# p and q our primes
p = gcd(carre - 1, modulus)
q = modulus // p
# now that we have factored N let's find alice decryption key
public = 65537
phi = (p - 1) * (q - 1)
private = inverse_mod(public, phi)

Now that we have Alice's private key there are two ways to decrypt our secret:

recreate a valid rsa key with those values and use openssl rsautl

figure out how openssl rsautl works to do it ourselves

Let's do the first one. We'll modify our mykey.pem for this:

first is coded the type 02 (integer), then the length (81 repeated twice because the value block is bigger than 127bits, so we set the first byte to 81 (10000001, the first bit means it is a long way of defining the length, the 7 following bits are the number of byte it will take to define the length, in our case only one and it will be the next one) and the second byte to the actual size), then there is our modulo in hexadecimal. Note that the most significant bit of our value has to be zero if it is a positive integer, that's why we use 41 instead of 40 and lead the payload with 00.

So let's take the time and break this apart:

3082 // some header
0122 // the length of everything that follows (in byte)
0201 // integer of size 1
00
028181 // integer of size 0x81 (our modulus)
00c6c83529a2388f146365c5f5fd4b0d888961b95de10ffa8853a3c2cbed750e9959bd0ff872c2232f6bad32624f356a82d062755e1e4fedae54e8ca2471fc8d13ac700ee25720d4d9089fd6fbd42f12e6a41e1c1de81f578c32132ad08594e851841d0239cd410def11d1c15ee75b92f86a04f7c6c7f36b9046b8fb2fe29565b1
0201 // integer of size 1 (our public key)
03
028181 // integer of size 0x81 (our private key)
00848578c66c25b4b84243d94ea8dcb3b05b967b93eb5ffc5ae26d2c87f3a35f10e67e
0aa5a1d6c21f9d1e2196df78f1ac8aec4e3ebedff3c98df086c2f6a85e0b
efc0ca19c5e2495549fee52e513e7be9f22207d24b847fbb0cb5bab795c6
90053e652d11539a2d960feadecb9b175487000f7812ceacf5db83301606
cc357da3
0202 // integer of size 2 (prime 1)
00f5
0202 // integer of size 2 (prime 2)
00cf
0202 // integer of size 2 (exponent 1)
00a3
0202 // integer of size 2 (exponent 2)
008a
0202 // integer of size 2 (coefficient)
00bd

Now let's remove everything which is after the modulus and let's refill the file with our own values. Let's go back in Sage to calculate them:

Since it is now common custom to market a new vulnerability, here is the page: weakdh.org you will notice their lazyness in the non-use of a vulnerability logo.

The paper containing most information is here:

Imperfect Forward Secrecy: How Diffie-Hellman Fails in Practice, from a impressive amounts of experts (David Adrian, Karthikeyan Bhargavan, Zakir Durumeric,
Pierrick Gaudry, Matthew Green, J. Alex Halderman, Nadia Heninger, Drew Springall, Emmanuel Thomé, Luke Valenta, Benjamin VanderSloot, Eric Wustrow, Santiago Zanella-Béguelin, Paul Zimmermann)

Not an implementation bug, flaw lives in the TLS protocol

This is not an implementation bug. This is a direct flaw of the TLS protocol.

This is also a Man in The Middle attack. By being in the middle, the attacker can modify the ClientHello packet to force the server to use an Export Ciphersuite, i.e. Export Ephemeral Diffie-Hellman, that uses weak parameters. I already explained what is an "Export" ciphersuite when the FREAK attack happened.

The server then generates weak parameters for a public key and sends 4 messages:

ServerHello that specifies the Ciphersuite chosen from the list the Client gave him (if the attacker did things correctly, the server must have chosen an Export ciphersuite)

Certificate which is the server's certificate

ServerKeyExchange which contains the weak parameters and his public key.

ServerHelloDone which signals the end of his transmission.

The ServerKeyExchange message is here because an "ephemeral" ciphersuite is used. So the Server and the Client need extra messages to compute an "ephemeral" key together. Using an Export DHE (Ephemeral Diffie-Hellman) or a normal DHE do not change the structure of the ServerKeyExchange message. And that's one of the problem since the server only signs this part with his long term public key.

Here you can see the four messages in Wireshark, the signature is computed on the Client.Random, the Server.Random and the ECDH parameters contained in the ServerKeyExchange.

Thus, the attacker only has to modify the unsigned part of the ServerHello message to tell the Client his normal ciphersuite has been chosen (and not an Export ciphersuite).

Now all the attacker has to do is to crack the private key of either the Client or the Server. Which is easy nowadays because of the low 512bits security of the Export DHE ciphersuite.

It can then pass as the server and read any messages the client wants to send to the server

(taken from the paper)

Not an implementation bug, but implementations do help

the use of common DHE parameters is making things easier for attackers since they can do a pre-computation phase and use it to quickly crack a private key of a weak DHE parameters during the handshake.

This happens, for example when Apache hardcoded a prime for its Export DHE Ciphersuite that is now used in a bunch of servers

(taken from the paper)

Defense from the Server

Don't use common DH or DHE parameters! Generate your owns. But even more important, remove the Export Ciphersuites as soon as possible.

Defense from the Client

From a client perspective, the only defense is to reject small primes in DHE handshakes.

This is the only way of detecting this Man in The Middle attack.

You could also remove DHE in your ciphersuite list and try to use the elliptic curve equivalent ECDHE (Elliptic Curve Diffie-Hellman Ephemeral)

Another way: if you control both the server and the client, you could modify both ends so that the server signs the ciphersuite he chose, and the client verifies that as well.

1024 bits primes?

In the 1024-bit case, we estimate that such computations are plausible given nation-state resources, and a close reading of published NSA leaks shows that the agency’s attacks on VPNs are consistent with having achieved such a break. We conclude that moving to stronger key exchange methods should be a priority for the Internet community.

Seems like the NSA doesn't even need to downgrade you. So as a server, or as a client, you should refuse primes <= 1024bits

Where is TLS used?

TLS is not only used in https!

For example, what about EAP, i.e. wifi authentication? From a quick glance it looks like there are no export ciphersuite.

But weak DH and DHE parameters should be checked as well everywhere you make use of Discrete Logarithm crypto

Even better, you can use hash_equals (coupled with crypt)

Compares two strings using the same time whether they're equal or not.
This function should be used to mitigate timing attacks; for instance, when testing crypt() password hashes.

because php understands them as both being zero to the power something big. So zero.

Security

Now, if you're comparing unencrypted or unhashed strings and one of them is supposed to be secret, you might have potentialy created the setup for a timing-attack.

Always try to compare hashes instead of the plaintext!

To make it short, I did some research on the Boneh and Durfee bound, made some code and it worked. (The bound that allows you to find private keys if they are lesser than \(N^{0.292}\))

I noticed that many times, the lattice was imperfect as many vectors were unhelpful. I figured I could try to remove those and preserve a triangular basis, and I went even further, I removed some helpful vectors when they were annoying. The code is pretty straightforward (compare to the boneh and durfee algorithm here)

So what happens is that I make the lattice smaller, so when I feed it to the lattice reduction algorithm LLL it takes less time, and since the complexity of the whole attack is dominated by LLL, the whole attack takes less time.

It was all just theoric until I had to try the code on the plaid ctf challenge. There I used the normal code and solved it in ~3 minutes. Then I wondered, why not try running the same program but with the research branch?

That’s right, only 10 seconds. Because I removed some unhelpful vectors, I could use the value m=4 and it worked. The original algorithm needed m=5 and needed a lattice of dimension 27 when I successfully found a lattice of dimension 10 that worked out. I guess the same thing happened to the 59 triplets before that and that’s why the program ran way faster. 3 minutes to 10 seconds, I think we can call that a success!

/!\ this page uses LaTeX, if you do not see this: \( \LaTeX \)

then refresh the page

Plaid CTF

The third crypto challenge of the Plaid CTF was a bunch of RSA triplet \( N : e : c \) with \( N \) the modulus, \( e \) the public exponent and \( c \) the ciphertext.

The public exponents \( e \) are all pretty big, which doesn't mean anything in particular. If you look at RSA's implementation you often see \( 3 \), \( 17 \) or other Fermat primes (\( 2^m + 1 \)) because it speeds up calculations. But such small exponents are not forced on you and it's really up to you to decide how big you want your public exponent to be.

But the hint here is that the public exponents are chosen at random. This is not good. When you choose a public exponent you should be careful, it has to be coprime with \( \varphi(N) \) so that it is invertible (that's why it is always odd) and its related private exponent \( d \) shouldn't be too small.

Maybe one of these public keys are associated to a small private key?

I quickly try my code on a small VM but it takes too much time and I give up.

Wiener

A few days after the CTF is over, I check some write-ups and I see that it was indeed a small private key problem. The funny thing is that they all used Wiener to solve the challenge.

Since Wiener's algorithm is pretty old, it only solves for private exponents \( d < N^{0.25} \). I thought I could give my code a second try but this time using a more powerful machine. I use this implementation of Boneh and Durfee, which is pretty much Wiener's method but with Lattices and it works on higher values of \( d \). That means that if the private key was bigger, these folks would not have found the solution. Boneh and Durfee's method allows to find values of private key up to \( d < N^{0.292} \)!

After running the code (on my new work machine) for 188 seconds (~ 3 minutes) I found the solution :)

Here we can see that a solution was found at the triplet #60, and that it took several time to figure out the correct size of lattice (the values of \( m \) and \( t \)) so that if there was a private exponent \( d < N^{0.26} \) a solution could be found.

The lattice basis is shown as a matrix (the ~ represents an unhelpful vector, to try getting rid of them you can use the research branch), and the solution is displayed.

Boneh and Durfee

Here is the code if you want to try it. What I did is that I started with an hypothesis \( delta = 0.26 \) which tested for every RSA triplets if there was a private key \( d < N^{0.26 } \). It worked, but if it didn't I would have had to re-run the code for \(delta = 0.27\), \(0.28\), etc...

I setup the problem:

# data is our set of RSA triplets
for index, triplet in enumerate(data):
print "Testing triplet #", index
N = triplet[0]
e = triplet[1]
# Problem put in equation
P.<x,y> = PolynomialRing(ZZ)
A = int((N+1)/2)
pol = 1 + x * (A + y)

I leave the default values and set my hypothesis:

delta = 0.26
X = 2*floor(N^delta)
Y = floor(N^(1/2))

I use strict = true so that if the algorithm will stop if a solution is not sure to be found. Then I increase the values of \( m \) and \( t \) (which increases the size of our lattice) and try again:

solx = -1
m = 2
while solx == -1:
m += 1
t = int((1-2*delta) * m) # optimization from Herrmann and May
print "* m: ", m, "and t:", t
solx, soly = boneh_durfee(pol, e, m, t, X, Y)

If no private key lesser than \(N^{delta}\) exists, I try the next triplet. However, if a solution is found, I stop everything and display it.

Remember our initial equation:

\[ e \cdot d = f(x, y) \]

And what we found are \(x\) and \(y\)

if solx != 0:
d = int(pol(solx, soly) / e)
print "found the private exponent d!"
print d
m = power_mod(triplet[2], d, N)
hex_string = "%x" % m
import binascii
print "the plaintext:", binascii.unhexlify(hex_string)
break

Plaid, The biggest CTF Team, was organizing a Capture The Flag contest last week. There were two crypto challenges that I found interesting, here is the write-up of the second one:

You are given a file with a bunch of triplets:

{N : e : c}

and the hint was that they were all encrypting the same message using RSA. You could also easily see that N was the same modulus everytime.

The trick here is to find two public exponent \( e \) which are coprime: \( gcd(e_1, e_2) = 1 \)

This way, with Bézout's identity you can find \( u \) and \( v \) such that: \(u \cdot e_1 + v \cdot e_2 = 1 \)

So, here's a little sage script to find the right public exponents in the triplets:

for index, triplet in enumerate(truc[:-1]):
for index2, triplet2 in enumerate(truc[index+1:]):
if gcd(triplet[1], triplet2[1]) == 1:
a = index
b = index2
c = xgcd(triplet[1], triplet2[1])
break

Now that have found our \( e_1 \) and \( e_2 \) we can do this:

\[ c_1^{u} * c_2^{v} \pmod{N} \]

And hidden underneath this calculus something interesting should happen:

\[ (m^{e_1})^u * (m^{e_2})^u \pmod{N} \]

\[ = m^{u \cdot e_1 + v \cdot e_2} \pmod{N} \]

\[ = m \pmod{N} \]

And since \( m < N \) we have our solution :)

Here's the code in Sage:

m = Mod(power_mod(e_1, u, N) * power_mod(e_2, v, N), N)

And after the crypto part, we still have to deal with the presentation part:

hex_string = "%x" % m
import binascii
binascii.unhexlify(hex_string)

Tadaaa!! And thanks @spdevlin for pointing me in the right direction :)

So, RFC means Request For Comments and they are a bunch of text files that describe different protocols. If you want to understand how SSL, TLS (the new SSL) and x509 certificates (the certificates used for SSL and TLS) all work, for example you want to code your own OpenSSL, then you will have to read the corresponding RFC for TLS: rfc5280 for x509 certificates and rfc5246 for the last version of TLS (1.2).

x509

x509 is the name for certificates which are defined for:

informal internet electronic mail, IPsec, and WWW applications

There used to be a version 1, and then a version 2. But now we use the version 3. Reading the corresponding RFC you will be able to read such structures:

those are ASN.1 structures. This is actually what a certificate should look like, it's a SEQUENCE of objects.

The first object contains everything of interest that will be signed, that's why we call it a To Be Signed Certificate

The second object contains the type of signature the CA used to sign this certificate (ex: sha256)

The last object is not an object, its just some bits that correspond to the signature of the TBSCertificate after it has been encoded with DER

ASN.1

It looks small, but each object has some depth to it.

The TBSCertificate is the biggest one, containing a bunch of information about the client, the CA, the publickey of the client, etc...

TBSCertificate ::= SEQUENCE {
version [0] EXPLICIT Version DEFAULT v1,
serialNumber CertificateSerialNumber,
signature AlgorithmIdentifier,
issuer Name,
validity Validity,
subject Name,
subjectPublicKeyInfo SubjectPublicKeyInfo,
issuerUniqueID [1] IMPLICIT UniqueIdentifier OPTIONAL,
-- If present, version MUST be v2 or v3
subjectUniqueID [2] IMPLICIT UniqueIdentifier OPTIONAL,
-- If present, version MUST be v2 or v3
extensions [3] EXPLICIT Extensions OPTIONAL
-- If present, version MUST be v3
}

DER

A certificate is of course not sent like this. We use DER to encode this in a binary format.

Every fieldname is ignored, meaning that if we don't know how the certificate was formed, it will be impossible for us to understand what each value means.

Every value is encoded as a TLV triplet: [TAG, LENGTH, VALUE]

On the right is the hexdump of the DER encoded certificate, on the left is its translation in ASN.1 format.

As you can see, without the RFC near by we don't really know what each value corresponds to. For completeness here's the same certificate parsed by openssl x509 command tool:

How to read the DER encoded certificate

So go back and check the hexdump of the GITHUB certificate, here is the beginning:

30 82 05 E0 30 82 04 C8 A0 03 02 01 02

As we saw in the RFC for x509 certificates, we start with a SEQUENCE.

Certificate ::= SEQUENCE {

Microsoft made a documentation that explains pretty well how each ASN.1 TAG is encoded in DER, here's the page on SEQUENCE

30 82 05 E0

So 30 means SEQUENCE. Since we have a huge sequence (more than 127 bytes) we can't code the length on the one byte that follows:

If it is more than 127 bytes, bit 7 of the Length field is set to 1 and bits 6 through 0 specify the number of additional bytes used to identify the content length.

(in their documentation the least significant bit on the far right is bit zero)

So the following byte 82, converted in binary: 1000 0010, tells us that the length of the SEQUENCE will be written in the following 2 bytes 05 E0 (1504 bytes)

We can keep reading:

30 82 04 C8 A0 03 02 01 02

Another Sequence embedded in the first one, the TBSCertificate SEQUENCE

TBSCertificate ::= SEQUENCE {
version [0] EXPLICIT Version DEFAULT v1,

The first value should be the version of the certificate:

A0 03

Now this is a different kind of TAG, there are 4 classes of TAGs in ASN.1: UNIVERSAL, APPICATION, PRIVATE, and context-specific. Most of what we use are UNIVERSAL tags, they can be understood by any application that knows ASN.1. The A0 is the [0] (and the following 03 is the length). [0] is a context specific TAG and is used as an index when you have a series of object. The github certificate is a good example of this, because you can see that the next index used is [3] the extensions object:

TBSCertificate ::= SEQUENCE {
version [0] EXPLICIT Version DEFAULT v1,
serialNumber CertificateSerialNumber,
signature AlgorithmIdentifier,
issuer Name,
validity Validity,
subject Name,
subjectPublicKeyInfo SubjectPublicKeyInfo,
issuerUniqueID [1] IMPLICIT UniqueIdentifier OPTIONAL,
-- If present, version MUST be v2 or v3
subjectUniqueID [2] IMPLICIT UniqueIdentifier OPTIONAL,
-- If present, version MUST be v2 or v3
extensions [3] EXPLICIT Extensions OPTIONAL
-- If present, version MUST be v3
}

Since those obects are all optionals, skipping some without properly indexing them would have caused trouble parsing the certificate.

Following next is:

02 01 02

Here's how it reads:

_______ tag: integer
| ____ length: 1 byte
| | _ value: 2
| | |
| | |
v v v
02 01 02

The rest is pretty straight forward except for IOD: Object Identifier.

Object Identifiers

They are basically strings of integers that reads from left to right like a tree.

So in our Github's cert example, we can see the first IOD is 1.2.840.113549.1.1.11 and it is supposed to represent the signature algorithm.

Like the audit of OpenSSL wasn't awesome enough, today we learned that we were going to audit Let's Encrypt this summer as well. Pretty exciting agenda for an internship!

ISRG has engaged the NCC Group Crypto Services team to perform a security review of Let’s Encrypt’s certificate authority software, boulder, and the ACME protocol. NCC Group’s team was selected due to their strong reputation for cryptography expertise, which brought together Matasano Security, iSEC Partners, and Intrepidus Group.

I was really confused about all those acronyms when I started digging into OpenSSL and RFCs. So here's a no bullshit quick intro to them.

PKCS#7

Or Public-Key Crypto Standard number 7. It's just a guideline, set of rules, on how to send messages, sign messages, etc... There are a bunch of PKCS that tells you exactly how to do stuff using crypto. PKCS#7 is the one who tells you how to sign and encrypt messages using certificates.
If you ever see "pkcs#7 padding", it just refers to the padding explained in pkcs#7.

X509

In a lot of things in the world (I'm being very vague), we use certificates. For example each person can have a certificate, and each person's certificate can be signed by the government certificate. So if you want to verify that this person is really the person he pretends to be, you can check his certificate and check if the government signature on his certificate is valid.

TLS use x509 certificates to authenticate servers. If you go on https://www.facebook.com, you will first check their certificate, see who signed it, checked the signer's certificate, and on and on until you end up with a certificate you can trust. And then! And only then, you will encrypt your session.

So x509 certificates are just objects with the name of the server, the name of who signed his certificate, the signature, etc...

Certificate
Version
Serial Number
Algorithm ID
Issuer
Validity
Not Before
Not After
Subject
Subject Public Key Info
Public Key Algorithm
Subject Public Key
Issuer Unique Identifier (optional)
Subject Unique Identifier (optional)
Extensions (optional)
...
Certificate Signature Algorithm
Certificate Signature

ASN.1

So, how should we write our certificate in a computer format? There are a billion ways of formating a document and if we don't agree on one then we will never be able to ask a computer to parse a x509 certificate.

That's what ASN.1 is for, it tells you exactly how you should write your object/certificate

DER

ASN.1 defines the abstract syntax of information but does not restrict the way the information is encoded. Various ASN.1 encoding rules provide the transfer syntax (a concrete representation) of the data values whose abstract syntax is described in ASN.1.

Now to encode our ASN.1 object we can use a bunch of different encodings specified in ASN.1, the most common one being used in TLS is DER

DER is a TLV kind of encoding, meaning you first write the Tag (for example, "serial number"), and then the Length of the following value, and then the Value (in our example, the serial number).

DER is also more than that:

DER is intended for situations when a unique encoding is needed, such as in cryptography, and ensures that a data structure that needs to be digitally signed produces a unique serialized representation.

So there is only one way to write a DER document, you can't re-order the elements.

And if you see any equal sign =, it's for padding.

So if the first 6 bits of your file is '01' in base 10, then you will write that as B in plaintext. See an example if you still have no idea about what I'm talking about.

PEM

A pem file is just two comments (that are very important) and the data in base64 in the middle. For example the pem file of an encrypted private key:

This is my second video, after the first explaining how DPA works. I'm still trying to figure out how to do that but I think it is already better than the first one. Here I explain how Coppersmith used LLL, an algorithm to reduce lattices basis, to attack RSA. I also explain how his attack was simplified by Howgrave-Graham, and the following Boneh and Durfee attack simplified by Herrmann and May as well.