Hey! I'm David, a security consultant at Cryptography Services, the crypto team of NCC Group . This is my blog about cryptography and security and other related topics that I find interesting.

# WhibOx part 3: Attacks on Whitebox crypto

## posted August 2016

You can read part 1 here

Joppe Bos from NXP gave the same talk I had seen a couple month ago at SSTIC.

To recap the white box model:

• you can inject fault
• you can access the source code
• you can alter the memory, observe it
• ...

If a secure whitebox solution exists, then by definition it is secured to all current and future side-channel and fault attacks.

In practice we only know how to do it for symmetric crypto (although that's not what Mike Wiener said after (but his design is proprietary))

Chow et al. construction:

• converted every operation (mixcolumns, xor, etc...) in LUT (look up table)
• applied linear encoding to inputs and outputs of look up tables (input/output encoding)
• adding non-linear encodings to inputs and outputs as well

And this is the reason why it only works for symmetric crypto. For richer math problems like RSA we can't do all these things.

In practice you do not ship whitebox crypto like that though:

• you add more obfuscation on top, just to make the work of the hacker harder)
• you glue the environment to the whitbox, so that you can't just take the whitebox and hope it will work)
• you support for traitor tracing, if the program is copied to other people you can trace where it came from
• you create a mechanism for frequent updating, so a copy of the whitebox is useless after some time, or if you end up breaking it it might be too late
• ...

In practice to attack:

• you reverse-engineer the code (time-consuming)
• identify which WB scheme is used
• target the correct look up tables
• apply an algebraic attack

Bos' approach:

• is it AES or DES?
• launch the attack (that's the only thing you need, see how they pwned the CHES challenges)

The idea is to create software traces like power traces using dynamic binary instrumentation tools (Pin, Valgrind). They created plugins for both of them. Then they record every instructions in memory made by the whitebox and apply a DPA-style attack. If you don't know what that is check my video on the subject here.

The idea here is that you can't remove some correlation with input/ouput encoding

• they can visually tell what's the cipher by looking at the memory trace

picture of AES, the 10th round is different so you don't see it

And if memory access is made linear, you can still look at the stack access and it will reveal the algorithm as well

They can do better than a CPA (Correlation Power Analysis) attack, which they call a DCA (Differential Computation Analysis) . The hamming weight makes a lot of sense in the hardware setting, but here without the noise and the errors it does not so much. Also they need way less traces. All in all, these kind of DPA attacks are much more powerful in a software setting compared to a hardware setting.

By the definition of the WB attack model, implementations shouldn't leak any side channel data. tl;dr: they all do. Well, except when there is this extern encoding thing (transformation added to the beginning and the end of AES).

intuition why this works: Encodings do not sufficiently hide correlations when the correct key is used.

Countermeasures?

• most of the countermeasures against DPA/CPA rely on random data (but in the WB model we can't rely on random data...)
• if you take large encodings, this attack shouldn't work anymore (why? dunno)

The code is on github!

## Marc Witteman - Practical white-box topics: design and attacks II

The hype of WB crypto comes from mobile payment and streaming video. Today if you look in your pocket you most probably have a whitebox implementation.

In practice for an attack you need to:

• root the phone
• decompile/disassemble code
• use a debugger to step through the code
• instrumentate the binary to change the behavior of the code

They applied the Differential Fault Attack (DFA) on a bunch of whitebox implementations, broke almost all of them except the ones using extern encoding (apparently in banking there is no extern encoding because of "constraints")

But overall, if you're designing whitebox crypto, you don't need to make the implementation perfect, you just need to make it good enough

comment on this story

# Whibox part 2: Whitebox Crypto

## posted August 2016

You can read part 1 here

Matthieu Rivain from Crypto Experts did the transition to Whitebox Crypto.

After a bunch of funny slides about Obfuscation in general. He reminded us what were the real definitions of these new concepts.

iO is equivalent to BPO, it's the best possible way to obfuscate a program. It was Chow et al. who introduced the first whitebox crypto construction in 2002 at DRM. The main goal of whitebox crypto was to make key extraction difficult, not to stop the attacker from using the program! It forces the attacker to use the software. There is value for DRM system providers.

What is WBC AES? A key extraction must be "difficult".

What are the practical uses of such a scheme?

Making AES one-way and whiteboxing it turns it into a public-key cryptosystem (this is what happens in general, using whitebox on a symmetric system turns it into an asymmetric system).

There are some security notions for whitebox symmetric ciphers, the usual CPA, CCA, ... but also RCA!

RCA stands for recompilation attack: the attacker can get a new compilation of the program.

Here we have different goals for attacks:

• extract the key
• compress a whitebox implementation to make it smaller
• inverse the whitebox implementation (if it only encrypt, make it decrypt)
• be untraceable (often if you just copy the program and send it to someone else, it is true that they will be able to use it, but it might be tied to you)

whitebox crypto is about designing a compiler for an existing encryption scheme

A "one-way" compiler to be exact.

## Andrey Bogdanov - Towards secure whitebox cryptography

Unlike black box models like apps in the clouds, that deal with million of requests, probably the white box model like an app on your phone is not dealing with so much. So we don't mind it being a bit slower.

Bogdanov introduced the new Host Card Emulation trend (HCE). There are a bunch of insecure hardware out there (think old Android phones) and banks want them to be able to do secure payment and all that fuss with phones and Near Field Communication (NFC). Apple did it with a secure Enclave but we can't wait for all phones to have a secure element right? So now banks are doing it in software instead with an emulated smart card in the phone. And that's what HCE is.

• In 2014, Visa and Mastercard started supporting HCE.
• In 2017, 86% of North-america point of sales, and 78% of Europe point of sales, will support NFC.
• 2/3 of shipped phones will support NFC in 2018

Bogdanov remarks that so far, since the first construction of Chow et al. in 2002, all Whitebox crypto schemes have been broken:

Even the recent ASASA [BBK13] in 2014 seems unsecure

Bogdanov also adds on the security notions of whitebox crypto: *space hardness. This is the same notion M. Rivain called "compression of the whitebox". You should not be able to decompose internal component. Without all the code/tables, you shouldn't be able to compute encryption or decryption with good probability.

You can read part 3 here

comment on this story

# Whibox part 1: Indistinguishability Obfuscation

## posted August 2016

I landed in Santa Barbara in a tiny airplane, at a tiny airport, so cute that I could have mistakenly taken it for an small university building collecting toy aircrafts. Outside the weather was warm and I could already smell the ocean.

A shuttle dropped me at the registration building and I had to walk through a large patio filled with people sitting in silence, looking at their notes or their laptops, in what seemed like comfortable couches arranged in a Moroccan way. I thought to myself "they all seem so comfortable, it looks like they've been here many times". As it turns out, Crypto is taking place in Santa Barbara every year and people have got used to it.

I registered at the desk, then went to my assigned room in these summer-vacant student residences, passing by the numerous faces I will probably see again this coming week. I went to sleep early and woke up the next day for the first workshop: WhibOx.

First, let me say that in theory, workshops are cool. Cooler than conferences, because they are here to teach you something. Crypto conferences are filled with PHD students trying to get their publication ratio (the rule of thumb is 3 good papers accepted at 3 big conferences before defending) and researchers trying to market their papers. (Well, some of them are actually good speakers and care about teaching you something in their 45 minutes time frame.)

Back to our workshop, it is the first ever workshop on whitebox crypto and indistinguishability obfuscation (iO). Pascal Pailier -- the author of the homomorphic cryptosystem -- starts the course with a few words: whitebox crypto is something relatively old/new, but it's starting to rise. There is a clear problem that real companies are trying to solve, and whitebox seems to be the answer. Pailier notes that it's not a concept that is well understood, and the speakers will proove his point again and again later, trying to come up with a different definition. In this blogpost I will summarize the day, if you're interested to see the talks yourself the slides should shortly be uploaded, as well as the videos on the ECRYPT channel.

## Amit Sahai - Indistinguishability Obfuscation

Indistinguishability Obfuscation was the first topic to be discussed about. Amit reminded us that iO was quite recent, and aimed towards our modern world attack models: programs are routinely leaked to the adversary, often they are even given directly to the adversary.

For a while, they thought about multi-party computation (MPC), but it was not the answer. This is because it requires some portion of the computation to be completely hidden from the adversary.

In the idea of general purpose obfuscation, every part is known by the adversary. But the adversary can still use the program as he wishes! Amit gives an example where a program would give you a different output if you feed it an integer greater or lesser than some fixed value. By just querying the program an attacker could learn it... So for this to work, you would first need some function/program that is un-learnable by queries.

BGIRSVY2001 showed that the concept of a virtual black box (VBB) is impossible for all circuits. That is, a program that you entirely control, and that is indistinguishable from a black box (you can only see input and output).

But then... there are some contrived "self-eating programs", for which this idea of black box obfuscation is not ruled-out. The idea of these self-eating programs is that you have a program P that takes an input x, and it checks if x is a program equivalent to P itself. If so, it outputs a secret s, otherwise 0. But even these have "explicit attacks" against them. So they define an even weaker definition that is iO: a polynomial slowdown + indistinguishability.

The idea of iO is to bring the best possible obfuscation. The first construction was done in 2013, and surprisingly, it was useful! it was used to resolve long-standing open problems like:

• functional encryption
• deniable encryption
• multi-input FE
• patchable iO
• ...

It's kind of a central hub of crypto, anything you want to do in cryptography (except for efficiency), you can use iO to do it.

But first, if you want to obfuscate something, you need a language! They use a Matrix Branching Program (MBP), which is not a straight forward language at all...

It's a bunch of invertible matrix over $Z_n$, and the input tells you which matrix in each pair to choose. Then you repeat the same pattern over and over until you reach the last pair.

In the image below you can see the pairs on the right, here let's imagine that you take an input of 3 bits. If the input is 011 you would choose $M_{1,0}$ then $M_{2,1}$ and then $M_{3,1}$. You would then repeat this pattern until you reach the last matrix. The output of your program is the multiplication of all the matrices your input chose.

You can implement a bunch of stuff like that it seems, and here's a XOR:

But this is not enough to obfuscate (it's a start). To obfuscate you want to use Kilian randomization: multiply each matrice $M$ with some $R$ and its inverse like so: $RMR^{-1}$.

(slide from a crypto conf)

Now to hide everything you encode the matrices in some way that will still allow matrix multiplications. And for this you use the now famous Multilinear Maps (MMaps) aka Granded Encodings. There are still attacks on that (Annihilation attacks) but new ways to counter against these as well.

But it is still not enough, there are some input-mixing attacks (what if you do not respect the repeating pattern and multiply the wrong matrix), then they can't efficiently bound the information learned by these tricks.

There are of course ways to avoid that... Finally they add some more randomness to the mix... A final construction is made (Miles-Sahai-Zhandry, TTC2016):

They end up using $3\times3$ matrices instead of the initial matrices to finish the obfuscation. It seems like this technique is making the thing look random, so that the adversary now has to distinguish it from randomness as well.

It's of course confusing and it seems like reading the paper might help more. But the idea is here. A circuit, that is obfuscable according to some definition, is transformed into this MBP language, then obfuscated using an encoding (MMaps) and some other techniques.

It's an incredibly powerful construction, which is still at its infancy stage (they are still exploring many other constructions). And its biggest problem right now is efficiency. Amit talks about a $2^{100}$ overhead for programs like AES (It's almost as good as the complexity of an attack on AES).

So yeah. Completely theoretical at the moment. No way anyone is going to use that. But they're quickly improving the constructions and they're really happy that they have planted the seeds for a new generation of crypto primitives.

## Mariana Raykova - 5Gen: a framework for prototyping applications using multilinear maps and matrix branching programs

The workshop had a lot of speakers, and so a lot of repetions were made. Which is not necessarily a bad thing since it's always good to see new concepts being defined by different persons.

The point of obfuscation for Raykova is to protect proprietary algorithms, software patches and avoid malware detection.

Now, the talk is about Functional Encryption: some construction that allows you to generate decryption keys that reveal only functions of the encrypted message.

They implemented a bunch of stuff. For that they use a compiler (called Cryfms) that transforms Cryptol code ("Cryptol is used mostly in the intelligence community") to a minimal finite state machine and then to Matrix Branching Programs (MBPs). It's called 5gen and it's on github!

Like many speakers at this workshop, she made fun of the efficiency of iO

And then explained again how iO worked. Here you can see the same kind of explanation as Amit did.

Barrington has a construction to convert circuits to MBP, but the bad thing is that the size of the MBP is exponential in the circuit depth. Urg... That's why they transform the Cryptol code in a finite state automata first before converting it to MBP. It's better to transform a finite state automata in a MBP directly:

Raykova explained that they used a bunch of optimization, sometimes the second matrix wouldn't change anything (or wouldn't change a part of the state) she said, so they could pre-multiply them...

Now to obfuscate you use Kilian88 to do Randomized branching programs (RBP) as Amit already explained

And as Amit said, this is not enough for obfuscation, you need to apply encoding techniques (MMaps). MMaps gives you a way to encode an element (which is hiding the element), and give you some nice property on the encoding (to do operations).

But you still have these mix and match attacks. In the iO model the adversary can use the whole program and feed it all sort of inputs, but in FE we don't want that.

You can read part 2 here

comment on this story

# Breaking https' AES-GCM (or a part of it)

## posted August 2016

The coolest talk of this year's Blackhat must have been the one of Sean Devlin and Hanno Böck. The talk summarized this early year's paper, in a very cool way: Sean walked on stage and announced that he didn't have his slides. He then said that it didn't matter because he had a good idea on how to retrieve them.

He proceeded to open his browser and navigate to a malicious webpage. Some javascript there seemed to send various requests to a website in his place, until some MITM attacker found what they came for. The page refreshed and the address bar now whoed https://careers.mi5.gov.uk as well as a shiny green lock. But instead of the content we would have expected, the white title of their talk was blazing on a dark background.

What happened is that a MITM attacker tampered with the mi5's website page and injected the slides in a HTML format in there. They then went ahead and gave the whole presentation via the same mi5's webpage.

## How it worked?

The idea is that repeating a nonce in AES-GCM is... BAD. Here's a diagram from Wikipedia. You can't see it, but the counter has a unique nonce prepended to it. It's supposed to change for every different message you're encrypting.

AES-GCM is THE AEAD mode. We've been using it mostly because it's a nice all-in-one function that does encryption and authentication for you. So instead of shooting yourself in the foot trying to MAC then-and-or Encrypt, an AEAD mode does all of that for you securely. We're not too happy with it though, and we're looking for alternatives in the CAESAR's competition (there is also AES-SIV).

The presentation had an interesting slide on some opinions:

"Do not use GCM. Consider using one of the other authenticated encryption modes, such as CWC, OCB, or CCM." (Niels Ferguson)

"We conclude that common implementations of GCM are potentially vulnerable to authentication key recovery via cache timing attacks." (Emilia Käsper, Peter Schwabe, 2009)

"AES-GCM so easily leads to timing side-channels that I'd like to put it into Room 101." (Adam Langley, 2013)

"The fragility of AES-GCM authentication algorithm" (Shay Gueron, Vlad Krasnov, 2013)

"GCM is extremely fragile" (Kenny Paterson, 2015)

One of the bad thing is that if you ever repeat a nonce, and someone malicious sees it, that person will be able to recover the authentication key. It's the H in the AES-GCM diagram, and it is obtained by hashing the encryption key K. If you want to know how, check Antoine Joux' comment on AES-GCM.

Now, with this attack we lose integrity/authentication as soon as a nonce repeats. This means we can modify the ciphertext in the middle and no-one will realize it. But modifying the ciphertext doesn't mean we can modify the plaintext right? Wait for it...

Since AES-GCM is basically counter mode (CTR mode) with a GMac, the attacker can do the same kind of bitflip attacks he can do on CTR mode. This is pretty bad. In the context of TLS, you often (almost always) know what the website will send to a victim, and that's how you can modify the page with anything you want.

Look, this is CTR mode if you don't remember it. If you know the nonce and the plaintext, fundamentally the thing behaves like a stream cipher and you can XOR the keystream out of the ciphertext. After that, you can encrypt something else by XOR'ing the something else with the keystream :)

They found a pretty big number of vulnerable targets by just sending dozen of messages to the whole ipv4 space.

## My thoughts

Now, here's how the TLS 1.2 specification describe the structure of the nonce + counter in AES-GCM: [salt (4) + nonce (8) + counter (4)].

The whole thing is a block size in AES (16 bytes) and the salt is a fixed part of 4 bytes derived from the key exchange happening during TLS' handshake. The only two purposes of the salt I could think of are:

Pick the reason you prefer.

Now if you picked the second reason, let's recap: the nonce is the part that should be different for every different message you encrypt. Some increment it like a counter, some others generate them at random. This is interesting to us because the birthday paradox tells us that we'll have more than 50% chance of seeing a nonce repeat after $2^{32}$ messages. Isn't that pretty low?

# How to Backdoor Diffie-Hellman: quick explanation

## posted August 2016

I've noticed that since I published the How to Backdoor Diffie-Hellman paper I did not post any explanations on this blog. I just gave a presentation at Defcon 24 and the recording should be online in a few months. In the mean time, let me try with a dumbed-down explanation of the outlines of the paper:

I found many ways to implement a backdoor, some are Nobody-But-Us (NOBUS) backdoors, while some are not (I also give some numbers of "security" for the NOBUS ones in the paper).

The idea is to look at a natural way of injecting a backdoor into DH with Pohlig-Hellman:

Here the modulus $p$ is prime, so we can naturally compute the number of public keys (elements) in our group: $p-1$. By factoring this number you can also get the possible subgroups. If you have enough small subgroups $p_i$ then you can use the Chinese Remainder Theorem to stitch together the many partial private keys you found into the real private key.

The problem here is that, if you can do Pohlig-Hellman, it means that the subgroups $p_i$ are small enough for anyone to find them by factoring $p-1$.

The next idea is to hide these small subgroups so that only us can use this Pohlig-Hellman attack.

Here the prime $n$ is not so much a prime anymore. We instead use a RSA modulus $n = p \times q$. Since $n$ is not a prime anymore, to compute the number of possible public keys in our new DH group, we need to compute $(p-1)(q-1)$ (the number of elements co-prime to $n$). This is a bit tricky and only us, with the knowledge of $p$ and $q$ should be able to compute that. This way, under the assumptions of RSA, we know that no-one will be able to factor the number of elements ($(p-1)(q-1)$) to find out what subgroups there are. And now our small subgroups are well hidden for us, and only us, to perform Pohlig-Hellman.

There is of course more to it. Read the paper :)

comment on this story

# Generating randomness and you're too close to boot?

## posted July 2016

If you want to generate good randomness, but are iffy about /dev/urandom because your machine has just booted, and you also don't know how long you should wait before /dev/urandom has enough entropy, then maybe you should consider using getrandom (thanks rwg!). From the manpage:

By default, getrandom() draws entropy from the /dev/urandom pool.

If the pool has not yet been initialized, then the call blocks

Also it seems like the instruction RDRAND on certain Intel chips returns "true" random numbers. It's also interesting to see that it was audited twice by Cryptography Research, which resulted in two papers, the recent one being in 2012 and done by Kocher et al: Analysis of Intel's Ivy Bridge Digital Random Number Generator.

comment on this story

# Generating one's own elliptic curve.

## posted July 2016

Dear Mr.DAVID
I am learning about generating an elliptic curves cryptography , in your notes I find:-
JPF: Many people don’t trust NIST curves. How many people verified the curve generation? Open source tools would be nice.
Flori: people don't trust NIST curves anymore, surely for good reasons, so if we do new curves we should make them trustable. Did anyone here tried generating nist, dan, brainpool etc...? (3 people raised their hands).
Would you give me some reasons for generating our own curves? what are the good reasons that some people do not trust on NIST CURVES?
hager

Hello Mr. HAGER

Well, the history of how standard curves were generated has been pretty chaotic, and djb has shown that this might be a bad thing for us in his Bada55 paper.

NIST says that they generated their curves out of the hash of a sentence that is unknown and lost. The german Brainpool curves seem to ignore their own standards on how to generate secure curves.

one of the standard Brainpool curves below 512 bits were generated by the standard Brainpool curve-generation procedure

So how bad is it? We know that choosing random parameters for your crypto algorithms can lead to unsecure or even backdoored constructions. See how to choose Sboxes in DES, how to choose nothing-up-my-sleeve numbers for hash functions, how to choose secure parameters for Diffie-Hellman, how Dual EC is backdoored if you know how the main points P and Q were generated, ...

Well, so far we don't really know. The fact that we haven't been able to crack any of the curves we use is "reassuring". I like to look at bitcoin as a proof that at least one of them hasn't been broken :) But as many researchers suspect, the NSA might be years ahead of us in cryptanalysis. So it is possible that they might have found one (or more) issues with elliptic curve cryptography, and that they generated "weak" curves before publishing them through NIST's standards.

So far, there hasn't been enough advances in ECC (a relatively old field in cryptography) to make you worried enough to generate your own curves. Some people do that though, but they know what they're doing. They even released a paper on how to generate safe curves using nothing-up-my-sleeves parameters. Like that, you can review the generation process and see that nothing was done to harm you.

Some people also worry about sparse primes, and that's one more paranoid reason to use random curves as above.

But seriously, if you don't want to go through all the trouble, you should probably start using one of the curve specified in this RFC: Curve 25519 or Curve 448. They use state-of-the-art research, reproducible generation and have been vetted by many people.

comment on this story

# Blackhat + Defcon

## posted July 2016

Blackhat and Defcon are almost here! I'll be landing there on Friday 29th and will give a crypto training at Blackhat, then on the 5th will give a talk at the Defcon Village track about Diffie-Hellman and backdoors.

If you have any interest in cryptography and want to meet up say the word! We should probably organize a drunk cryptographer evening (anyone interested?)

Also, I should be looking in what talks are interesting, if you think a particular one is please share in the comment section =)

comment on this story

# I talked at the NCC Group open forum

## posted June 2016

I was at the offices of Braintree this evening, talking about the history of TLS, backdoors and Diffie-Hellman. If you missed it, my paper was released a few days ago and this is the talk that is packaged with it =)

my colleagues and I preparing the event in the beautiful atrium of Braintree

starting the talk!

Someone asked me for the slides, you can find them on the github repo. You can find the .keynote file as well containing videos (but you need osx).

I'll be looking into submitting this talk to conferences, if you have any idea where I'll be happy to hear suggestions =)

# How to Backdoor Diffie-Hellman is on ePrint

## posted June 2016

My latest whitepaper just got published on ePrint. It's available here.

Looking to seek answers to the recent Snowden revelations and the history of state agencies backdoors, this paper looks at what the secret spies might have been researching in order to find new ways to observe and tamper with the people's traffic. What if we just sabotaged one of the most trusted cryptographic algorithm of the last 40 years? What if we backdoored Diffie-Hellman?

comment on this story

# Interested in Crypto and living in Chicago?

## posted June 2016

Next week on wednesday NCC Group will host its second open forum in Chicago. And I'll be one of the speaker!

If you are interested in crypto, I'll be talking about backdoors and Diffie-Hellman. This will be the occasion of explaining what my latest whitepaper that was released today is about.

John Downey will also be talking about "Cryptography Pitfalls".

By the way, if you're interested in such events in Chicago. OWASP was today (and we even learned how to brew beer there). There are also other security related events that you can get update on from this twitter.

comment on this story

# DTLS and Finished messages

## posted June 2016

Sorry i am using DTLS 1.2 instead TLS 1.2. Kindly explain the structure of finished message, like how many bytes for "nonce", how many bytes are encrypted data and how many bytes for authentication tag.

I'm writing this because I want to clear things up: you can ask me pretty much anything and I'll do my best to answer here. So please ask away if you have a question in crypto! There is also a contact form for that.

## Differences between DTLS and TLS

TLS is an application layer protocol. Some people say that it's a transport layer protocol. This is false. It runs in whatever program you are using on top of TCP/UDP. Although it is used to transport the real content of the application: it can be seen as an intermediate transport layer running in the application layer.

The TLS we know, the one that runs in our browser, typically runs on top of TCP. But in some situations, the developer might want to use UDP to exchange packets. Since UDP forgives packet loss (think multiplayer video games or audio/video conferences), it is important that TLS is setup accordingly to forgive those packet loss as well. For this, we use a similar but different specification than TLS: DTLS.

DTLS is what you use if you need TLS on top of UDP.

The main DTLS RFC documents the differences with TLS. Most of them are modification to the protocol so that the connection doesn't terminate/break if a message is lost, duplicated, out of order, etc...:

• records can't be split into several datagrams
• the sequence number is written in each record
• errors (duplicates, loss, out of order) are tolerated
• no stream cipher can be used (no state can be used if errors are tolerated)
• protections against DoS attacks (apparently DTLS has some problems with DoS attacks)
• ...

## Finished messages

The simplest TLS handshake goes like this:

• the client sends its ClientHello packet
• the server replies with his ServerHello packet
• the client sends (his part of) the shared secret in a ClientKeyExchange packet

Now I omitted a bunch of packets that are usually part of the handshake as well. For example:

• the server usually sends his certificate after the ServerHello message.
• the server might also take part in the creation of the shared secret in some modes (including ephemeral modes)

But this is not what is interesting to us here.

After enough messages have been sent to compute the shared secret, a ChangeCipherSpec message is sent by both the client and the server to announce the beginning of traffic encryption. Followed directly by an encrypted Finished message authenticating all the previous handshake messages.

In my knowledge, the Finished message is the only encrypted message of a handshake. It is also the moment where the handshake is "authenticated" and where Man-In-The-Middle attacks usually stop.

Now what is in that Finished message?

Exactly the same things as in TLS. The TLS 1.2 RFC shines a bit more light on the subject:

struct {
opaque verify_data[verify_data_length];
} Finished;

verify_data = PRF(master_secret, finished_label, Hash(handshake_messages)) [0..verify_data_length-1];

finished_label =
For Finished messages sent by the client, the string "client finished". For Finished messages sent by the server, the string "server finished".

Don't forget that this Finished structure is then encrypted before being sent in a record. The Hash and the PRF we already defined in previous posts, the handshake_messages value is what interest us: it is the concatenation of all the binary data received and sent during the handshake, in order, and not including this Finished one.

Now DTLS has the particularity that some messages are re-sent, out of order, etc... so duplicates must be ignored, real order must be preserved.

## How do I know that?

Besides reading the RFC, you might often want to know what's happening for real. To be a bit more informative, let me tell you how I quickly get that kind of information when I'm curious:

• I setup a server key + certificate: openssl req -x509 -new -nodes -keyout key.pem -out server.pem.
• I start the server: openssl s_server -dtls1 -key key.pem -port 4433 -msg.
• I connect to it with a client: openssl s_client -dtls1 -connect localhost:4433 -msg.

The -msg argument will print out the exact content of the messages sent. In the case of the Finished message, it will show the unencrypted hexadecimal data sent. If you want to see the real encrypted data that is sent, you can use the -debug option.

You might also want to have a bit more information about every records. A good way to do this is to record the traffic with tcpdump: sudo tcpdump udp -i lo0 -s 65535 -w handshake.pcap and to open the .pcap file in Wireshark and enter udp && dtls in the filter area.

# Crypto training at Black Hat US 2016

## posted May 2016

I'll be in Vegas for the next Black Hat USA giving a crypto training with my coworkers Javed Samuel and Alex Balducci.

You have a nice summary on the official training page.

It's 2 days crash course on our knowledge as crypto consultants. There will be a lot of general culture, protocols, side-channels, random numbers, ... as well as deep segments and exercises on common crypto attacks.

If you're not interested, you can still hit me up for drinks around Black Hat or Defcon, I'll be in Vegas. Otherwise I'll be in Santa Barbara a few week after for CRYPTO and CHES. If you're in need of crypto conference, there is also SAC happening in Canada between Defcon and CRYPTO (I won't be there).

comment on this story

# Referencing eprint papers

## posted May 2016

Just realized that each ePrint paper has a BibTeX snippet to reference the paper in your own work. If you don't know what I'm talking about, I'm talking about that:

Go on any paper's page and click on the BibTeX Citation link. The snippet needs to be included in a .bib file and referenced from the your main .tex

@misc{cryptoeprint:2015:767,
author = {Daniel J. Bernstein and Tanja Lange and Ruben Niederhagen},
title = {Dual EC: A Standardized Back Door},
howpublished = {Cryptology ePrint Archive, Report 2015/767},
year = {2015},
note = {\url{http://eprint.iacr.org/}},
}

# How Gavin Andresen was duped into believing Wright is Satoshi

## posted May 2016

I won't be talking about how weird Wright is, how he was debunked a multitude of times, and how you shouldn't believe the press.

What I will be talking about, is how Gavin Andresen, a main bitcoin developer, could have been duped into thinking Wright was Satoshi:

I believe Craig Steven Wright is the person who invented Bitcoin.

First. How weird is it that instead of just signing something with Satoshi's public key and releasing it on the internet, Mr. Wright decides to demo a signature verification on closed door to TV channels, magazines and some bitcoin dev on a trip to London? From reddit, Gavin wrote:

Craig signed a message that I chose ("Gavin's favorite number is eleven. CSW" if I recall correctly) using the private key from block number 1. That signature was copied on to a clean usb stick I brought with me to London, and then validated on a brand-new laptop with a freshly downloaded copy of electrum. I was not allowed to keep the message or laptop (fear it would leak before Official Announcement).

That last sentence... is the rule number one in a magic trick or scam.

Now people are pointing out from the blogpost this intentional mistake in Wright's script to verify a signature:

Looks like the signature is also taken from an old transaction, so Wright is re-using a signature to prove he verified "something". Except he swapped the something.

Another way he might have done it, on his website he also has a oneliner:

>> base64 –decode signature > sig.asn1 & openssl dgst -verify sn-pub.pem -signature sig.asn1 sn7-message.txt

But the & sign is a single one instead of &&, making the two commands run at the same time instead of running the second command after the first one.

Someone else points at other tricks, like using doppelganger letters from the UTF-8 set of characters to trick someone in a demo.

myvar = "foo"
myvаr = "bar"  # This is a *different* variable.

print("first one:", myvar)
print("second one:", myvаr)

The amount of sleigh of hands that could have been done here is actually really interesting. We have cryptographic signatures so that we don't need to believe in human lies, but if the human is in between you and the verification of the signature, then good luck!

Also that python script makes me think of using this as a proof of concept backdoor.

comment on this story