david wong

Hey! I'm David, cofounder of zkSecurity and the author of the Real-World Cryptography book. I was previously a crypto architect at O(1) Labs (working on the Mina cryptocurrency), before that I was the security lead for Diem (formerly Libra) at Novi (Facebook), and a security consultant for the Cryptography Services of NCC Group. This is my blog about cryptography and security and other related topics that I find interesting.

Quick access to articles on this page:

more on the next page...

FAQ on Whitebox Cryptography posted December 2014

A great FAQ, written by Marc Joye (Thomson R&D), on Whitebox Cryptography.

http://joye.site88.net/papers/Joy08whitebox.pdf

Thansk Tancrède for the link!

Q1: What is white-box cryptography?
A major issue when dealing with security programs is the protection of "sensitive" (secret, confidential or private) data embedded in the code. The usual solution consists in encrypting the data but the legitimate user needs to get access to the decryption key, which also needs to be protected. This is even more challenging in a software-only solution, running on a non-trusted host.
White-box cryptography is aimed at protecting secret keys from being disclosed in a software implementation. In such a context, it is assumed that the attacker (usually a "legitimate" user or malicious software) may also control the execution environment. This is in contrast with the more traditional security model where the attacker is only given a black-box access (i.e., inputs/outputs) to the cryptographic algorithm under consideration.

Q2: What is the difference with code obfuscation?
Related and complementary techniques for protecting software implementations but with different security goals include code obfuscation and software tamper-resistance. Code obfuscation is aimed at protecting against the reverse engineering of a (cryptographic) algorithm while software tamper-resistance is aimed at protecting against modifications of the code.
All these techniques have however in common that the resulting implementation must remain directly executable.

Or as Francis Gabriel writes here

Code obfuscation means code protection. A piece of code which is obfuscated is modified in order to be harder to understand. As example, it is often used in DRM (Digital Rights Management) software to protect multimedia content by hiding secrets informations like algorithms and encryption keys.

comment on this story

Last exam of my life posted December 2014

And I just passed the last exam of this semester, which should be the last exam of my life =) Now is time to take a few days to relax and eat nice food because it will soon be christmas ^^ (or holidays, as I heard some american say to avoid saying christmas).

A few interesting things I had to do during my exams these last few days:

  • Simple Power Analysis (SPA). Guess what algorithm is used from smartcards' traces and calculate the exponent if it's a binary exponentiation

spa

In the picture you can see two patterns, "1" is represented by two operations in the algorithm, and one of them is squaring which happens also when you have a "0" in your exponent's binary representation. So following the computations revealed by the power trace you can guess the binary representation of the exponent.

I had to read this article explaining two malloc implementations and their vulnerabilities. GNU Lib C (used in Linux) and System V AT&T (used in Solaris, IRIX). I knew the double chained list system but System V uses a different approach: binary tree and also a realfree function that completes the free function.

comment on this story

Airbus crypto challenge write-up posted December 2014

Airbus made a "private" challenge called « Trust the future » and accessible only by some selected schools (epitech, insa, and others). I wasn't invited to participate but there was a "crypto" challenge I thought was interesting. Since the challenge just finished I'm posting the write up.

Crypto challenge #1

crypto1

We have 4 certificates and a challenge1 file that seems to be a s/mime file of a pkcs#7 enveloped-data object.

2.4.3 EnvelopedData Content Type
This content type is used to apply privacy protection to a message. A sender needs to have access to a public key for each intended message recipient to use this service. This content type does not provide authentication.

and

3.2 The application/pkcs7-mime Type
The application/pkcs7-mime type is used to carry CMS objects of several types including envelopedData and signedData. The details of constructing these entities is described in subsequent sections. This section describes the general characteristics of the application/pkcs7-mime type.

rfc2633

Certificates

We dump the info of each certificates in human readable format, openssl has commands for that (I think certtool does as well, but I'm on windows using cmder and openssl is the one included).

openssl x509 -in alice.crt -text -noout -out alice.crt.txt

crypto2

We see that alice, bob and charly use the same rsa exponent (3).

Reminder: RSA

If you're familiar with RSA (and it's highly probable you are if you read this blog) you can skip this section.

RSA is an asymmetric encryption scheme (also used as a signature). It works by generating a set of private key/public key, the private key is of course kept private and the public key is publicly disclosed. If someone wants to send us a private message he can encrypt it with our public key and we will be able to decrypt it with the private key. The public key is the pair of number (n, e) where n is called the modulus and e is called the exponent. If we want to encrypt a message m with the public key we "basically" do c = m^e modulo n and send c. To decrypt it we use our private key d like this: m = c^d modulo n.

The math behind this is that n is generated from two secret primes p and q (big enough) n = p x q and d = e^-1 modulo (p-1)(q-1), (p-1)(q-1) being phi(n) being the order of the multiplicative group Z/nZ. The security comes from the fact that it's Computationally Hard to find the inverse of e if we don't know p and q. By the way, Heartbleed (a recent attack on openssl) led to finding one of the prime, thus the entire decomposition of n.

Textbook RSA vs real life RSA

This is all theory. And in practice we have to go through several steps to encrypt an ascii message, make sure it is of length lesser than the modulus, make sure the modulus is big enough, etc...

Textbook RSA is also deterministic and thus not semantically secure (see my previous post) + it is malleable: imagine you intercept c, and of course you know (n, e) (the public key). You could compute c' = 2^e * c = 2^e * m^e = (2m)^e modulo n, this would correctly decrypt as 2m.

Thus, to counter those in practice, RSA Encrytion uses padding (usually OAEP) to make it probabilist and not malleable.

Let's go back to our challenge

We open our challenge1 file:

MIME-Version: 1.0
Content-Disposition: attachment; filename="smime.p7m"
Content-Type: application/x-pkcs7-mime; smime-type=enveloped-data; name="smime.p7m"
Content-Transfer-Encoding: base64

MIIy1wYJKoZIhvcNAQcDoIIyyDCCMsQCAQAxggQ0MIIBYgIBADBKMDYxCzAJBgNV
BAYTAkZSMQ4wDAYDVQQHEwVQYXJpczEXMBUGA1UEAxQOY2FAZXhhbXBsZS5jb20C
EGOE4rIYS8v1jszxDKemVjwwDQYJKoZIhvcNAQEBBQAEggEAweI1fG/FPxzF4Odu
sSJL6PJOiDklHPlUqYCQxFSfG6+3vLEAbdKpgtVsHS0+a0IhItAfeNoAmXdreJFi
6M6U7j0ee4iqgXXbuG8vSsZTYbyUmzuQRgdByu5vGr3FvWxSlvvI8tr/d/cRDqMt

To read that we need to extract the pkcs7 object and parse it. Openssl allows us to do this:

openssl smime -in challenge1 -pk7out -out challenge1.p7m
openssl asn1parse -text -in challenge1.p7m

We get an annoying dump of info to read. With three of those things:

 95:d=6  hl=2 l=  16 prim: INTEGER           :6384E2B2184BCBF58ECCF10CA7A6563C
  113:d=5  hl=2 l=  13 cons: SEQUENCE          
  115:d=6  hl=2 l=   9 prim: OBJECT            :rsaEncryption
  126:d=6  hl=2 l=   0 prim: NULL              
  128:d=5  hl=4 l= 256 prim: OCTET STRING      [HEX DUMP]:C1E2357C6FC53F1CC5E0E76EB1224BE8F24E8839251CF954A98090C4549F1BAFB7BCB1006DD2A982D56C1D2D3E6B422122D01F78DA0099776B789162E8CE94EE3D1E7B88AA8175DBB86F2F4AC65361BC949B3B90460741CAEE6F1ABDC5BD6C5296FBC8F2DAFF77F7110EA32D330D38DD2CA2FE13E785C86FE2210B58074C2DA5F440794BA023FC98B3D1E7DC979DBAC6672B5C19ABF4A91E21D5E474475BC09B78910D1F8E0290B38AE8D756E04D7F5EFBA64BFB5A0E96CD3DE1D82F609544A423F666D08B63262229687E1982BC8E424C7B5266B11A59036625F8E92C06740A3C9D8F3CE87FEB1F4444BC2039C8C6FF0AB9457D8AA63851ECF3C4AF1A2328FD

Which means the same message was sent to three recipients, identified by their serial number which we recognize as being our alice, bob and charly.

We also get this at the end:

 1110:d=4  hl=2 l=   9 prim: OBJECT            :pkcs7-data
 1121:d=4  hl=2 l=  20 cons: SEQUENCE          
 1123:d=5  hl=2 l=   8 prim: OBJECT            :des-ede3-cbc
 1133:d=5  hl=2 l=   8 prim: OCTET STRING      [HEX DUMP]:01D4CE3AF4D17ABB

Which means that the data sent (after this dump) is encrypted by 3DES version 3 (three different keys) in CBC mode with an IV 01D4CE3AF4D17ABB.

Reminder: DES-EDE3-CBC

I like to put reminders like this so you don't have to switch to Wikipedia if you don't remember what are those letters.

DES (Data Encryption Standard) is the famous no-longer-used block cipher (because it was broken ages ago). EDE3 short for the third version of the Triple DES block cipher (that is still considered secure today, it was a response to DES no longer being secure) which uses 3 different keys. Encrypting is done like this:

  • we encrypt with key1
  • then we decrypt with key2
  • then we encrypt again with key3
E(k3, D(k2, E(k1, M)))

Hence the triple DES.

CBC is a mode of operation. A block cipher can only encrypt/decrypt blocks of a certain size (64bits with DES). If you want to do more (or less) you have to use a mode of operation (and a padding system).

cbc

Chinese Remainder Theorem

Here the interesting thing is that the same message was sent to three different recipients, encrypted with the same exponent (3). Let's write down the informations we have:

c1 = m^3 modulo n1
c2 = m^3 modulo n2
c3 = m^3 modulo n3

c1 being the encrypted message sent to Alice, n1 being Alice's modulus, and so on...

We have a system with one unknown: the message. The Chinese Remainder Theorem works in a similar fashion to Lagrange Interpolation (anecdote time: it is used in Shamir's Secret Sharing).

So that we have:

m^3 = c1 * n2 * n3 * ((n2 * n3)^-1 [n1]) + 
        c2 * n1 * n3 * ((n1 * n3)^-1 [n2]) +
            c3 * n1 * n2 * ((n1 * n2)^-1 [n3])
                modulo n1 * n2 * n3

A brief explanation: We have `c1 = m^3 modulo n1, to place it in a formula modulo n1 * n2 * n3 we have to cancel it when it's modulo n2 or modulo n3. How to make something congruent to zero when its modulo n2 or n3 ? Make it a multiple of n2 or n3. So we multiply c1 with n2 and n3. But then when it will be modulo n1 we will have the value c1 * n2 * n3 which is not correct (c1 = m^3 modulo n1 !). So let's cancel the n2 and n3 with their inverse modulo n1. We then have c1 * n2 * n3 * ((n2 * n3)^-1 [n1]). We do this with all the equations to find the bigger equation. This is the Chinese Remainder Theorem. Simple no?

And this result is even more useful since we know that:

m < n1
m < n2
m < n3
=>
m^3 < n1*n2*n3

Of course if m was greater than one of the modulus then it would decrypt incorrectly. So what we have is:

m^3 = something modulo n1*n2*n3
=>
m^3 = something

That's right, we can get rid of the modulo. We then do a normal cubic root and we find m.

Here's the quick python code I hacked together for this:

(by the way we can quickly get the modulus of each recipients with openssl: openssl x509 -in alice.crt -modulus)

## 6384E2B2184BCBF58ECCF10CA7A6563C (Alice)
c1 = "C1E2357C6FC53F1CC5E0E76EB1224BE8F24E8839251CF954A98090C4549F1BAFB7BCB1006DD2A982D56C1D2D3E6B422122D01F78DA0099776B789162E8CE94EE3D1E7B88AA8175DBB86F2F4AC65361BC949B3B90460741CAEE6F1ABDC5BD6C5296FBC8F2DAFF77F7110EA32D330D38DD2CA2FE13E785C86FE2210B58074C2DA5F440794BA023FC98B3D1E7DC979DBAC6672B5C19ABF4A91E21D5E474475BC09B78910D1F8E0290B38AE8D756E04D7F5EFBA64BFB5A0E96CD3DE1D82F609544A423F666D08B63262229687E1982BC8E424C7B5266B11A59036625F8E92C06740A3C9D8F3CE87FEB1F4444BC2039C8C6FF0AB9457D8AA63851ECF3C4AF1A2328FD"

n1 = "EFBA9C442084759DC9770021B03C2E2913053E770779316F92C5DBFCAE4D3682E64006E38FA6A3AC24CC13AD2E747A50E5735064549F590294E36F2A1B23DB29567B49C007F8F8C224D3CD19B81D3F198C540291C135965E549881B775EEE29684F0E6CD4C2A017BE38F2E78E070D503BE9EE3EA2C491E53DE9C705FEB973918A168F275D90D055778289598BD2377D79ACC1BA493F570C5C8301913CEF12CD513321F8F320D8EC8172182D03F33721F02DFCE24463AE7A6CAB7C3A0CBB7D2AB149D347A2C9ABDB81BE4B60CAECBF31CF79C4BA0081FC00BB0939A950CBACA5B7B79FF92AF273B0D01A7E183FF30C90F27705D18F70EBB32281C5A873ED0A90D"

## 9F9D51BC70EF21CA5C14F307980A29D8 (Bob)
c2 = "27EB5A62A311DFAECF09318BEF7D60B98EA151AF09BDBF2A89A884617B8A8A14FF6F8045A8FD5D8956F5768C32A7E47AB17FA08D5F7D2EB590C4FC8296A1F70069C338CFA3C131A58FE05A75E36D7457ECC7B1BB403C1FF31FA66FB478B1F4548325B57961191AE1BFDDD7F5AF6FEC6FD94F66B1BC482337B579AD790466D1F33EDB09AA388085053D3C383F91A8EC40DB150365735B7E2D01F172E23717D31ABE0350FAC6730673C3C70EE593E8008A222DE40CF8A62615D119CD119FE8C30DA49E7A1D3596279B659E72C584D6D8262B1126B8C8BCDC09A31761EE746A14ADA7EC387AF2C52CBABDD8F443DF1F4C5E70C83B82A12F3BBCB21494689D13791F"
n2 = "C0028301D5645483592E2117463A804454BD1AE33B1CBEF33D9CDFF86EFD46CB24BA1984874DAE3A4A3D3609D9607276F91DFAD38E9ED6B6BCE15F29C49E1C7E0B532AE2A1343AF3F8064B4A093F23F981624D4E08F901787B761847B4F0963512EAC13B5D47DA4295FF614501A3D7D7EDA6B4E6197B974A70BF11FDC5D619D50974415E209DE76C68F190DBCA5BB6F1963C1E0E987BE105FF9082EAE003C2051A4C95E3299D3C1BC64D5F8E95D40C7B27EEEA5EBCD9B54CD6B5655B0AB9AEAA15E976AE37EA228A151D2AFD5417D4BCBC3BFB396E6B10AF561CBE0FD0374081B7034585C54096849FF82B79A117E44AE8FDB5B304BC0D9CA297C2F4A57FF91F"

## A6D4EF4DD38B1BB016D250C16A680470 (Charly)
c3 = "04991C5BA4882F329B03B18E2B317F4A54905ED4EB832B084A42AD700A0D3136A14BB57D61D4A1982E2CAB0FF773356759EE4AD77C1982E642CF574332AB32D109952FDE6221D77C35E4D0B69E559392DBE602E5336BD09239E85F21A70F4A824907AF75C9C372D4BE4C15E45431C35FE678E2646017D74186B3B084A41F217655A2ED262AA5C300BA737AB0DF270BD0B38A2FF215A3B5DB3CBB79350DDFEF1A08E40CB253B506D92002BBF4AD112AC1DDDB96CD4539A01035E76B1CC5C43427F46C83DBAA318387FE2C8C7FAA75FC0099050CF98671015A568CFFC56DFF6F8CB80A6A55B4CCB0D825AA9D99098DDA5D2EEC7D40D0BCCDA42D9E618A09AEC50A"
n3 = "CFA7854352FF9DF5E84AB10AB8F034F8106811D973BAB528BFCBD3DCBCFDF9FB5C398E23B58BA883F7C78F47C6694B4F042CDB8E54E856040F8A8A9ADBCA4C6D0813894C43352EB3EE19C1F76DF46DFD1B6BB38349BCE811036B0ED7ACFE2E5045FE4232F11DA3F113189A176964C206155342FD9E2E8AD11CBBCB85DFDF30E62AEA068F2DD7CEC6CF818D1E312BBA5FA6385461CA5ADCA0F95B6299FA366EEF8856416D72A42A93FD979E269D8FEA143870985FD353C85850FB4A11B6E4BA483CDC97F7E1717C34DF7D9E34DF83F67A9DA97ACA69926167D44C2CB3BB858EC041A244A6197D7F3B9AFD02A0562F13EACE6494F289184DAD16D2D995ED1ADC13"

## base16 -> base10
c1 = int(c1, 16)
c2 = int(c2, 16)
c3 = int(c3, 16)
n1 = int(n1, 16)
n2 = int(n2, 16)
n3 = int(n3, 16)

## extended euclide algorithm
def xgcd(a,b):
    """Extended GCD:
    Returns (gcd, x, y) where gcd is the greatest common divisor of a and b
    with the sign of b if b is nonzero, and with the sign of a if b is 0.
    The numbers x,y are such that gcd = ax+by."""
    prevx, x = 1, 0;  prevy, y = 0, 1
    while b:
        q, r = divmod(a,b)
        x, prevx = prevx - q*x, x  
        y, prevy = prevy - q*y, y
        a, b = b, r
    return a, prevx, prevy

## chinese remainder formula
n2n3 = n2 * n3
n1n3 = n1 * n3
n1n2 = n1 * n2

n2n3_ = xgcd(n2n3, n1)[1]
n1n3_ = xgcd(n1n3, n2)[1]
n1n2_ = xgcd(n1n2, n3)[1]

m3 = c1 * n2n3 * n2n3_ + c2 * n1n3 * n1n3_ + c3 * n1n2 * n1n2_

m3 = m3 % (n1n2 * n3)

print(m3)

from decimal import *

getcontext().prec = len(str(m3))
x = Decimal(m3)
power = Decimal(1)/Decimal(3)

answer = x**power
ranswer = answer.quantize(Decimal('1.'), rounding=ROUND_UP)

diff = x - ranswer**Decimal(3)
if diff == Decimal(0):
    print("x is the cubic number of", ranswer)
else:
    print("x has a cubic root of ", answer)

Note:

  • The xgcd function is included in sage but here I use Python so I included it in the code.
  • We need to use the decimal package to calculate the cubic root because our number is too big.

We then get this big ass number that we convert to hexadecimal (hex(number) in python). This yields:

0001ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff004f8957408f0ea202c785b95e206b3ba8da3dba7aea08dca1

We refer once more to the RFCs

8.1 Encryption-block formatting

A block type BT, a padding string PS, and the data D shall be formatted into an octet string EB, the encryption block.

          EB = 00 || BT || PS || 00 || D .           (1)

The block type BT shall be a single octet indicating the structure of the encryption block. For this version of the document it shall have value 00, 01, or 02. For a private- key operation, the block type shall be 00 or 01. For a public-key operation, it shall be 02.

The padding string PS shall consist of k-3-||D|| octets. For block type 00, the octets shall have value 00; for block type 01, they shall have value FF; and for block type 02, they shall be pseudorandomly generated and nonzero. This makes the length of the encryption block EB equal to k.

rfc 2315

We have our 3DES key: 4f8957408f0ea202c785b95e206b3ba8da3dba7aea08dca1 to use.

Let's get the hexdump the end of the file (you can use commandline utilities like base64, hexdump, dd and xdd):

openssl smime -in challenge1 -pk7out > b64file`
base64 -d b64file > hexfile
hexdump -s 1135 hexfile
dd
xdd

And finally decrypt our encrypted file with openssl since it provides a command for that:

openssl des-ede3-cbc -d -iv 01D4CE3AF4D17ABB -K 4f8957408f0ea202c785b95e206b3ba8da3dba7aea08dca1 -in encrypted

Voila ! That was really fun :)

6 comments

Is there any particular reason to use Diffie-Hellman over RSA for key exchange? posted December 2014

I was wondering why RSA was used in the SSL handshake, and why Diffie-Hellman was used instead in a Perfect Forward Secrecy scheme.

http://security.stackexchange.com/questions/35471/is-there-any-particular-reason-to-use-diffie-hellman-over-rsa-for-key-exchange

There is, however, an advantage of DH over RSA for generating ephemeral keys: producing a new DH key pair is extremely fast (provided that some "DH parameters", i.e. the group into which DH is computed, are reused, which does not entail extra risks, as far as we know). This is not a really strong issue for big servers, because a very busy SSL server could generate a new "ephemeral" RSA key pair every ten seconds for a very small fraction of his computing power, and keep it in RAM only, and for only ten seconds, which would be PFSish enough.

comment on this story

Real Life side-channels attacks posted December 2014

Some funny slides from Vitaly Shmatikov on side channels attacks: http://www.cs.utexas.edu/~shmat/courses/cs361s/sidechannels.pdf

So you can tell what someone is typing just by analyzing the sound of the fingers on the keyboard, from a certain distance.

If you observe someone typing at his computer from an outside window, you can analyze the reflections in many objects (glass teapots, plastic bottles, spoons!!! and even eyes).

Like we weren't worried enough.

comment on this story

Network/Software Security posted December 2014

I was trying to find some info about the Heap and malloc (for the level 14 of microcorruption) when I ran into some very good videos from the Infosec Institute. I cannot find the name of the speaker but damn he's so good I just lost 2 hours of my life just watching his videos about nmap, pentesting, metaspoilt, and so on...

Here is his video on Heap Overflow:

And his talks are on several different youtube channels. I don't know how legit this is, and if someone can find the name of that guy I would love to know more about him. More about him: Advanced Recon, Advanced Exploitation, and so on...

comment on this story

Pandoc : Markdown -> LaTeX posted December 2014

I just turned in my cryptanalysis project: A Linear Cryptanalysis of A5/2 with Sage. Had to write a rapport in LaTeX.

Now I have to finish the challenges over at Microcorruption and produce a write up in LaTeX as well.

Well LaTeX is awful as a writing syntax. I'd rather focus on writing with John Gruber's excellent markdown syntax and then later convert it to LaTeX. And Pandoc does just that! It's magic. Now that I have all my .md files I concat them with a quick python script

output = ""
for ii in range(1,15):
    markdown = open(str(ii) + ".md", "r")
    output += markdown.read()
    output += "\n\n"
    markdown.close()

output_file = open("rapport.md", "w")
output_file.write(output)
output_file.close()

A bit of Pandoc magic and voilà ! I have a beautiful .tex

Now let's finish Microcorruption (or at least try :D it's getting pretty hard).

PS: I use Markdown for everything. This blog is written in Markdown and then converted to HTML. I'm also writing a book in Markdown. Well Markdown is awesome.

comment on this story