Hey! I'm David, the author of the Real-World Cryptography book. I'm a crypto engineer at O(1) Labs on the Mina cryptocurrency, previously I was the security lead for Diem (formerly Libra) at Novi (Facebook), and a security consultant for the Cryptography Services of NCC Group. This is my blog about cryptography and security and other related topics that I find interesting.

# Bruteforce Apr1 hashes. posted May 2014

One of my professor organized a Hacking Week this semester but I didn't have time to do it. Since I'm in holidays I thought I would take a look at it and write a bit about how I solved them.

Here's the Crypto Challenge number 2 (out of 5) from this CTF (Capture The Flag):

user0:$apr1$oTsx8NNn$bAjDZHpM7tCvHermlXKfZ0 user1:$apr1$UxOdpNtW$funTxZxL/8y3m8STvonWj0
user2:$apr1$w7YNTrjQ$0/71H7ze5o9/jCnKLt0mj0 user3:$apr1$AIw2h09/$Ti0TRlU9mDpCGm5zg.ZDP. user4:$apr1$048HynE6$io7TkN7FwrBk6PmMzMuyC. user5:$apr1$T2QG6cUw$eIPlGIXG6KZsn4ht/Kpff0 user6:$apr1$2aLkQ0oD$YRb6aFYMkzPoUCj70lsdX0 You have 7 different users with their respective password hashed and you have to find them. It's just the 2nd out of 5 crypto problems, it's pretty basic, but I never brute forced passwords for real before (I remember using John The Ripper when I was in middle school but that's for script kiddies). What's Apr1 ? It's a hash function that uses md5. And md5 is pretty weak, lots of rainbow tables on google. This is how Apr1 looks in PHP according to Wikipedia, also the passwords are supposed to be alpha (a to z) in lowercase. function apr1($mdp, $salt) {$max = strlen($mdp);$context = $mdp.'$apr1$'.$salt;
$binary = pack('H32', md5($mdp.$salt.$mdp));
for($i=$max; $i>0;$i-=16)
$context .= substr($binary, 0, min(16, $i)); for($i=$max;$i>0; $i>>=1)$context .= ($i & 1) ? chr(0) :$mdp{0};
$binary = pack('H32', md5($context));
for($i=0;$i<1000; $i++) {$new = ($i & 1) ?$mdp : $binary; if($i % 3) $new .=$salt;
if($i % 7)$new .= $mdp;$new .= ($i & 1) ?$binary : $mdp;$binary = pack('H32', md5($new)); }$hash = '';
for ($i = 0;$i < 5; $i++) {$k = $i+6;$j = $i+12; if($j == 16) $j = 5;$hash = $binary{$i}.$binary{$k}.$binary{$j}.$hash; }$hash = chr(0).chr(0).$binary{11}.$hash;
$hash = strtr( strrev(substr(base64_encode($hash), 2)),
'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/',
'./0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz'
);
return '$apr1$'.$salt.'$'.$hash; } It seems pretty difficult to reverse. Let's not forget that hashes are one-way functions and that they also lose information. I don't know if they do lose information on a 7-letters-password though, but it seemed quite stupid to go down this road when I could just brute force it. What language offers a good library to hash with Apr1? Well I didn't know, and I felt like maybe Unix could do it well for me. Turns out that OpenSSL has a command line for it: openssl passwd -apr1 -salt SALT PASSWD A quick bash script later: #!/bin/bash test[1]='$apr1$oTsx8NNn$bAjDZHpM7tCvHermlXKfZ0'
salt[1]='oTsx8NNn'

test[2]='$apr1$UxOdpNtW$funTxZxL/8y3m8STvonWj0' salt[2]='UxOdpNtW' test[3]='$apr1$w7YNTrjQ$0/71H7ze5o9/jCnKLt0mj0'
salt[3]='w7YNTrjQ'

test[4]='$apr1$AIw2h09/$Ti0TRlU9mDpCGm5zg.ZDP.' salt[4]='AIw2h09/' test[5]='$apr1$048HynE6$io7TkN7FwrBk6PmMzMuyC.'
salt[5]='048HynE6'

test[6]='$apr1$T2QG6cUw$eIPlGIXG6KZsn4ht/Kpff0' salt[6]='T2QG6cUw' test[7]='$apr1$2aLkQ0oD$YRb6aFYMkzPoUCj70lsdX0'
salt[7]='2aLkQ0oD'

do
if [ "${#line}" == 7 ] then for num in {1..7} do noob=$(openssl passwd -apr1 -salt $salt[$num] $line) if [ "$noob" == "$test[$num]" ];
then
echo $line; fi done fi done < /usr/share/dict/words I read the /user/share/dict/words that contains a simple dictionary of words on Unix, I try only the 7-letters-words. The test ran in a few minutes and gave me nothing. Well, I guess with a 7 letters password they must have used gibberish words. Let's try a real bruteforce: for a in {a..z} do for b in {a..z} do for c in {a..z} do for d in {a..z} do for e in {a..z} do for f in {a..z} do for g in {a..z} do truc=$a$b$c$d$e$f$g;

for num in {1..7}
do
noob=$(openssl passwd -apr1 -salt$salt[$num]$truc)
if [ "$noob" == "$test[$num]" ]; then echo$truc;
fi
done
done
done
done
done
done
done
done

It ran and ran and... nothing.

Well. Let's not spend too much on this. There is John The Ripper that does this well and even oclHashcat that does this with the GPU.

Let's create a john.conf with the following to limit the password to 7 letters:

[Incremental:Alpha7]
File = $JOHN/alpha.chr MinLen = 7 MaxLen = 7 CharCount = 26 Let's launch John: john -i=Alpha7 hackingweek.txt (don't forget to put the hashed password in hackingweek.txt). Wait and wait and wait.. and get the passwords =) comment on this story # Find all the pairs in a list that are summing to a known number posted May 2014 I got asked this question in an interview. And I knew this question beforehands, and that it had to deal with hashtables, but never got to dig into it since I thought nobody would asked me that for a simple internship. I didn't know how to answer, in my mind I just had a simple php script that would have looked like this: $arr = array(-5, 5, 3, 1, 7, 8);
$target = 8; for($i = 0; $i < sizeof($arr) - 1; $i++) { for($j = $i + 1;$j < sizeof($arr);$j++)
{
if($arr[$i] + $arr[$j] == $target) echo "pair found:${arr[i]}, ${arr[j]}"; } } But it's pretty slow, it's mathematically correct, but it's more of a CS-oriented question. How to implement that quickly for machines? The answer is hash tables. Which are implemented as arrays in PHP (well, arrays are like super hash tables) and as dictionaries in Python. I came up with this simple example in python: arr = (-5, 5, 3, 1, 7, 8) target = 8 dic = {} for i, item in enumerate(arr): dic[item] = i if dic.has_key(target - item) and dic[target - item] != i: print item, (target - item) 1. iterate the list 2. assign the hash of the value to the index of the value in the array 3. to avoid finding a pair twice, we do this in the same for loop: we do the difference of the target sum and the number we're on, we hash it, if we find that in the hash table that's good! 4. but it could also be the number itself, so we check for its index, and it has to be different than its own index. Voilà! We avoid the n-1! additions and comparisons of the first idea with hash tables (I actually have no idea how fast they are but since most things use hash tables in IT, I guess that it is pretty fast). comment on this story # I got my first dogecoins today posted May 2014 Time to celebrate! and thanks r/dogecoin for tipping me! If you want some of my dogecoins just comment :D comment on this story # Notes on ECC (Elliptic Curve Cryptography) & Internship progress posted May 2014 One last exam, ECC, and then I'm free to do whatever I want (no I still haven't found an internship, but I talked with TrueVault, Cloudflare, MatterMark, Spotify and maybe Matasano so this has been a good experience nonetheless). I stumbled upon the notes of Ben Lynn an ex Stanford's student that took an ECC class there. They're pretty awesome and I kinda want to do something like that on this blog. Maybe next year it's a bit late for that :) The notes are here comment on this story # Do you know what Elliptic Curve Cryptography is? posted May 2014 comment on this story # Toom-Cook multiplication for dummies posted April 2014 We're learning a lot of algorithm in my algebre et calcul formel class. One of them is the Toom-Cook algorithm used for multiplication of large integers. I found a super simple explanation of it on a forum, it helps: Say, we want to multiply 23 times 35. We write, p(x) = 2x + 3, q(x) = 3x + 5. We are using our realization that any integer can be written as a polynomial. Here, p(x), represents 23, and q(x), represents 35, when x equals 10. We write, p(x)q(x) = r(x). That is, p(x) times q(x), equals r(x). So, (2x + 3)(3x + 5) = ax^2 + bx + c = r(x). Now, p(0)q(0) = r(0). So, (20 + 3)(30 + 5) = a0 + b0 + c. Therefore, c = 15. Now, p(1)q(1) = r(1). Therefore, when we do the substitutions (for x and c), a + b = 25. Now, p(-1)q(-1) = r(-1). Therefore, when we do the substitutions (for x and c), a - b = -13. Now, we already know c, and we just need to find a and b. We have two linear equations and two unknowns, a + b = *25, a - b = -13. We just add the two equations and we get, 2a = 12. Therefore, a = 6. Now, we can substitute 6 for a in, a + b = 25, and we get, b = 19. So, r(x) = 6x^2 + 19x + 15. Now, we substitute 10 for x in r(x), and we are done, r(10) = 600 + 190 + 15 = 805. Believe it or not! 4 comments # Why can't I copy PS3 games and play them on another console? posted April 2014 I've always wondered how it is that we can't easily copy the entire content of a CD/DVD/Bluray on another one and play it with a PS1/PS2/PS3 and I guess PS4 and its competition. Here's part of an answer on psx-scene's forum: Whenever you insert a disc (bluray one that is) the ps3 drive will look at a special area of the disc called the Pic Zone (the BD ROM Mark is actually used in movie discs but not in game unlike what I first thought).This area cannot easily be dumped (you'd pretty much need a bluray drive with a hacked firmware) and of course that specific area cannot be burned on any kind of discs or with any kind of burners commercially available. reading this made me apply to Sony for an internship :) comment on this story # Weblang posted April 2014 I've been writing html, xhtml, and now html5 for ages. I think I started in 2001 (13 years ago). I had to go through <br> becoming <br /> becoming <br> again. I had to go through different doctypes I had to go through new divs like <header> and <footer> But I never had to go through a syntax change. Why is that? I don't understand why HTML is a language based on tags. It is unnecessary and it just adds time and confusion to typing in html. I haven't ran into any project directed at changing that syntax. And I thought, why not doing it myself? (and if there is already such a project please tell me!) So I thought about a new language to write static web pages called web or weblang. No tags. Indentation. Simple doctype. A simple index.web would looks like that: \web:1 // this is a doctype \head$title: 'Weblang example';
$css: 'css/app.css'; \body$header .monheader{
$h1 "Weblang"; }$section #introduction{
$h2 "What is Weblang?";$p "Weblang is an elegant way of writing static webpages"
"HTML is annoying to write." // there will be a breakline here
$p{ what about just writing text like this, it's kinda easier } } // what about just writing text This is a text block, it will just render as text in this text I want a list here :$ul{
$li "with text in it"; }$ul .links{
$li{$a "more info" href: 'https://github.com/mimoo/weblang';
}
$li$a{
tags can be chained
}
}

$javascript 'js/jquery.js';$script 'js/script.js' type: 'javascript';

This is just a first draft. The biggest problem is that plain text and code is mixed. The trick I used here is to use \$ to tell the render engine that it is not plaintext. Might not be super clever. I need to brainstorm a bit more about this.

Also I need to look at sass' code to see how a compiler works. Seems to be a bunch of regex.