Cryptography and assembly code posted March 2021

Thanks to filippo streaming his adventures rewriting Golang assembly code into "cleaner" Golang assembly code, I discovered the Avo assembly generator for Golang.

This post is not necessarily about Golang, but Golang is a good example as its standard library is probably the best cryptographic standard library of any programming language.

At dotGo 2019, Michael McLoughlin presented on his Avo tool. In the talk he mentions that there's 24,962 x86 assembly lines in Golang's standard library, and most of it is in the crypto package. A very "awkward" place where "we need very high performance, and absolute correctness". He then shows several example that he describes as "write-once code".

assembly golang crypto

The talk is really interesting and I recommend you to check it.

I personally spent days trying to understand Golang's SHA-3 assembly implementation. I even created a Go Assembly by Example page to help me in this journey. And I ended up giving up. I just couldn't understand how it worked, the thing didn't make sense. Someone had written it with their own mental model of how they wanted to pass data around. It was horrible.

It's not just a problem of Golang. Look at OpenSSL, for example, which most cryptographic applications and libraries rely on. It contains a huge amount of assembly code to implement cryptography, and that assembly code is sometimes generated by unintelligible perl code.

There are many more good examples out there. the BearSSL TLS implementation by Thomas Pornin, the libsodium cryptographic library by Frank Denis, the extended keccak code package by the Keccak team, all use assembly code to produce fast cryptography.

We're making such a fuss about readable, auditable, simple and clear cryptographic implementations, but most of that has been thrown out of the window in the quest for performance.

The real problem, from a reviewer perspective is that assembly is getting us much further away from the specification. As the role of a reviewer is to match the implementation to the specification, it makes the job hard, perhaps impossible.

Food for thoughts...

Well done! You've reached the end of my post. Now you can leave a comment or read something else.

Here are some random popular articles:

Here are some random recent articles:

Comments

Name

There's also the problem that there are additional properties a reviewer might want to verify about the code (is it constant time, is it guaranteed to clear caches, etc) that are much harder to prove in a high-level description due to clever compiler optimisations, or (admittedly much rarer) compiler bugs. Writing directly in assembly is much closer to guaranteeing that these properties will not change when the program is compiled (though not foolproof, depending on how clever your toolchain decides to be). While compiler optimizations can be mitigated or temporarily-disabled in some compilers with some flags, it becomes dependent on the language to not change its semantics. While languages probably won't change in ways that break cryptographic properties, assembly code almost certainly won't. But perhaps this small risk is worth it for more understandable cryptographic code (if the compiler plays fair).

I guess this makes for a good argument for having hand-optimising assembly paired with computerized (potentially computer-generated) proofs of equivalence to some sort of specification, such as the same algorithm written in C, Go, etc.

Cryptography and assembly code posted March 2021

Comments

Name

leave a comment...

Subscribe to the mailing list