david wong

Hey! I'm David, cofounder of zkSecurity and the author of the Real-World Cryptography book. I was previously a crypto architect at O(1) Labs (working on the Mina cryptocurrency), before that I was the security lead for Diem (formerly Libra) at Novi (Facebook), and a security consultant for the Cryptography Services of NCC Group. This is my blog about cryptography and security and other related topics that I find interesting.

Simple introduction to monads in OCaml posted November 2022

I'm not a language theorist, and even less of a functional programming person. I was very confused trying to understand monads, as they are often explained in complicated terms and also mostly using haskell syntax (which I'm not interested in learning). So if you're somewhat like me, this might be the good short post to introduce you to monads in OCaml.

I was surprised to learn that monads are really just two functions: return and bind:

module type Monads = sig
    type 'a t
    val return : 'a -> 'a t
    val bind : 'a t -> ('a -> 'b t) -> 'b t

Before I explain exactly what these does, let's look at something that's most likely to be familiar: the option type.

An option is a variant (or enum) that either represents nothing, or something. It's convenient to avoid using real values to represent emptiness, which is often at the source of many bugs. For example, in languages like C, where 0 is often used to terminate strings, or Golang, where a nil pointer is often used to represent nothing.

An option has the following signature in OCaml:

type 'a option = None | Some of 'a

Sometimes, we want to chain operations on an option. That is, we want to operate on the value it might contain, or do nothing if it doesn't contain a value. For example:

let x = Some 5 in
let y = None
match x with
  | None -> None
  | Some v1 -> (match y with
    | None -> None
    | Some v2 -> v1 + v2)

the following returns nothing if one of x or y is None, and something if both values are set.

Writing these nested match statements can be really tedious, especially the more there are, so there's a bind function in OCaml to simplify this:

let x = Some 5 in
let y = None in
Option.bind x (fun v1 -> 
  Option.bind y (fun v2 ->
    Some (v1 + v2)

This is debatably less tedious.

This is where two things happened in OCaml if I understand correctly:

Let's explain the syntax introduced by OCaml first.

To do this, I'll define our monad by extending the OCaml Option type:

module Monad = struct
  type 'a t = 'a option

  let return x = Some x
  let bind = Option.bind
  let ( let* ) = bind

The syntax introduced by Ocaml is the let* which we can define ourselves. (There's also let+ if we want to use that.)

We can now rewrite the previous example with it:

open Monad

let print_res res =
  match res with
  | None -> print_endline "none"
  | Some x -> Format.printf "some: %d\n" x

let () =
  (* example chaining Option.bind, similar to the previous example *)
  let res : int t =
    bind (Some 5) (fun a -> bind (Some 6) (fun b -> Some (a + b)))
  print_res res;

  (* same example but using the let* syntax now *)
  let res =
    let* a = Some 5 in
    let* b = Some 6 in
    Some (a + b)
  print_res res

Or I guess you can use the return keyword to write something a bit more idiomatic:

  let res =
    let* a = Some 5 in
    let* b = Some 6 in
    return (a + b)
  print_res res

Even though this is much cleaner, the new syntax should melt your brain if you don't understand exactly what it is doing underneath the surface.

But essentially, this is what monads are. A container (e.g. option) and a bind function to chain operations on that container.

Bonus: this is how the jane street's ppx_let syntax works (only defined on result, a similar type to option, in their library):

open Base
open Result.Let_syntax

let print_res res =
  match res with
  | Error e -> Stdio.printf "error: %s\n" e
  | Ok x -> Stdio.printf "ok: %d\n" x

let () =
  (* example chaining Result.bind *)
  let res =
    Result.bind (Ok 5) ~f:(fun a ->
        Result.bind (Error "lol") ~f:(fun b -> Ok (a + b)))
  print_res res;

  (* same example using the ppx_let syntax *)
  let res =
    let%bind a = Ok 5 in
    let%bind b = Error "lol" in
    Ok (a + b)
  print_res res

You will need to following dune file if you want to run it (assuming your file is called monads.ml):

 (name monads)
 (modules monads)
 (libraries base stdio)
  (pps ppx_let)))

And run it with dune exec ./monads.exe

comment on this story

The intuition behind the sum-check protocol in 5 minutes posted November 2022

The sum check protocol allows a prover to convince a verifier that the sum of a multivariate polynomial is equal to some known value. It’s an interactive protocol used in a lot of zero-knowledge protocols. I assume that you know the Schwartz Zippel lemma in this video, and only give you intuitions about how the protocol works.

comment on this story

What's the deal with zkapps? posted October 2022

Vitalik recently mentioned zkapps at ETHMexico. But what are these zkapps? By the end of this post you will know what they are, and how they are going to change the technology landscape as we know it.

Zkapps, or zero-knowledge applications, are the modern and secure solution we found to allow someone else to compute arbitrary programs, while allowing us to trust the result. And all of that thanks to a recently rediscovered cryptographic construction called general-purpose zero-knowledge proofs. With it, no need to trust the hardware to behave correctly, especially if you're not the one running it (cough cough intel SGX).

Today, we're seeing zero-knowledge proofs impacting cryptocurrencies (which as a whole have been a petri dish for cryptographic innovation), but tomorrow I argue that most applications (not just cryptocurrencies) will be directly or indirectly impacted by zero-knowledge technology.

Because I've spent so much time with cryptocurrencies in recent years, auditing blockchains like Zcash and Ethereum at NCC Group, and working on projects like Libra/Diem at Facebook, I'm mostly going to focus on what's happening in the blockchain world in this post. If you want a bigger introduction to all of these concepts, check my book Real-World Cryptography.

The origin of the story starts with the ancient search for solutions to the problem of verifiable computation; being able to verify that the result of a computation is correct. In other words, that whoever run the program is not lying to us about the result.

Most of the solutions, until today, were based on hardware. Hardware chips were first invented to be "tamper resistant" and "hard to analyze". Chips capable of performing simple cryptographic operations like signing or encryption. You would typically find them in sim cards, TV boxes, and in credit cards. While all of these are being phased out, they are being replaced by equivalent chips called "secure enclaves" that can be found in your phone. On the enterprise side, more recently technologies were introduced to provide programmability. Chips capable of running arbitrary programs, while providing (signed) attestation that the programs were run correctly. These chips would typically be certified by some vendor (for example, Intel SGX) with some claim that it’s hard to tamper with them. Unfortunately for Intel and others, the security community has found a lot of interest in publishing attacks on their "secure" hardware, and we see new hacks coming up pretty much every year. It's a game of cat and mouse.

Cryptocurrencies is just another field that's been dying to find a solution to this verifiable computation problem. The previous hardware solutions I’ve talked about can be found in oracles like town crier, in bridges like the Ethereum-Avalanche bridge, or even at the core of cryptocurrencies like MobileCoin.

Needless to say, I'm not a fan, but I'll be the first to conceive that in some scenarios you just don't have a choice. And being expensive enough for attackers to break is a legitimate solution. I like to be able to pay with my smartphone.

But in recent years, an old cryptographic primitive that can solve our verifiable computation problem for real has made a huge comeback. Yes you know which one I'm talking about: general-purpose zero-knowledge proofs (ZKPs).

With it, there is no need to trust the hardware: whoever runs the program can simply create a cryptographic proof to convince you that the result is correct.

ZKPs have been used to solve ALL kind of problems in cryptocurrency:

  • "I wish we could process many more transactions" -> simply let someone else run a program that verifies all the transactions and outputs a small list of changes to be made to the blockchain (and a proof that the output is correct). This is what zk rollups do.
  • "I wish we could mask the sender, recipient, and the amount being transacted" -> just encrypt your transaction! And use a zero-knowledge proof to prove that what's encrypted is correct. This is what ZCash has done (and Monero, to some extent).
  • "It takes ages for me to download the whole Bitcoin ledger..." -> simply have someone else do it for you, and give you the resulting latest state (with a proof that it's correct). That's what Mina does.
  • "Everybody using cryptocurrency is just blindly trusting some public server (e.g. Infura) instead of running their own nodes" -> use ZKP to make light clients verifiable! This is what Celo does with Plumo.
  • "Why can't I easily transfer a token from one blockchain to another one?" -> use these verifiable light clients. This is what zkBridge proposes.

There's many more, but I want to focus on zkapps (remember?) in this post. Zkapps are a new way to implement smart contracts. Smart contracts were first pioneered by Ethereum, to allow user programs to run on the blockchain itself.

To explain smart contracts, I like the analogy of a single supercomputer floating in the sky above us. We're all using the same computer, the one floating in the sky. We all can install our programs on the floating computer, and everyone can execute functions of these programs (which might mutate the state of the program).

The solution found by Ethereum at the time was to implement the concept naively and without using cryptography:

  • Users can install a program by placing the program's code in a transaction.
  • Users can execute functions of a program by writing in a transaction the function they want to execute and with what arguments.
  • Everyone running a node has to run the functions found in users transactions. All of them. In order to get the result (e.g. move X tokens to wallet Y, update the state of the smart contract, etc.)

The last point is the biggest limitation of Ethereum. We can't have the user provide the result of executing a function, or anyone else really, because we can't trust them. And so, not only does this mean that everyone is always re executing the same stuff (which is redundant, and slows down the network), but this also means that everything in a smart contract must be public (as everyone must be able to run the function). There can be no secrets used. There can be no asynchronous calls or interaction outside of the network while this happens.

This is where zkapps enters the room. Zkapps allow users to run the programs themselves and give everyone else the result (along with a proof).

This not only solves the problem of having everyone re-execute the same smart contract calls constantly, but it also opens up new applications as computations can be non-deterministic: they can use randomness, they can use secrets, they can use asynchronous calls, etc.

More than that, the state of a zkapp can now mostly live off-chain, like real applications before Ethereum used to do. Reducing the size of the entire blockchain (Today, Ethereum is almost 1 terabyte!). These applications are not limited by the speed of the blockchain, or by the capabilities of the language exposed by the blockchain anymore.

Perhaps, it would be more correct to describe them as mini-blockchains of their own, that can be run as centralized or decentralized applications, similar to L2s or Cosmos zones.

OK. So far so good, but do these zkapps really exist or is it just talk? Well, not yet. But a few days ago, the Mina cryptocurrency released their implementations of zkapps on a testnet. And if the testnet goes well, there is no reason to believe this won't unlock a gigantic number of applications we haven't seen before on blockchains.

You can read the hello world tutorial and deploy your first zkapp in like 5 minutes (I kid you not). So I highly recommend you to try it. This is the future :)

1 comment

OCaml wishlist posted September 2022

I've been writing (although mostly reading) OCaml on-and-off this last year. It's been quite a painful experience, even though I had some experience with functional languages already (erlang, which I really liked).

I find the language and the experience very close to C in many ways, while at the same time boasting a state of the art type system. It's weird. I think there's a real emphasis on the expressiveness, but little on the engineering. Perhaps this is due to the language not having enough traction in the industry.

So about this, I have two things I'd like to say. The first, is that if you're looking for a somewhat low-level (there's a garbage collector) language you can make a real dent in, OCaml might be the one. It's pretty bare bone, not that many libraries exist, and if they do they are barely usable due to a lack of documentation. My first contribution was a library to encode and decode hexadecimal strings, because I couldn't find one that I could use. That should tell you something.

My second contribution was a tool to build "by example" websites. I used it to make a website to learn OCaml by examples, and another one to learn Nix by example. How cool would it be if people started using it to build a number of "by examples" websites in the OCaml ecosystem :D?

Anyway, I digress, the second thing I wanted to say is: if you're working on OCaml (or want to contribute to a new language), here's my wishlist:

  • a tool like cargo to manage dependencies (& versions), start projects, run tests, etc. Two important things: it should use a real configuration language (e.g. toml, json, yml) and it should work in a convention over configuration model.
  • better integration with vscode. Every time I write or read OCaml I find myself missing rust-analyzer and its integration with vscode. I just want to be able to go to definitions, even in the presence of functors, and easily find the types of things (and see their implementations).
  • being able to run a single test. It is crazy to me that today, you still can't write an inline test and run it. It's the best way to debug or test something.
  • better compiler error messages. I think the lack of a tool like cargo, and this, are the biggest impediment to the language. See this issue for an example.
  • better default for ocamlformat. OCaml is hard to read, some of the reasons are hard to change, but the formatting can be fixed and it really needs some work.
  • a linter like clippy. It's 2022, every project should be able to run an OCaml linter in CI.
  • good documentation for stdlib and 3rd party libraries. Documentation is really subpar in OCaml.
  • a use keyword to import specific values in scope (as opposed to "opening" a whole module in scope)

PS: would someone actually be interested to work on any of these for a grant? There's a number of new-ish companies in the OCaml space that would probably pay for someone to solve these. I guess reach out to me on the contact page if you're interested.


noname developer update #4: showcasing method calls posted September 2022

This is part 4 of a series on noname. See part 1, part 2, and part 3.

I just implemented method calls in noname, and I made a video showcasing it and checking if the implementation is correct with some simple examples.

Don't forget, if you want to play with it check it out here: https://github.com/mimoo/noname and if you have any questions or are running into something weird please leave a comment on the Github repo!

comment on this story

noname developer update #3: user-defined functions posted September 2022

I guess this is part 3. With part 1 introducing noname, a programming language inspired by rust that makes use zero-knowledge proofs to provide verifiable computation, part 2 going over a simple arithmetic example and part 3 going over custom types.

In this update, I showcase a new feature: functions, and go through the debug compilation of an example program to see if the implementation is sound (that it constrains what it is supposed to constrain). In this video I do something more: I optimize the implementation of assert_eq so that you can see a bit of the compiler internals. I also end the video abruptly, thinking I found a bug in the implementation. If you were an attentive student, you would have figured out that there was no bugs: doing things on constants does not create any gates.

You can play with the noname language here! And you can read the noname book as well.

comment on this story

noname developer update #2: structs are working! posted September 2022

noname is the DSL I'm working on to write rust-like programs and prove their executions using zero-knowledge proofs.

The previous post used noname as an education tool to explain how programs get compiled to gates and constraints.

In this post, I showcase a new feature: custom gates, and go through the debug compilation of an example program to see if the implementation is sound (that it constrains what it is supposed to constrain).

Don't forget, you can play with the noname language here! And you can read the noname book as well.

comment on this story