david wong

Hey! I'm David, cofounder of zkSecurity and the author of the Real-World Cryptography book. I was previously a crypto architect at O(1) Labs (working on the Mina cryptocurrency), before that I was the security lead for Diem (formerly Libra) at Novi (Facebook), and a security consultant for the Cryptography Services of NCC Group. This is my blog about cryptography and security and other related topics that I find interesting.

What are zkVMs? And what's the difference with a zkEVM? posted July 2022

I've been talking about zero-knowledge proofs (the "zk" part) for a while now, so I'm going to skip that and assume that you already know what "zk" is about.

The first thing we need to define is: what's a virtual machine (VM)? In brief, it's a program that can run other programs, usually implemented as a loop that executes a number of given instructions (the other program). Simplified, a VM could look like this:

let stack = Stack::new();
loop {
    let inst = get_next_instruction();
    apply_instruction(inst, stack);
}

The Ethereum VM is the VM that runs Ethereum smart contracts. The original list of instructions the EVM supports, and the behavior of these instructions, was specified in 2014 in the seminal yellow paper (that thing is literally yellow) by Gavin Wood. The paper seems to be continuously updated so it should be representative of the current instructions supported.

a zk VM, is simply a VM implemented as a circuit for a zero-knowledge proof (zkp) system. So, instead of proving the execution of a program, as one would normally do in zkp systems, you prove the execution of a VM. As such, some people say that non-VM zkps are part of the FPGA approach, while zkVMs are part of the CPU approach.

Since programs (or circuits) in zkp systems are fixed forever (like a binary compiled from some program really), our VM circuit actually implements a fixed number of iteration for the loop (you can think of that as unrolling the loop). In our previous example, it would look like this if we wanted to support programs of 3 instructions tops:

let stack = Stack::new();
let inst = get_next_instruction();
apply_instruction(inst, stack);
let inst = get_next_instruction();
apply_instruction(inst, stack);
let inst = get_next_instruction();
apply_instruction(inst, stack);

EDIT: Bobbin Threadbare pointed to me that with STARKs, as there is no preprocessing (like plonk), there's no strict limit on the number of iteration.

This is essentially what a zkVM is; a zk circuit that runs a VM. The actual program's instructions can be passed as public input to that circuit so that everyone can see what program is really being proven. (There are a number of other ways to pass the program's instructions to the VM if you're interested.)

There's a number of zkVMs out there, I know of at least three important projects:

  • Cairo. This is used by Starknet and I highly recommend reading the paper which is a work of art! We also have an experimental implementation in kimchi we call turshi.
  • Miden. This is a work-in-progress project from Polygon.
  • Risczero. This is a work-in-progress project that aims at supporting the set of RISC-V instructions (a popular standard outside of the blockchain world).

All of them supports a different instruction set, so they are not compatible.

A zkEVM, on the other hand, aims at supporting the Ethereum VM. It seems like there is a lot of debate going on about the actual meaning of "supporting the EVM", as three zkEVM were announced this week:

So one can perhaps divide the field of zk VMs into two types:

  • zk-optimized VMs. Think Cairo or Miden. These types of VMs tend to be much faster as they are designed specifically to make it easier for zkp systems to support them.
  • real-world VMs. Think RiscZero supporting the RISC-V instruction set, or the different zkEVMs supporting the Ethereum VM.

And finally, why would anyone try to support the EVM? If I'd have to guess, I would come up with two reasons: if you're Ethereum it could allow you to create proofs of the entire state transition from genesis to the latest state of Ethereum (which is what Mina does), and if you're not Ethereum it allows your project to quickly copy/paste projects from the Ethereum ecosystem (e.g. uniswap) and steal developers from Ethereum.

1 comment

Panel: how ZKPs and other advanced cryptographic schemes can be deployed to solve hard problems in decentralized systems posted July 2022

I spoke at a panel at Privacy Evolution with Aleo and Anoma, which coincidentally was my first time ever speaking at a panel :D this was actually a lot of fun and I'm thinking that I should do this again. The topic was "how ZKPs and other advanced cryptographic schemes can be deployed to solve hard problems in decentralized systems"

You can watch the whole thing, or skip directly to the panel I was in at 4:10

Privacy Evolution with Aleo and Anoma from Alex Miles on Vimeo.

comment on this story

Smart contracts with private keys posted July 2022

In most blockchains, smart contracts cannot hold a private key. The reason is that everyone on the network needs to be able to run the logic of any smart contract (to update the state when they see a transaction). This means that, for example, you cannot implement a smart contract on Ethereum that will magically sign a transaction for Bitcoin if you execute one of its functions.

In zero-knowledge smart contracts, like zkapps, the situation is a bit different in that someone can hold a smart contract's private key and run the logic associated to the private key locally (for example, signing a Bitcoin transaction). Thanks to the zero-knowledge proof (ZKP), anyone can attest that the private key was used in a correct way according to the contract, and everyone can update the state without knowing about the private key.

Only that one person can use the key, but you can encode your contract logic so that they can respond to requests from other users. For example:

-- some user calls smart_contract.request(): Hey, can you sign this Bitcoin transaction? -- key holder calls smart_contract.response(): here's the signature

The elephant in the room is that you need to trust the key holder not to leak the key or use it themselves (for example, to steal all the Bitcoin funds it protects).

The first step to a solution is to use cryptographic protocols called multi-party computations (MPCs) (see chapter 15 of Real-World Cryptography). MPCs allow you to split a private key between many participants, thereby decentralizing the usage of the private key. Thanks to protocols like decentralized key generations (DKGs) the private key is never fully present anywhere, and as long as enough of the participants act honestly (and don't collude) the protocol is fully secure. This is what Axelar, for example, implements to allow different blockchains to interoperate.

This solution is limited, in that it requires a different protocol for every different usage you might have. Signing is one thing, but secret-key cryptography is about decrypting messages, encrypting to channels, generating random numbers, and much more! A potential solution here is to mix MPC with zero-knowledge proofs. This way, you can essentially run any program in a correct way (the ZKP part) where different parts of the program might come from different people (the MPC part).

A recent paper (2021) presented a solution to do just that: Experimenting with Collaborative zk-SNARKs: Zero-Knowledge Proofs for Distributed Secrets . As far as I know, nobody has implemented such a concept onchain, but I predict that this will be one of the next big thing to unlock for programmable blockchains in general.

comment on this story

The Web PKI 2.0 posted June 2022

The Web Public Key Infrastructure (PKI) is what's behind the green lock in your browser's URL bar. Actually, as I'm writing this, I realize that it's not even green anymore:

Now, instead of having a green lock that stands out, you get a "Not Secure" that stands out if you visit a non-HTTPS (or plain HTTP) website:

In this post I will briefly explain what this lock means, what the foundations for the security of the web are, and how blockchain technology can turn this into a better world.

The web PKI, in brief

When you talk to a website like Google, under the hood, your browser uses a protocol called TLS to secure the connection (and display the lock icon).

Part of that protocol is simply about making use of good old cryptography. Your browser performs a key exchange with Google's public key, and then encrypts the connection with some authenticated encryption algorithm. (If these words don't mean much to you, I introduce these concepts as well as TLS in my book Real-World Cryptography.)

The other part of the protocol is about identities. As in: "how can I trust that this secure connection I just established is really with Google?" The only way we found out how to create trust between people's browsers and random websites on the web is by having a number of organizations manage these identities. These organizations are called "Certificate Authorities", and they are in charge of verifying the owner of a domain (for example, google.com) before signing their public keys. The artifact they produce is usually referred to as a certificate.

Your browser trusts a number of these Certificate Authorities by default. They are hardcoded in the software you downloaded. When you connect to google.com, not only do you create a secure connection with some public key, but you also verify that this public key is signed by one of these Certificate Authorities that you trust.

Without this, you'd have to have all the public keys of all the websites on the internet baked in your browser. Not very practical.

This system, called the web PKI, is chaotic. Your browser ends up trusting hundreds of these authorities, and they sometimes misbehave (and sign certificates they should not):

When a Certificate Authorities misbehave, you sometimes have to revoke a number of the certificates they have signed. In other words, you need a way to tell browsers (or other types of clients) that the certificate they're seeing is no longer valid. This is another can of worms, as there is no list of all the current valid certificates that exists anywhere. You'd have to check with the Certificate Authority themselves if they've revoked the certificate (and if the Certificate Authority themselves has been banned... you'll need to update your browser).

Detecting attacks, Certificate Transparency to the rescue

To manage this insanity, Certificate Transparency (CT) was launched. An append-only log of certificates that relies on users (e.g. browsers) reporting what they see and gossiping between one another to make sure they see the same thing. Websites (like google.com) can use these logs to detect fraudulent certificates that were signed for their domains.

While Certificate Transparency has had some success, there are fundamental problems with it:

  • it relies on clients (for example, browsers) to do the right thing and report what they see
  • it is useful only to those who use it to monitor their domain (paranoids and large organizations who can afford security teams)
  • it can only detect attacks, not prevent them

With the advance of blockchain-related technologies, we can take a different look at Certificate Transparency and notice that while it is very close to what a blockchain fundamentally is, it does not rely on a consensus protocol to ensure that everyone sees the same state.

Preventing attacks, blockchain to the rescue

If you think about it, a blockchain (touted as "a solution in search of a problem" by some technologists) solves exactly our scenario: it allows a set of organizations to police one another in order to maintain some updatable state. Here, the state is simply the association between websites and their public keys.

In this scenario, everyone sees the same state (there's consensus), and clients can simply own their state without going through a middle man (by being the first to register some domain name, for example).

There are many technical details to what I claim could be the web PKI 2.0. A few years back, someone could have simply retorted: "it'll be too slow, and energy inefficient, and browsers shouldn't have to synchronize to a blockchain".

But today, latest consensus systems like Bullshark and consensus-less systems like FastPay are not only green, but boast 125,000 and near-infinite transactions per second (respectively).

Not only that, but zero-knowledge proofs, as used in cryptocurrencies like Mina allow someone to receive a small proof (in the order of a few hundred bytes to a few hundred MB, depending on the zero-knowledge proof system) of the latest state of the blockchain. This would allow a browser to simply make a query to the system and obtain a short cryptographic proof that the public key they're seeing is indeed the one of google.com in the latest state.

Again, there are many details to such an implementation (how do you incentivize the set of participants to maintain such a system, who would be the participants, how do you prevent squatting, how do you prevent spam, etc.), but it'd be interesting to see if such a proof of concept can be realized in the next 5 years. Even more interesting: would such a system benefit from running on a cryptocurrency or would the alternative (in cryptocurrency lingo: a permissionned network based on a proof of authority) be fine?

3 comments

ZK FAQ: What's a trusted setup? What's a Structured Reference String? What's toxic waste? posted June 2022

In proof systems, provers and the verifiers rely on a common set of parameters, sometimes referred to as the common reference string (CRS).

In some proof systems (for example, the ones that rely on pairings) a dangerous setup phase produces these common parameters. Dangerous because it generates random values, encrypts them, and then must get rid of the random values so that no one can ever find out about them. The reason is that knowing these values would allow anyone to forge invalid proofs. Invalid proofs that verifiers would accept. Such values are sometimes referred to as toxic waste, and due to the fact that the individuals performing the setup have behave honestly, we call the setup a trusted setup.

By the way, since this common set of parameters has some hidden structure to it, it is usually referred to as structured reference string (SRS).

In the past, ceremonies (called powers of tau ceremonies) have been conducted where multiple participants collaborate to produce the SRS. Using cryptographic constructions called multi-party computations (MPC), the protocol is secure as long as one of the participant behaves honestly (and destroys the random values they generated as part of the ceremony).

It seems to be accepted in the community that such ceremonies are a pain to run. When mistakes happen, new ceremonies have to take place, which is what infamously happened to Zcash.

comment on this story

What's two-adicity? posted May 2022

Some elliptic curves (related to zero-knowledge proofs I believe) have been claiming high 2-adicity. But for some reason, it seems a bit hard to find a definition of what this term means. And oddly, it's not a complicated thing to explain. So here's a short note about it.

You can see this being mentioned for example by the pasta curves:

They have the same 2-adicity, 32, unlike the Tweedle curves that had 2-adicity of 33 and 34. This simplifies implementations and may assist in square root performance (used for point decompression and internally to Halo 2) due to a new algorithm recently discovered; 32 is more convenient for this algorithm.

Looking at the definition of one of its field in Rust you can see that it is defined specifically for a trait related to FFTs:

impl FftParameters for FqParameters {
    type BigInt = BigInteger;

    const TWO_ADICITY: u32 = 32;

    #[rustfmt::skip]
    const TWO_ADIC_ROOT_OF_UNITY: BigInteger = BigInteger([
        0x218077428c9942de, 0xcc49578921b60494, 0xac2e5d27b2efbee2, 0xb79fa897f2db056
    ]);

so what's that? Well, simply put, a two-adicity of 32 means that there's a multiplicative subgroup of size $2^{32}$ that exists in the field. And the code above also defines a generator $g$ for it, such that $g^{2^{32}} = 1$ and $g^i \neq 1$ for $i \in [[1, 2^{32}-1]]$ (so it's a primitive $2^{32}$-th root of unity).

Lagrange's theorem tells us that if we have a group of order $n$, then we'll have subgroups with orders dividing $n$. So in our case, we have subgroups with all the powers of 2, up to the 32-th power of 2.

To find any of these groups, it is pretty straight forward as well. Notice that:

  • let $h = g^2$, then $h^{2^{31}} = g^{2^{32}} = 1$ and so $h$ generates a subgroup of order $2^31$
  • let $h = g^{2^2}$, then $h^{2^{30}} = g^{2^{32}} = 1$ and so $h$ generates a subgroup of order $2^30$
  • and so on...

In arkworks you can see how this is implemented:

let size = n.next_power_of_two() as u64;
let log_size_of_group = ark_std::log2(usize::try_from(size).expect("too large"));
omega = Self::TWO_ADIC_ROOT_OF_UNITY;
for _ in log_size_of_group..Self::TWO_ADICITY {
    omega.square_in_place();
}

this allows you to easily find subgroups of different sizes of powers of 2, which is useful in zero-knowledge proof systems as FFT optimizations apply well on domains that are powers of 2. You can read more about that in the mina book.

6 comments

Are system thinkers right? And why I left security posted April 2022

Niall Murphy recently wrote about The Curse of Systems Thinkers (Part 1). In the post he made the point that some people (people who he calls "System thinkers" and who sound a lot like me to be honest) can become extremely frustrated by chaotic environments, and will seek to better engineer them as they engineer code.

He ends the post with a pessimistic take (which I disagree with):

If you can't get the ball rolling on even a small scale because no-one can see the need or will free-up the required resources, then you're free: they're fucked. Give yourself permission to let the organization fail

A while ago, Magoo suggested I read The Phoenix Project which is a book about engineering companies. Specifically, it's a novel that seeks to teach you lessons through an engaging story instead of a catalogue of bullet points. In the book, the analogy is made that any technology company is like an assembly line, and thus can be made efficient by using the lessons already learned decades ago by the manufacture industry.

The book also contains a side story about the failure of the security lead, which at the time really talked to me. The tl;dr is that the security person was too extreme (like all security engineers who have never worked on the other side) and could not recognize that the business needs were more urgent and more important than the security needs at the time. The security person was convinced to be right, and that the others didn't not care enough (reminiscent of Niall Murphy's blogpost), and consequently he lived a miserable life.

The point I'll be trying to make here is that it's all the same. Security, devops, engineering, ... it's all about trade offs and about finding what works well at a given time.

Ignoring yak shaving (which everyone does, and thus needs to be controlled), how much time and effort should be spent specifying protocols, documenting code, and communicating ideas? How much time and effort do we really need to spend writing clean code and refactoring?

I don't think there's a good or bad answer. The argument for both sides are strong:

Moving slow. Maintaining your own code, or having people maintain and extend your code, becomes harder and harder for the team with time. You will switch projects, and then go back to some code you haven't seen in a while. As the team grows, as people come and go, the situation amplifies as well. Obviously some people are better than others at reverse engineering code, but it's generally a hard problem.

Another argument is that some people on the team are not necessarily good programmers, or perhaps don't even know how to code, so it becomes hard/impossible for them to review or contribute in different ways. For example, by writing proofs with formal analysis tools or with a pen and paper, or to discuss the design with you, etc.

Complexity and rushed code obviously lead to security issues as well. That's undeniable.

Moving fast. On the other hand, you can't spend 90% of your time refactoring and doing things the_right_way™. You need to ship at some point. Business needs are often more important, and companies can go bankrupt by taking too much time to launch products. This is especially true during some stages of a company, in which it is in dire need of cash.

Furthermore, there are a ton of examples of companies growing massively while building on top of horrible stacks. Sometimes these companies can stagnate for years due to the amount of spaghetti code and complexity they're built on, and due to the fact that nobody is able to make changes effectively. But when this happens, codebases get rewritten from scratch anyway, which is not necessarily a bad thing. This is what happens with architecture, for example, where we tend to leave houses and buildings the way they are for very long periods of time, and destroy & rebuild when we really want to do consequent changes.

Eventually, the decision to move faster or slower is based on many factors. Many people work well in chaos and the system engineers might have to adapt to that.

That being said, extremes are always bad, and finding the right balance is always the right thing to do. But to find the right balance, you need extremists who will push and pull the company in different directions. Being a fanatic is a consuming job, and this is why you get a high turnover rate for such individuals (or blogposts like Niall Murphy telling you to let the organization fail).

This is the reason I personally left security.

The reasonable man adapts himself to the world; the unreasonable one persists in trying to adapt the world to himself. Therefore all progress depends on the unreasonable man. -- George Bernard Shaw, Maxims for Revolutionists

comment on this story

Linearization in Plonk and Kimchi. Why? posted April 2022

This is a short note on the linearization step in Plonk (the zero-knowledge proof system), which is in my opinion the hardest step to understand.

Essentially, the linearization in Plonk exists due to two reasons:

  1. The polynomials we're dealing with are too big for the SRS (in the case of a polynomial commitment scheme, or PCS, like KZG) or URS (in the case of a PCS like bulletproof).
  2. During a proof verification, the verifier reconstructs the commitment to the polynomial that is aggregating all the constraints. To construct that commitment, the verifier adds commitments together, but must avoid multiplications as the commitment scheme used is only additively homomorphic.

Let's look at each of these in order.

The Reference String has a maximum size limit

The polynomials we're dealing with are too big for the SRS (in the case of a polynomial commitment scheme, or PCS, like KZG) or URS (in the case of a PCS like bulletproof).

Imagine that you have a polynomial with 4 coefficients:

$$ f = f_0 + f_1 x + f_2 x^2 + f_3 x^3 $$

Now imagine that your reference string is of size 2. This means that you can commit polynomials that have 2 coefficients max. The only way to commit to $f$ is to commit to the two polynomials with 2 or less coefficients separately:

  • $com_1 = f_0 + f_1 x$
  • $com_2 = f_2 + f_3 x$

Then, as part of the protocol, the prover evaluates this polynomial at some challenge point $\zeta$ and produces an evaluation proof for that. The verifier will have to first reconstruct the commitment to that polynomial, which can be constructed as:

$$ com = com_1 + \zeta^2 com_2 $$

Take a few minutes to understand why this works if needed.

Commitments are not homomorphic for the multiplication

During a proof verification, the verifier reconstructs the commitment to the polynomial that is aggregating all the constraints. To construct that commitment, the verifier adds commitments together, but must avoid multiplications as the commitment scheme used is only additively homomorphic.

Both the KZG and bulletproof PCS's use the Pedersen commitment, which is an homomorphic commitment scheme for the addition, but not for the multiplication. In other words, this means we can add commitments together (or add a commitment to itself many times), but not multiply commitments together.

So if $com_1$ is the commitment of the polynomial $g_1$ and $com_2$ is the commitment of the polynomial $g_2$. Then the following is fine:

$$ com_1 + com_2 = com(g_1 + g_2) $$

While the following is not possible:

$$ com_1 \cdot com_2 $$

In this last case, the solution is to simply have the prover evaluate one of the commitments. For example, if we want to evaluate the polynomial (behind this commitment) to a challenge point $\zeta$, the prover could provide $g_1(\zeta)$ and the verfier would then compute the commitment of $g_1 \cdot g_2$ as

$$ g_1(\zeta) \cdot com_2 $$

This works, as long as the protocol uses this commitment to later verify an evaluation proof of $g_1(\zeta) \cdot g_2(\zeta)$.

Note that if $g_1$ is a polynomial that needs to remain private (part of the witness), then it should be blinded if you care about zero-knowledge.

As such, the linearization step of Plonk is a way to have the verifier construct a commitment as a linear combination of commitments. These commitments are generally preprocessed commitments contained in the verifier index (also called verifier key) or commitments that the verifier can generate themselves (the commitment to the public input polynomial for example).

3 comments