david wong

Hey! I'm David, cofounder of zkSecurity and the author of the Real-World Cryptography book. I was previously a crypto architect at O(1) Labs (working on the Mina cryptocurrency), before that I was the security lead for Diem (formerly Libra) at Novi (Facebook), and a security consultant for the Cryptography Services of NCC Group. This is my blog about cryptography and security and other related topics that I find interesting.

How STARKs work if you don't care about FRI posted November 2023

Here's some notes on how STARK works, following my read of the ethSTARK Documentation (thanks Bobbin for the pointer!).

Warning: the following explanation should look surprisingly close to PlonK or SNARKs in general, to anyone familiar with these other schemes. If you know PlonK, maybe you can think of STARKs as turboplonk without preprocessing and without copy constraints/permutation. Just turboplonk with a single custom gate that updates the next row, also the commitment scheme makes everything complicated.

The execution trace table

Imagine a table with $W$ columns representing registers, which can be used as temporary values in our program/circuit. The table has $N$ rows, which represent the temporary values of each of these registers in each "step" of the program/circuit.

For example, a table of 4 registers and 3 steps:

r0 r1 r2
1 0 1 534
2 4 1 235
3 3 4 5

The constraints

There are two types of constraints which we want to enforce on this execution trace table to simulate our program:

  • boundary constraints: if I understand correctly this is for initializing the inputs of your program in the first rows of the table (e.g. the second register must be set to 1 initially) as well as the outputs (e.g. the registers in the last two rows must contain $3$, $4$, and $5$).
  • state transitions: these are constraints that apply to ALL contiguous pairs of rows (e.g. the first two registers added together in a row equal the value of the third register in the next row). The particularity of STARKs (and what makes them "scallable" and fast in practice) is that the same constraint is applied repeatidly. This is also why people like to use STARKs to implement zkVMs, as VMs do the same thing over and over.

This way of encoding a circuit as constraints is called AIR (for Algebraic Intermediate Representation).

Straw man 1: Doing things in the clear coz YOLO

Let's see an example of how a STARK could work as a naive interactive protocol between a prover and verifier:

  1. the prover constructs the execution trace table and sends it to the verifier
  2. the verifier checks the constraints on the execution trace table by themselves

This protocol works if we don't care about zero-knowledge, but it is obviously not very efficient: the prover sends a huge table to the verifier, and the verifier has to check that the table makes sense (vis a vis of the constraints) by checking every rows involved in the boundary constraints, and checking every contiguous pair of rows involved in the state transition constraints.

Straw man 2: Encoding things as polynomials for future profit

Let's try to improve on the previous protocol by using polynomials. This step will not immediately improve anything, but will set the stage for the step afterwards. Before we talk about the change to the protocol let's see two different observations:

First, let's note that one can encode a list of values as a polynomial by applying a low-degree extension (LDE). That is, if your list of values look like this: $(y_0, y_1, y_2, \cdots)$, then interpolate these values into a polynomial $f$ such that $f(0) = y_0, f(1) = y_1, f(2) = y_2, \cdots$

Usually, as we're acting in a field, a subgroup of large-enough size is chosen in place of $0, 1, 2$ as domain. You can read why's that here. (This domain is called the "trace evaluation domain" by ethSTARK.)

Second, let's see how to represent a constraint like "the first two registers added together in a row equal the value of the third register in the next row" as a polynomial. If the three registers in our examples are encoded as the polynomials $f_1, f_2, f_3$ then we need a way to encode "the next row". If our domain is simply $(0, 1, 2, \cdots)$ then the next row for a polynomial $f_1(x)$ is simply $f_1(x + 1)$. Similarly, if we're using a subgroup generated by $g$ as domain, we can write the next row as $f_1(x \cdot g)$. So the previous example constraint can be written as the constraint polynomial $c_0(x) = f_1(x) + f_2(x) - f_3(x \cdot g)$.

If a constraint polynomial $c_0(x)$ is correctly satisfied by a given execution trace, then it should be zero on the entire domain (for state transition constraints) or on some values of the domain (for boundary constraints). This means we can write it as $c_0(x) = t(x) \cdot \sum_i (x-g^i)$ for some "quotient" polynomial $t$ and the evaluation points $g^i$ (that encode the rows) where the constraint should apply. (In other words, you can factor $c_0$ using its roots $g^i$.)

Note: for STARKs to be efficient, you shouldn't have too many roots. Hence why boundary constraints should be limited to a few rows. But how does it work for state transition constraints that need to be applied to all the rows? The answer is that since we are in a subgroup there's a very efficient way to compute $\sum_i (x - g^i)$. You can read more about that in Efficient computation of the vanishing polynomial of the Mina book.

At this point, you should understand that a prover that wants to convince you that a constraint $c_1$ applies to an execution trace table can do so by showing you that $t$ exists. The prover can do so by sending the verifier the $t$ polynomial and the verifier computes $c_1$ from the register polynomials and verifies that it is indeed equal to $t$ multiplied by the $\sum_i (x-g^i)$. This is what is done both in Plonk and in STARK.

Note: if a constraint doesn't satisfy the execution trace, then you won't be able to factor it with $\sum_i (x - g^i)$ as not all of the $g^i$ will be roots. For this reason you'll get something like $c_1(x) = t(x) \cdot \sum_i (x - g^i) + r(x)$ for $r$ some "rest" polynomial. TODO: at this point can we still get a $t$ but it will have a high degree? If not then why do we have to do a low-degree test later?

Now let's see our modification to the previous protocol:

  1. Instead of sending the execution trace table, the prover encodes each column of the execution trace table (of height $N$) as polynomials, and sends the polynomials to the verifier.
  2. The prover then creates the constraint polynomials $c_0, c_1, \cdots$ (as described above) for each constraint involved in the AIR.
  3. The prover then computes the associated quotient polynomials $t_0, t_1, \cdots$ (as described above) and sends them to the verifier. Note that the ethSTARK paper call these quotient polynomials the constraint polynomials (sorry for the confusion).
  4. The verifier now has to check two things:
    • low-degree check: that these quotient polynomials are indeed low-degree. This is easy as we're doing everything in the clear for now (TODO: why do we need to check that though?)
    • correctness check: that these quotient polynomials were correctly constructed. For example, the verifier can check that for $t_0$ by computing $c_0$ themselves using the execution trace polynomials and then checking that it equals $t_0 \cdot (x - 1)$. That is, assuming that the first constraint $c_0$ only apply to the first row $g^0=1$.

Straw man 3: Let's make use of the polynomials with the Schwartz-Zippel optimization!

The verifier doesn't actually have to compute and compare large polynomials in the correctness check. Using the Schwartz-Zippel lemma one can check that two polynomials are equal by evaluating both of them at a random value and checking that the evaluations match. This is because Schwartz-Zippel tells us that two polynomials that are equal will be equal on all their evaluations, but if they differ they will differ on most of their evaluations.

So the previous protocol can be modified to:

  1. The prover sends the columns of the execution trace as polynomials $f_0, f_1, \cdots$ to the verifier.
  2. The prover produces the quotient polynomials $t_0, t_1, \cdots$ and sends them to the verifier.
  3. The verifier produces a random evaluation point $z$.
  4. The verifier checks that each quotient polynomial has been computed correctly. For example, for the first constraint, they evaluate $c_0$ at $z$, then evaluate $t_0(z) \cdot (z - 1)$, then check that both evaluations match.

Straw man 4: Using commitments to hide stuff and reduce proof size!

As the title indicates, we eventually want to use commitments in our scheme so that we can add zero-knowledge (by hiding the polynomials we're sending) and reduce the proof size (our commitments will be much smaller than what they commit).

The commitments used in STARKs are merkle trees, where the leaves contain evaluations of a polynomial. Unlike the commitments used in SNARKs (like IPA and KZG), merkle trees don't have an algebraic structure and thus are quite limited in what they allow us to do. Most of the complexity in STARKs come from the commitments. In this section we will not open that pandora box, and assume that the commitments we're using are normal polynomial commitment schemes which allow us to not only commit to polynomials, but also evaluate them and provide proofs that the evaluations are correct.

Now our protocol looks like this:

  1. The prover commits to the execution trace columns polynomials, then sends the commitments to the verifier.
  2. The prover commits to the quotient polynomials, the sends them to the verifier.
  3. The verifier sends a random value $z$.
  4. The prover evaluates the execution trace column polynomials at $z$ and $z \cdot g$ (remember the verifier might want to evaluate a constraint that looks like this $c_0(x) = f1(x) + f2(x) - f3(x \cdot g)$ as it also uses the next row) and sends the evaluations to the verifier.
  5. The prover evaluates the quotient polynomials at $z$ and sends the evaluations to the verifier (these evaluations are called "masks" in the paper).
  6. For each evaluation, the prover also sends evaluation proofs.
  7. The verifier verifies all evaluation proofs.
  8. The verifier then checks that each constraint is satisfied, by checking the $t = c \cdot \sum_i (x - g^i)$ equation in the clear (using the evaluations provided by the prover).

Straw man 5: a random linear combination to reduce all the checks to a single check

If you've been reading STARK papers you're probably wondering where the heck is the composition polynomial. That final polynomial is simply a way to aggregate a number of checks in order to optimize the protocol.

The idea is that instead of checking a property on a list of polynomial, you can check that property on a random linear combination. For example, instead of checking that $f_1(z) = 3$ and $f_2(z) = 4$, and $f_3(z) = 8$, you can check that for random values $r_1, r_2, r_3$ you have:

$$r_1 \cdot f_1(z) + r_2 \cdot f_2(z) + r_3 \cdot f_3(z) = 3 r_1 + 4 r_2 + 8 r_3$$

Often we avoid generating multiple random values and instead use powers of a single random value, which is a tiny bit less secure but much more practical for a number of reasons I won't touch here. So things often look like this instead, with a random value $r$:

$$f_1(z) + r \cdot f_2(z) + r^2 \cdot f_3(z) = 3 + 4 r + 8 r^2$$

Now our protocol should look like this:

  1. The prover commits to the execution trace columns polynomials, then sends the commitments to the verifier.
  2. The verifier sends a random value $r$.
  3. The prover produces a random linear combination of the constraint polynomials.
  4. The prover produces the quotient polynomial for that random linear combination, which ethSTARK calls the composition polynomial.
  5. The prover commits to the composition polynomial, then sends them to the verifier.
  6. The protocol continues pretty much like the previous one

Note: in the ethSTARK paper they note that the composition polynomial is likely of higher degree than the polynomials encoding the execution trace columns. (The degree of the composition polynomial is equal to the degree of the highest-degree constraint.) For this reason, there's some further optimization that split the composition polynomial into several polynomials, but we will avoid talking about it here.

We now have a protocol that looks somewhat clean, which seems contradictory with the level of complexity introduced by the various papers. Let's fix that in the next blogpost on FRI...

Well done! You've reached the end of my post. Now you can leave a comment or read something else.


leave a comment...