As I explained here a while back, checking polynomial identities (some left-hand side is equal to some right-hand side) when polynomials are hidden using polynomial commitment schemes, gets harder and harder with multiplications. This is why we use pairings, and this is why sometimes we "linearize" our identities. If you didn't get what I just said, great! Because this is exactly what I'll explain in this post.
Using Schwartz-Zippel with no multiplication
First, let me say that there's typically two types of "nice" polynomial commitment schemes that people use with elliptic curves: Pedersen commitments and KZG commitments.
Pedersen commitments are basically hidden random linear combinations of the coefficients of a polynomial. That is, if your polynomial is $f(x) = \sum c_i \cdot x^i$ your commitment will look like $[\sum r_i \cdot c_i] G$ for some base point $G$ and unknown random values $r_i$. This is both good and bad: since we have access to the coefficients we can try to use them to evaluate a polynomial from its commitment, but since it's a random linear combination of them things can get ugly.
On the other hand, KZG commitments can be seen as hidden evaluations of your polynomials. For the same polynomial $f$ as above, a KZG commitment of $f$ would look like $[f(s)]G$ for some unknown random point $s$. Not knowing $s$ here is much harder than not knowing the values $r_i$ in Pedersen commitments, and this is why KZG usually requires a trusted setup whereas Pedersen doesn't.
In the rest of this post we'll use KZG commitments to prove identities.
Let's use $[a]$ to mean "commitment of the polynomial $a(x)$", then you can easily check that $a(x) = b(x)$ knowing only the commitments to $a(x)$ and $b(x)$ by checking that $[a] = [b]$ or $[a] - [b] = [0]$. This is because of the Schwartz-Zippel (S-Z) lemma which tells us that checking this identity at a random point is convincing with high-enough probability.
When multiplication with scalars is required, then things are fine. As you can do $i \cdot [a]$ to obtain $[i \cdot a]$, checking that $i \cdot a = j \cdot b$ is as simple as checking that $i \cdot [a] - j \cdot [b] = [0]$.
This post is about explaining how pairing helps us when we want to check an identity that involves multiplying $a$ and $b$ together.
Using elliptic curve pairings for a single multiplication
It turns out that elliptic curve pairings allow us to perform a single multiplication. Meaning that once things get multiplied, they move to a different planet where things can only get added together and compared. No more multiplications.
Pairings give you this function $e$ which allows you to move things in the exponent like this: $e([a], [b]) = e([1], [1])^{ab}$. Where, remember, $ab$ is the multiplication of the two polynomials evaluated at a random point: $a(s) \cdot b(s)$.
As such, if you wanted to check something like this for example: $a \cdot b = c + 3$ with commitments only, you could check the following pairings:
$$
e([a], [b]) = e([c] + 3 [1], [1])
$$
By the way, the left argument and the right argument of a pairing are often in different groups for "reasons". So we usually write things like this:
$$
e([a]_1, [b]_2) = e([c]_1 + 3 [1]_1, [1]_2)
$$
And so it is important to have commitments in the right groups if you want to be able to construct your polynomial identity check.
Evaluations can help with more than one multiplication
But what if you want to check something like $a \cdot b \cdot c = d + 4$? Are we doomed?
We're not! One insight that plonk brought to me (which potentially came from older papers, I don't know, I'm not an academic, leave me alone), is that you can reduce the number of multiplication with "this one simple trick". Let me explain...
A typical scenario includes you wanting to check an identity like this one:
$$a(x) \cdot b(x) \cdot c(x) = d(x)$$
and you have KZG commitments to all three polynomials $[a], [b], [c]$. (So in other words, hidden evaluations of these polynomials at the same unknown random point $s$)
You can't compute the commitment of the left-hand side because you can't perform the multiplication of the three commitments.
The trick is to evaluate (using KZG) the previous identity at a different point, let's say $\zeta$, and pre-evaluate (using KZG as well) as many polynomials as you can to $\zeta$ to reduce the number of multiplications down to 0.
Note: that is, if we want to check that $a(x) - b(x) = 0$ is true, and we want to use S-Z to do that at some point $\zeta$, then we can pre-evaluate $a$ (or $b$) and check the following identity $a(\zeta) - b(x) = 0$ at some point $\zeta$ instead.
More precisely, we'll choose to pre-evaluate $b(\zeta) = \bar{b}$ and $c(\zeta) = \bar{c}$, for example. This means that we'll have to produce a quotient polynomial $q_b$ and $q_c$ such that:
- $b(s) - \bar{b} = (s - \zeta) \cdot q_b(s)$
- $c(s) - \bar{c} = (s - \zeta) \cdot q_c(s)$
which means that the verifier will have to perform the following two pairings (after having been sent the evaluation $\bar{b}$ and $\bar{c}$ in the clear):
- $e([b]_1 - \bar{b} \cdot [1]_1, [1]_2) = e([x]_1 - \zeta \cdot [1]_1, [q_b]_2)$
- $e([c]_1 - \bar{c} \cdot [1]_1, [1]_2) = e([x]_1 - \zeta \cdot [1]_1, [q_c]_2)$
Then, they'll be able to check the first identity at $\zeta$ and use $\bar{b}$ and $\bar{c}$ in place of the commitments $[b]$ and $[c]$. The verifier check will look like the following pairing (after receiving a commitment $[q]$ from the prover):
$$e( \bar{b} \cdot \bar{c} \cdot [a]_1 - [d] - 0, [1]_2) = e([x]_1 - \zeta \cdot [1]_1, [q]_2)$$
which proves using KZG that $a(\zeta)b(\zeta)c(\zeta) - d(\zeta) = 0$ (which proves that the identity checks out with high probability thanks to S-Z).
Aggregating all the KZG evaluation proofs
In the previous explanation, we actually perform 3 KZG evaluation proofs instead of one:
- $2$ pairings that are KZG evaluation proofs that pre-evaluate different polynomials from the main check at some random point $\zeta$.
- $1$ pairing that evaluates the main identity at $\zeta$, after it was linearized to get rid of any multiplication of commitments.
Pairings can be aggregated by simply creating a random linear combinations of the pairings. That is, with some random values $r_i$ we can aggregate the checks where the left-hand side is:
$$
b(s) - \bar{b} + r_1 (c(s) - \bar{c}) + r_2 (\bar{b} \cdot \bar{c} \cdot a(s) - d(s) - 0])
$$
and the right-hand side is:
$$ = (s - \zeta) \cdot q_b(s) + r_1 ((s - \zeta) \cdot q_c(s)) + r_2 ((s - \zeta) \cdot q(s))$$
I've recorded a video on how the plonk permutation works here, but I thought I would write a more incremental explanation about it for those who want MOAR! If things don't make sense in this explanation, I'm happy to dig into specifics in more detail, just ask in the comments! Don't forget your companion eprint paper.
Multiset equality check
Suppose that you have two ordered sets) of values $D = {d_1, d_2, d_3, d_4}$ and $E = {e_1, e_2, e_3, e_4}$, and that you want to check that they contain the same values. That is, you want to check that there exists a permutation of the elements of $D$ (or $E$) such that the multisets (sets where some values can repeat) are the same, but you don't care about which permutation exactly gets you there. You're willing to accept ANY permutation.
$${d_1, d_2, d_3, d_4} = \text{some_permutation}({e_1, e_2, e_3, e_4})$$
For example, it could be that re-ordering $E$ as ${e_2, e_3, e_1, e_4}$ gives us exactly $D$.
Trick 1: multiply things to reduce to a single value
One way to do perform our multiset equality check is to compare the product of elements on both sides:
$$d_1 \cdot d_2 \cdot d_3 \cdot d_4 = e_1 \cdot e_2 \cdot e_3 \cdot e_4$$
If the two sets contain the same values then our identity checks out. But the reverse is not true, and thus this scheme is not secure.
Can you see why?
For example, $D = (1, 1, 1, 15)$ and $E = (3, 5, 1, 1)$ are obviously different multisets, yet the product of their elements will match!
Trick 2: use polynomials, because maybe it will help...
What we can do to fix this issue is to encode the values of each lists as roots of two polynomials:
- $d(x) = (x - d_1)(x - d_2)(x - d_3)(x - d_4)$
- $e(x) = (x - e_1)(x - e_2)(x - e_3)(x - e_4)$
These two polynomials are equal if they have the same roots with the same multiplicities (meaning that if a root repeats, it must repeat the same number of times).
Trick 3: optimize polynomial identities with Schwartz-Zippel
Now is time to use the Schwartz-Zippel lemma to optimize the comparison of polynomials! Our lemma tells us that if two polynomials are equal, then they are equal on all points, but if two polynomials are not equal, then they differ on MOST points.
So one easy way to check that they match with high probability is to sample a random evaluation point, let's say some random $\gamma$. Then evaluate both polynomials at that random point $\gamma$ to see if their evaluations match:
$$(\gamma - d_1)(\gamma - d_2)(\gamma - d_3)(\gamma - d_4) = (\gamma - e_1)(\gamma - e_2)(\gamma - e_3)(\gamma - e_4)$$
Permutation check
The previous check is not useful for wiring different cells within some execution trace. There is no specific "permutation" being enforced. So we can't use it as in in plonk to implement our copy constraints.
Trick 4: random linear combinations to encode tuples
To enforce a permutation, we can compare tuples of elements instead! For example, let's say we want to enforce that $E$ must be re-ordered using the permutation $(1 3 2) (4)$ in cycle notation. Then we would try to do the following identity check:
$$
((1, d_1), (2, d_2), (3, d_3), (4, d_4)) = ((2, e_1), (3, e_2), (1, e_3), (4, e_4))
$$
Here, we are enforcing that $d_1$ is equal to $e_3$, and that $d_2$ is equal to $e_1$, etc. This allows us to re-order the elements of $E$:
$$
((1, d_1), (2, d_2), (3, d_3), (4, d_4)) = ((1, e_3), (2, e_1), (3, e_2), (4, e_4))
$$
But how can we encode our tuples into the polynomials we've seen previously?
The trick is to use a random linear combination! (And that is often the answer in a bunch of ZK protocol.)
So if we want to encode $(2, d_2)$ in an equation, for example, we write $2 + \beta \cdot d_2$ for some random value $\beta$.
Note: The rationale behind this idea is still due to Schwartz-Zippel: if you have two tuples $(a,b)$ and $(a', b')$ you know that the polynomials $a + x \cdot b$ is the same as the polynomial $a' + x \cdot b'$ if $a = a'$ and $b = b'$, or if you have $x = \frac{a' - a}{b - b'}$ . If $x$ is chosen at random, the probability that it is exactly that value is $\frac{1}{N}$ with $N$ the size of your sampling domain (i.e. the size of your field) which is highly unlikely.
So now we can encode the previous lists of tuples as these polynomials:
- $d(x, y) = (1 + y \cdot d_1 - x)(2 + y \cdot d_2 - x)(3 + y \cdot d_3 - x)(4 + y \cdot d_4 - x)$
- $e(x, y) = (2 + y \cdot e_1 - x)(3 + y \cdot e_2 - x)(1 + y \cdot e_3 - x)(4 + y \cdot e_4 - x)$
And then reduce both polynomials to a single value by sampling random values for $x$ and $y$. Which gives us:
- $(1 + \beta \cdot d_1 - \gamma)(2 + \beta \cdot d_2 - \gamma)(3 + \beta \cdot d_3 - \gamma)(4 + \beta \cdot d_4 - \gamma)$
- $(2 + \beta \cdot e_1 - \gamma)(3 + \beta \cdot e_2 - \gamma)(1 + \beta \cdot e_3 - \gamma)(4 + \beta \cdot e_4 - \gamma)$
If these two values match, with overwhelming probability we have that the two polynomials match and thus our permutation of $E$ matches $D$.
Wiring within a single execution trace column
Let's now see how we can use the (optimized) checks we've learn previously in plonk. We will first learn how to wire cells of a single execution trace column, and in the next section we will expand this to three columns (as vanilla Plonk uses three columns).
Take some moment to think about how can we use the previous stuff.
The answer is to see the execution trace as your list $E$, and then see if it is equal to a fixed permutation of it ($D$). Note that this permutation is decided when you write your circuit, and precomputed into the verifier key in Plonk.
Remember that the formula we're trying to check is the following for some random $\beta$ and $\gamma$, and for some permutation function $\sigma$ that we defined:
$$
\prod_{i=1} (i + \beta \cdot d[i] - \gamma) = \prod_{i=1} (\sigma(i) + \beta \cdot e[i] - \gamma)
$$
Trick 5: write a circuit for the permutation check
To enforce the previous check, we will write a mini-circuit (yes an actual circuit!) which will progressively accumulate the result of dividing the left-hand side with the right-hand side. This circuit only requires one variable/register we'll call $z$ (and so it will add a new column $z$ in our execution trace) which will start with the initial value 1 and will end with the following value:
$$
\prod_{i=1} \frac{i+\beta \cdot d[i] - \gamma}{\sigma(i) + \beta \cdot e[i] - \gamma} = 1
$$
Let's rewrite it using only the first wire/column $a$ of Plonk, and using our generator $\omega$ as index in our tuples (because this is how we handily index things in Plonk):
$$
\prod_{i=1} \frac{\omega^i+\beta \cdot a[i] - \gamma}{\sigma(\omega^i) + \beta \cdot a[i] - \gamma} = 1
$$
We can then constrain the last value to be equal to 1, which will enforce that the two polynomials encoding our list of value and its permutation are equal (with overwhelming probability).
In plonk, a gate can only access variables/registers from the same row. So we will use the following extra gate (reordering the previous equation, as we can't divide in a circuit) throughout the circuit:
$$
z[i+1] \cdot (\sigma(i) + \beta \cdot a[i] - \gamma) = z[i] \cdot (i + \beta \cdot a[i] - \gamma)
$$
Now, how do we encode this gate in the circuit? The astute eye will have noticed that we are using a cell of the next row ($z[i+1]$) which we haven't done in Plonk so far.
Trick 6: you're in a multiplicative subgroup, remember?
Enforcing things across rows is actually possible in plonk because we encode our polynomials in a multiplicative subgroup of our field! Due to this, we can reach for the next value(s) by multiplying an evaluation point with the subgroup's generator.
That is, values are encoded in our polynomials at evaluation points $\omega, \omega^2, \omega^3, \cdots$, and so multiplying an evaluation point by $\omega$ (the generator) brings you to the next cell in an execution trace.
As such, the verifier will later try to enforce that the following identity checks out in the multiplicative subgroup:
$$
z(x \cdot \omega) \cdot (\sigma(x) + \beta \cdot a(x) - \gamma) = z(x) \cdot (x + \beta \cdot a(x) - \gamma)
$$
Note: This concept was generalized in turboplonk, and is used extensively in the AIR arithmetization (used by STARKs). This is also the reason why in Plonk we have to evaluate the $z$ polynomial at $\zeta \omega$.
There will also be two additional gates: one that checks that the initial value is 1, and one that check that the last value is 1, both applied only to their respective rows. One trick that Plonk uses is that the last value is actually obtained in the last row. As last_value + 1 = 0
in our multiplicative subgroup, we have that $z[\text{last_value} + 1] = z[0]$ is constrained automatically. As such, checking that $z[0] = 1$ is enough.
You can see these two gates added to the vanilla plonk gate in the computation of the quotient polynomial $t$ in plonk. Take a look at this screenshot of the round 3 of the protocol, and squint really hard to ignore the division by $Z_H(X)$, the powers of $\alpha$ being used to aggregate the different gate checks, and the fact that $b$ and $c$ (the other wires/columns) are used:
The first line in the computation of $t$ is the vanilla plonk gate (that allows you to do multiplication and addition); the last line constrains that the first value of $z$ is $1$; and the other lines encode the permutation gate as I described (again, if you ignore the terms involving $b$ and $c$).
Trick 7: create your execution trace in steps
There's something worthy of note: the extra execution trace column $z$ contains values that use other execution trace columns. For this reason, the other execution trace columns must be fixed BEFORE anything is done with the permutation column $z$.
In Plonk, this is done by waiting for the prover to send commitments of $a$, $b$, and $c$ to the verifier, before producing the random challenges $\beta$ and $\gamma$ that will be used by the prover to produce the values of $z$.
Wiring multiple execution trace columns
The previous check only works within the cells of a single execution trace, how does Plonk generalizes this to several execution trace columns?
Remember: we indexed our first execution trace column with the values of our circuit domain (that multiplicative subgroup), we simply have to find a way to index the other columns with distinct values.
Trick 8: use cosets
A coset is simply a set that is the same size as another set, but that is completely disjoint from that set. Handily, a coset is also defined as something that's very easy to compute if you know a subgroup: just multiply it with some element $k$.
Since we want a similar-but-different set from the elements of our multiplicative subgroup, we can use cosets!
Plonk produces the values $k_1$ and $k_2$ (which can be the values $2$ and $3$, for example), which when multiplied with the values of our multiplicative subgroup (${\omega, \omega^2, \omega^3, \cdots}$) produces a different set of the same size. It's not a subgroup anymore, but who cares!
We now have to create three different permutations, one for each set, and each permutation can point to the index of any of the sets.