Short Note On Montgomery Reduction: Why Operating Modulo b? posted 6 days ago
Here's a short note on the Montgomery reduction algorithm, which we explained in this audit report of p256. If you don't know, this is an algorithm that is used to perform modular reductions in an efficient way. I invite you to read the explanations in the report, up until the section on word-by-word Montgomery reduction.
In this note, I wanted to offer a different explanation as to why most of the computation happens modulo $b$ instead of modulo $R$.
As a recap, we want to use $A$ (called $\bar{x}$ in the report's explanations) in a base-$b$ decomposition so that in our implementation we can use limbs of size $b$:
$$ A = a_0 + a_1 \cdot b + a_2 \cdot b^2 + \cdots $$
Knowing that we can write the nominator part of $N/R$ as explained here as:
$$ N = (a_0 + a_1 \cdot b + a_2 \cdot b^2 + \cdots) + k \cdot m $$
Grouping by $b^i$ terms we get the following terms:
$$ N = (a_0 + k_0 \cdot m) + b \cdot (a_1 + k_1 \cdot m) + b^2 \cdot (a_2 + k_2 \cdot m) + \cdots $$
but then why do algorithms compute $k_i = a_i m^{-1} \mod b$ at this point?
The trick is to notice that if a value is divisible by a power of 2, let's say $2^l$, then it means that the first $l$ least-significant bits are 0.
As such, the value $A + k \cdot m$ we are computing in the integers will have the first $n$ chunks sum to zero if it wants to be divisible by $R = b^n$ (where $b$ is a power of 2):
$$ \sum_{i=0}^{n-1} b^i \cdot (a_i + k_i \cdot m) = 0 $$
This means that:
- either $a_i + k_i \cdot m = 0$ (each term will separately be 0)
- or $a_i + k_i \cdot m = b \cdot \text{carry}$ (cancellation will occur with carries except for the last chunk that needs to be 0)
In both case, we can write:
$$ a_i + k_i \cdot m = 0 \mod b $$
Comments
leave a comment...