vEnhance's avatar

Jun 15, 2016

🖉 Miller-Rabin (for MIT 18.434)

This is a transcript of a talk I gave as part of MIT’s 18.434 class, the “Seminar in Theoretical Computer Science” as part of MIT’s communication requirement. (Insert snarky comment about MIT’s CI-* requirements here.) It probably would have made a nice math circle talk for high schoolers but I felt somewhat awkward having to present it to a bunch of students who were clearly older than me.

1. Preliminaries

1.1. Modular arithmetic

In middle school you might have encountered questions such as

Exercise 1. What is 32016(mod10)3^{2016} \pmod{10}?

You could answer such questions by listing out 3n3^n for small nn and then finding a pattern, in this case of period 44. However, for large moduli this “brute-force” approach can be time-consuming.

Fortunately, it turns out that one can predict the period in advance.

Theorem 2 (Euler’s little theorem)

  1. Let gcd(a,n)=1\gcd(a,n) = 1. Then aϕ(n)1(modn)a^{\phi(n)} \equiv 1 \pmod n.
  2. (Fermat) If pp is a prime, then apa(modp)a^p \equiv a \pmod p for every aa.

Proof: Part (a) is a special case of Lagrange’s Theorem: if GG is a finite group and gGg \in G, then gGg^{|G|} is the identity element. Now select G=(Z/nZ)×G = (\mathbb Z/n\mathbb Z)^\times. Part (b) is the case n=pn=p. \Box

Thus, in the middle school problem we know in advance that 341(mod10)3^4 \equiv 1 \pmod{10} because ϕ(10)=4\phi(10) = 4. This bound is sharp for primes:

Theorem 3 (Primitive roots)

For every pp prime there’s a g(modp)g \pmod p such that gp11(modp)g^{p-1} \equiv 1 \pmod p but gk≢1(modp)g^{k} \not\equiv 1 \pmod p for any k<p1k < p-1. (Hence (Z/pZ)×Z/(p1)(\mathbb Z/p\mathbb Z)^\times \cong \mathbb Z/(p-1).)

For a proof, see the last exercise of my orders handout.

We will define the following anyways:

Definition 4. We say an integer nn (thought of as an exponent) annihilates the prime pp if

  • an1(modp)a^n \equiv 1 \pmod p for every a≢0(modp)a \not\equiv 0 \pmod p,
  • or equivalently, p1np-1 \mid n.

Theorem 5 (All/nothing)

Suppose an exponent nn does not annihilate the prime pp. Then more than 12p\frac{1}{2} p of x(modp)x \pmod p satisfy xn≢1(modp)x^n \not\equiv 1 \pmod p.

Proof: Much stronger result is true: in xn1(modp)x^n \equiv 1 \pmod p then xgcd(n,p1)1(modp)x^{\gcd(n,p-1)} \equiv 1 \pmod p. \Box

1.2. Repeated Exponentiation

Even without the previous facts, one can still do:

Theorem 6 (Repeated exponentiation)

Given xx and nn, one can compute xn(modN)x^n \pmod N with O(logn)O(\log n) multiplications mod NN.

The idea is that to compute x600(modN)x^{600} \pmod N, one just multiplies x512+64+16+8x^{512+64+16+8}. All the x2kx^{2^k} can be computed in kk steps, and klog2nk \le \log_2 n.

1.3. Chinese remainder theorem

In the middle school problem, we might have noticed that to compute 32016(mod10)3^{2016} \pmod{10}, it suffices to compute it modulo 55, because we already know it is odd. More generally, to understand (modn)\pmod n it suffices to understand nn modulo each of its prime powers.

The formal statement, which we include for completeness, is:

Theorem 7 (Chinese remainder theorem)

Let p1p_1, p2p_2, …, pmp_m be distinct primes, and ei1e_i \ge 1 integers. Then there is a ring isomorphism given by the natural projection Z/ni=1mZ/piei.\mathbb Z/n \rightarrow \prod_{i=1}^m \mathbb Z/p_i^{e_i}. In particular, a random choice of x(modn)x \pmod n amounts to a random choice of xx mod each prime power.

For an example, in the following table we see the natural bijection between x(mod15)x \pmod{15} and (x(mod3),x(mod5))(x \pmod 3, x \pmod 5).

x(mod15)x(mod3)x(mod5)000111222303414520601712x(mod15)x(mod3)x(mod5)82390410101121120213131424 \begin{array}{c|cc} x \pmod{15} & x \pmod{3} & x \pmod{5} \\ \hline 0 & 0 & 0 \\ 1 & 1 & 1 \\ 2 & 2 & 2 \\ 3 & 0 & 3 \\ 4 & 1 & 4 \\ 5 & 2 & 0 \\ 6 & 0 & 1 \\ 7 & 1 & 2 \end{array} \quad \begin{array}{c|cc} x \pmod{15} & x \pmod{3} & x \pmod{5} \\ \hline 8 & 2 & 3 \\ 9 & 0 & 4 \\ 10 & 1 & 0 \\ 11 & 2 & 1 \\ 12 & 0 & 2 \\ 13 & 1 & 3 \\ 14 & 2 & 4 \\ && \end{array}

2. The RSA algorithm

This simple number theory is enough to develop the so-called RSA algorithm. Suppose Alice wants to send Bob a message MM over an insecure channel. They can do so as follows.

  • Bob selects integers dd, ee and NN (with NN huge) such that NN is a semiprime and

    de1(modϕ(N)).de \equiv 1 \pmod{\phi(N)}.

  • Bob publishes both the number NN and ee (the public key) but keeps the number dd secret (the private key).
  • Alice sends the number X=Me(modN)X = M^e \pmod N across the channel.
  • Bob computes

    XdMdeM1M(modN)X^d \equiv M^{de} \equiv M^1 \equiv M \pmod N and hence obtains the message MM.

In practice, the NN in RSA is at least 20002000 bits long.

The trick is that an adversary cannot compute dd from ee and NN without knowing the prime factorization of NN. So the security relies heavily on the difficulty of factoring.

Remark 8. It turns out that we basically don’t know how to factor large numbers NN: the best known classical algorithms can factor an nn-bit number in O(exp(649nlog(n)2)1/3)O\left( \exp\left( \frac{64}{9} n \log(n)^2 \right)^{1/3} \right) time (“general number field sieve”). On the other hand, with a quantum computer one can do this in O(n2lognloglogn)O\left( n^2 \log n \log \log n \right) time.

3. Primality Testing

Main question: if we can’t factor a number nn quickly, can we at least check it’s prime?

In what follows, we assume for simplicity that nn is squarefree, i.e. n=p1p2pkn = p_1 p_2 \dots p_k for distinct primes pkp_k, This doesn’t substantially change anything, but it makes my life much easier.

3.1. Co-RP

Here is the goal: we need to show there is a random algorithm AA which does the following.

  • If nn is composite then
    • More than half the time AA says “definitely composite”.
    • Occasionally, AA says “possibly prime”.
  • If nn is prime, AA always says “possibly prime”.

If there is a polynomial time algorithm AA that does this, we say that PRIMES is in Co-RP. Clearly, this is a very good thing to be true!

3.2. Fermat

One idea is to try to use the converse of Fermat’s little theorem: given an integer nn, pick a random number x(modn)x \pmod n and see if xn11(modn)x^{n-1} \equiv 1 \pmod n. (We compute using repeated exponentiation.) If not, then we know for sure nn is not prime, and we call xx a Fermat witness modulo nn.

How good is this test? For most composite nn, pretty good:

Proposition 9. Let nn be composite. Assume that there is a prime pnp \mid n such that n1n-1 does not annihilate pp. Then over half the numbers mod nn are Fermat witnesses.

Proof: Apply the Chinese theorem then the “all-or-nothing” theorem. \Box Unfortunately, if nn doesn’t satisfy the hypothesis, then all the gcd(x,n)=1\gcd(x,n) = 1 satisfy xn11(modn)x^{n-1} \equiv 1 \pmod n!

Are there such nn which aren’t prime? Such numbers are called Carmichael numbers, but unfortunately they exist, the first one is 561=31117561 = 3 \cdot 11 \cdot 17.

Remark 10. For X1X \gg 1, there are more than X1/3X^{1/3} Carmichael numbers at most XX.

Thus these numbers are very rare, but they foil the Fermat test.

Exercise 11. Show that a Carmichael number is not a semiprime.

3.3. Rabin-Miller

Fortunately, we can adapt the Fermat test to cover Carmichael numbers too. It comes from the observation that if nn is prime, then a21(modn)    a±1(modn)a^2 \equiv 1 \pmod n \implies a \equiv \pm 1 \pmod n.

So let n1=2stn-1 = 2^s t, where tt is odd. For example, if n=561n = 561 then 560=2435560 = 2^4 \cdot 35. Then we compute xtx^t, x2tx^{2t}, …, xn1x^{n-1}. For example in the case n=561n=561 and x=245x=245:

mod561mod3mod11mod17x245137x35122113x70298119x140166114x28067111x5601111 \begin{array}{c|r|rrr} & \mod 561 & \mod 3 & \mod 11 & \mod 17 \\ \hline x & 245 & -1 & 3 & 7 \\ \hline x^{35} & 122 & -1 & \mathbf 1 & 3 \\ x^{70} & 298 & \mathbf 1 & 1 & 9 \\ x^{140} & 166 & 1 & 1 & -4 \\ x^{280} & 67 & 1 & 1 & -1 \\ x^{560} & 1 & 1 & 1 & \mathbf 1 \end{array}

And there we have our example! We have 6721(mod561)67^2 \equiv 1 \pmod{561}, so 561561 isn’t prime.

So the Rabin-Miller test works as follows:

  • Given nn, select a random xx and compute powers of xx as in the table.
  • If xn1≢1x^{n-1} \not\equiv 1, stop, nn is composite (Fermat test).
  • If xn11x^{n-1} \equiv 1, see if the entry just before the first 11 is 1-1. If it isn’t then we say xx is a RM-witness and nn is composite.
  • Otherwise, nn is “possibly prime”.

How likely is probably?

Theorem 12. If nn is Carmichael, then over half the x(modn)x \pmod n are RM witnesses.

Proof: We sample x(modn)x \pmod n randomly again by looking modulo each prime (Chinese theorem). By the theorem on primitive roots, show that the probability the first 1-1 appears in any given row is 12\le \frac{1}{2}. This implies the conclusion. \Box

Exercise 13. Improve the 12\frac{1}{2} in the problem to 34\frac34 by using the fact that Carmichael numbers aren’t semiprime.

3.4. AKS

In August 6, 2002, it was in fact shown that PRIMES is in P, using the deterministic AKS algorithm. However, in practice everyone still uses Miller-Rabin since the implied constants for AKS runtime are large.