These are the notes of my last lecture in the 18.099 discrete analysis seminar.
It is a very high-level overview of the Green-Tao theorem.
It is a subset of this paper.
1. Synopsis
This post as in overview of the proof of:
Theorem 1(Green-Tao)
The prime numbers contain arbitrarily long arithmetic progressions.
Here, Szemerédi’s theorem isn’t strong enough, because the primes have density approaching zero.
Instead, one can instead try to prove the following “relativity” result.
Theorem(Relative Szemerédi)
Let S be a sparse “pseudorandom” set of integers.
Then subsets of A with positive density in S have arbitrarily long arithmetic progressions.
In order to do this, we have to accomplish the following.
Make precise the notion of “pseudorandom”.
Prove the Relative Szemerédi theorem, and then
Exhibit a “pseudorandom” set S which subsumes the prime numbers.
This post will use the graph-theoretic approach to Szemerédi as in the exposition of David Conlon, Jacob Fox,
and Yufei Zhao. In order to motivate the notion of pseudorandom,
we return to the graph-theoretic approach of Roth’s theorem, i.e. the case k=3 of Szemerédi’s theorem.
2. Defining the linear forms condition
2.1. Review of Roth theorem
Roth’s theorem can be phrased in two ways. The first is the “set-theoretic” formulation:
Theorem 2(Roth, set version)
If A⊆Z/N is 3-AP-free, then ∣A∣=o(N).
The second is a “weighted” version
Theorem 3(Roth, weighted version)
Fix δ>0. Let f:Z/N→[0,1] with Ef≥δ. Then
Λ3(f,f,f)≥Ωδ(1).
We sketch the idea of a graph-theoretic proof of the first theorem.
We construct a tripartite graph GA on vertices X⊔Y⊔Z, where X=Y=Z=Z/N.
Then one creates the edges
(x,y) if 2x+y∈A,
(x,z) if x−z∈A, and
(y,z) if −y−2z∈A.
This construction is selected so that arithmetic progressions in A correspond
to triangles in the graph GA.
As a result, if A has no 3-AP’s (except trivial ones, where x+y+z=0),
the graph GA has exactly one triangle for every edge.
Then, we can use the theorem of Ruzsa-Szemerédi, which states that this graph GA has o(n2) edges.
2.2. The measure ν
Now for the generalized version, we start with the second version of Roth’s theorem.
Instead of a set S, we consider a function
ν:Z/N→R≥0
which we call a majorizing measure.
Since we are now dealing with A of low density, we normalize ν so that
E[ν]=1+o(1).
Our goal is to now show a result of the form:
If 0≤f≤ν, Ef≥δ, and ν satisfies a “pseudorandom” condition,
then Λ3(f,f,f)≥Ωδ(1).
The prototypical example of course is that if A⊂S⊂ZN,
then we let ν(x)=∣S∣N1S(x).
2.3. Pseudorandomness for k=3
So, how can we put the pseudorandom condition?
Initially, consider GS the tripartite graph defined earlier, and let p=∣S∣/N;
since S is sparse we expect p small.
The main idea that turns out to be correct is:
The number of embeddings of K2,2,2 in S is “as expected”, namely (1+o(1))p12N6.
Here K2,2,2 is actually the 2-blow-up of a triangle.
This condition thus gives us control over the distribution of triangles in the sparse graph GS:
knowing that we have approximately the correct count for K2,2,2 is enough
to control distribution of triangles.
For technical reasons, in fact we want this to be true not only for
K2,2,2 but all of its subgraphs H.
Now, let’s move on to the weighted version.
Let’s consider a tripartite graph G, which we can think of as a collection of three functions
μ−zμ−yμ−x:X×Y→R:X×Z→R:Y×Z→R.
We think of μ as normalized so that
E[μ−x]=E[μ−y]=E[μ−z]=1. Then we can define
Definition 4. A weighted tripartite graph
μ=(μ−x,μ−y,μ−z) satisfies the 3-linear forms condition if
and similarly if any of the twelve factors are deleted.
Then the pseudorandomness condition is according to the graph we defined above:
Definition 5. A function ν:Z/N→Z is
satisfies the 3-linear forms condition if E[ν]=1+o(1),
and the tripartite graph μ=(μ−x,μ−y,μ−z) defined by
μ−zμ−yμ−x=ν(2x+y)=ν(x−z)=ν(−y−2z)
satisfies the 3-linear forms condition.
Finally, the relative version of Roth’s theorem which we seek is:
Theorem 6(Relative Roth)
Suppose ν:Z/N→R≥0 satisfies the 3-linear forms condition.
Then for any f:Z/N→R≥0 bounded above by
ν and satisfying E[f]≥δ>0, we have
Λ3(f,f,f)≥Ωδ(1).
2.4. Relative Szemerédi
We of course have:
Theorem 7(Szemerédi)
Suppose k≥3, and f:Z/n→[0,1] with E[f]≥δ. Then
Λk(f,…,f)≥Ωδ(1).
For k>3, rather than considering weighted tripartite graphs,
we consider a (k−1)-uniform k-partite hypergraph.
For example, given ν with E[ν]=1+o(1) and k=4, we use the construction
Thus 4-AP’s correspond to the simplex K4(3) (i.e. a tetrahedron).
We then consider the two-blow-up of the simplex, and require the same uniformity on subgraphs of H.
Here is the compiled version:
Definition 8. A (k−1)-uniform k-partite weighted hypergraph
μ=(μ−i)i=1k satisfies the k-linear forms condition if
This is actually an extension of the cut norm defined on a r-uniform
r-partite hypergraph (not(r−1)-uniform like before!):
if g:X1×⋯×Xr→R is such a graph, we let
Taking g(x1,…,xr)=f(x1+⋯+xr), X1=⋯=Xr=Z/N gives the analogy.
For the second theorem, we define the norm
∥g∥k−1,k□=i=1,…,kmax(∥g−i∥k−1,k−1□).
Theorem 12(Relative simplex counting lemma)
Let μ, g, g be weighted (k−1)-uniform k-partite weighted
hypergraphs on X1∪⋯∪Xk.
Assume that μ satisfies the k-linear forms condition, and 0≤g−i≤μ−i for all i,
0≤g≤1. If ∥g−g∥k−1,k□=o(1) then
One then combines these two results to prove Szemerédi, as follows. Start with f and ν in the theorem.
The k-linear forms condition turns out to imply ∥ν−1∥k−1□=o(1).
So we can find a nearby f by the dense model theorem.
Then, we induce ν, g, g from μ, f, f respectively.
The counting lemma then reduce the bounding of Λk(f,…,f) to the
bounding of Λk(f,…,f),
which is Ωδ(1) by the usual Szemerédi theorem.
4. Arithmetic progressions in primes
We now sketch how to obtain Green-Tao from Relative Szemerédi.
As expected, we need to us the von Mangoldt function Λ.
Unfortunately, Λ is biased (e.g. “all decent primes are odd”).
To get around this, we let w=w(N) tend to infinity slowly with N, and define
W=p≤w∏p.
In the W-trick we consider only primes 1(modW).
The modified von Mangoldt function then is defined by
Λ(n)={Wφ(W)log(Wn+1)0Wn+1 primeelse.
In accordance with Dirichlet, we have ∑n≤NΛ(n)=N+o(N).
So, we need to show now that
Proposition 13. Fix k≥3.
We can find δ=δ(k)>0 such that for N≫1 prime,
we can find ν:Z/N→R≥0 which
satisfies the k-linear forms condition as well as
ν(n)≥δΛ(n)
for N/2≤n<N.
In that case, we can let
f(n)={δΛ(n)0N/2≤n<Nelse.
Then 0≤f≤ν. The presence of N/2≤n<N allows us to avoid
“wrap-around issues” that arise from using Z/N instead of Z.
Relative Szemerédi then yields the result.
For completeness, we state the construction.
Let χ:R→[0,1] be supported on [−1,1] with χ(0)=1,
and define a normalizing constant cχ=∫0∞∣χ′(x)∣2dx.
Inspired by Λ(n)=∑d∣nμ(d)log(n/d), we define a truncated Λ by
Λχ,R(n)=logRd∣n∑μ(d)χ(logRlogd).
Let k≥3, R=Nk−12−k−3. Now, we define ν by