vEnhance's avatar

Jun 12, 2015

🖉 Proof of Dirichlet's Theorem on Arithmetic Progressions

In this post I will sketch a proof Dirichlet Theorem’s in the following form:

Theorem 1 (Dirichlet’s Theorem on Arithmetic Progression)

Let ψ(x;q,a)=nxnamodqΛ(n).\psi(x;q,a) = \sum_{\substack{n \le x \\ n \equiv a \mod q}} \Lambda(n). Let NN be a positive constant. Then for some constant C(N)>0C(N) > 0 depending on NN, we have for any qq such that q(logx)Nq \le (\log x)^N we have ψ(x;q,a)=1ϕ(q)x+O(xexp(C(N)logx))\psi(x;q,a) = \frac{1}{\phi(q)} x + O\left( x\exp\left(-C(N) \sqrt{\log x}\right) \right) uniformly in qq.

Prerequisites: complex analysis, previous two posts, possibly also Dirichlet characters. It is probably also advisable to read the last chapter of Hildebrand first, since this contains a much more thorough version of an easier version in which the zeros of LL-functions are less involved.

Warning: I really don’t understand what I am saying. It is at least 50% likely that this post contains a major error, and 90% likely that there will be multiple minor errors. Please kindly point out any screw-ups of mine; thanks!

Throughout this post: s=σ+its = \sigma + it and ρ=β+iγ\rho = \beta + i \gamma, as always. All OO-estimates have absolute constants unless noted otherwise, and ABA \ll B means A=O(B)A = O(B), ABA \asymp B means ABAA \ll B \ll A. By abuse of notation, L\mathcal L will be short for either logq(t+2)\log q \left( \left\lvert t \right\rvert + 2 \right) or logq(T+2)\log q \left( \left\lvert T \right\rvert + 2 \right), depending on context.

1. Outline

Here are the main steps:

  1. We introduce Dirichlet character χ:NC\chi : \mathbb N \rightarrow \mathbb C which will serves as a roots of unity filter, extracting terms a(modq)\equiv a \pmod q. We will see that this reduces the problem to estimating the function ψ(x,χ)=nxχ(n)Λ(n)\psi(x,\chi) = \sum_{n \le x} \chi(n) \Lambda(n).
  • Introduce the LL-function L(s,χ)L(s, \chi), the generalization of ζ\zeta for arithmetic progressions. Establish a functional equation in terms of ξ(χ,s)\xi(\chi,s), much like with ζ\zeta, and use it to extend L(s,χ)L(s,\chi) to a meromorphic function in the entire complex plane.
  • We will use a variation on the Perron transformation in order to transform this sum into an integral involving an LL-function L(χ,s)L(\chi,s). We truncate this integral to [ciT,c+iT][c-iT, c+iT]; this introduces an error EtruncateE_{\text{truncate}} that can be computed immediately, though in this presentation we delay its computation until later.
  • We do a contour as in the proof of the Prime Number Theorem in order to estimate the above integral in terms of the zeros of L(χ,s)L(\chi, s). The main term emerges as a residue, so we want to show that the integral EcontourE_{\text{contour}} along this integral goes to zero. Moreover, we get some residues ρxρρ\sum_\rho \frac{x^\rho}{\rho} related to the zeros of the LL-function.
  • By using Hadamard’s Theorem on ξ(χ,s)\xi(\chi,s) which is entire, we can write LL(s,χ)\frac{L'}{L}(s,\chi) in terms of its zeros. This has three consequences:

    1. We can use the previous to get bounds on LL(s,χ)\frac{L'}{L}(s, \chi).
    2. Using a 3-4-1 trick, this gives us information on the horizontal distribution of ρ\rho; the dreaded Siegel zeros appear here.
    3. We can get an expression which lets us estimate the vertical distribution of the zeros in the critical strip (specifically the number of zeros with γ[T1,T+1]\gamma \in [T-1, T+1]).

    The first and third points let us compute EcontourE_{\text{contour}}.

  • The horizontal zero-free region gives us an estimate of ρxρρ\sum_\rho \frac{x^\rho}{\rho}, which along with EcontourE_{\text{contour}} and EtruncateE_{\text{truncate}} gives us the value of ψ(x,χ)\psi(x,\chi).
  • We use Siegel’s Theorem to handle the potential Siegel zero that might arise.

Possibly helpful diagram:

Contour integral for zeta function.
Contour integral for zeta function.

The pink dots denote zeros; we think the nontrivial ones all lie on the half-line by the Generalized Riemann Hypothesis but they could actually be anywhere in the green strip.

2. Dirichlet Characters

2.1. Definitions

Recall that a Dirichlet character χ\chi modulo qq is a completely multiplicative function χ:NC\chi : \mathbb N \rightarrow \mathbb C which is also periodic modulo qq, and vanishes for all nn with gcd(n,q)>1\gcd(n,q) > 1. The trivial character (denoted χ0\chi_0) is defined by χ0(n)=1\chi_0(n) = 1 when gcd(n,q)=1\gcd(n,q)=1 and χ0(n)=0\chi_0(n) = 0 otherwise.

In particular, χ(1)=1\chi(1)=1 and thus each nonzero χ\chi value is a ϕ(q)\phi(q)-th primitive root of unity; there are also exactly ϕ(q)\phi(q) Dirichlet characters modulo qq. Observe that χ(1)2=χ(1)=1\chi(-1)^2 = \chi(1) = 1, so χ(1)=±1\chi(-1) = \pm 1. We shall call χ\chi even if χ(1)=+1\chi(1) = +1 and odd otherwise.

If q~q\tilde q \mid q, then a character χ~\tilde\chi modulo q~\tilde q induces a character χ\chi modulo qq in a natural way: let χ=χ~\chi = \tilde\chi except at the points where gcd(n,q)>1\gcd(n,q)>1 but gcd(n,q~)=1\gcd(n,\tilde q)=1, letting χ\chi be zero at these points instead. (In effect, we are throwing away information about χ~\tilde\chi.) A character χ\chi not induced by any smaller character is called primitive.

2.2. Orthogonality

The key fact about Dirichlet characters which will enable us to prove the theorem is the following trick:

Theorem 2 (Orthogonality of Dirichlet Characters)

We have

χmodqχ(a)χ(b)={ϕ(q) if ab(modq),gcd(a,q)=10otherwise. \sum_{\chi \mod q} \chi(a) \overline{\chi}(b) = \begin{cases} \phi(q) & \text{ if } a \equiv b \pmod q, \gcd(a,q) = 1 \\ 0 & \text{otherwise}. \end{cases}

(Here χ\overline{\chi} is the conjugate of χ\chi, which is essentially a multiplicative inverse.)

This is in some senses a slightly fancier form of the old roots of unity filter. Specifically, it is not too hard to show that χχ(n)\sum_{\chi} \chi(n) vanishes for n≢1(modq)n \not\equiv 1 \pmod q while it is equal to ϕ(q)\phi(q) for n1(modq)n \equiv 1 \pmod q.

2.3. Dirichlet LL-Functions

Now we can define the associated LL-function by L(χ,s)=n1χ(n)ns=p(1χ(p)ps)1.L(\chi, s) = \sum_{n \ge 1} \chi(n) n^{-s} = \prod_p \left( 1-\chi(p) p^{-s} \right)^{-1}. The properties of these LL-functions are that

Theorem 3. Let χ\chi be a Dirichlet character modulo qq. Then

  1. If χχ0\chi \ne \chi_0, L(χ,s)L(\chi, s) can be extended to a holomorphic function on σ>0\sigma > 0.
  2. If χ=χ0\chi = \chi_0, L(χ,s)L(\chi, s) can be extended to a meromorphic function on σ>0\sigma > 0, with a single simple pole at s=1s=1 of residue ϕ(q)/q\phi(q) / q.

The proof is pretty much the same as for zeta.

Observe that if q=1q=1, then L(χ,s)=ζ(s)L(\chi, s) = \zeta(s).

2.4. The Functional Equation for Dirichlet LL-Functions

While I won’t prove it here, one can show the following analog of the functional equation for Dirichlet LL-functions.

Theorem 4 (The Functional Equation of Dirichlet LL-Functions)

Assume that χ\chi is a character modulo qq, possibly trivial or imprimitive. Let a=0a=0 if χ\chi is even and a=1a=1 if χ\chi is odd. Let ξ(s,χ)=q12(s+a)γ(s,χ)L(s,χ)[s(1s)]δ(x)\xi(s,\chi) = q^{\frac{1}{2}(s+a)} \gamma(s,\chi) L(s,\chi) \left[ s(1-s) \right]^{\delta(x)} where γ(s,χ)=π12(s+a)Γ(s+a2)\gamma(s,\chi) = \pi^{-\frac{1}{2}(s+a)} \Gamma\left( \frac{s+a}{2} \right) and δ(χ)=1\delta(\chi) = 1 if χ=χ0\chi = \chi_0 and zero otherwise. Then

  1. ξ\xi is entire.
  2. If χ\chi is primitive, then ξ(s,χ)=W(χ)ξ(1s,χ)\xi(s,\chi) = W(\chi)\xi(1-s, \overline{\chi}) for some complex number W(χ)=1\left\lvert W(\chi) \right\rvert = 1.

Unlike the ζ\zeta case, the W(χ)W(\chi) is nastier to describe; computing it involves some Gauss sums that would be too involved for this post. However, I should point out that it is the Gauss sum here that requires χ\chi to be primitive. As before, ξ\xi gives us an meromorphic continuation of L(χ,s)L(\chi, s) in the entire complex plane. We obtain trivial zeros of L(χ,s)L(\chi, s) as follows:

  • For χ\chi even, we get zeros at 2-2, 4-4, 6-6 and so on.
  • For χχ0\chi \neq \chi_0 even, we get zeros at 00, 2-2, 4-4, 6-6 and so on (since the pole of Γ(12s)\Gamma(\frac{1}{2} s) at s=0s=0 is no longer canceled).
  • For χ\chi odd, we get zeros at 1-1, 3-3, 5-5 and so on.

3. Obtaining the Contour Integral

3.1. Orthogonality

Using the trick of orthogonality, we may write

ψ(x;q,a)=nx1ϕ(q)χmodqχ(n)χ(a)Λ(n)=1ϕ(q)χmodqχ(a)(nxχ(n)Λ(n)). \begin{aligned} \psi(x;q,a) &= \sum_{n \le x} \frac{1}{\phi(q)} \sum_{\chi \mod q} \chi(n)\overline{\chi}(a) \Lambda(n) \\ &= \frac{1}{\phi(q)} \sum_{\chi \mod q} \overline{\chi}(a) \left( \sum_{n \le x} \chi(n) \Lambda(n) \right). \end{aligned}

To do this we have to estimate the sum nxχ(n)Λ(n)\sum_{n \le x} \chi(n) \Lambda(n).

3.2. Introducing the Logarithmic Derivative of the LL-Function

First, we realize χ(n)Λ(n)\chi(n) \Lambda(n) as the coefficients of a Dirichlet series. Recall last time we saw that ζζ-\frac{\zeta'}{\zeta} gave Λ\Lambda as coefficients. We can do the same thing with LL-functions: put logL(s,χ)=plog(1χ(p)ps).\log L(s, \chi) = -\sum_p \log \left( 1 - \chi(p) p^{-s} \right). Taking the derivative, we obtain

Theorem 5. For any χ\chi (possibly trivial or imprimitive) we have LL(s,χ)=n1Λ(n)χ(n)ns.-\frac{L'}{L}(s, \chi) = \sum_{n \ge 1} \Lambda(n) \chi(n) n^{-s}.

Proof:

LL(s,χ)=plogp1χ(p)ps=plogpm1χ(pm)(pm)s=n1Λ(n)χ(n)ns \begin{aligned} -\frac{L'}{L}(s, \chi) &= \sum_p \frac{\log p}{1-\chi(p) p^{-s}} \\ &= \sum_p \log p \cdot \sum_{m \ge 1} \chi(p^m) (p^m)^{-s} \\ &= \sum_{n \ge 1} \Lambda(n) \chi(n) n^{-s} \end{aligned}

as desired. \Box

3.3. The Truncation Trick

Now, we unveil the trick at the heart of the proof of Perron’s Formula in the last post. I will give a more precise statement this time, by stating where this integral comes from:

Lemma 6 (Truncated Version of Perron Lemma)

For any c,y,T>0c,y,T > 0 define I(y,T)=12πiciTc+iTyssdsI(y,T) = \frac{1}{2\pi i} \int_{c-iT}^{c+iT} \frac{y^s}{s} ds Then I(y,T)=δ(y)+E(y,T)I(y,T) = \delta(y) + E(y,T) where δ(y)\delta(y) is the indicator function defined by δ(y)={00<y<112y=11y>1\delta(y) = \begin{cases} 0 & 0 < y < 1 \\ \frac{1}{2} & y=1 \\ 1 & y > 1 \end{cases} and the error term E(y,T)E(y,T) is given by

E(y,T)<{ycmin{1,1Tlogy}y1cT1y=1. \left\lvert E(y,T) \right\rvert < \begin{cases} y^c \min \left\{ 1, \frac{1}{T \left\lvert \log y \right\rvert} \right\} & y \neq 1 \\ cT^{-1} & y=1. \end{cases}

In particular, I(y,)=δ(y)I(y,\infty) = \delta(y).

In effect, the integral from ciTc-iT to c+iTc+iT is intended to mimic an indicator function. We can use it to extract the terms of the Dirichlet series of LL(s,χ)-\frac{L'}{L}(s, \chi) which happen to have nxn \le x, by simply appealing to δ(x/n)\delta(x/n). Unfortunately, we cannot take T=T = \infty because later on this would introduce a sum which is not absolutely convergent, meaning we will have to live with the error term introduced by picking a particular finite value of TT.

3.4. Applying the Truncation

Let’s do so: define ψ(x;χ)=n1δ(x/n)Λ(n)χ(n)\psi(x;\chi) = \sum_{n \ge 1} \delta\left( x/n \right) \Lambda(n) \chi(n) which is almost the same as nxΛ(n)χ(n)\sum_{n \le x} \Lambda(n) \chi(n), except that if xx is actually an integer then Λ(x)χ(x)\Lambda(x)\chi(x) should be halved (since δ(12)=12\delta(\frac{1}{2}) = \frac{1}{2}). Now, we can substitute in our integral representation, and obtain

ψ(x;χ)=n1Λ(n)χ(n)(E(x/n,T)+ciTc+iT(x/n)ssds)=n1Λ(n)χ(n)E(x/n,T)+ciTc+iTn1(Λ(n)χ(n)ns)xssds=Etruncate+ciTc+iTLL(s,χ)xssds \begin{aligned} \psi(x;\chi) &= \sum_{n \ge 1} \Lambda(n) \chi(n) \cdot \left( E(x/n,T) + \int_{c-iT}^{c+iT} \frac{(x/n)^s}{s} ds \right) \\ &= \sum_{n \ge 1} \Lambda(n) \chi(n) \cdot E(x/n, T) + \int_{c-iT}^{c+iT} \sum_{n \ge 1} \left( \Lambda(n)\chi(n) n^{-s} \right) \frac{x^s}{s} ds \\ &= E_{\text{truncate}} + \int_{c-iT}^{c+iT} -\frac{L'}{L}(s, \chi) \frac{x^s}{s} ds \end{aligned}

where Etruncate=n1Λ(n)χ(n)E(x/n,T).E_{\text{truncate}} = \sum_{n \ge 1} \Lambda(n) \chi(n) \cdot E(x/n, T). Estimating this is quite ugly, so we defer it to later.

4. Applying the Residue Theorem

4.1. Primitive Characters

Exactly like before, we are going to use a contour to estimate the value of ciTc+iTLL(s,χ)xssds.\int_{c-iT}^{c+iT} -\frac{L'}{L}(s, \chi) \frac{x^s}{s} ds. Let UU be a large half-integer (so no zeros of L(χ,s)L(\chi,s) with Res=U\operatorname{Re} s = U). We then re-route the integration path along the contour integral ciTUiTU+iTc+iT.c-iT \rightarrow -U-iT \rightarrow -U+iT \rightarrow c+iT. During this process we pick up residues, which are the interesting terms.

First, assume that χ\chi is primitive, so the functional equation applies and we get the information we want about zeros.

  • If χ=χ0\chi = \chi_0, then so we pick up a residue of +x+x corresponding to

    (1)x1/1=+x.(-1) \cdot -x^1/1 = +x. This is the “main term”. Per laziness, δ(χ)x\delta(\chi) x it is.

  • Depending on whether χ\chi is odd or even, we detect the trivial zeros, which we can express succinctly by

    m1xa2m2ma\sum_{m \ge 1} \frac{x^{a-2m}}{2m-a} Actually, I really ought to truncate this at UU, but since I’m going to let UU \rightarrow \infty in a moment I really don’t want to take the time to do so; the difference is negligible.

  • We obtain a residue of LL(s,χ)-\frac{L'}{L}(s, \chi) at s=0s = 0, which we denote b(χ)b(\chi), for s=0s=0. Observe that if χ\chi is even, this is the constant term of LL(s,χ)-\frac{L'}{L}(s, \chi) near s=0s=0 (but there is a pole of the whole function at s=0s=0); otherwise it equals the value of LL(0,χ)-\frac{L'}{L}(0, \chi) straight-out.
  • If χχ0\chi \ne \chi_0 is even then L(s,χ)L(s, \chi) itself has a zero, so we are in worse shape. We recall that

    LL(s,χ)=1s+b(χ)+\frac{L'}{L}(s, \chi) = \frac 1 s + b(\chi) + \dots and notice that

    xss=1s+logx+\frac{x^s}{s} = \frac 1s + \log x + \dots so we pick up an extra residue of logx-\log x. So, call this a bonus of (1a)logx-(1-a) \log x

  • Finally, the hard-to-understand zeros in the strip 0<σ<10 < \sigma < 1. If ρ=β+iγ\rho = \beta+i\gamma is a zero, then it contributes a residue of xρρ-\frac{x^\rho}{\rho}. We only pick up the zeros with γ<T\left\lvert \gamma \right\rvert < T in our rectangle, so we get a term

    ρ,γ<Txρρ.-\sum_{\rho, \left\lvert \gamma \right\rvert < T} \frac{x^\rho}{\rho}. Letting UU \rightarrow \infty we derive that

    =ciTc+iTLL(s,χ)xssds=δ(χ)x+Econtour+m1xa2m2mab(χ)(1a)logxρ,γ<Txρρ \begin{aligned} &\phantom= \int_{c-iT}^{c+iT} -\frac{L'}{L}(s, \chi) \frac{x^s}{s} ds \\ &= \delta(\chi) x + E_{\text{contour}} + \sum_{m \ge 1} \frac{x^{a-2m}}{2m-a} - b(\chi) - (1-a) \log x - \sum_{\rho, \left\lvert \gamma \right\rvert < T} \frac{x^\rho}{\rho} \end{aligned}

    at least for primitive characters. Note that the sum over the zeros is not absolutely convergent without the restriction to γ<T\left\lvert \gamma \right\rvert < T (with it, the sum becomes a finite one).

4.2. Transition to nonprimitive characters

The next step is to notice that if χ\chi modulo qq happens to be not primitive, and is induced by χ~\tilde\chi with modulus q~\tilde q, then actually ψ(x,χ)\psi(x,\chi) and ψ(x,χ~)\psi(x,\tilde\chi) are not so different. Specifically, they differ by at most

ψ(x,χ)ψ(x,χ~)gcd(n,q~)=1gcd(n,q)>1nxΛ(n)gcd(n,q)>1nxΛ(n)pqpkxlogppqlogx(logq)(logx) \begin{aligned} \left\lvert \psi(x,\chi)-\psi(x,\tilde\chi) \right\rvert &\le \sum_{\substack{\gcd(n,\tilde q)=1 \\ \gcd(n,q) > 1 \\ n \le x}} \Lambda(n) \\ &\le \sum_{\substack{\gcd(n,q) > 1 \\ n \le x}} \Lambda(n) \\ &\le \sum_{p \mid q} \sum_{\substack{p^k \le x}} \log p \\ &\le \sum_{p \mid q} \log x \\ &\le (\log q)(\log x) \end{aligned}

and so our above formula in fact holds for any character χ\chi, if we are willing to add an error term of (logq)(logx)(\log q)(\log x). This works even if χ\chi is trivial, and also q~q\tilde q \le q, so we will just simplify notation by omitting the tilde’s.

Anyways (logq)(logx)(\log q)(\log x) is piddling compared to all the other error terms in the problem, and we can swallow a lot of the boring residues into a new term, say Etiny(logq+1)(logx)+2.E_{\text{tiny}} \le (\log q + 1)(\log x) + 2. Thus we have

ψ(x,χ)=δ(χ)x+Econtour+Etruncate+Etinyb(χ)ρ,γ<Txρρ. \psi(x, \chi) = \delta(\chi) x + E_{\text{contour}} + E_{\text{truncate}} + E_{\text{tiny}} - b(\chi) - \sum_{\rho, \left\lvert \gamma \right\rvert < T} \frac{x^\rho}{\rho}.

Unfortunately, the constant b(χ)b(\chi) depends on χ\chi and cannot be absorbed. We will also estimate EcontourE_{\text{contour}} in the error term party.

5. Distribution of Zeros

In order to estimate ρ,γ<Txρρ\sum_{\rho, \left\lvert \gamma \right\rvert < T} \frac{x^\rho}{\rho} we will need information on both the vertical and horizontal distribution of the zeros. Also, it turns out this will help us compute EcontourE_{\text{contour}}.

5.1. Applying Hadamard’s Theorem

Let χ\chi be primitive modulo qq. As we saw,

ξ(s,χ)=(q/π)12s+12aΓ(s+a2)L(s,χ)(s(1s))δ(χ) \xi(s,\chi) = (q/\pi)^{\frac{1}{2} s + \frac{1}{2} a} \Gamma\left( \frac{s+a}{2} \right) L(s, \chi) \left( s(1-s) \right)^{\delta(\chi)}

is entire. It also is easily seen to have order 11, since no term grows much more than exponentially in ss (using Stirling to handle the Γ\Gamma factor). Thus by Hadamard, we may put ξ(s,χ)=eA(χ)+B(χ)zρ(1zρ)ezρ.\xi(s, \chi) = e^{A(\chi)+B(\chi)z} \prod_\rho \left( 1-\frac{z}{\rho} \right) e^{\frac{z}{\rho}}. Taking a logarithmic derivative and cleaning up, we derive the following lemma.

Lemma 7 (Hadamard Expansion of Logarithmic Derivative)

For any primitive character χ\chi (possibly trivial) we have

LL(s,χ)=12logqπ+12Γ(12s+12a)Γ(12s+12a)B(χ)ρ(1sρ+1ρ)+δ(χ)(1s1+1s). \begin{aligned} -\frac{L'}{L}(s, \chi) &= \frac{1}{2} \log\frac{q}{\pi} + \frac{1}{2}\frac{\Gamma'(\frac{1}{2} s + \frac{1}{2} a)}{\Gamma(\frac{1}{2} s + \frac{1}{2} a)} \\ &- B(\chi) - \sum_{\rho} \left( \frac{1}{s-\rho} + \frac{1}{\rho} \right) + \delta(\chi) \cdot \left( \frac{1}{s-1} + \frac 1s \right). \end{aligned}

Proof: On one hand, we have

logξ(s,χ)=A(χ)+B(χ)s+ρ(log(1sρ)+sρ). \log \xi(s, \chi) = A(\chi) + B(\chi) s + \sum_\rho \left( \log \left( 1-\frac{s}{\rho} \right) + \frac{s}{\rho} \right).

On the other hand

logξ(s,χ)=s+a2logqπ+logΓ(s+a2)+logL(s,χ)+δχ(logs+log(1s)). \log \xi(s, \chi) = \frac{s+a}{2} \cdot \log \frac{q}{\pi} + \log \Gamma\left( \frac{s+a}{2} \right) + \log L(s, \chi) + \delta\chi(\log s + \log (1-s)).

Taking the derivative of both sides and setting them equal: we have on the left side

B(χ)+ρ(11sρ1ρ+1ρ)=B(χ)+ρ(1sρ+1ρ) B(\chi) + \sum_{\rho} \left( \frac{1}{1-\frac{s}{\rho}} \cdot \frac{1}{-\rho} + \frac{1}{\rho} \right) = B(\chi) + \sum_\rho \left( \frac{1}{s-\rho} + \frac{1}{\rho} \right)

and on the right-hand side

12logqπ+12ΓΓ(s+a2)+LL(s,χ)+δχ(1s+1s1). \frac{1}{2} \log\frac{q}{\pi} + \frac{1}{2}\frac{\Gamma'}{\Gamma}\left( \frac{s+a}{2} \right) + \frac{L'}{L} (s, \chi) + \delta_\chi \left( \frac 1s + \frac{1}{s-1} \right).

Equating these gives the desired result. \Box

This will be useful in controlling things later. The B(χ)B(\chi) is a constant that turns out to be surprisingly annoying; it is tied to b(χ)b(\chi) from the contour, so we will need to deal with it.

5.2. A Bound on the Logarithmic Derivative

Frequently we will take the real part of this. Using Stirling, the short version of this is:

Lemma 8 (Logarithmic Derivative Bound)

Let σ1\sigma \ge 1 and χ\chi be primitive (possibly trivial). Then

Re[L(σ+it,χ)L(σ+it,χ)]={O(L)Reρ1sρ+Reδ(χ)s11σ2O(L)Reρ1sρ1σ2,t2O(1)σ2. \operatorname{Re} \left[ -\frac{L'(\sigma+it, \chi)}{L(\sigma+it, \chi)} \right] = \begin{cases} O(\mathcal L) - \operatorname{Re} \sum_\rho \frac{1}{s-\rho} + \operatorname{Re} \frac{\delta(\chi)}{s-1} & 1 \le \sigma \le 2 \\ O(\mathcal L) - \operatorname{Re} \sum_\rho \frac{1}{s-\rho} & 1 \le \sigma \le 2, \left\lvert t \right\rvert \ge 2 \\ O(1) & \sigma \ge 2. \end{cases}

Proof: The claim is obvious for σ2\sigma \ge 2, since we can then bound the quantity by ζ(σ)ζ(σ)ζ(2)ζ(2)\frac{\zeta'(\sigma)}{\zeta(\sigma)} \le \frac{\zeta'(2)}{\zeta(2)} due to the fact that the series representation is valid in that range. The second part with t2\left\lvert t \right\rvert \ge 2 follows from the first line, by noting that Re1s1<1\operatorname{Re} \frac{1}{s-1} < 1. So it just suffices to show that O(L)Reρ1sρ+Reδ(χ)s1O(\mathcal L) - \operatorname{Re} \sum_\rho \frac{1}{s-\rho} + \operatorname{Re} \frac{\delta(\chi)}{s-1} where 1σ21 \le \sigma \le 2 and χ\chi is primitive.

First, we claim that ReB(χ)=Re1ρ\operatorname{Re} B(\chi) = - \operatorname{Re} \sum \frac{1}{\rho}. We use the following trick:

B(χ)=ξ(0,χ)ξ(0,χ)=ξ(1,χ)ξ(1,χ)=B(χ)ρ(11ρ+1ρ)+δ(χ)s1 B(\chi) = \frac{\xi'(0,\chi)}{\xi(0,\chi)} = -\frac{\xi'(1,\overline{\chi})}{\xi(1,\overline{\chi})} = \overline{B(\chi)} - \sum_{\overline{\rho}} \left( \frac{1}{1-\overline{\rho}} + \frac{1}{\overline{\rho}} \right) + \frac{\delta(\chi)}{s-1}

where the ends come from taking the logarithmic derivative directly. By switching 1ρ1-\overline{\rho} with ρ\rho, the claim follows.

Then, the lemma follows rather directly; the Reρ1ρ\operatorname{Re} \sum_\rho \frac{1}{\rho} has miraculously canceled with ReB(χ)\operatorname{Re} B(\chi). To be explicit, we now have

ReL(s,χ)L(s,χ)=12logqπ+12ReΓ(12s+12a)Γ(12s+12a)ρRe1sρ+δ(χ)s+δ(χ)s1 - \operatorname{Re} \frac{L'(s, \chi)}{L(s, \chi)} = \frac{1}{2} \log\frac{q}{\pi} + \frac{1}{2} \operatorname{Re} \frac{\Gamma'(\frac{1}{2} s + \frac{1}{2} a)}{\Gamma(\frac{1}{2} s + \frac{1}{2} a)} - \sum_{\rho} \operatorname{Re} \frac{1}{s-\rho} + \frac{\delta(\chi)}{s} + \frac{\delta(\chi)}{s-1}

and the first two terms contribute logq\log q and logt\log t, respectively; meanwhile the term δ(χ)s\frac{\delta(\chi)}{s} is at most 11, so it is absorbed. \Box

Short version: our functional equation lets us relate L(s,χ)L(s, \chi) to L(1s,χ)L(1-s, \chi) for σ0\sigma \le 0 (in fact it’s all we have!) so this gives the following corresponding estimate:

Lemma 9 (Far-Left Estimate of Log Derivative)

If σ1\sigma \le -1 and t2t \ge 2 we have L(s,χ)L(s,χ)=O(logqs).\frac{L'(s, \chi)}{L(s, \chi)} = O\left( \log q\left\lvert s \right\rvert \right).

Proof: We have

L(1s,χ)=W(χ)21sπsqs12cos12π(sa)Γ(s)L(s,χ) L(1-s, \chi) = W(\chi) 2^{1-s} \pi^{-s} q^{s-\frac{1}{2}} \cos \frac{1}{2} \pi (s-a) \Gamma(s) L(s, \overline{\chi})

(the unsymmetric functional equation, which can be obtained from Legendre’s duplication formula). Taking a logarithmic derivative yields

LL(s,χ)=logq2π12πtan12π(1sa)+ΓΓ(1s)+LL(1s,χ). \frac{L'}{L}(s, \chi) = \log \frac{q}{2\pi} - \frac{1}{2} \pi \tan \frac{1}{2} \pi(1-s-a) + \frac{\Gamma'}{\Gamma}(1-s) + \frac{L'}{L}(1-s, \overline{\chi}).

Because we assumed t2\left\lvert t \right\rvert \ge 2, the tangent function is bounded as ss is sufficiently far from any of its poles along the real axis. Also since Re(1s)2\operatorname{Re}(1-s) \ge 2 implies the LL\frac{L'}{L} term is bounded. Finally, the logarithmic derivative of Γ\Gamma contributes logs\log \left\lvert s \right\rvert according to Stirling. So, total error is O(logq)+O(1)+O(logs)+O(1)O(\log q) + O(1) + O(\log \left\lvert s \right\rvert) + O(1) and this gives the conclusion. \Box

5.3. Horizontal Distribution

I claim that:

Theorem 10 (Horizontal Distribution Bound)

Let χ\chi be a character, possibly trivial or imprimitive. There exists an absolute constant c1c_1 with the following properties:

  1. If χ\chi is complex, then no zeros are in the region σ1c1L\sigma \ge 1 - \frac{c_1}{\mathcal L}.
  2. If χ\chi is real, there are no zeros in the region σ1c1L\sigma \ge 1 - \frac{c_1}{\mathcal L}, with at most one exception; this zero must be real and simple.

Such bad zeros are called Siegel zeros, and I will denote them βS\beta_S. The important part about this estimate is that it does not depend on χ\chi but rather on qq. We need the relaxation to non-primitive characters, since we will use them in the proof of Landau’s Theorem.

Proof: First, assume χ\chi is both primitive and nontrivial.

By the 3-4-1 lemma on logL(χ,s)\log L(\chi, s) we derive that

3Re[L(σ,χ0)L(σ,χ0)]+4Re[L(σ+it,χ)L(σ+it,χ)]+Re[L(σ+2it,χ2)L(σ+2it,χ2)]0. 3 \operatorname{Re} \left[ -\frac{L'(\sigma, \chi_0)}{L(\sigma, \chi_0)} \right] + 4 \operatorname{Re} \left[ -\frac{L'(\sigma+it, \chi)}{L(\sigma+it, \chi)} \right] + \operatorname{Re} \left[ -\frac{L'(\sigma+2it, \chi^2)}{L(\sigma+2it, \chi^2)} \right] \ge 0.

This is cool because we already know that

Re[L(σ+it,χ)L(σ+it,χ)]<O(L)Reρ1sρ \operatorname{Re} \left[ -\frac{L'(\sigma+it, \chi)}{L(\sigma+it, \chi)} \right] < O(\mathcal L) - \operatorname{Re} \sum_\rho \frac{1}{s-\rho}

We now assume σ>1\sigma > 1.

In particular, we now have (since Reρ<1\operatorname{Re} \rho < 1 for any zero ρ\rho) Re1sρ>0.\operatorname{Re} \frac{1}{s-\rho} > 0. So we are free to throw out as many terms as we want.

If χ2\chi^2 is primitive, then everything is clear. Let ρ=β+iγ\rho = \beta + i \gamma be a zero. Then

Re[L(σ,χ0)L(σ,χ0)]1σ1+O(1)Re[L(σ+it,χ)L(σ+it,χ)]O(L)1sρRe[L(σ+2it,χ2)L(σ+2it,χ2)]O(L) \begin{aligned} \operatorname{Re} \left[ -\frac{L'(\sigma, \chi_0)}{L(\sigma, \chi_0)} \right] &\le \frac{1}{\sigma-1} + O(1) \\ \operatorname{Re} \left[ -\frac{L'(\sigma+it, \chi)}{L(\sigma+it, \chi)} \right] &\le O(\mathcal L) - \frac{1}{s-\rho} \\ \operatorname{Re} \left[ -\frac{L'(\sigma+2it, \chi^2)}{L(\sigma+2it, \chi^2)} \right] &\le O(\mathcal L) \end{aligned}

where we have dropped all but one term for the second line, and all terms for the third line. If χ2\chi^2 is not primitive but at least is not χ0\chi_0, then we can replace χ2\chi^2 with the inducing χ~2\tilde\chi_2 for a penalty of at most

ReLL(s,χ~)ReLL(s,χ2)<Repkqχ~(pk)logpRe(pk)s<pqlogp(pσ+p2σ+)<pqlogp1logq \begin{aligned} \operatorname{Re} \frac{L'}{L}(s, \tilde\chi) - \operatorname{Re} \frac{L'}{L}(s, \chi^2) &< \operatorname{Re} \sum_{p^k \mid q} \tilde\chi(p^k) \log p \cdot \operatorname{Re} (p^k)^{-s} \\ &< \sum_{p \mid q} \log p \cdot (p^{-\sigma} + p^{-2\sigma} + \dots) \\ &< \sum_{p \mid q} \log p \cdot 1 \\ &\le \log q \end{aligned}

just like earlier: Λ\Lambda is usually zero, so we just look at the differing terms! The Dirichlet series really are practically the same. (Here we have also used the fact that σ>1\sigma > 1, and p2p \ge 2.)

Consequently, we derive using 3413-4-1 that 3σ14sρ+O(L)0.\frac{3}{\sigma-1} - \frac{4}{s-\rho} + O(\mathcal L) \ge 0. Selecting s=σ+iγs = \sigma + i \gamma so that sρ=σβs - \rho = \sigma-\beta, we thus obtain 4σβ3σ1+O(L).\frac{4}{\sigma-\beta} \le \frac{3}{\sigma-1} + O(\mathcal L). If we select σ=1+εL\sigma = 1 + \frac{\varepsilon}{\mathcal L}, we get 41+εLβO(L)\frac{4}{1 + \frac{\varepsilon}{\mathcal L} - \beta} \le O(\mathcal L) so β<1c2L\beta < 1 - \frac{c_2}{\mathcal L} for some constant c2c_2, initially only for primitive χ\chi.

But because the Euler product of the LL-function of an imprimitive character versus its primitive inducing character differ by a finite number of zeros on the line σ=0\sigma=0 it follows that this holds for all nontrivial complex characters.

Unfortunately, if we are unlucky enough that χ~2\tilde\chi_2 is trivial, then replacing χ2\chi^2 causes all hell to break loose. (In particular, χ\chi is real in this case!) The problem comes in that our new penalty has an extra 1s1\frac{1}{s-1}, so

ReLL(s,χ2)Reζζ(s)<1s1+logq \left\lvert \operatorname{Re} \frac{L'}{L}(s, \chi^2) - \operatorname{Re} \frac{\zeta'}{\zeta}(s) \right\rvert < \frac{1}{s-1} + \log q

Applied with s=σ+2its = \sigma + 2it, we get the weaker 3σ14sρ+O(L)+1σ1+2it0.\frac{3}{\sigma-1} - \frac{4}{s-\rho} + O(\mathcal L) + \frac{1}{\sigma - 1 + 2it} \ge 0. If t>δlogq\left\lvert t \right\rvert > \frac{\delta}{\log q} for some δ\delta then the 1σ1+2it\frac{1}{\sigma-1+2it} term will be at most logqδ=O(L)\frac{\log q}{\delta} = O(\mathcal L) and we live to see another day. In other words, we have unconditionally established a zero-free region of the form

σ>1c(δ)Landt>δlogq \sigma > 1 - \frac{c(\delta)}{\mathcal L} \quad\text{and}\quad \left\lvert t \right\rvert > \frac{\delta}{\log q}

for any δ>0\delta > 0.

Now let’s examine t<δlogq\left\lvert t \right\rvert < \frac{\delta}{\log q}. We don’t have the facilities to prove that there are no bad zeros, but let’s at least prove that the zero must be simple and real. By Hadamard at t=0t=0, we have L(σ,χ)L(σ,χ)<O(L)ρ1σρ-\frac{L'(\sigma, \chi)}{L(\sigma, \chi)} < O(\mathcal L) - \sum_\rho \frac{1}{\sigma-\rho} where we no longer need the real parts since χ\chi is real, and in particular the roots of L(s,χ)L(s,\chi) come in conjugate pairs. The left-hand side can be stupidly bounded below by

L(σ,χ)L(σ,χ)n1(1)lognnσ=ζ(σ)ζ(σ)>1σ1O(1). -\frac{L'(\sigma, \chi)}{L(\sigma, \chi)} \ge - \sum_{n \ge 1} (-1) \cdot \log n \cdot n^{-\sigma} = \frac{\zeta'(\sigma)}{\zeta(\sigma)} > -\frac{1}{\sigma-1} - O(1).

So 1σ1<O(L)ρ1σρ.-\frac{1}{\sigma-1} < O(\mathcal L) - \sum_\rho \frac{1}{\sigma-\rho}. In other words,

ρReσρσρ2<1σ1+O(L). \sum_\rho \operatorname{Re} \frac{\sigma-\rho}{\left\lvert \sigma-\rho \right\rvert^2} < \frac{1}{\sigma-1} + O(\mathcal L).

Then, let σ=1+2δlogq\sigma = 1 + \frac{2\delta}{\log q}, so

ρReσρσρ2<logq2δ+O(L). \sum_\rho \operatorname{Re} \frac{\sigma-\rho}{\left\lvert \sigma-\rho \right\rvert^2} < \frac{\log q}{2\delta} + O(\mathcal L).

The rest is arithmetic; basically one finds that there can be at most one Siegel zero. In particular, since complex zeros come in conjugate pairs, that zero must be real.

It remains to handle the case that χ=χ0\chi = \chi_0 is the constant function giving 11. For this, we observe that the LL-function in question is just ζ\zeta. Thus, we can decrease the constant c2c_2 to some c1c_1 in such a way that the result holds true for ζ\zeta, which completes the proof. \Box

5.4. Vertical Distribution

We have the following lemma:

Lemma 11 (Sum of Zeros Lemma)

For all real tt and primitive characters χ\chi (possibly trivial), we have ρ14+(tγ)2=O(L).\sum_\rho \frac{1}{4+(t-\gamma)^2} = O(\mathcal L).

Proof: We already have that ReLL(s,χ)=O(L)ρRe1sρ\operatorname{Re} -\frac{L'}{L}(s, \chi) = O(\mathcal L) - \sum_\rho \operatorname{Re} \frac{1}{s-\rho} and we take s=2+its = 2 + it, noting that the left-hand side is bounded by a constant ζζ(2)=0.569961\frac{\zeta'}{\zeta}(2) = -0.569961. On the other hand, Re12+itρ=Re(2+itρ)(2β)+(tγ)i2=2β(2β)2+(tγ)2\operatorname{Re} \frac{1}{2+it-\rho} = \frac{\operatorname{Re}(2+it-\rho)}{\left\lvert (2-\beta) + (t-\gamma)i \right\rvert^2} = \frac{2-\beta}{(2-\beta)^2+(t-\gamma)^2} and 14+(tγ)22β(2β)2+(tγ)221+(tγ)2\frac{1}{4+(t-\gamma)^2} \le \frac{2-\beta}{(2-\beta)^2+(t-\gamma)^2} \le \frac{2}{1+(t-\gamma)^2} as needed. \Box

From this we may deduce that

Lemma 12 (Number of Zeros Nearby TT)

For all real tt and primitive characters χ\chi (possibly trivial), the number of zeros ρ\rho with γ[t1,t+1]\gamma \in [t-1, t+1] is O(L)O(\mathcal L).

In particular, we may perturb any given TT by 2\le 2 so that the distance between it and the nearest zero is at least c0L1c_0 \mathcal L^{-1}, for some absolute constant c0c_0.

From this, using an argument principle we can actually also obtain the following: For a real number T>0T > 0, we have N(T,χ)=TπlogqT2πe+O(L)N(T, \chi) = \frac{T}{\pi} \log \frac{qT}{2\pi e} + O(\mathcal L) is the number of zeros of L(s,χ)L(s, \chi) with imaginary part γ[T,T]\gamma \in [-T, T]. However, we will not need this fact.

6. Error Term Party

Up to now, cc has been arbitrary. Assume now x6x \ge 6; thus we can now follow the tradition c=1+1logx<2c = 1 + \frac{1}{\log x} < 2 so cc is just to the right of the critical line. This causes xc=exx^c = ex. We assume also for convenience that T2T \ge 2.

6.1. Estimating the Truncation Error

Recall that

E(y,T)<{ycmin{1,1Tlogy}y1cT1y=1. \left\lvert E(y,T) \right\rvert < \begin{cases} y^c \min \left\{ 1, \frac{1}{T \left\lvert \log y \right\rvert} \right\} & y \neq 1 \\ cT^{-1} & y=1. \end{cases}

We need to bound the right-hand side of

Etruncaten1Λ(n)χ(n)E(x/n,T)=n1Λ(n)E(x/n,T). \left\lvert E_{\text{truncate}} \right\rvert \le \sum_{n \ge 1} \left\lvert \Lambda(n) \chi(n) \cdot E(x/n, T) \right\rvert = \sum_{n \ge 1} \Lambda(n) \left\lvert E(x/n, T) \right\rvert.

If 34xn54x\frac34 x \le n \le \frac 54x, the log part is small, and this is bad. We have to split into three cases: 34xn<x\frac34 x \le n < x, n=xn = x, and x<n54xx < n \le \frac 54x. This is necessary because in the event that Λ(x)0\Lambda(x) \neq 0 (xx is a prime power), then E(x/n,T)=E(1,T)E(x/n,T) = E(1,T) needs to be handled differently.

We let xleftx_{\text{left}} and xrightx_{\text{right}} be the nearest prime powers to xx other than xx itself. Thus this breaks our region to conquer into 34xxleft<x<xright54x.\frac 34 x \le x_{\text{left}} < x < x_{\text{right}} \le \frac 54 x. So we have possibly a center term (if xx is a prime power, we have a term n=xn=x), plus the far left interval and the far right interval. Let d=min{xxleft,xrightx}d = \min\left\{ x-x_{\text{left}}, x_{\text{right}}-x \right\} for convenience.

  • In the easy case, if n=xn = x we have a contribution of E(1,T)logx<cTlogxE(1,T) \log x < \frac{c}{T}\log x, which is piddling (less than logx\log x).
  • Suppose 34xnxleft1\frac 34x \le n \le x_{\text{left}} - 1. If n=xleftan = x_{\text{left}} - a for some integer 1a14x1 \le a \le \frac 14x, then

    logxnlogxleftxlefta=log(1axleft)axleft \log \frac xn \ge \log \frac{x_{\text{left}}}{x_{\text{left}}-a} = -\log\left( 1 - \frac{a}{x_{\text{left}}} \right) \ge \frac{a}{x_{\text{left}}}

    by using the silly inequality log(1t)t-\log(1-t) \ge t for t<1t < 1. So the contribution in total is at most

    1a14xΛ(n)(x/n)c1TaxleftxleftT1a14xΛ(n)(43)21a169xleftTlogx1a14x1a169(x1)(logx)(log14x+2)T1.9x(logx)2T \begin{aligned} \sum_{1 \le a \le \frac 14 x} \Lambda(n) \cdot (x/n)^c \cdot \frac{1}{T \cdot \frac{a}{x_{\text{left}}}} &\le \frac{x_{\text{left}}}{T} \sum_{1 \le a \le \frac 14 x} \Lambda(n) \cdot \left( \frac 43 \right)^2 \frac 1a \\ &\le \frac{16}{9} \frac{x_{\text{left}}}{T} \log x \sum_{1 \le a \le \frac 14 x} \frac 1a \\ &\le \frac{16}{9} \frac{(x-1) (\log x)(\log \frac 14 x + 2)}{T} \\ &\le \frac{1.9x (\log x)^2}{T} \end{aligned}

    provided x7391x \ge 7391.

  • If n=xleftn = x_{\text{left}}, we have

    logxn=log(1xxleftx)>dx\log \frac xn = -\log\left( 1 - \frac{x-x_{\text{left}}}{x} \right) > \frac{d}{x} Hence in this case, we get an error at most

    Λ(xleft)(xxleft)cmin{1,xTd}<Λ(xleft)(43)2min{1,xTd}169logxmin{1,xTd}. \begin{aligned} \Lambda(x_{\text{left}}) \left( \frac{x}{x_{\text{left}}} \right)^c \min \left\{ 1, \frac{x}{Td} \right\} &< \Lambda(x_{\text{left}}) \left( \frac 43 \right)^2 \min \left\{ 1, \frac{x}{Td} \right\} \\ &\le \frac{16}{9} \log x \min \left\{ 1, \frac{x}{Td} \right\}. \end{aligned}

  • The cases n=xrightn = x_{\text{right}} and xright+1n<54xx_{\text{right}} + 1 \le n < \frac 54x give the same bounds as above, in the same way.

Finally, if for xx outside the interval mentioned above, we in fact have logx/n>15\left\lvert \log x/n \right\rvert > \frac{1}{5}, say, and so all terms contribute at most

nΛ(n)(x/n)c1Tlogx/n5xcTnΛ(n)nc=5exTζζ(c)<5exT(1c1+0.5773)14xlogxT. \begin{aligned} \sum_n \Lambda(n) \cdot (x/n)^c \cdot \frac{1}{T \log \left\lvert x/n \right\rvert} &\le \frac{5x^c}{T} \sum_n \Lambda(n) \cdot n^{-c} \\ &= \frac{5ex}{T} \cdot \left\lvert -\frac{\zeta'}{\zeta} (c) \right\rvert \\ &< \frac{5ex}{T} \cdot \left( \frac{1}{c-1} + 0.5773 \right) \\ &\le \frac{14x \log x}{T}. \end{aligned}

(Recall ζ/ζ\zeta'/\zeta had a simple pole at s=1s=1, so near s=1s=1 it behaves like 1s1\frac{1}{s-1}.)

The sum of everything is 3.8x(logx)2+14xlogxT+329logxmin{1,xTd}\le \frac{3.8x(\log x)^2+14x\log x}{T} + \frac{32}{9} \log x \min \left\{ 1, \frac{x}{Td} \right\}. Hence, the grand total across all these terms is the horrible Etruncate5x(logx)2T+3.6logxmin{1,xTd}\boxed{ E_{\text{truncate}} \le \frac{5x(\log x)^2}{T} + 3.6\log x \min \left\{ 1, \frac{x}{Td} \right\}} provided x1.2105x \ge 1.2 \cdot 10^5.

6.2. Estimating the Contour Error

We now need to measure the error along the contour, taken from UU \rightarrow \infty. Throughout assume U3U \ge 3. Naturally, to estimate the integral, we seek good estimates on LL(σ).\left\lvert \frac{L'}{L}(\sigma) \right\rvert. For this we appeal to the Hadamard expansion. We break into a couple cases.

  • First, let’s look at the integral when 1σ2-1 \le \sigma \le 2, so s=σ±iTs = \sigma \pm iT with TT large. We bound the horizontal integral along these regions; by symmetry let’s consider just the top

    1+iTc+iTLL(s,χ)xssds.\int_{-1+iT}^{c+iT} -\frac{L'}{L}(s, \chi) \frac{x^s}{s} ds. Thus we want an estimate of LL-\frac{L'}{L}.

    Lemma 13. Let ss be such that 1σ2-1 \le \sigma \le 2, t2\left\lvert t \right\rvert \ge 2. Assume χ\chi is primitive (possibly trivial), and that tt is not within c0L1c_0\mathcal L^{-1} of any zeros of L(s,χ)L(s, \chi). Then L(s,χ)L(s,χ)=O(L2).\frac{L'(s, \chi)}{L(s, \chi)} = O(\mathcal L^2).

    Proof: Since we assumed that T2T \ge 2 we need not worry about δ(χ)s1\frac{\delta(\chi)}{s-1} and so we obtain

    L(s,χ)L(s,χ)=12logqπ12Γ(12s+12a)Γ(12s+12a)+B(χ)+ρ(1sρ+1ρ). \frac{L'(s, \chi)}{L(s, \chi)} = -\frac{1}{2} \log\frac{q}{\pi} - \frac{1}{2}\frac{\Gamma'(\frac{1}{2} s + \frac{1}{2} a)}{\Gamma(\frac{1}{2} s + \frac{1}{2} a)} + B(\chi) + \sum_{\rho} \left( \frac{1}{s-\rho} + \frac{1}{\rho} \right).

    and we eliminate B(χ)B(\chi) by computing

    L(σ+it,χ)L(σ+it,χ)L(2+it,χ)L(2+it,χ)=Egamma+ρ(1σ+itρ12+itρ). \frac{L'(\sigma+it, \chi)}{L(\sigma+it, \chi)} - \frac{L'(2+it, \chi)}{L(2+it, \chi)} = E_{\text{gamma}} + \sum_{\rho} \left( \frac{1}{\sigma+it-\rho} - \frac{1}{2+it-\rho} \right).

    where

    Egamma=12Γ(12(2+it)+12a)Γ(12(2+it)+12a)12Γ(12(σ+it)+12a)Γ(12(σ+it)+12a)logT E_{\text{gamma}} = \frac{1}{2}\frac{\Gamma'(\frac{1}{2} (2+it) + \frac{1}{2} a)}{\Gamma(\frac{1}{2} (2+it) + \frac{1}{2} a)} - \frac{1}{2}\frac{\Gamma'(\frac{1}{2} (\sigma+it) + \frac{1}{2} a)}{\Gamma(\frac{1}{2} (\sigma+it) + \frac{1}{2} a)} \ll \log T

    by Stirling (here we use the fact that 1σ2-1 \le \sigma \le 2). For the terms where γ[t1,t+1]\gamma \notin [t-1, t+1] we see that

    1σ+itρ12+itρ=2σσ+itρ2+itρ2σγt23γt26γt2+1. \begin{aligned} \left\lvert \frac{1}{\sigma+it-\rho} - \frac{1}{2+it-\rho} \right\rvert &= \frac{2-\sigma}{\left\lvert \sigma+it-\rho \right\rvert \left\lvert 2+it-\rho \right\rvert} \\ &\le \frac{2-\sigma}{\left\lvert \gamma-t \right\rvert^2} \le \frac{3}{\left\lvert \gamma-t \right\rvert^2} \\ &\le \frac{6}{\left\lvert \gamma-t \right\rvert^2+1}. \end{aligned}

    So the contribution of the sum for γt1\left\lvert \gamma-t \right\rvert \ge 1 can be bounded by O(L)O(\mathcal L), via the vertical sum lemma.

    As for the zeros with smaller imaginary part, we at least have 2+itρ=2β>1\left\lvert 2+it-\rho \right\rvert = \left\lvert 2-\beta \right\rvert > 1 and thus we can reduce the sum to just

    L(σ+it,χ)L(σ+it,χ)L(2+it,χ)L(2+it,χ)=γ[t1,t+1]1σ+itρ+O(L). \frac{L'(\sigma+it, \chi)}{L(\sigma+it, \chi)} - \frac{L'(2+it, \chi)}{L(2+it, \chi)} = \sum_{\gamma\in[t-1,t+1]} \frac{1}{\sigma+it-\rho} + O(\mathcal L).

    Now by the assumption that γtcL1\left\lvert \gamma-t \right\rvert \ge c\mathcal L^{-1}; so the terms of the sum are all at most O(L)O(\mathcal L). Also, there are O(L)O(\mathcal L) zeros with imaginary part in that range. Finally, we recall that L(2+it,χ)L(2+it,χ)\frac{L'(2+it, \chi)}{L(2+it, \chi)} is bounded; we can write it using its (convergent) Dirichlet series and then note it is at most ζ(2+it)ζ(2+it)ζ(2)ζ(2)\frac{\zeta'(2+it)}{\zeta(2+it)} \le \frac{\zeta'(2)}{\zeta(2)}. \Box At this point, we perturb TT as described in vertical distribution so that the lemma applies, and use can then compute

    1+iTc+iTLL(s,χ)xssds<O(L2)1+iTc+iTxssds<O(L2)1cxσ2Tdσ<O(L2)xc+11Tlogx<O(L2xTlogx). \begin{aligned} \left\lvert \int_{-1+iT}^{c+iT} -\frac{L'}{L}(s, \chi) \frac{x^s}{s} ds \right\rvert &< O(\mathcal L^2) \cdot \int_{-1+iT}^{c+iT} \left\lvert \frac{x^s}{s} \right\rvert ds \\ &< O(\mathcal L^2) \int_{-1}^c \frac{x^\sigma}{2T} d\sigma \\ &< O(\mathcal L^2) \cdot \frac{x^{c+1}-1}{T \log x} \\ &< O\left(\frac{\mathcal L^2 x}{T \log x}\right). \end{aligned}

  • Next, for the integral Uσ1-U \le \sigma \le 1, we use the “far-left” estimate to obtain

    U+iT1+iTLL(s,χ)xssds+iT1+iTxsslogqsds+iT1+iTxsslogqsdslogq+iT1+iTxssds++iT1+iTxslogssds<logq+iT1+iTxsTds+1xslogTTdslogqT1xσdσ+logTT1xσdσ<LT(x1logx)=LTxlogx. \begin{aligned} \left\lvert \int_{-U+iT}^{-1+iT} -\frac{L'}{L}(s, \chi) \frac{x^s}{s} ds \right\rvert &\ll \int_{-\infty+iT}^{-1+iT} \left\lvert \frac{x^s}{s} \right\rvert \cdot \log q \left\lvert s \right\rvert ds \\ &\ll \int_{-\infty+iT}^{-1+iT} \left\lvert \frac{x^s}{s} \right\rvert \cdot \log q \left\lvert s \right\rvert ds \\ &\ll \log q \int_{-\infty+iT}^{-1+iT} \left\lvert \frac{x^s}{s} \right\rvert ds + \int_{-\infty+iT}^{-1+iT} \left\lvert \frac{x^s \log \left\lvert s \right\rvert}{s} \right\rvert ds \\ &< \log q \int_{-\infty+iT}^{-1+iT} \left\lvert \frac{x^s}{T} \right\rvert ds + \int_{-\infty}^{-1} \left\lvert \frac{x^s \log T}{T} \right\rvert ds \\ &\ll \frac{\log q}{T} \int_{-\infty}^{-1} x^\sigma d\sigma + \frac{\log T}{T} \int_{-\infty}^{-1} x^\sigma d\sigma \\ &< \frac{\mathcal L}{T} \left( \frac{x^{-1}}{\log x} \right) = \frac{\mathcal L}{T x \log x}. \end{aligned}

    So the contribution in this case is O(LTxlogx)O\left( \frac{\mathcal L}{T x \log x} \right).

  • Along the horizontal integral, we can use the same bound

    UiTU+iTLL(s,χ)xssdsUiTU+iTxsslogqsds=xUUiTU+iTlogqssds=xUUiTU+iTlogq+logUUds=2T(logq+logU)UxU \begin{aligned} \left\lvert \int_{-U-iT}^{-U+iT} -\frac{L'}{L}(s, \chi) \frac{x^s}{s} ds \right\rvert &\ll \int_{-U-iT}^{-U+iT} \left\lvert \frac{x^s}{s} \right\rvert \cdot \log q \left\lvert s \right\rvert ds \\ &= x^{-U} \cdot \int_{-U-iT}^{-U+iT} \frac{\log q \left\lvert s \right\rvert}{\left\lvert s \right\rvert} ds \\ &= x^{-U} \cdot \int_{-U-iT}^{-U+iT} \frac{\log q + \log U}{U} ds \\ &= \frac{2T(\log q + \log U)}{Ux^U} \end{aligned}

    which vanishes as UU \rightarrow \infty.

So we only have two error terms, O(L2xTlogx)O\left( \frac{\mathcal L^2 x}{T \log x} \right) and O(LTxlogx)O\left( \frac{\mathcal L}{Tx\log x} \right). The first is clearly larger, so we end with EcontourL2xTlogx.\boxed{E_{\text{contour}} \ll \frac{\mathcal L^2x}{T \log x}}.

6.3. The term b(χ)b(\chi)

We can estimate b(χ)b(\chi) as follows:

Lemma 14. For primitive χ\chi. we have b(χ)=O(logq)γ<11ρb(\chi) = O(\log q) - \sum_{\left\lvert \gamma \right\rvert < 1} \frac{1}{\rho} Proof: The idea is to look at LL(s,χ)LL(2,χ)\frac{L'}{L}(s,\chi)-\frac{L'}{L}(2,\chi). By subtraction, we obtain

LL(s,χ)LL(2,χ)=ΓΓ(s+a2)+ΓΓ(2+a2)rsrs1+r2+r1+ρ(1sρ12ρ) \begin{aligned} \frac{L'}{L}(s, \chi) -\frac{L'}{L}(2, \chi) &= - \frac{\Gamma'}{\Gamma} \left( \frac{s+a}{2} \right) + \frac{\Gamma'}{\Gamma} \left( \frac{2+a}{2} \right) \\ &- \frac rs - \frac r{s-1} + \frac r2 + \frac r1 \\ &+ \sum_\rho \left( \frac{1}{s-\rho} - \frac{1}{2-\rho} \right) \end{aligned}

Then at s=0s=0 (eliminating the poles), we have LL(s,χ)=O(1)ρ(1ρ+12ρ)\frac{L'}{L}(s, \chi) = O(1) - \sum_{\rho} \left( \frac{1}{\rho}+\frac{1}{2-\rho} \right) where the O(1)O(1) is LL(2,χ)+r2+γ+ΓΓ(1)\frac{L'}{L}(2,\chi) + \frac r2 + \gamma + \frac{\Gamma'}{\Gamma}(1) if a=0a=0 and LL(2,χ)r2ΓΓ(12)+ΓΓ(32)\frac{L'}{L}(2,\chi) - \frac r2 - \frac{\Gamma'}{\Gamma}(\frac{1}{2}) + \frac{\Gamma'}{\Gamma}(\frac32) for a=1a=1. Furthermore,

ρ,γ>1(1ρ+12ρ)ρ,γ>12ρ(2ρ)<2ρ,γ>112ρ2 \sum_{\rho, \left\lvert \gamma \right\rvert > 1} \left( \frac{1}{\rho}+\frac{1}{2-\rho} \right) \le \sum_{\rho, \left\lvert \gamma \right\rvert > 1} \frac{2}{\left\lvert \rho(2-\rho) \right\rvert} < 2 \sum_{\rho, \left\lvert \gamma \right\rvert > 1} \frac{1}{\left\lvert 2-\rho \right\rvert^2}

which is O(logq)O(\log q) by our vertical distribution results, and similarly ρ,γ<112ρ=O(logq).\sum_{\rho, \left\lvert \gamma \right\rvert < 1} \frac{1}{2-\rho} = O(\log q). This completes the proof. \Box

Let β1\beta_1 be a Siegel zero, if any; for all the other zeros, we have that 1ρ=1β2+γ2\left\lvert \frac{1}{\rho} \right\rvert = \frac{1}{\beta^2+\gamma^2}. We now have two cases.

  • χχ\overline{\chi} \neq \chi. Then χ\overline{\chi} is complex and thus has no exceptional zeros; hence each of its zeros has β<1clogq\beta < 1 - \frac{c}{\log q}; since ρ\overline{\rho} is a zero of χ\overline{\chi} if and only if 1ρ1-\rho is a zero of χ\chi, it follows that all zeros of χ\chi are have 1ρ<O(logq)\left\lvert \frac{1}{\rho} \right\rvert < O(\log q). Moreover, in the range γ[1,1]\gamma \in [-1,1] there are O(logq)O(\log q) zeros (putting T=0T=0 in our earlier lemma on vertical distribution).

    Thus, total contribution of the sum is O((logq)2)O\left( (\log q)^2 \right).

  • If χ=χ\overline{\chi} = \chi, then χ\chi is real. The above argument goes through, except that we may have an extra Siegel zero at βS\beta_S; hence there will also be a special zero at 1βS1 - \beta_S. We pull these terms out separately.

Consequently, b(χ)=O((logq)2)1βS11βS.\boxed{b(\chi) = O\left( (\log q)^2 \right) - \frac{1}{\beta_S} - \frac{1}{1-\beta_S}}. By adjusting the constant, we may assume βS>20142015\beta_S > \frac{2014}{2015} if it exists.

7. Computing ψ(x,χ)\psi(x,\chi) and ψ(x;q,a)\psi(x;q,a)

7.1. Summing the Error Terms

We now have, for any T2T \ge 2, x6x \ge 6, and χ\chi modulo qq possibly primitive or trivial, the equality

ψ(x,χ)=δ(χ)x+Econtour+Etruncate+Etinyb(χ)ρ,γ<Txρρ. \psi(x, \chi) = \delta(\chi) x + E_{\text{contour}} + E_{\text{truncate}} + E_{\text{tiny}} - b(\chi) - \sum_{\rho, \left\lvert \gamma \right\rvert < T} \frac{x^\rho}{\rho}.

where

Econtourx(logx)2T+logxmin{1,xTd}EtruncateL2xTlogxEtinylogxlogqb(χ)=O((logq)2)1βS11βS. \begin{aligned} E_{\text{contour}} &\ll \frac{x(\log x)^2}{T} + \log x \min \left\{ 1, \frac{x}{Td} \right\} \\ E_{\text{truncate}} &\ll \frac{\mathcal L^2 x}{T \log x} \\ E_{\text{tiny}} &\ll \log x \log q \\ b(\chi) &= O\left( (\log q)^2 \right) - \frac{1}{\beta_S} - \frac{1}{1-\beta_S}. \end{aligned}

Assume now that TxT \le x, and xx is an integer (hence d1d \ge 1). Then aggregating all the errors gives

ψ(x,χ)=δ(χ)xρ,γ<TxρρxβS1βSx1βS11βS+O(x(logqx)2T). \psi(x, \chi) = \delta(\chi) x - \sum_{\rho, \left\lvert \gamma \right\rvert < T} \frac{x^\rho}{\rho} - \frac{x^{\beta_S}-1}{\beta_S} - \frac{x^{1-\beta_S}-1}{1-\beta_S} + O\left( \frac{x (\log qx)^2}{T} \right).

where the sum over ρ\rho now excludes the Siegel zero. We can omit the terms βS1<20152014=O(1)\beta_S^{-1} < \frac{2015}{2014} = O(1), and also

x1βS11βS<x12015logx.\frac{x^{1-\beta_S}-1}{1-\beta_S} < x^{\frac{1}{2015}} \log x. Absorbing things into the error term,

ψ(x,χ)=δ(χ)xxβSβSρ,γ<Txρρ+O(x(logqx)2T+x12015logx). \psi(x, \chi) = \delta(\chi) x - \frac{x^{\beta_S}}{\beta_S} - \sum_{\rho, \left\lvert \gamma \right\rvert < T} \frac{x^\rho}{\rho} + O\left( \frac{x (\log qx)^2}{T} + x^{\frac{1}{2015}} \log x \right).

7.2. Estimating the Sum Over Zeros

Now we want to estimate

ρ,γ<Txρρ.\sum_{\rho, \left\lvert \gamma \right\rvert < T} \left\lvert \frac{x^\rho}{\rho} \right\rvert. We do this is the dumbest way possible: putting a bound on xρx^\rho and pulling it out.

For any non-Siegel zero, we have a zero-free region β<1c1L\beta < 1 - \frac{c_1}{\mathcal L}, whence

xρ<xβ=xxβ1=xexp(c1logxL). \left\lvert x^\rho \right\rvert < x^{\beta} = x \cdot x^{\beta-1} = x \exp\left( \frac{-c_1 \log x}{\mathcal L} \right).

Pulling this out, we can then estimate the reciprocals by using our differential:

ρ,γ<T1ρ<ρ,γ<T1γ<t=1Tlogq(t+2)t(logqT)2(logqx)2. \begin{aligned} \sum_{\rho, \left\lvert \gamma \right\rvert < T} \left\lvert \frac{1}{\rho} \right\rvert < \sum_{\rho, \left\lvert \gamma \right\rvert < T} \frac{1}{\gamma} < \sum_{t=1}^T \frac{\log q(t+2)}{t} \ll (\log qT)^2 \le (\log qx)^2. \end{aligned}

Hence,

ψ(x,χ)=δ(χ)xxβSβS+O(x(logqx)2T+x12015logx+(logqx)2xexp(c1logxL)). \psi(x, \chi) = \delta(\chi) x - \frac{x^{\beta_S}}{\beta_S} + O\left( \frac{x (\log qx)^2}{T} + x^{\frac{1}{2015}} \log x + (\log qx)^2 \cdot x \exp\left( \frac{-c_1 \log x}{\mathcal L} \right) \right).

We select

T=exp(c3logx)T = \exp\left(c_3 \sqrt{\log x}\right) for some constant c3c_3, and moreover assume qTq \le T, then we obtain

ψ(x,χ)=δ(χ)xxβSβS+O(xexp(c4logx)). \psi(x, \chi) = \delta(\chi) x - \frac{x^{\beta_S}}{\beta_S} + O\left( x \exp\left( -c_4 \sqrt{\log x} \right) \right).

7.3. Summing Up

We would like to sum over all characters χ\chi. However, we’re worried that there might be lots of Siegel zeros across characters. A result of Landau tells us this is not the case:

Theorem 15 (Landau)

If χ1\chi_1 and χ2\chi_2 are real nontrivial primitive characters modulo q1q_1 and q2q_2, then for any zeros β1\beta_1 and β2\beta_2 we have

min{β1,β2}<1c5logq1q2\min \left\{ \beta_1, \beta_2 \right\} < 1 - \frac{c_5}{\log q_1q_2} for some fixed absolute c5c_5. In particular, for any fixed qq, there is at most one χmodq\chi \mod q with a Siegel zero.

Proof: The character χ1χ2\chi_1\chi_2 is not trivial, so we can put

ζζ(σ)=1σ1+O(1)LL(σ,χ1χ2)=O(logq1q2)LL(σ,χ1)=O(logq1)1σβiLL(σ,χ2)=O(logq2)2σβi. \begin{aligned} -\frac{\zeta'}{\zeta}(\sigma) &= \frac{1}{\sigma-1} + O(1) \\ -\frac{L'}{L}(\sigma, \chi_1\chi_2) &= O(\log q_1q_2) \\ -\frac{L'}{L}(\sigma, \chi_1) &= O(\log q_1) - \frac{1}{\sigma-\beta_i} \\ -\frac{L'}{L}(\sigma, \chi_2) &= O(\log q_2) - \frac{2}{\sigma-\beta_i}. \end{aligned}

Now we use a silly trick:

0ζζ(σ)LL(σ,χ1)LL(σ,χ2)LL(σ,χ1χ2) 0 \le -\frac{\zeta'}{\zeta}(\sigma) -\frac{L'}{L}(\sigma, \chi_1) -\frac{L'}{L}(\sigma, \chi_2) -\frac{L'}{L}(\sigma, \chi_1\chi_2)

by “Simon’s Favorite Factoring Trick” (we use the deep fact that (1+χ1)(1+χ2)0(1+\chi_1)(1+\chi_2) \ge 0, the analog of 3413-4-1). The upper bounds give now

1σβ1+1σβ2<1σ1+O(logq1logq2).\frac{1}{\sigma-\beta_1} + \frac{1}{\sigma-\beta_2} < \frac{1}{\sigma-1} + O(\log q_1 \log q_2). and one may deduce the conclusion from here. \Box

We now sum over all characters χ\chi as before to obtain

ψ(x;q,a)=1ϕ(q)xχS(a)ϕ(q)xβSβS+O(xexp(c6logx)) \psi(x; q, a) = \frac{1}{\phi(q)} x - \frac{\chi_S(a)}{\phi(q)} \frac{x^{\beta_S}}{\beta_S} + O \left( x \exp\left(-c_6 \sqrt{\log x}\right) \right)

where χS=χS\chi_S = \overline{\chi}_S is the character with a Siegel zero, if it exists.

8. Siegel’s Theorem, and Finishing Off

The term with xβS/βSx^{\beta_S} / \beta_S is bad, and we need some way to get rid of it. We now appeal to Siegel’s Theorem:

Theorem 16 (Siegel’s Theorem)

For any ε>0\varepsilon > 0 there is a C1(ε)>0C_1(\varepsilon) > 0 such that any Siegel zero βS\beta_S satisfies

βS<1C1(ε)qε.\beta_S < 1-C_1(\varepsilon) q^{-\varepsilon}.

Thus for a positive constant NN, assuming q(logx)Nq \le (\log x)^N, letting ε=(2N)1\varepsilon = (2N)^{-1} means qε>1logxq^{-\varepsilon} > \frac{1}{\sqrt{\log x}}, so we obtain

xβS<xexp(C1(ε)logxqε)<xexp(C1(ε)logx). x^{\beta_S} < x \exp\left( -C_1(\varepsilon) \log x q^{-\varepsilon} \right) < x \exp\left( -C_1(\varepsilon) \sqrt{\log x} \right).

Then

ψ(x;q,a)=1ϕ(q)xχS(a)ϕ(q)xβSβS+O(xexp(c6logx))1ϕ(q)x+O(xexp(C1(ε)logx))+O(xexp(c6logx))1ϕ(q)x+O(xexp(C(N)logx)) \begin{aligned} \psi(x; q, a) &= \frac{1}{\phi(q)} x - \frac{\chi_S(a)}{\phi(q)} \frac{x^{\beta_S}}{\beta_S} + O \left( x \exp\left(-c_6 \sqrt{\log x}\right) \right) \\ &\le \frac{1}{\phi(q)} x + O\left( x \exp\left( -C_1(\varepsilon) \sqrt{\log x} \right) \right) + O \left( x \exp\left(-c_6 \sqrt{\log x}\right) \right) \\ &\le \frac{1}{\phi(q)} x + O\left( x \exp\left( -C(N) \sqrt{\log x} \right) \right) \end{aligned}

where C(N)=min{C1(ε),c6}C(N) = \min \left\{ C_1(\varepsilon), c_6 \right\}. This completes the proof of Dirichlet’s Theorem.