I’m reading through Primes of the Form $x^2+ny^2$,
by David Cox (link; it’s good!).
Here are the high-level notes I took on the first chapter, which is about the theory of quadratic forms.
(Meta point re blog: I’m probably going to start posting more and more of these
more high-level notes/sketches on this blog on topics that I’ve been just learning.
Up til now I’ve been mostly only posting things that I understand well and for
which I have a very polished exposition.
But the perfect is the enemy of the good here; given that I’m taking these notes for my own sake,
I may as well share them to help others.)
1. Overview
Definition 1. For us a quadratic form is a polynomial Q=Q(x,y)=ax2+bxy+cy2, where a,
b, c are some integers. We say that it is primitive if gcd(a,b,c)=1.
For example, we have the famous quadratic form
QFermat(x,y)=x2+y2.
As readers are probably aware,
we can say a lot about exactly which integers can be represented by QFermat:
by Fermat’s Christmas theorem,
the primes p≡1(mod4) (and p=2) can all be written as the sum of two squares,
while the primes p≡3(mod4) cannot. For convenience, let us say that:
Definition 2. Let Q be a quadratic form.
We say it represents the integer m if there exists x,y∈Z with m=Q(x,y).
Moreover, Qproperly representsm if one can find such x and y which are also relatively prime.
The basic question is: what can we say about which primes/integers are
properly represented by a quadratic form? In fact,
we will later restrict our attention to “positive definite” forms (described later).
For example, Fermat’s Christmas theorem now rewrites as:
Theorem 3(Fermat’s Christmas theorem for primes)
An odd prime p is (properly) represented by QFermat if and only if p≡1(mod4).
The proof of this is classical,
see for example my olympiad handout.
We also have the formulation for odd integers:
Theorem 4(Fermat’s Christmas theorem for odd integers)
An odd integer m is properly represented by QFermat if and only
if all prime factors of m are 1(mod4).
Proof: For the “if” direction,
we use the fact that QFermat is multiplicative in the sense that
(x2+y2)(u2+v2)=(xu±yv)2+(xv∓yu)2.
For the “only if” part we use the fact that if a multiple of a prime p is
properly represented by QFermat, then so is p.
This follows by noticing that if x2+y2≡0(modp) (and
xy≡0(modp)) then (x/y)2≡−1(modp). □
Tangential remark: the two ideas in the proof will grow up in the following way.
The fact that QFermat “multiplies nicely” will grow up to become
the so-called composition of quadratic forms.
The second fact will not generalize for an arbitrary form Q.
Instead, we will see that if a multiple of p is represented by a form Q
then some form of the same “discriminant” will represent the prime p,
but this form need not be the same as Q itself.
2. Equivalence of forms, and the discriminant
The first thing we should do is figure out when two forms are essentially the same: for example,
x2+5y2 and 5x2+y2 should clearly be considered the same.
More generally, if we think of Q as acting on Z⊕2 and T
is any automorphism of Z⊕2, then Q∘T should be considered the same as Q.
Specifically,
Definition 5. Two forms Q1 and Q2 said to be equivalent if there exists
T=(prqs)∈GL(2,Z)
such that Q2(x,y)=Q1(px+ry,qx+sy).
We have detT=ps−qr=±1 and so we say the equivalence is
a proper equivalence if detT=+1, and
an improper equivalence if detT=−1.
So we generally will only care about forms up to proper equivalence.
(It will be useful to distinguish between proper/improper equivalence later.)
Naturally we seek some invariants under this operation. By far the most important is:
Definition 6. The discriminant of a quadratic form Q=ax2+bxy+cy2 is defined as
D=b2−4ac.
The discriminant is invariant under equivalence (check this).
Note also that we also have D≡0,1(mod4).
Observe that we have
4a⋅(ax2+bxy+cy2)=(2ax+by)2−Dy2.
So if D<0 and a>0 (thus c>0 too) then ax2+bxy+cy2>0 for all x,y>0.
Such quadratic forms are called positive definite, and we will restrict our attention to these forms.
Now that we have this invariant,
we may as well classify equivalence classes of quadratic forms for a fixed discriminant.
It turns out this can be done explicitly.
Definition 7. A quadratic form Q=ax2+bxy+cy2 is reduced if
it is primitive and positive definite,
∣b∣≤a≤c, and
b≥0 if either ∣b∣=a or a=c.
Exercise 8. Check there only finitely many reduced forms of a fixed discriminant.
Then the big huge theorem is:
Theorem 9(Reduced forms give a set of representatives)
Every primitive positive definite form Q of discriminant is properly equivalent to a unique reduced form.
We call this the reduction of Q.
Proof: Omitted due to length, but completely elementary. It is a reduction argument with some number of cases. □
Thus, for any discriminant D we can consider the set
Cl(D)={reduced forms of discriminant D}
which will be the equivalence classes of positive definite of discriminant D.
By abuse of notation we will also consider it as the set of equivalence classes
of primitive positive definite forms of discriminant D.
We also define h(D)=∣Cl(D)∣; by the exercise, h(D)<∞.
This is called the class number.
Moreover, we have h(D)≥1,
because we can take x2−D/4y2 for D≡0(mod4) and
x2+xy+(1−D)/4y2 for D≡1(mod4). We call this form the principal form.
3. Tables of quadratic forms
Example 10(Examples of quadratic forms with h(D)=1, D≡0(mod4))
The following discriminants have class number h(D)=1, hence having only the principal form:
D=−4, with form x2+y2.
D=−8, with form x2+2y2.
D=−12, with form x2+3y2.
D=−16, with form x2+4y2.
D=−28, with form x2+7y2.
This is in fact the complete list when D≡0(mod4).
Example 11(Examples of quadratic forms with h(D)=1, D≡1(mod4))
The following discriminants have class number h(D)=1, hence having only the principal form:
D=−3, with form x2+xy+y2.
D=−7, with form x2+xy+2y2.
D=−11, with form x2+xy+3y2.
D=−19, with form x2+xy+5y2.
D=−27, with form x2+xy+7y2.
D=−43, with form x2+xy+11y2.
D=−67, with form x2+xy+17y2.
D=−163, with form x2+xy+41y2.
This is in fact the complete list when D≡1(mod4).
Example 12(More examples of quadratic forms)
Here are tables for small discriminants with h(D)>1. When D≡0(mod4) we have
D=−20, with h(D)=2 forms 2x2+2xy+3y2 and x2+5y2.
D=−24, with h(D)=2 forms 2x2+3y2 and x2+6y2.
D=−32, with h(D)=2 forms 3x2+2xy+3y2 and x2+8y2.
D=−36, with h(D)=2 forms 2x2+2xy+5y2 and x2+9y2.
D=−40, with h(D)=2 forms 2x2+5y2 and x2+10y2.
D=−44, with h(D)=3 forms 3x2±2xy+4y2 and x2+11y2.
As for D≡1(mod4) we have
D=−15, with h(D)=2 forms 2x2+xy+2y2 and x2+xy+4y2.
D=−23, with h(D)=3 forms 2x2±xy+3y2 and x2+xy+6y2.
D=−31, with h(D)=3 forms 2x2±xy+4 and x2+xy+8y2.
D=−39, with h(D)=4 forms 3x2+3xy+4y2, 2x2±2xy+5y2 and x2+xy+10y2.
Example 13(Even More Examples of quadratic forms)
Here are some more selected examples:
D=−56 has h(D)=4 forms x2+14y2, 2x2+7y2 and 3x2±2xy+5y2.
D=−108 has h(D)=3 forms x2+27y2 and 4x2±2xy+7y2.
D=−256 has h(D)=4 forms x2+64y2, 4x2+4xy+17y2 and 5x2±2xy+13y2.
4. The Character χD
We can now connect this to primes p as follows.
Earlier we played with QFermat=x2+y2, and observed that for odd primes p,
p≡1(mod4) if and only if some multiple of p is properly represented by QFermat.
Our generalization is as follows:
Theorem 14(Primes represented by some quadratic form)
Let D<0 be a discriminant, and let p∤D be an odd prime. Then the following are equivalent:
(pD)=1, i.e. D is a quadratic residue modulo p.
The prime p is (properly) represented by some reduced quadratic form in Cl(D).
This generalizes our result for QFermat, but note that it uses h(−4)=1 in an essential way!
That is: if (−1/p)=1, we know p is represented by some quadratic form of
discriminant D=−4… but only since h(−4)=1 do we know that this form
reduces to QFermat=x2+y2.
Proof: First assume WLOG that p∤4a and Q(x,y)≡0(modp).
Thus p∤y, since otherwise this would imply x≡y≡0(modp).
Then
0≡4a⋅Q(x,y)≡(2ax+by)2−Dy2(modp)
hence D≡(2axy−1+b)2(modp).
The converse direction is amusing: let m2=D+pk for integers m, k. Consider the quadratic form
Q(x,y)=px2+mxy+ky2.
It is primitive of discriminant D and Q(1,0)=p.
Now Q may not be reduced, but that’s fine: just take the reduction of Q,
which must also properly represent p. □
Thus to every discriminant D<0 we can attach the Legendre character (is that the name?),
which is a homomorphism
χD=(∙D):(Z/DZ)×→{±1}
with the property that if p is a rational prime not dividing D,
then χD(p)=(pD).
This is abuse of notation since I should technically write χD(p(modD)), but there is no harm done:
one can check by quadratic reciprocity that if p≡q(modD) then χD(p)=χD(q).
Thus our previous result becomes:
Theorem 15(ker(χD) consists of representable primes)
Let p∤D be prime. Then p∈ker(χD) if and only if some
quadratic form in Cl(D) represents p.
As a corollary of this, using the fact that h(−8)=h(−12)=h(−28)=1 one can prove that
Corollary 16(Fermat-type results for h(−4n)=1)
Let p>7 be a prime. Then p is
of the form x2+2y2 if and only if p≡1,3(mod8).
of the form x2+3y2 if and only if p≡1(mod3).
of the form x2+7y2 if and only if p≡1,2,4(mod7).
Proof: The congruence conditions are equivalent to (−4n/p)=1,
and as before the only point is that the only reduced quadratic form
for these D=−4n is the principal one. □
5. Genus theory
What if h(D)>1? Sometimes, we can still figure out which primes go where just by taking mods.
Let Q∈Cl(D). Then it represents some residue classes of (Z/DZ)×.
In that case we call the set of residue classes represented the genus of the quadratic form Q.
Example 17(Genus theory of D=−20)
Consider D=−20, with
ker(χD)={1,3,7,9}⊆(Z/DZ)×.
We consider the two elements of Cl(D):
x2+5y2 represents 1,9∈(Z/20Z)×.
2x2+2xy+3y2 represents 3,7∈(Z/20Z)×.
Now suppose for example that p≡9(mod20).
It must be represented by one of these two quadratic forms,
but the latter form is never 9(mod20) and so it must be the first one.
Thus we conclude that
p=x2+5y2 if and only if p≡1,9(mod20).
p=2x2+2xy+3y2 if and only if p≡3,7(mod20).
The thing that makes this work is that each genus appears exactly once.
We are not always so lucky: for example when D=−108 we have that
Example 18(Genus theory of D=−108)
The two elements of Cl(−108) are:
x2+27y2, which represents exactly the 1(mod3) elements of (Z/DZ)×.
4x2±2xy+7y2, which also represents exactly the 1(mod3)
elements of (Z/DZ)×.
So the best we can conclude is that p=x2+27y2 OR p=4x2±2xy+7y2 if
and only if p≡1(mod3). This is because the two distinct quadratic
forms of discriminant −108 happen to have the same genus.
We now prove that:
Theorem 19(Genii are cosets of ker(χD))
Let D be a discriminant and consider the Legendre character χD.
The genus of the principal form of discriminant D constitutes a subgroup H of ker(χD),
which we call the principal genus.
Any genus of a quadratic form in Cl(D) is a coset of the principal genus H in ker(χD).
Proof: For the first part, we aim to show H is multiplicatively closed.
For D≡0(mod4), D=−4n we use the fact that
(x2+ny2)(u2+nv2)=(xu±nyv)2+n(xv∓yu)2.
For D≡1(mod4), we instead appeal to another “magic” identity
4(x2+xy+41−Dy2)≡(2x+y)2(modD)
and it follows from here that H is actually the set of squares in (Z/DZ)×,
which is obviously a subgroup.
Now we show that other quadratic forms have genus equal to a coset of the principal genus.
For D≡0(mod4), with D=−4n we can write
a(ax2+bxy+cy2)=(ax+b/2y)2+ny2
and thus the desired coset is shown to be a−1H. As for D≡1(mod4), we have
4a⋅(ax2+bxy+cy2)=(2ax+by)2−Dy2≡(2ax+by)2(modD)
so the desired coset is also a−1H, since H was the set of squares. □
Thus every genus is a coset of H in ker(χD). Thus:
Definition 20. We define the quotient group
Gen(D)=ker(χD)/H
which is the set of all genuses in discriminant D.
One can view this as an abelian group by coset multiplication.
Thus there is a natural map
ΦD:Cl(D)↠Gen(D).
(The map is surjective by Theorem 14.)
We also remark than Gen(D) is quite well-behaved:
Proposition 21(Structure of Gen(D))
The group Gen(D) is isomorphic to (Z/2Z)⊕m for some integer m.
Proof: Observe that H contains all the squares of ker(χD):
if f is the principal form then f(t,0)=t2.
Thus claim each element of Gen(D) has order at most 2,
which implies the result since Gen(D) is a finite abelian group.
□
In fact, one can compute the order of Gen(D) exactly,
but for this post I Will just state the result.
Theorem 22(Order of Gen(D))
Let D<0 be a discriminant, and let r be the number of distinct odd primes which divide D.
Define μ by:
μ=r if D≡1(mod4).
μ=r if D=−4n and n≡3(mod4).
μ=r+1 if D=−4n and n≡1,2(mod4).
μ=r+1 if D=−4n and n≡4(mod8).
μ=r+2 if D=−4n and n≡0(mod8).
Then ∣Gen(D)∣=2μ−1.
6. Composition
We have already used once the nice identity
(x2+ny2)(u2+nv2)=(xu±nyv)2+n(xv∓yu)2.
We are going to try and generalize this for any two quadratic forms in Cl(D). Specifically,
Proposition 23(Composition defines a group operation)
Let f,g∈Cl(D).
Then there is a unique h∈Cl(D) and bilinear forms
Bi(x,y,z,w)=aixz+bixw+ciyz+diyw for i=1,2 such that
f(x,y)g(z,w)=h(B1(x,y,z,w),B2(x,y,z,w)).
a1b2−a2b1=+f(1,0).
a1c2−a2c1=+g(1,0).
In fact, without the latter two constraints we would instead have
a1b2−a2b1=±f(1,0) and a1c2−a2c1=±g(1,0),
and each choice of signs would yield one of four (possibly different) forms.
So requiring both signs to be positive makes this operation well-defined.
(This is why we like proper equivalence; it gives us a well-defined group structure,
whereas with improper equivalence it would be impossible to put a group structure on the forms above.)
Taking this for granted, we then have that
Theorem 24(Form class group)
Let D≡0,1(mod4), D<0 be a discriminant.
Then Cl(D) becomes an abelian group under composition, where
The identity of Cl(D) is the principal form, and
The inverse of the form ax2+bxy+cy2 is ax2−bxy+cy2.
This group is called the form class group.
We then have a group homomorphism
ΦD:Cl(D)↠Gen(D).
Observe that ax2+bxy+cy2 and ax2−bxy+cy2 are inverses and that
their ΦD images coincide (being improperly equivalent);
this is expressed in the fact that Gen(D) has elements of order ≤2.
As another corollary, the number of elements of Cl(D) with a given genus is always a power of two.
We now define:
Definition 25. An integer n≥1 is convenient if the following equivalent conditions hold:
The principal form x2+ny2 is the only reduced form with the principal genus.
ΦD is injective (hence an isomorphism).
∣h(D)∣=2μ−1.
Thus we arrive at the following corollary:
Corollary 26(Convenient numbers have nice representations)
Let n≥1 be convenient. Then p is of the form x2+ny2 if and only if
p lies in the principal genus.
Hence the representability depends only on p(mod4n).
OEIS A000926 lists 65 convenient numbers.
This sequence is known to be complete except for at most one more number;
moreover the list is complete assuming the Grand Riemann Hypothesis.
7. Cubic and quartic reciprocity
To treat the cases where n is not convenient, the correct thing to do is develop class field theory.
However, we can still make a little bit more progress if we bring higher reciprocity theorems to bear:
we’ll handle the cases n=27 and n=64, two examples of numbers which are not convenient.
7.1. Cubic reciprocity
First, we prove that
Theorem 27(On p=x2+27y2)
A prime p>3 is of the form x2+27y2 if and
only if p≡1(mod3) and 2 is a cubic residue modulo p.
To do this we use cubic reciprocity,
which requires working in the Eisenstein integers Z[ω] where ω is a cube root of unity.
There are six units in Z[ω] (the sixth roots of unity),
hence each nonzero number has six associates (differing by a unit), and the ring is in fact a PID.
Now if we let π be a prime not dividing 3, and α is coprime to π,
then we can define the cubic Legendre symbol by setting
(πα)3≡α31(Nπ−1)(modπ)∈{1,ω,ω2}.
Moreover, we can define a primary prime π∤3 to be one such that π≡−1(mod3);
given any prime exactly one of the six associates is primary. We then have the following reciprocity theorem:
Theorem 28(Cubic reciprocity)
If π and θ are disjoint primary primes in Z[ω] then
(θπ)3=(πθ)3.
We also have the following supplementary laws: if π=(3m−1)+3nω, then
(πω)3=ωm+nand(π1−ω)3=ω2m.
The first supplementary law is for the unit (analogous to (−1/p)) while the
second reciprocity law handles the prime divisors of 3=−ω2(1−ω)2 (analogous to (2/p).)
We can tie this back into Z as follows.
If p≡1(mod3) is a rational prime then it is represented by x2+xy+y2,
and thus we can put p=ππ for some prime π, N(π)=p.
Consequently, we have a natural isomorphism
Z[ω]/πZ[ω]≅Z/pZ.
Therefore, we see that a given a∈(Z/pZ)× is a cubic
residue if and only if (α/π)3=1.
In particular, we have the following corollary, which is all we will need:
Corollary 29(When 2 is a cubic residue)
Let p≡1(mod3) be a rational prime, p>3. Write p=ππ with π primary.
Then 2 is a cubic residue modulo p if and only if π≡1(mod2).
Proof: First assume
p=x2+27y2=(x+33y)(x−33y).
Let π=x+3−3y=(x+3y)+6yω be primary, noting that π≡1(mod2).
Now clearly p≡1(mod3), so done by corollary.
For the converse, assume p≡1(mod3),
p=ππ with π primary and π≡1(mod2).
If we set π=a+bω for integers a and b,
then the fact that π≡1(mod2) and π≡−1(mod3) is enough
to imply that 6∣b (check it!). Moreover,
p=a2−ab+b2=(a−21b)2+27(61b)2
as desired. □
7.2. Quartic reciprocity
This time we work in Z[i], for which there are four units ±1, ±i.
A prime is primary if π≡1(mod2+2i);
every prime not dividing 2=−i(1+i)2 has a unique associate which is primary. Then we can as before define
α41(Nπ−1)≡(πα)4(modπ)∈{±1,±i}
where π is primary, and α is nonzero mod π.
As before p≡1(mod4),
p=ππ we have that a is a quartic residue modulo p if
and only if (a/π)4=1 thanks to the isomorphism
Z[i]/πZ[i]≅Z/pZ.
Now we have
Theorem 30(Quartic reciprocity)
If π and θ are distinct primary primes in Z[i] then
(πθ)4=(θπ)4(−1)161(Nθ−1)(Nπ−1).
We also have supplementary laws that state that if π=a+bi is primary, then
(πi)4=i−21(a−1)and(π1+i)4=i41(a−b−b2−1).
Again, the first law handles units, and the second law handles the prime divisors of 2.
The corollary we care about this time in fact uses only the supplemental laws:
Corollary 31(When 2 is a quartic residue)
Let p≡1(mod4) be a prime, and put p=ππ with π=a+bi primary. Then
(π2)4=i−b/2
and in particular 2 is a quartic residue modulo p if and only if b≡0(mod8).
Proof: Note that 2=i3(1+i)2 and applying the above. Therefore
Now we assumed a+bi is primary. We claim that
a−1+21b2≡0(mod4).
Note that since (a+bi)−1 was is divisible by 2+2i, hence N(2+2i)=8 divides (a−1)2+b2. Thus
2(a−1)+b2≡2(a−1)+(a−1)2≡(a−1)(a−3)≡0(mod8)
since a is odd and b is even. Finally,