vEnhance's avatar

Jul 12, 2016

🖉 The Structure Theorem over PID's

In this post I’ll describe the structure theorem over PID’s which generalizes the following results:

  • Finite dimensional vector fields over kk are all of the form knk^{\oplus n},
  • The classification theorem for finitely generated abelian groups,
  • The Frobenius normal form of a matrix,
  • The Jordan decomposition of a matrix.

1. Some ring theory prerequisites

(Prototypical example for this section: R=ZR = \mathbb Z.)

Before I can state the main theorem, I need to define a few terms for UFD’s, which behave much like Z\mathbb Z: Our intuition from the case R=ZR = \mathbb Z basically carries over verbatim. We don’t even need to deal with prime ideals and can factor elements instead.

Definition 1. If RR is a UFD, then pRp \in R is a prime element if (p)(p) is a prime ideal and p0p \neq 0. For UFD’s this is equivalent to the following property: if p=xyp = xy then either xx or yy is a unit.

So for example in Z\mathbb Z the set of prime elements is {±2,±3,±5,}\{\pm2, \pm3, \pm5, \dots\}. Now, since RR is a UFD, every element rr factors into a product of prime elements r=up1e1p2e2pmemr = u p_1^{e_1} p_2^{e_2} \dots p_m^{e_m}

Definition 2. We say rr divides ss if s=rrs = r'r for some rRr' \in R. This is written rsr \mid s.

Example 3 (Divisibility in Z\mathbb Z)

The number 00 is divisible by every element of Z\mathbb Z. All other divisibility as expected.

Question 4

Show that rsr \mid s if and only if the exponent of each prime in rr is less than or equal to the corresponding exponent in ss.

Now, the case of interest is the even stronger case when RR is a PID:

Proposition 5 (PID’s are Noetherian UFD’s)

If RR is a PID, then it is Noetherian and also a UFD.

Proof: The fact that RR is Noetherian is obvious. For RR to be a UFD we essentially repeat the proof for Z\mathbb Z, using the fact that (a,b)(a,b) is principal in order to extract gcd(a,b)\gcd(a,b). \Box

In this case, we have a Chinese remainder theorem for elements.

Theorem 6 (Chinese remainder theorem for rings)

Let mm and nn be relatively prime elements, meaning (m)+(n)=(1)(m) + (n) = (1). Then R/(mn)R/m×R/n.R / (mn) \cong R/m \times R/n.

Proof: This is the same as the proof of the usual Chinese remainder theorem. First, since (m,n)=(1)(m,n)=(1) we have am+bn=1am+bn=1 for some aa and bb. Then we have a map R/m×R/nR/(mn)by(r,s)rbn+sam.R/m \times R/n \rightarrow R/(mn) \quad\text{by}\quad (r,s) \mapsto r \cdot bn + s \cdot am. One can check that this map is well-defined and an isomorphism of rings. (Diligent readers invited to do so.) \Box

Finally, we need to introduce the concept of a Noetherian RR-module.

Definition 7. An RR-module MM is Noetherian if it satisfies one of the two equivalent conditions:

  • Its submodules obey the ascending chain condition: there is no infinite sequence of modules M1M2M_1 \subsetneq M_2 \subsetneq \dots.
  • All submodules of MM (including MM itself) are finitely generated.

This generalizes the notion of a Noetherian ring: a Noetherian ring RR is one for which RR is Noetherian as an RR-module.

Question 8

Check these two conditions are equivalent. (Copy the proof for rings.)

2. The structure theorem

Our structure theorem takes two forms:

Theorem 9 (Structure theorem, invariant form)

Let RR be a PID and let MM be any finitely generated RR-module. Then Mi=1mR/siM \cong \bigoplus_{i=1}^m R/s_i for some sis_i satisfying s1s2sms_1 \mid s_2 \mid \dots \mid s_m.

Corollary 10 (Structure theorem, primary form)

Let RR be a PID and let MM be any finitely generated RR-module. Then MRrR/(q1)R/(q2)R/(qm)M \cong R^{\oplus r} \oplus R/(q_1) \oplus R/(q_2) \oplus \dots \oplus R/(q_m) where qi=pieiq_i = p_i^{e_i} for some prime element pip_i and integer ei1e_i \ge 1.

Proof: Factor each sis_i into prime factors (since RR is a UFD), then use the Chinese remainder theorem. \Box

Remark 11. In both theorems the decomposition is unique up to permutations of the summands; good to know, but I won’t prove this.

3. Reduction to maps of free RR-modules

The proof of the structure theorem proceeds in two main steps. First, we reduce the problem to a linear algebra problem involving free RR-modules RdR^{\oplus d}. Once that’s done, we just have to play with matrices; this is done in the next section.

Suppose MM is finitely generated by dd elements. Then there is a surjective map of RR-modules RdMR^{\oplus d} \twoheadrightarrow M whose image on the basis of RdR^{\oplus d} are the generators of MM. Let KK denote the kernel.

We claim that KK is finitely generated as well. To this end we prove that

Lemma 12 (Direct sum of Noetherian modules is Noetherian)

Let MM and NN be two Noetherian RR-modules. Then the direct sum MNM \oplus N is also a Noetherian RR-module.

Proof: It suffices to show that if LMNL \subseteq M \oplus N, then LL is finitely generated. It’s unfortunately not true that L=PQL = P \oplus Q (take M=N=ZM = N = \mathbb Z L={(n,n)nZ}L = \{(n,n) \mid n \in \mathbb Z\}) so we will have to be more careful.

Consider the submodules

A={xM(x,0)L}MB={yNxM:(x,y)L}N. \begin{aligned} A &= \left\{ x \in M \mid (x,0) \in L \right\} \subseteq M \\ B &= \left\{ y \in N \mid \exists x \in M : (x,y) \in L \right\} \subseteq N. \end{aligned}

(Note the asymmetry for AA and BB: the proof doesn’t work otherwise.) Then AA is finitely generated by a1a_1, …, aka_k, and BB is finitely generated by b1b_1, …, bb_\ell. Let xi=(ai,0)x_i = (a_i, 0) and let yi=(,bi)y_i = (\ast, b_i) be elements of LL (where the \ast’s are arbitrary things we don’t care about). Then xix_i and yiy_i together generate LL. \Box

Question 13

Deduce that for RR a PID, RdR^{\oplus d} is Noetherian.

Hence KRdK \subseteq R^{\oplus d} is finitely generated as claimed. So we can find another surjective map RfKR^{\oplus f} \twoheadrightarrow K. Consequently, we have a composition RfKRdM.R^{\oplus f} \twoheadrightarrow K \hookrightarrow R^{\oplus d} \twoheadrightarrow M. Observe that MM is the cokernel of the composition T:RfRdT : R^{\oplus f} \rightarrow R^{\oplus d}, i.e. we have that MRd/im(T).M \cong R^{\oplus d} / \operatorname{im}(T). So it suffices to understand the map TT well.

4. Smith normal form

The idea is now that we have reduced our problem to studying linear maps T:RmRnT : R^{\oplus m} \rightarrow R^{\oplus n}, which can be thought of as a generic matrix

T=(a11a1man1anm) T = \begin{pmatrix} a_{11} & \dots & a_{1m} \\ \vdots & \ddots & \vdots \\ a_{n1} & \dots & a_{nm} \end{pmatrix}

for the standard basis e1e_1, …, eme_m of RmR^{\oplus m} and f1f_1, …, fnf_n of NN.

Of course, as you might expect it ought to be possible to change the given basis of TT such that TT has a nicer matrix form. We already saw this in Jordan form, where we had a map T:VVT : V \rightarrow V and changed the basis so that TT was “almost diagonal”. This time, we have two sets of bases we can change, so we would hope to get a diagonal basis, or even better.

Before proceeding let’s think about how we might edit the matrix: what operations are permitted? Here are some examples:

  • Swapping rows and columns, which just corresponds to re-ordering the basis.
  • Adding a multiple of a column to another column. For example, if we add 33 times the first column to the second column, this is equivalent to replacing the basis

    (e1,e2,e3,,em)(e1,e2+3e1,e3,,em).(e_1, e_2, e_3, \dots, e_m) \mapsto (e_1, e_2+3e_1, e_3, \dots, e_m).

  • Adding a multiple of a row to another row. One can see that adding 33 times the first row to the second row is equivalent to replacing the basis

    (f1,f2,f3,,fn)(f13f2,f2,f3,,fn).(f_1, f_2, f_3, \dots, f_n) \mapsto (f_1-3f_2, f_2, f_3, \dots, f_n). More generally, If AA is an invertible n×nn \times n matrix we can replace TT with ATAT. This corresponds to replacing (f1,,fn)(A(f1),,A(fn))(f_1, \dots, f_n) \mapsto (A(f_1), \dots, A(f_n)) (the “invertible” condition just guarantees the latter is a basis). Of course similarly we can replace XX with XBXB where BB is an invertible m×mm \times m matrix; this corresponds to (e1,,em)(B1(e1),,B1(em))(e_1, \dots, e_m) \mapsto (B^{-1}(e_1), \dots, B^{-1}(e_m)) Armed with this knowledge, we can now approach the following result.

Theorem 14 (Smith normal form)

Let RR be a PID. Let M=RmM = R^{\oplus m} and N=RnN = R^{\oplus n} be free RR-modules and let T:MNT : M \rightarrow N be a linear map. Set k=min(m,n)k = \min(m,n).

Then we can select a pair of new bases for MM and NN such that TT has only diagonal entries s1s_1, s2s_2, …, sks_k and s1s2sks_1 \mid s_2 \mid \dots \mid s_k.

So if m>nm > n, the matrix should take the form

(s100000s2000000sn0). \begin{pmatrix} s_1 & 0 & 0 & 0 & \dots & 0 \\ 0 & s_2 & 0 & 0 & \dots & 0 \\ \vdots & \vdots & \ddots & \vdots & \dots & \vdots \\ 0 & 0 & 0 & s_n & \dots & 0 \end{pmatrix}.

and similarly when mnm \le n.

Question 15

Show that Smith normal form implies the structure theorem.

Remark 16. Note that this is not a generalization of Jordan form.

  • In Jordan form we consider maps T:VVT : V \rightarrow V; note that the source and target space are the same, and we are considering one basis for the space VV.
  • In Smith form the maps T:MNT : M \rightarrow N are between different modules, and we pick two sets of bases (one for MM and one for NN).

Example 17 (Example of Smith normal form)

To give a flavor of the idea of the proof, let’s work through a concrete example with the following matrix with entries from Z\mathbb Z: (183848143038).\begin{pmatrix} 18 & 38 & 48 \\ 14 & 30 & 38 \end{pmatrix}. The GCD of all the entries is 22, and so motivated by this, we perform the Euclidean algorithm on the left column: subtract the second row from the first row, then three times the first row from the second:

(183848143038)(4810143038)(4810262). \begin{pmatrix} 18 & 38 & 48 \\ 14 & 30 & 38 \end{pmatrix} \mapsto \begin{pmatrix} 4 & 8 & 10 \\ 14 & 30 & 38 \end{pmatrix} \mapsto \begin{pmatrix} 4 & 8 & 10 \\ 2 & 6 & 2 \end{pmatrix}.

Now that the GCD of 22 is present, we move it to the upper-left by switching the two rows, and then kill off all the entries in the same row/column; since 22 was the GCD all along, we isolate 22 completely:

(4810262)(2624810)(262046)(200046). \begin{pmatrix} 4 & 8 & 10 \\ 2 & 6 & 2 \end{pmatrix} \mapsto \begin{pmatrix} 2 & 6 & 2 \\ 4 & 8 & 10 \end{pmatrix} \mapsto \begin{pmatrix} 2 & 6 & 2 \\ 0 & -4 & 6 \\ \end{pmatrix} \mapsto \begin{pmatrix} 2 & 0 & 0 \\ 0 & -4 & 6 \end{pmatrix}.

This reduces the problem to a 1×21 \times 2 matrix. So we just apply the Euclidean algorithm again there:

(200046)(200042)(200002)(200020). \begin{pmatrix} 2 & 0 & 0 \\ 0 & -4 & 6 \end{pmatrix} \mapsto \begin{pmatrix} 2 & 0 & 0 \\ 0 & -4 & 2 \end{pmatrix} \mapsto \begin{pmatrix} 2 & 0 & 0 \\ 0 & 0 & 2 \end{pmatrix} \mapsto \begin{pmatrix} 2 & 0 & 0 \\ 0 & 2 & 0 \end{pmatrix}.

Now all we have to do is generalize this proof to work with any PID. It’s intuitively clear how to do this: the PID condition more or less lets you perform a Euclidean algorithm.

Proof: Begin with a generic matrix

T=(a11a1man1anm) T = \begin{pmatrix} a_{11} & \dots & a_{1m} \\ \vdots & \ddots & \vdots \\ a_{n1} & \dots & a_{nm} \end{pmatrix}

We want to show, by a series of operations (gradually changing the given basis) that we can rearrange the matrix into Smith normal form.

Define gcd(x,y)\gcd(x,y) to be any generator of the principal ideal (x,y)(x,y).

Claim 18 (“Euclidean algorithm”)

If aa and bb are entries in the same row or column, we can change bases to replace aa with gcd(a,b)\gcd(a,b) and bb with something else.

Proof: We do just the case of columns. By hypothesis, gcd(a,b)=xa+yb\gcd(a,b) = xa+yb for some x,yRx,y \in R. We must have (x,y)=(1)(x,y) = (1) now (we’re in a UFD). So there are uu and vv such that xu+yv=1xu + yv = 1. Then

(xyvu)(ab)=(gcd(a,b)something) \begin{pmatrix} x & y \\ -v & u \end{pmatrix} \begin{pmatrix} a \\ b \end{pmatrix} = \begin{pmatrix} \gcd(a,b) \\ \text{something} \end{pmatrix}

and the first matrix is invertible (check this!), as desired. \Box

Let s1=(aij)i,js_1 = (a_{ij})_{i,j} be the GCD of all entries. Now by repeatedly applying this algorithm, we can cause ss to appear in the upper left hand corner. Then, we use it to kill off all the entries in the first row and the first column, thus arriving at a matrix

(s10000a22a23a2n0a32a33a3n0am2am3amn). \begin{pmatrix} s_1 & 0 & 0 & \dots & 0 \\ 0 & a_{22}' & a_{23}' & \dots & a_{2n}' \\ 0 & a_{32}' & a_{33}' & \dots & a_{3n}' \\ \vdots&\vdots&\vdots&\ddots&\vdots \\ 0 & a_{m2}' & a_{m3}' & \dots & a_{mn}' \\ \end{pmatrix}.

Now we repeat the same procedure with this lower-right (m1)×(n1)(m-1) \times (n-1) matrix, and so on. This gives the Smith normal form. \Box

With the Smith normal form, we have in the original situation that MRd/\operatoranmeimTM \cong R^{\oplus d} / \operatoranme{im} T and applying the theorem to TT completes the proof of the structure theorem.

5. Applications

Now, we can apply our structure theorem! I’ll just sketch proofs of these and let the reader fill in details.

Corollary 19 (Finite-dimensional vector spaces are all isomorphic)

A vector space VV over a field kk has a finite spanning set of vectors. Then for some nn, VknV \cong k^{\oplus n}.

Proof: In the structure theorem, k/(si){0,k}k / (s_i) \in \{0,k\}. \Box

Corollary 20 (Frobenius normal form)

Let T:VVT : V \rightarrow V where VV is a finite-dimensional vector space over an arbitrary field kk (not necessarily algebraically closed). Then one can write TT as a block-diagonal matrix whose blocks are all of the form

(0000100001000001). \begin{pmatrix} 0 & 0 & 0 & \dots & 0 & \ast \\ 1 & 0 & 0 & \dots & 0 & \ast \\ 0 & 1 & 0 & \dots & 0 & \ast \\ \vdots&\vdots&\vdots&\ddots&\vdots&\vdots \\ 0 & 0 & 0 & \dots & 1 & \ast \\ \end{pmatrix}.

Proof: View VV as a k[x]k[x]-module with action xv=T(v)x \cdot v = T(v). By theorem Vik[x]/(si)V \cong \bigoplus_i k[x] / (s_i) for some polynomials sis_i, where s1s2s3s_1 \mid s_2 \mid s_3 \mid \dots. Write each block in the form described. \Box

Corollary 21 (Jordan normal form)

Let T:VVT : V \rightarrow V where VV is a finite-dimensional vector space over an arbitrary field kk which is algebraically closed. Prove that TT can be written in Jordan form.

Proof: We now use the structure theorem in its primary form. Since k[x]k[x] is algebraically closed each pip_i is a linear factor, so every summand looks like k[x]/(xa)mk[x] / (x-a)^m for some mm. \Box

This is a draft of Chapter 15 of the Napkin.