vEnhance's avatar

Apr 10, 2015

🖉 Cauchy's Functional Equation and Zorn's Lemma

This is a draft of an appendix chapter for my Napkin project.

In the world of olympiad math, there’s a famous functional equation that goes as follows: f:RRf(x+y)=f(x)+f(y).f : {\mathbb R} \rightarrow {\mathbb R} \qquad f(x+y) = f(x) + f(y). Everyone knows what its solutions are! There’s an obvious family of solutions f(x)=cxf(x) = cx. Then there’s also this family of… uh… noncontinuous solutions (mumble grumble) pathological (mumble mumble) Axiom of Choice (grumble).

Don't worry, I know what I'm doing!
Don’t worry, I know what I’m doing!

There’s also this thing called Zorn’s Lemma. It sounds terrifying, because it’s equivalent to the Axiom of Choice, which is also terrifying because why not.

In this post I will try to de-terrify these things, because they’re really not terrifying and I’m not sure why no one bothered to explain this properly yet. I have yet to see an olympiad handout that explains how you would construct a pathological solution, even though it’s really quite natural. So let me fix this problem now…

1. Let’s Construct a Monster

Let’s try to construct a “bad” function ff and see what happens.

By scaling, let’s assume WLOG that f(1)=1f(1) = 1. Thus f(n)=nf(n) = n for every integer nn, and you can easily show from here that f(mn)=mn.f\left( \frac mn \right) = \frac mn. So ff is determined for all rationals. And then you get stuck.

None of this is useful for determining, say, f(2)f(\sqrt 2). You could add and subtract rational numbers all day and, say, 2\sqrt 2 isn’t going to show up at all.

Well, we’re trying to set things on fire anyways, so let’s set f(2)=2015f(\sqrt 2) = 2015 because why not? By the same induction, we get f(n2)=2015nf(n\sqrt2) = 2015n, and then that f(a+b2)=a+2015b.f\left( a + b \sqrt 2 \right) = a + 2015b. Here aa and bb are rationals. Well, so far so good – as written, this is a perfectly good solution, other than the fact that we’ve only defined ff on a tiny portion of the real numbers.

Well, we can do this all day: f(a+b2+c3+dπ)=a+2015b+1337c999d.f\left( a + b \sqrt 2 + c \sqrt 3 + d \pi \right) = a + 2015b + 1337c - 999d. Perfectly consistent.

You can kind of see how we should keep going now. Just keep throwing in new real numbers which are “independent” to the previous few, assigning them to whatever junk we want. It feels like it should be workable. . .

In a moment I’ll explain what “independent” means (though you might be able to guess already), but at the moment there’s a bigger issue: no matter how many numbers we throw, it seems like we’ll never finish. Let’s address the second issue first.

2. Review of Finite Induction

When you do induction, you get to count off 11, 22, 33, … and so on. So for example, suppose we had a “problem” such as the following: Prove that the intersection of nn open intervals is either \varnothing or an open interval. You can do this by induction easily: it’s true for n=2n = 2, and for the larger cases it’s similarly easy.

But you can’t conclude from this that infinitely many open intervals intersect at some open interval. Indeed, this is false: consider the intervals

(1,1),(12,12),(13,13),(14,14), \left( -1, 1 \right), \quad \left( -\frac12, \frac12 \right), \quad \left( -\frac13, \frac13 \right), \quad \left( -\frac14, \frac14 \right), \quad \dots

This infinite set of intervals intersects at a single point {0}\{0\}!

The moral of the story is that induction doesn’t let us reach infinity. Too bad, because we’d have loved to use induction to help us construct a monster. That’s what we’re doing, after all – adding things in one by one.

3. Transfinite Induction

Well, it turns out we can, but we need a new notion of number.

For this we need a notion of an ordinal number. I defined these in their full glory a previous blog post, but I don’t need the full details of that. Here’s what I want to say: after all the natural numbers 0,1,,0, 1, \dots, I’ll put a new number called ω\omega, representing how large the natural numbers are. After that there’s a number called ω+1,ω+2,\omega+1, \omega+2, \dots and eventually a number called 2ω2\omega.

The list goes on:

0,1,2,3,,ωω+1,ω+2,,ω+ω2ω+1,2ω+2,,3ωω2+1,ω2+2,ω3,,ω4,,ωω,ωωω \begin{aligned} 0, & 1, 2, 3, \dots, \omega \\ & \omega+1, \omega+2, \dots, \omega+\omega \\ & 2\omega+1, 2\omega+2, \dots, 3\omega \\ & \vdots \\ & \omega^2 + 1, \omega^2+2, \dots \\ & \vdots \\ & \omega^3, \dots, \omega^4, \dots, \omega^\omega \\ & \vdots, \\ & \omega^{\omega^{\omega^{\dots}}} \\ \end{aligned}

Pictorially, it kind of looks like this:

Ordinals up to ωω\omega^\omegaωω.
Ordinals up to ωω\omega^\omegaωω.

Anyways, in the same way that natural numbers dominate all finite sets, the ordinals dominate all the sets.

Theorem 1. For every set SS there’s some ordinal α\alpha which is bigger than it.

But it turns out (and you can intuitively see) that as large as the ordinals grow, there is no infinite descending chain. Meaning: if I start at an ordinal (like 2ω+42 \omega + 4) and jump down, I can only take finitely many jumps before I hit 00. (To see this, try writing down a chain starting at 2ω+42 \omega + 4 yourself.) Hence, induction and recursion still work verbatim:

Theorem 2. Given a statement P()P(-), suppose that

  • P(0)P(0) is true, and
  • If P(α)P(\alpha) is true for all α<β\alpha < \beta, then P(β)P(\beta) is true.

Then P(β)P(\beta) is true.

Similarly, you’re allowed to do recursion to define xβx_\beta if you know the value of xαx_\alpha for all α<β\alpha < \beta.

The difference from normal induction or recursion is that we’ll often only do things like “define xn+1=x_{n+1} = \dots”. But this is not enough to define xαx_\alpha for all α\alpha. To see this, try using our normal induction and see how far we can climb up the ladder.

Answer: you can’t get ω\omega! It’s not of the form n+1n+1 for any of our natural numbers nn – our finite induction only lets us get up to the ordinals less than ω\omega. Similarly, the simple +1+1 doesn’t let us hit the ordinal 2ω2\omega, even if we already have ω+n\omega+n for all nn. Such ordinals are called limit ordinals. The ordinal that are of the form α+1\alpha+1 are called successor ordinals.

So a transfinite induction or recursion is very often broken up into three cases. In the induction phrasing, it looks like

  • (Zero Case) First, resolve P(0)P(0).
  • (Successor Case) Show that from P(α)P(\alpha) we can get P(α+1)P(\alpha+1).
  • (Limit Case) Show that P(λ)P(\lambda) holds given P(α)P(\alpha) for all α<λ\alpha < \lambda, where λ\lambda is a limit ordinal.

Similarly, transfinite recursion often is split into cases too.

  • (Zero Case) First, define x0x_0.
  • (Successor Case) Define xα+1x_{\alpha+1} from xαx_\alpha.
  • (Limit Case) Define xλx_\lambda from xαx_\alpha for all α<λ\alpha < \lambda, where λ\lambda is a limit ordinal.

In both situations, finite induction only does the first two cases, but if we’re able to do the third case we can climb far above the barrier ω\omega.

4. Wrapping Up Functional Equations

Let’s return to solving our problem.

Let SnS_n denote the set of “base” numbers we have at the nn the step. In our example, we might have

S1={1},S2={1,2},S3={1,2,3},S4={1,2,3,π}, S_1 = \left\{ 1 \right\}, \quad S_2 = \left\{ 1, \sqrt 2 \right\}, \quad S_3 = \left\{ 1, \sqrt 2, \sqrt 3 \right\}, \quad S_4 = \left\{ 1, \sqrt 2, \sqrt 3, \pi \right\}, \quad \dots

and we’d like to keep building up SiS_i until we can express all real numbers. For completeness, let me declare S0=S_0 = \varnothing.

First, I need to be more precise about “independent”. Intuitively, this construction is working because a+b2+c3+dπa + b \sqrt 2 + c \sqrt 3 + d \pi is never going to equal zero for rational numbers aa, bb, cc, dd (other than all zeros). In general, a set XX of numbers is “independent” if the combination c1x1+c2x2++cmxm=0c_1x_1 + c_2x_2 + \dots + c_mx_m = 0 never occurs for rational numbers Q{\mathbb Q} unless c1=c2==cm=0c_1 = c_2 = \dots = c_m = 0. Here xiXx_i \in X are distinct. Note that even if XX is infinite, I can only take finite sums! (This notion has a name: we want XX to be linearly independent over Q{\mathbb Q}.)

When do we stop? We’d like to stop when we have a set SsomethingS_{\text{something}} that’s so big, every real number can be written in terms of the independent numbers. (This notion also has a name: it’s called a Q{\mathbb Q}-basis.) Let’s call such a set spanning; we stop once we hit a spanning set.

The idea that we can induct still seems okay: suppose SαS_\alpha isn’t spanning. Then there’s some number that is independent of SαS_\alpha, say 2015π\sqrt{2015}\pi or something. Then we just add it to get Sα+1S_{\alpha+1}. And we keep going.

Unfortunately, as I said before it’s not enough to be able to go from SαS_\alpha to Sα+1S_{\alpha+1} (successor case); we need to handle the limit case as well. But it turns out there’s a trick we can do. Suppose we’ve constructed all the sets S0S_0, S1S_1, S2S_2, …, one for each positive integer nn, and none of them are spanning. The next thing I want to construct is SωS_\omega; somehow I have to “jump”. To do this, I now take the infinite union SωS0S1S2.S_\omega \coloneqq S_0 \cup S_1 \cup S_2 \cup \dots. The elements of this set are also independent (why?).

Ta-da! With the simple trick of “union all the existing sets”, we’ve just jumped the hurdle to the first limit ordinal ω\omega. Then we can construct Sω+1S_{\omega+1}, Sω+2S_{\omega+2}, …, and for the next limit we just do the same trick of “union-ing” all the previous sets.

So we can formalize the process as follows:

  1. Let S0=S_0 = \varnothing.
  2. For a successor stage Sα+1S_{\alpha+1}, add any element to SαS_\alpha to obtain Sα+1S_{\alpha+1}.
  3. For a limit stage SλS_{\lambda}, take the union γ<λSγ\bigcup_{\gamma < \lambda} S_\gamma.

How do we know that we’ll stop eventually? Well, the thing is that this process consumes a lot of real numbers. In particular, the ordinals get larger than the size of R{\mathbb R}. Hence if we don’t stop we will quite literally reach a point where we have used up every single real number. Clearly that’s impossible, because by then the elements can’t possibly be independent! (EDIT Dec 20 2015: To be clear, the claim that “ordinals get larger than the size of the reals” requires the Axiom of Choice; one can’t do this construction using transfinite induction alone. Thanks reddit for calling me out on this.)

So by transfinite recursion (and Choice), we eventually hit some SγS_\gamma which is spanning: the elements are all independent, but every real number can be expressed using it. Done! This set has a name: a Hamel basis.

Thus this problem is dead, dead, dead. Any questions?

Squidzilla.
Squidzilla.

5. Zorn’s Lemma

Now I can tell you what Zorn’s Lemma is: it lets us do the same thing in any poset.

We can think of the above example as follows: consider all sets of independent elements. These form a partially ordered set by inclusion, and what we did was quite literally climb up a chain S0S1S2.S_0 \subset S_1 \subset S_2 \subset \dots. It’s not quite climbing since we weren’t just going one step at a time: we had to do “jumps” to get up to SωS_\omega and resume climbing. But the main idea is to climb up a poset until we’re at the very top; in the previous case, when we reached the spanning set.

The same thing works verbatim with any partially ordered set P\mathcal P. Let’s define some terminology. A local maximum (or maximal element) of the entire poset P\mathcal P is an element which has no other elements strictly greater than it.

Now a chain of length γ\gamma is a set of elements pαp_\alpha for every α<γ\alpha < \gamma such that p0<p1<p2<p_0 < p_1 < p_2 < \dots. (Observe that a chain has a last element if and only if γ\gamma is a successor ordinal, like ω+3\omega+3.) An upper bound to a chain is an element p~\tilde p which is greater than or equal to all elements of the chain. In particular, if γ\gamma is a successor ordinal, then just taking the last element of the chain works.

In this language, Zorn’s Lemma states that

Theorem 3 (Zorn’s Lemma). Let P\mathcal P be a nonempty partially ordered set. If every chain has an upper bound, then P\mathcal P has a local maximum.

Chains with length equal to a successor ordinal always have upper bounds, but this is not true in the limit case. So the hypothesis of Zorn’s Lemma is exactly what lets us “jump” up to define pωp_\omega and other limit ordinals. And the proof of Zorn’s Lemma is straightforward: keep climbing up the poset at successor stages, using Zorn’s condition to jump up at limit stages, and thus building a really long chain. But we have to eventually stop, or we literally run out of elements of P\mathcal P. And the only possible stopping point is a local maximum.

If we want to phrase our previous solution in terms of Zorn’s Lemma, we’d say: Proof: Look at the poset whose elements are sets of independent real numbers. Every chain S0S1S_0 \subset S_1 \subset \dots has an upper bound Sα\bigcup S_\alpha (which you have to check is actually an element of the poset). Thus by Zorn, there is a local maximum SS. Then SS must be spanning, because otherwise we could add an element to it. \Box

So really, Zorn’s Lemma is encoding all of the work of climbing that I argued earlier. It’s a neat little package that captures all the boilerplate, and tells you exactly what you need to check.

Zornaholic.
Zornaholic.

One last thing you might ask: where is the Axiom of Choice used? Well, the idea is that for any chain there could be lots of p~\tilde p’s, and you need to pick one of them. Since you are making arbitrary choices infinitely many times, you need the Axiom of Choice. But really, it’s nothing special. [EDIT: AM points out that in order to talk about cardinalities in Theorem 1, one also needs the Axiom of Choice.]

6. Conclusion

In the words of Timothy Gowers,

If you are building a mathematical object in stages and find that (i) you have not finished even after infinitely many stages, and (ii) there seems to be nothing to stop you continuing to build, then Zorn’s lemma may well be able to help you.

Really, there’s nothing tricky at all here. People seem scared of Zorn’s Lemma, and claim it’s not intuitive or something. But really, all we’re doing is climbing up a poset. Nothing tricky at all.