vEnhance's avatar

Jun 03, 2016

🖉 Things Fourier

For some reason several classes at MIT this year involve Fourier analysis. I was always confused about this as a high schooler, because no one ever gave me the “orthonormal basis” explanation, so here goes. As a bonus, I also prove a form of Arrow’s Impossibility Theorem using binary Fourier analysis, and then talk about the fancier generalizations using Pontryagin duality and the Peter-Weyl theorem.

In what follows, we let T=R/Z\mathbb T = \mathbb R/\mathbb Z denote the “circle group”, thought of as the additive group of “real numbers modulo 11”. There is a canonical map e:TCe : \mathbb T \rightarrow \mathbb C sending T\mathbb T to the complex unit circle, given by e(θ)=exp(2πiθ)e(\theta) = \exp(2\pi i \theta).

Disclaimer: I will deliberately be sloppy with convergence issues, in part because I don’t fully understand them myself, and in part because I don’t care.

1. Synopsis

Suppose we have a domain ZZ and are interested in functions f:ZCf : Z \rightarrow \mathbb C. Naturally, the set of such functions form a complex vector space. We like to equip the set of such functions with an positive definite inner product. The idea of Fourier analysis is to then select an orthonormal basis for this set of functions, say (eξ)ξ(e_\xi)_{\xi}, which we call the characters; the indexing ξ\xi are called frequencies. In that case, since we have a basis, every function f:ZCf : Z \rightarrow \mathbb C becomes a sum f(x)=ξf^(ξ)eξf(x) = \sum_{\xi} \widehat f(\xi) e_\xi where f^(ξ)\widehat f(\xi) are complex coefficients of the basis; appropriately we call f^\widehat f the Fourier coefficients. The variable xZx \in Z is referred to as the physical variable. This is generally good because the characters are deliberately chosen to be nice “symmetric” functions, like sine or cosine waves or other periodic functions. Thus wewe decompose an arbitrarily complicated function into a sum on nice ones.

For convenience, we record a few facts about orthonormal bases.

Proposition 1 (Facts about orthonormal bases)

Let VV be a complex Hilbert space with inner form <,>\left< -,-\right> and suppose x=ξaξeξx = \sum_\xi a_\xi e_\xi and y=ξbξeξy = \sum_\xi b_\xi e_\xi where eξe_\xi are an orthonormal basis. Then

<x,x>=ξaξ2aξ=<x,eξ><x,y>=ξaξbξ. \begin{aligned} \left< x,x \right> &= \sum_\xi |a_\xi|^2 \\ a_\xi &= \left< x, e_\xi \right> \\ \left< x,y \right> &= \sum_\xi a_\xi \overline{b_\xi}. \end{aligned}

2. Common Examples

2.1. Binary Fourier analysis on {±1}n\{\pm1\}^n

Let Z={±1}nZ = \{\pm 1\}^n for some positive integer nn, so we are considering functions f(x1,,xn)f(x_1, \dots, x_n) accepting binary values. Then the functions ZCZ \rightarrow \mathbb C form a 2n2^n-dimensional vector space CZ\mathbb C^Z, and we endow it with the inner form <f,g>=12nxZf(x)g(x).\left< f,g \right> = \frac{1}{2^n} \sum_{x \in Z} f(x) \overline{g(x)}. In particular, <f,f>=12nxZf(x)2\left< f,f \right> = \frac{1}{2^n} \sum_{x \in Z} \left\lvert f(x) \right\rvert^2 is the average of the squares; this establishes also that <,>\left< -,-\right> is positive definite.

In that case, the multilinear polynomials form a basis of CZ\mathbb C^Z, that is the polynomials χS(x1,,xn)=sSxs.\chi_S(x_1, \dots, x_n) = \prod_{s \in S} x_s. Thus our frequency set is actually the subsets S{1,,n}S \subseteq \{1, \dots, n\}. Thus, we have a decomposition f=S{1,,n}f^(S)χS.f = \sum_{S \subseteq \{1, \dots, n\}} \widehat f(S) \chi_S.

Example 2 (An example of binary Fourier analysis)

Let n=2n = 2. Then binary functions {±1}2C\{\pm 1\}^2 \rightarrow \mathbb C have a basis given by the four polynomials 1,x1,x2,x1x2.1, \quad x_1, \quad x_2, \quad x_1x_2. For example, consider the function ff which is 11 at (1,1)(1,1) and 00 elsewhere. Then we can put f(x1,x2)=x1+12x2+12=14(1+x1+x2+x1x2).f(x_1, x_2) = \frac{x_1+1}{2} \cdot \frac{x_2+1}{2} = \frac14 \left( 1 + x_1 + x_2 + x_1x_2 \right). So the Fourier coefficients are f^(S)=14\widehat f(S) = \frac 14 for each of the four SS’s.

This notion is useful in particular for binary functions f:{±1}n{±1}f : \{\pm1\}^n \rightarrow \{\pm1\}; for these functions (and products thereof), we always have <f,f>=1\left< f,f \right> = 1.

It is worth noting that the frequency \varnothing plays a special role:

Exercise 3. Show that f^()=1ZxZf(x).\widehat f(\varnothing) = \frac{1}{|Z|} \sum_{x \in Z} f(x).

2.2. Fourier analysis on finite groups ZZ

This is the Fourier analysis used in this post and this post. Here, we have a finite abelian group ZZ, and consider functions ZCZ \rightarrow \mathbb C; this is a Z|Z|-dimensional vector space. The inner product is the same as before: <f,g>=1ZxZf(x)g(x).\left< f,g \right> = \frac{1}{|Z|} \sum_{x \in Z} f(x) \overline{g}(x). Now here is how we generate the characters. We equip ZZ with a non-degenerate symmetric bilinear form Z×ZT(ξ,x)ξx.Z \times Z \xrightarrow{\cdot} \mathbb T \qquad (\xi, x) \mapsto \xi \cdot x. Experts may already recognize this as a choice of isomorphism between ZZ and its Pontryagin dual. This time the characters are given by (eξ)ξZwhereeξ(x)=e(ξx).\left( e_\xi \right)_{\xi \in Z} \qquad \text{where} \qquad e_\xi(x) = e(\xi \cdot x). In this way, the set of frequencies is also ZZ, but the ξZ\xi \in Z play very different roles from the “physical” xZx \in Z. (It is not too hard to check these indeed form an orthonormal basis in the function space CZ\mathbb C^{\left\lvert Z \right\rvert}, since we assumed that \cdot is non-degenerate.)

Example 4 (Cube roots of unity filter)

Suppose Z=Z/3ZZ = \mathbb Z/3\mathbb Z, with the inner form given by ξx=(ξx)/3\xi \cdot x = (\xi x)/3. Let ω=exp(23πi)\omega = \exp(\frac 23 \pi i) be a primitive cube root of unity. Note that eξ(x)={1ξ=0ωxξ=1ω2xξ=2.e_\xi(x) = \begin{cases} 1 & \xi = 0 \\ \omega^x & \xi = 1 \\ \omega^{2x} & \xi = 2. \end{cases} Then given f:ZCf : Z \rightarrow \mathbb C with f(0)=af(0) = a, f(1)=bf(1) = b, f(2)=cf(2) = c, we obtain

f(x)=a+b+c31+a+ω2b+ωc3ωx+a+ωb+ω2c3ω2x. f(x) = \frac{a+b+c}{3} \cdot 1 + \frac{a + \omega^2 b + \omega c}{3} \cdot \omega^x + \frac{a + \omega b + \omega^2 c}{3} \cdot \omega^{2x}.

In this way we derive that the transforms are

f^(0)=a+b+c3f^(1)=a+ω2b+ωc3f^(2)=a+ωb+ω2c3. \begin{aligned} \widehat f(0) &= \frac{a+b+c}{3} \\ \widehat f(1) &= \frac{a+\omega^2 b+ \omega c}{3} \\ \widehat f(2) &= \frac{a+\omega b+\omega^2c}{3}. \end{aligned}

Exercise 5. Show that f^(0)=1ZxZf(x).\widehat f(0) = \frac{1}{|Z|} \sum_{x \in Z} f(x).

Olympiad contestants may recognize the previous example as a “roots of unity filter”, which is exactly the point. For concreteness, suppose one wants to compute (10000)+(10003)++(1000999).\binom{1000}{0} + \binom{1000}{3} + \dots + \binom{1000}{999}. In that case, we can consider the function w:Z/3C.w : \mathbb Z/3 \rightarrow \mathbb C. such that w(0)=1w(0) = 1 but w(1)=w(2)=0w(1) = w(2) = 0. By abuse of notation we will also think of ww as a function w:ZZ/3Cw : \mathbb Z \twoheadrightarrow \mathbb Z/3 \rightarrow \mathbb C. Then the sum in question is

n(1000n)w(n)=n(1000n)k=0,1,2w^(k)ωkn=k=0,1,2w^(k)n(1000n)ωkn=k=0,1,2w^(k)(1+ωk)n. \begin{aligned} \sum_n \binom{1000}{n} w(n) &= \sum_n \binom{1000}{n} \sum_{k=0,1,2} \widehat w(k) \omega^{kn} \\ &= \sum_{k=0,1,2} \widehat w(k) \sum_n \binom{1000}{n} \omega^{kn} \\ &= \sum_{k=0,1,2} \widehat w(k) (1+\omega^k)^n. \end{aligned}

In our situation, we have w^(0)=w^(1)=w^(2)=13\widehat w(0) = \widehat w(1) = \widehat w(2) = \frac13, and we have evaluated the desired sum. More generally, we can take any periodic weight ww and use Fourier analysis in order to interchange the order of summation.

Example 6 (Binary Fourier analysis)

Suppose Z={±1}nZ = \{\pm 1\}^n, viewed as an abelian group under pointwise multiplication hence isomorphic to (Z/2Z)n(\mathbb Z/2\mathbb Z)^{\oplus n}. Assume we pick the dot product defined by ξx=12iξixi\xi \cdot x = \frac{1}{2} \sum_i \xi_i x_i where ξ=(ξ1,,ξn)\xi = (\xi_1, \dots, \xi_n) and x=(x1,,xn)x = (x_1, \dots, x_n).

We claim this coincides with the first example we gave. Indeed, let S{1,,n}S \subseteq \{1, \dots, n\} and let ξ{±1}n\xi \in \{\pm1\}^n which is 1-1 at positions in SS, and +1+1 at positions not in SS. Then the character χS\chi_S form the previous example coincides with the character eξe_\xi in the new notation. In particular, f^(S)=f^(ξ)\widehat f(S) = \widehat f(\xi).

Thus Fourier analysis on a finite group ZZ subsumes binary Fourier analysis.

2.3. Fourier series for functions L2([π,π])L^2([-\pi, \pi])

Now we consider the space L2([π,π])L^2([-\pi, \pi]) of square-integrable functions [π,π]C[-\pi, \pi] \rightarrow \mathbb C, with inner form <f,g>=12π[π,π]f(x)g(x).\left< f,g \right> = \frac{1}{2\pi} \int_{[-\pi, \pi]} f(x) \overline{g(x)}. Sadly, this is not a finite-dimensional vector space, but fortunately it is a Hilbert space so we are still fine. In this case, an orthonormal basis must allow infinite linear combinations, as long as the sum of squares is finite.

Now, it turns out in this case that (en)nZwhereen(x)=exp(inx)(e_n)_{n \in \mathbb Z} \qquad\text{where}\qquad e_n(x) = \exp(inx) is an orthonormal basis for L2([π,π])L^2([-\pi, \pi]). Thus this time the frequency set Z\mathbb Z is infinite. So every function fL2([π,π])f \in L^2([-\pi, \pi]) decomposes as f(x)=nf^(n)exp(inx)f(x) = \sum_n \widehat f(n) \exp(inx) for f^(n)\widehat f(n).

This is a little worse than our finite examples: instead of a finite sum on the right-hand side, we actually have an infinite sum. This is because our set of frequencies is now Z\mathbb Z, which isn’t finite. In this case the f^\widehat f need not be finitely supported, but do satisfy nf^(n)2<\sum_n |\widehat f(n)|^2 < \infty.

Since the frequency set is indexed by Z\mathbb Z, we call this a Fourier series to reflect the fact that the index is nZn \in \mathbb Z.

Exercise 7. Show once again f^(0)=12π[π,π]f(x).\widehat f(0) = \frac{1}{2\pi} \int_{[-\pi, \pi]} f(x). Often we require that the function ff satisfies f(π)=f(π)f(-\pi) = f(\pi), so that ff becomes a periodic function, and we can think of it as f:TCf : \mathbb T \rightarrow \mathbb C.

2.4. Summary

We summarize our various flavors of Fourier analysis in the following table.

TypePhysical varFrequency varBasis functionsBinary{±1}nSubsets S{1,,n}sSxsFinite groupZξZ, choice of ,e(ξx)Fourier seriesT or [π,π]nZexp(inx) \begin{array}{llll} \hline \text{Type} & \text{Physical var} & \text{Frequency var} & \text{Basis functions} \\ \hline \textbf{Binary} & \{\pm1\}^n & \text{Subsets } S \subseteq \left\{ 1, \dots, n \right\} & \prod_{s \in S} x_s \\ \textbf{Finite group} & Z & \xi \in Z, \text{ choice of } \cdot, & e(\xi \cdot x) \\ \textbf{Fourier series} & \mathbb T \text{ or } [-\pi, \pi] & n \in \mathbb Z & \exp(inx) \\ \end{array}

In fact, we will soon see that all these examples are subsumed by Pontryagin duality for compact groups GG.

3. Parseval and friends

The notion of an orthonormal basis makes several “big-name” results in Fourier analysis quite lucid. Basically, we can take every result from Proposition 1, translate it into the context of our Fourier analysis, and get a big-name result.

Corollary 8 (Parseval theorem)

Let f:ZCf : Z \rightarrow \mathbb C, where ZZ is a finite abelian group. Then ξf^(ξ)2=1ZxZf(x)2.\sum_\xi |\widehat f(\xi)|^2 = \frac{1}{|Z|} \sum_{x \in Z} |f(x)|^2. Similarly, if f:[π,π]Cf : [-\pi, \pi] \rightarrow \mathbb C is square-integrable then its Fourier series satisfies nf^(n)2=12π[π,π]f(x)2.\sum_n |\widehat f(n)|^2 = \frac{1}{2\pi} \int_{[-\pi, \pi]} |f(x)|^2.

Proof: Recall that <f,f>\left< f,f\right> is equal to the square sum of the coefficients. \Box

Corollary 9 (Formulas for f^\widehat f)

Let f:ZCf : Z \rightarrow \mathbb C, where ZZ is a finite abelian group. Then f^(ξ)=1ZxZf(x)eξ(x).\widehat f(\xi) = \frac{1}{|Z|} \sum_{x \in Z} f(x) \overline{e_\xi(x)}. Similarly, if f:[π,π]Cf : [-\pi, \pi] \rightarrow \mathbb C is square-integrable then its Fourier series is given by f^(n)=12π[π,π]f(x)exp(inx).\widehat f(n) = \frac{1}{2\pi} \int_{[-\pi, \pi]} f(x) \exp(-inx).

Proof: Recall that in an orthonormal basis (eξ)ξ(e_\xi)_\xi, the coefficient of eξe_\xi in ff is <f,eξ>\left< f, e_\xi\right>. \Box Note in particular what happens if we select ξ=0\xi = 0 in the above!

Corollary 10 (Plancherel theorem)

Let f:ZCf : Z \rightarrow \mathbb C, where ZZ is a finite abelian group. Then <f,g>=ξZf^(ξ)g^(ξ).\left< f,g \right> = \sum_{\xi \in Z} \widehat f(\xi) \overline{\widehat g(\xi)}. Similarly, if f:[π,π]Cf : [-\pi, \pi] \rightarrow \mathbb C is square-integrable then <f,g>=nf^(ξ)g^(ξ).\left< f,g \right> = \sum_n \widehat f(\xi) \overline{\widehat g(\xi)}.

Proof: Guess! \Box

4. (Optional) Arrow’s Impossibility Theorem

As an application, we now prove a form of Arrow’s theorem. Consider nn voters voting among 33 candidates AA, BB, CC. Each voter specifies a tuple vi=(xi,yi,zi){±1}3v_i = (x_i, y_i, z_i) \in \{\pm1\}^3 as follows:

  • xi=1x_i = 1 if AA ranks AA ahead of BB, and xi=1x_i = -1 otherwise.
  • yi=1y_i = 1 if AA ranks BB ahead of CC, and yi=1y_i = -1 otherwise.
  • zi=1z_i = 1 if AA ranks CC ahead of AA, and zi=1z_i = -1 otherwise.

Tacitly, we only consider 3!=63! = 6 possibilities for viv_i: we forbid “paradoxical” votes of the form xi=yi=zix_i = y_i = z_i by assuming that people’s votes are consistent (meaning the preferences are transitive).

Then, we can consider a voting mechanism

f:{±1}n{±1}g:{±1}n{±1}h:{±1}n{±1} \begin{aligned} f : \{\pm1\}^n &\rightarrow \{\pm1\} \\ g : \{\pm1\}^n &\rightarrow \{\pm1\} \\ h : \{\pm1\}^n &\rightarrow \{\pm1\} \end{aligned}

such that f(x)f(x_\bullet) is the global preference of AA vs. BB, g(y)g(y_\bullet) is the global preference of BB vs. CC, and h(z)h(z_\bullet) is the global preference of CC vs. AA. We’d like to avoid situations where the global preference (f(x),g(y),h(z))(f(x_\bullet), g(y_\bullet), h(z_\bullet)) is itself paradoxical.

In fact, we will prove the following theorem:

Theorem 11 (Arrow Impossibility Theorem)

Assume that (f,g,h)(f,g,h) always avoids paradoxical outcomes, and assume Ef=Eg=Eh=0\mathbf E f = \mathbf E g = \mathbf E h = 0. Then (f,g,h)(f,g,h) is either a dictatorship or anti-dictatorship: there exists a “dictator” kk such that f(x)=±xk,g(y)=±yk,h(z)=±zkf(x_\bullet) = \pm x_k, \qquad g(y_\bullet) = \pm y_k, \qquad h(z_\bullet) = \pm z_k where all three signs coincide.

The “irrelevance of independent alternatives” reflects that The assumption Ef=Eg=Eh=0\mathbf E f = \mathbf E g = \mathbf E h = 0 provides symmetry (and e.g. excludes the possibility that ff, gg, hh are constant functions which ignore voter input). Unlike the usual Arrow theorem, we do not assume that f(+1,,+1)=+1f(+1, \dots, +1) = +1 (hence possibility of anti-dictatorship).

To this end, we actually prove the following result:

Lemma 12. Assume the nn voters vote independently at random among the 3!=63! = 6 possibilities. The probability of a paradoxical outcome is exactly

14+14S{1,,n}(13)S(f^(S)g^(S)+g^(S)h^(S)+h^(S)f^(S)). \frac14 + \frac14 \sum_{S \subseteq \{1, \dots, n\}} \left( -\frac13 \right)^{\left\lvert S \right\rvert} \left( \widehat f(S) \widehat g(S) + \widehat g(S) \widehat h(S) + \widehat h(S) \widehat f(S) \right) .

Proof: Define the Boolean function D:{±1}3RD : \{\pm 1\}^3 \rightarrow \mathbb R by

D(a,b,c)=ab+bc+ca={3a,b,c all equal1a,b,c not all equal.. D(a,b,c) = ab + bc + ca = \begin{cases} 3 & a,b,c \text{ all equal} \\ -1 & a,b,c \text{ not all equal}. \end{cases}.

Thus paradoxical outcomes arise when D(f(x),g(y),h(z))=3D(f(x_\bullet), g(y_\bullet), h(z_\bullet)) = 3. Now, we compute that for randomly selected xx_\bullet, yy_\bullet, zz_\bullet that

ED(f(x),g(y),h(z))=EST(f^(S)g^(T)+g^(S)h^(T)+h^(S)f^(T))(χS(x)χT(y))=ST(f^(S)g^(T)+g^(S)h^(T)+h^(S)f^(T))E(χS(x)χT(y)). \begin{aligned} \mathbf E D(f(x_\bullet), g(y_\bullet), h(z_\bullet)) &= \mathbf E \sum_S \sum_T \left( \widehat f(S) \widehat g(T) + \widehat g(S) \widehat h(T) + \widehat h(S) \widehat f(T) \right) \left( \chi_S(x_\bullet)\chi_T(y_\bullet) \right) \\ &= \sum_S \sum_T \left( \widehat f(S) \widehat g(T) + \widehat g(S) \widehat h(T) + \widehat h(S) \widehat f(T) \right) \mathbf E\left( \chi_S(x_\bullet)\chi_T(y_\bullet) \right). \end{aligned}

Now we observe that:

  • If STS \neq T, then EχS(x)χT(y)=0\mathbf E \chi_S(x_\bullet) \chi_T(y_\bullet) = 0, since if say sSs \in S, sTs \notin T then xsx_s affects the parity of the product with 50% either way, and is independent of any other variables in the product.
  • On the other hand, suppose S=TS = T. Then

    χS(x)χT(y)=sSxsys.\chi_S(x_\bullet) \chi_T(y_\bullet) = \prod_{s \in S} x_sy_s. Note that xsysx_sy_s is equal to 11 with probability 13\frac13 and 1-1 with probability 23\frac23 (since (xs,ys,zs)(x_s, y_s, z_s) is uniform from 3!=63!=6 choices, which we can enumerate). From this an inductive calculation on S|S| gives that

    sSxsys={+1 with probability 12(1+(1/3)S)1 with probability 12(1(1/3)S). \prod_{s \in S} x_sy_s = \begin{cases} +1 & \text{ with probability } \frac{1}{2}(1+(-1/3)^{|S|}) \\ -1 & \text{ with probability } \frac{1}{2}(1-(-1/3)^{|S|}). \end{cases}

    Thus

    E(sSxsys)=(13)S.\mathbf E \left( \prod_{s \in S} x_sy_s \right) = \left( -\frac13 \right)^{|S|}.

Piecing this altogether, we now have that

ED(f(x),g(y),h(z))=(f^(S)g^(T)+g^(S)h^(T)+h^(S)f^(T))(13)S. \mathbf E D(f(x_\bullet), g(y_\bullet), h(z_\bullet)) = \left( \widehat f(S) \widehat g(T) + \widehat g(S) \widehat h(T) + \widehat h(S) \widehat f(T) \right) \left( -\frac13 \right)^{|S|}.

Then, we obtain that

E14(1+D(f(x),g(y),h(z)))=14+14S(f^(S)g^(T)+g^(S)h^(T)+h^(S)f^(T))f^(S)2(13)S. \begin{aligned} & \mathbf E \frac14 \left( 1 + D(f(x_\bullet), g(y_\bullet), h(z_\bullet)) \right) \\ =& \frac14 + \frac14\sum_S \left( \widehat f(S) \widehat g(T) + \widehat g(S) \widehat h(T) + \widehat h(S) \widehat f(T) \right) \widehat f(S)^2 \left( -\frac13 \right)^{|S|}. \end{aligned}

Comparing this with the definition of DD gives the desired result. \Box

Now for the proof of the main theorem. We see that

1=S{1,,n}(13)S(f^(S)g^(S)+g^(S)h^(S)+h^(S)f^(S)). 1 = \sum_{S \subseteq \{1, …, n\}} -\left( -\frac13 \right)^{\left\lvert S \right\rvert} \left( \widehat f(S) \widehat g(S) + \widehat g(S) \widehat h(S) + \widehat h(S) \widehat f(S) \right).

But now we can just use weak inequalities. We have f^()=Ef=0\widehat f(\varnothing) = \mathbf E f = 0 and similarly for g^\widehat g and h^\widehat h, so we restrict attention to S1|S| \ge 1. We then combine the famous inequality ab+bc+caa2+b2+c2|ab+bc+ca| \le a^2+b^2+c^2 (which is true across all real numbers) to deduce that

1=S{1,,n}(13)S(f^(S)g^(S)+g^(S)h^(S)+h^(S)f^(S))S{1,,n}(13)S(f^(S)2+g^(S)2+h^(S)2)S{1,,n}(13)1(f^(S)2+g^(S)2+h^(S)2)=13(1+1+1)=1. \begin{aligned} 1 &= \sum_{S \subseteq \{1, …, n\}} -\left( -\frac13 \right)^{\left\lvert S \right\rvert} \left( \widehat f(S) \widehat g(S) + \widehat g(S) \widehat h(S) + \widehat h(S) \widehat f(S) \right) \\ &\le \sum_{S \subseteq \{1, …, n\}} \left( \frac13 \right)^{\left\lvert S \right\rvert} \left( \widehat f(S)^2 + \widehat g(S)^2 + \widehat h(S)^2 \right) \\ &\le \sum_{S \subseteq \{1, …, n\}} \left( \frac13 \right)^1 \left( \widehat f(S)^2 + \widehat g(S)^2 + \widehat h(S)^2 \right) \\ &= \frac13 (1+1+1) = 1. \end{aligned}

with the last step by Parseval. So all inequalities must be sharp, and in particular f^\widehat f, g^\widehat g, h^\widehat h are supported on one-element sets, i.e. they are linear in inputs. As ff, gg, hh are ±1\pm 1 valued, each ff, gg, hh is itself either a dictator or anti-dictator function. Since (f,g,h)(f,g,h) is always consistent, this implies the final result.

5. Pontryagin duality

In fact all the examples we have covered can be subsumed as special cases of Pontryagin duality, where we replace the domain with a general group GG. In what follows, we assume GG is a locally compact abelian (LCA) group, which just means that:

  • GG is a abelian topological group,
  • the topology on GG is Hausdorff, and
  • the topology on GG is locally compact: every point of GG has a compact neighborhood.

Notice that our previous examples fall into this category:

Example 13 (Examples of locally compact abelian groups)

  • Any finite group ZZ with the discrete topology is LCA.
  • The circle group T\mathbb T is LCA and also in fact compact.
  • The real numbers R\mathbb R are an example of an LCA group which is not compact.

5.1. The Pontryagin dual

The key definition is:

Definition 14. Let GG be an LCA group. Then its Pontryagin dual is the abelian group

G^{continuous group homomorphisms ξ:GT}. \widehat G \coloneqq \left\{ \text{continuous group homomorphisms } \xi : G \rightarrow \mathbb T \right\}.

The maps ξ\xi are called characters. By equipping it with the compact-open topology, we make G^\widehat G into an LCA group as well.

Example 15 (Examples of Pontryagin duals)

  • Z^T\widehat{\mathbb Z} \cong \mathbb T.
  • T^Z\widehat{\mathbb T} \cong \mathbb Z. The characters are given by θnθ\theta \mapsto n\theta for nZn \in \mathbb Z.
  • R^R\widehat{\mathbb R} \cong \mathbb R. This is because a nonzero continuous homomorphism RS1\mathbb R \rightarrow S^1 is determined by the fiber above 1S11 \in S^1. (Covering projections, anyone?)
  • Z/nZ^Z/nZ\widehat{\mathbb Z/n\mathbb Z} \cong \mathbb Z/n\mathbb Z, characters ξ\xi being determined by the image ξ(1)T\xi(1) \in \mathbb T.
  • G×H^G^×H^\widehat{G \times H} \cong \widehat G \times \widehat H.
  • If ZZ is a finite abelian group, then previous two examples (and structure theorem for abelian groups) imply that Z^Z\widehat{Z} \cong Z, though not canonically. You may now recognize that the bilinear form :Z×ZZ\cdot : Z \times Z \rightarrow Z is exactly a choice of isomorphism ZZ^Z \rightarrow \widehat Z.
  • For any group GG, the dual of G^\widehat G is canonically isomorphic to GG, id est there is a natural isomorphism

    GG^^byx(ξξ(x)).G \cong \widehat{\widehat G} \qquad \text{by} \qquad x \mapsto \left( \xi \mapsto \xi(x) \right). This is the Pontryagin duality theorem. (It is an analogy to the isomorphism (V)V(V^\vee)^\vee \cong V for vector spaces VV.)

5.2. The orthonormal basis in the compact case

Now assume GG is LCA but also compact, and thus has a unique Haar measure μ\mu such that μ(G)=1\mu(G) = 1; this lets us integrate over GG. Let L2(G)L^2(G) be the space of square-integrable functions to C\mathbb C, i.e.

L2(G)={f:GCsuch thatGf2  dμ<}. L^2(G) = \left\{ f : G \rightarrow \mathbb C \quad\text{such that}\quad \int_G |f|^2 \; d\mu < \infty \right\}.

Thus we can equip it with the inner form <f,g>=Gfg  dμ.\left< f,g \right> = \int_G f\overline{g} \; d\mu. In that case, we get all the results we wanted before:

Theorem 16 (Characters of G^\widehat G forms an orthonormal basis)

Assume GG is LCA and compact. Then G^\widehat G is discrete, and the characters (eξ)ξG^byeξ(x)=e(ξ(x))=exp(2πiξ(x))(e_\xi)_{\xi \in \widehat G} \qquad\text{by}\qquad e_\xi(x) = e(\xi(x)) = \exp(2\pi i \xi(x)) form an orthonormal basis of L2(G)L^2(G). Thus for each fL2(G)f \in L^2(G) we have f=ξG^f^(ξ)eξf = \sum_{\xi \in \widehat G} \widehat f(\xi) e_\xi where f^(ξ)=<f,eξ>=Gf(x)exp(2πiξ(x))dμ.\widehat f(\xi) = \left< f, e_\xi \right> = \int_G f(x) \exp(-2\pi i \xi(x)) d\mu.

The sum ξG^\sum_{\xi \in \widehat G} makes sense since G^\widehat G is discrete. In particular,

  • Letting G=ZG = Z gives “Fourier transform on finite groups”.
  • The special case G=Z/nZG = \mathbb Z/n\mathbb Z has its own Wikipedia page.
  • Letting G=TG = \mathbb T gives the “Fourier series” earlier.

5.3. The Fourier transform of the non-compact case

If GG is LCA but not compact, then Theorem 16 becomes false. On the other hand, it is still possible to define a transform, but one needs to be a little more careful. The generic example to keep in mind in what follows is G=RG = \mathbb R.

In what follows, we fix a Haar measure μ\mu for GG. (This μ\mu is no longer unique up to scaling, since μ(G)=\mu(G) = \infty.)

One considers this time the space L1(G)L^1(G) of absolutely integrable functions. Then one directly defines the Fourier transform of fL1(G)f \in L^1(G) to be f^(ξ)=Gfeξ  dμ\widehat f(\xi) = \int_G f \overline{e_\xi} \; d\mu imitating the previous definitions in the absence of an inner product. This f^\widehat f may not be L1L^1, but it is at least bounded. Then we manage to at least salvage:

Theorem 17 (Fourier inversion on L1(G)L^1(G))

Take an LCA group GG and fix a Haar measure μ\mu on it. One can select a unique dual measure μ^\widehat \mu on G^\widehat G such that if fL1(G)f \in L^1(G), f^L1(G^)\widehat f \in L^1(\widehat G), the “Fourier inversion formula” f(x)=G^f^(ξ)eξ(x)dμ^.f(x) = \int_{\widehat G} \widehat f(\xi) e_\xi(x) d\widehat\mu. holds almost everywhere. It holds everywhere if ff is continuous.

Notice the extra nuance of having to select measures, because it is no longer the case that GG has a single distinguished measure.

Despite the fact that the eξe_\xi no longer form an orthonormal basis, the transformed function f^:G^C\widehat f : \widehat G \rightarrow \mathbb C is still often useful. In particular, they have special names for a few special GG:

  • If G=RG = \mathbb R, then G^=R\widehat G = \mathbb R, and this construction gives the poorly named “(continuous) Fourier transform”.
  • If G=ZG = \mathbb Z, then G^=T\widehat G = \mathbb T, and this construction gives the poorly named “DTFT..

5.4. Summary

In summary,

  • Given any LCA group GG, we can transform sufficiently nice functions on GG into functions on G^\widehat G.
  • If GG is compact, then we have the nicest situation possible: L2(G)L^2(G) is an inner product space with <f,g>=Gfg  dμ\left< f,g \right> = \int_G f \overline{g} \; d\mu, and eξe_\xi form an orthonormal basis across ξ^G^\widehat \xi \in \widehat G.
  • If GG is not compact, then we no longer get an orthonormal basis or even an inner product space, but it is still possible to define the transform

    f^:G^C\widehat f : \widehat G \rightarrow \mathbb C for fL1(G)f \in L^1(G). If f^\widehat f is also in L1(G)L^1(G) we still get a “Fourier inversion formula” expressing ff in terms of f^\widehat f.

We summarize our various flavors of Fourier analysis for various GG in the following. In the first half GG is compact, in the second half GG is not.

NameDomain GDual G^CharactersBinary Fourier analysis{±1}nS{1,,n}sSxsFourier transform on finite groupsZξZ^Ze(iξx)Discrete Fourier transformZ/nZξZ/nZe(ξx/n)Fourier seriesT[π,π]nZexp(inx)Continuous Fourier transformRξRe(ξx)Discrete time Fourier transformZξT[π,π]exp(iξn) \begin{array}{llll} \hline \text{Name} & \text{Domain }G & \text{Dual }\widehat G & \text{Characters} \\ \hline \textbf{Binary Fourier analysis} & \{\pm1\}^n & S \subseteq \left\{ 1, …, n \right\} & \prod_{s \in S} x_s \\ \textbf{Fourier transform on finite groups} & Z & \xi \in \widehat Z \cong Z & e( i \xi \cdot x) \\ \textbf{Discrete Fourier transform} & \mathbb Z/n\mathbb Z & \xi \in \mathbb Z/n\mathbb Z & e(\xi x / n) \\ \textbf{Fourier series} & \mathbb T \cong [-\pi, \pi] & n \in \mathbb Z & \exp(inx) \\ \hline \textbf{Continuous Fourier transform} & \mathbb R & \xi \in \mathbb R & e(\xi x) \\ \textbf{Discrete time Fourier transform} & \mathbb Z & \xi \in \mathbb T \cong [-\pi, \pi] & \exp(i \xi n) \\ \end{array}

You might notice that the various names are awful. This is part of the reason I got confused as a high school student: every type of Fourier series above has its own Wikipedia article. If it were up to me, we would just use the term “GG-Fourier transform”, and that would make everyone’s lives a lot easier.

6. Peter-Weyl

In fact, if GG is a Lie group, even if GG is not abelian we can still give an orthonormal basis of L2(G)L^2(G) (the square-integrable functions on GG). It turns out in this case the characters are attached to complex irreducible representations of GG (and in what follows all representations are complex).

The result is given by the Peter-Weyl theorem. First, we need the following result:

Lemma 18 (Compact Lie groups have unitary reps)

Any finite-dimensional (complex) representation VV of a compact Lie group GG is unitary, meaning it can be equipped with a GG-invariant inner form. Consequently, VV is completely reducible: it splits into the direct sum of irreducible representations of GG.

Proof: Suppose B:V×VCB : V \times V \rightarrow \mathbb C is any inner product. Equip GG with a right-invariant Haar measure dgdg. Then we can equip it with an “averaged” inner form B~(v,w)=GB(gv,gw)dg.\widetilde B(v,w) = \int_G B(gv, gw) dg. Then B~\widetilde B is the desired GG-invariant inner form. Now, the fact that VV is completely reducible follows from the fact that given a subrepresentation of VV, its orthogonal complement is also a subrepresentation. \Box

The Peter-Weyl theorem then asserts that the finite-dimensional irreducible unitary representations essentially give an orthonormal basis for L2(G)L^2(G), in the following sense. Let V=(V,ρ)V = (V, \rho) be such a representation of GG, and fix an orthonormal basis of e1e_1, \dots, ede_d for VV (where d=dimVd = \dim V). The (i,j)(i,j)-th matrix coefficient for VV is then given by GρGL(V)πijCG \xrightarrow{\rho} \mathop{\mathrm{GL}}(V) \xrightarrow{\pi_{ij}} \mathbb C where πij\pi_{ij} is the projection onto the (i,j)(i,j)-th entry of the matrix. We abbreviate πijρ\pi_{ij} \circ \rho to ρij\rho_{ij}. Then the theorem is:

Theorem 19 (Peter-Weyl)

Let GG be a compact Lie group. Let Σ\Sigma denote the (pairwise non-isomorphic) irreducible finite-dimensional unitary representations of GG. Then

{dimVρij    (V,ρ)Σ, and 1i,jdimV} \left\{ \sqrt{\dim V} \rho_{ij} \; \Big\vert \; (V, \rho) \in \Sigma, \text{ and } 1 \le i,j \le \dim V \right\}

is an orthonormal basis of L2(G)L^2(G).

Strictly, I should say Σ\Sigma is a set of representatives of the isomorphism classes of irreducible unitary representations, one for each isomorphism class.

In the special case GG is abelian, all irreducible representations are one-dimensional. A one-dimensional representation of GG is a map GGL(C)C×G \hookrightarrow \mathop{\mathrm{GL}}(\mathbb C) \cong \mathbb C^\times, but the unitary condition implies it is actually a map GS1TG \hookrightarrow S^1 \cong \mathbb T, i.e. it is an element of G^\widehat G.