9. Homework 3, Problems 5 and 6
≪ 8. Completeness of \(\ell^2\) | Table of Contents | 10. Connectedness ≫This is sort of a sequel to the previous post. A lot of people asked me about problems 5 and 6 from homework 3, both over email and during office hours. Moreover, many people did not score well on these two problems, with the average solutions scoring about 5/10 on both problems. With the first midterm looming ominously on the horizon, it would be good to explain what the right approach is and what the key steps are.
Problem 1.
A subset \(C\subseteq\ell^2\) has the midpoint convexity property if for all \(c_1,c_2\in C\), \(\frac{c_1+c_2}{2}\in C\). Prove that if \(C\) is a closed subset of \(\ell^2\) with the midpoint convexity property, and if \(v_0\in\ell^2\), then there exists some \(v_c\in C\) such that \(\left\lVert v_c-v_0 \right\rVert \leq \left\lVert c-v_0 \right\rVert\) for all \(c\in C\).
First, we assume without loss of generality that \(v_0=0\): translate \(C\) and \(v_0\) by \(v_0\) so that we are minimisng the distance from the (translated) \(C\) and the origin. A lot of people left it at this, but there’s more left to say: you need to verify that the translated \(C\) is still closed, has the midpoint convexity property, and that the closest point from the translated \(C\) to the origin is the translation of the closest point in \(C\) to \(v_0\). You can skip this step, but it makes the notation quite cumbersome. Rather than write \(d\left( v, 0 \right)\), we’ll just write \(\left\lVert v \right\rVert\), since it’s shorter.
Here’s a rough outline of what the rest of the proof entails:
- Let \(L=\inf \left\lbrace \left\lVert c \right\rVert : c\in C \right\rbrace\), and justify why \(L\) should exist.
- Find a sequence \(v_n\) in \(C\) such that \(\left\lVert v_n \right\rVert\) converges to \(L\).
- Demonstrate that this sequence is Cauchy, then use completenes of \(\ell^2\) to say that it converges to some \(v_c\in C\). Note \(v_c\in C\) since \(C\) is closed!
- Prove that the limit satisfies \(\left\lVert v_c \right\rVert=L\).
I think most people got steps 1 and 2, though one needs to be careful in the construction of the sequence in part 2 and ensure that the distances really converge to \(L\). Step 3 was a bit more troublesome, and one needed to use the midpoint convexity property, the definition of the infimum, and the parallelogram law in some combination. Step 4 was glossed over by almost everybody. It’s notationally obvious, but you have to be careful with swapping limits!
Steps 1 and 2
Let \(L=\inf \left\lbrace \left\lVert c \right\rVert : c\in C \right\rbrace\). Since \(0\leq \left\lVert c \right\rVert\) for all \(c\in C\), this set is a subset of \(\mathbb{R}\) that’s bounded below. Since \(\mathbb{R}\) has the greatest upper bound property, \(L\) exists!
Now for each \(n\in \mathbb{N}\), there exists some \(v_n\in C\) such that \(\left\lVert v_n \right\rVert< L+\frac{1}{n} \). Such a \(v_n\) must exist. Otherwise, \(L+\frac{1}{n}>L\) would be a lower bound for \(\left\lbrace \left\lVert c \right\rVert : c\in C \right\rbrace\), contradicting \(L\) being the greatest lower bound.
Step 3
We claim that the \(v_n\)’s form a Cauchy sequence. First, we use the parallelogram law, which states that \[\left\lVert v_n-v_m \right\rVert^2+\left\lVert v_n+v_m \right\rVert^2 = 2\left\lVert v_n \right\rVert^2+\left\lVert v_m \right\rVert^2.\] Divide by \(4\), then rearrange this to get \[\frac{1}{4} \left\lVert v_n-v_m \right\rVert^2 = \frac{1}{2} \left( \left\lVert v_n \right\rVert^2+\left\lVert v_m \right\rVert^2 \right)+\left\lVert \frac{v_n+v_m}{2} \right\rVert^2.\]
Fix an \(\epsilon>0\). Take \(N_\epsilon\) so large that for any \(n>N_\epsilon\), \(\left\lVert v_n \right\rVert^2< L^2+\frac{\epsilon}{4}\). Then, for any \(n, m > N_\epsilon\), we have (from above) that \[\frac{1}{4} \left\lVert v_n-v_m \right\rVert \leq L^2+\frac{\epsilon}{4}-\left\lVert \frac{v_n+v_m}{2} \right\rVert^2.\] However, \(\frac{v_n+v_m}{2}\in C\) by midpoint convexity, so by definition of \(L\) we have \(\left\lVert \frac{v_n+v_m}{2}\right\rVert^2 \geq L^2\).
All together, we get that when \(n,m> N_\epsilon\), \[\frac{1}{4} \left\lVert v_n-v_m \right\rVert < L^2+\frac{\epsilon}{4}-L^2=\frac{\epsilon}{4}.\] Multiplying by \(4\) shows that \(\left\lbrace v_n \right\rbrace\) is Cauchy. Finally, since \(\ell^2\) is complete, we get that \(\left\lbrace v_n \right\rbrace\) must converge to some limit \(v_c\). Since \(C\) is closed, \(v_c\in C\).
Step 4
While it’s tempting to say that since \(\left\lVert v_n \right\rVert\to L\) and \(v_n\to v_c\), then \(\left\lVert v_c \right\rVert=L\), which would conclude the proof. However, this is very hairy since this asserts that \[\left\lVert v_c \right\rVert=\left\lVert \lim _{n\to\infty}v_n \right\rVert=\lim _{n\to\infty} \left\lVert v_n \right\rVert.\] This relies on the continuity of the norm and is highly nontrivial! The norm is defined using a limit, and the above equation swaps two limits.
Let’s actually prove it. We claim that for any \(\epsilon>0\), \(\left\lVert v_c \right\rVert< L+\epsilon\). Since \(\left\lVert v_n \right\rVert\to L\) (in fact, they decrease to \(L\)), there exists some \(N_1\) such that for all \(n>N_1\), \(\left\lVert v_n \right\rVert< L+\frac{\epsilon}{2}\). Since \(v_n\to v_c\), there exists some \(N_2\) such that if \(n> N_2\), then \(\left\lVert v_n-v_c \right\rVert< \frac{\epsilon}{2}\). Now use the triangle inequality: \(\left\lVert v_c \right\rVert \leq \left\lVert v_n \right\rVert + \left\lVert v_c-v_n \right\rVert < L + \epsilon.\)
Thus, \(\left\lVert v_c \right\rVert < L+\epsilon\) for all \(\epsilon>0\), and since \(\left\lVert v_c \right\rVert\geq L\) (as \(v_c\in C\)), it follows that \(\left\lVert v_c \right\rVert=L\). (Note: this is the classic “espilon of room” trick!)
By definition, we get that \(\left\lVert v_c \right\rVert\leq \left\lVert c \right\rVert\) for all \(c\in C\), as desired. \(\square\)
Problem 6 on the homework is actually very similar to the above:
Problem 2.
Let \(\left( X,d \right)\) a metric space and \(C\subseteq X\) sequentially compact. Show that for any \(x\in X\), there exists some \(c_x\in C\) such that \(d\left( x, c_x \right)< d\left( x, c \right)\) for all \(c\in C\).
The proof is very similar to the previous problem, so I’ll leave the details as an exercise. Here’s an outline:
- Define \(L=\inf \left\lbrace d\left( x, c \right) : c\in C \right\rbrace\), and justify why it exists.
- Find a sequence \(c_n\) in \(C\) such that \(d\left( x,c_n \right)\to L\).
- Use compactness of \(C\) to find a subsequence \(\left\lbrace c _{n_j} \right\rbrace\) converging to some \(c_x\in C\).
- Argue why \(d\left( x, c_x \right)=L\).
The only difference is finding a limit from the sequence constructed! Every other step is nearly identical, though perhaps with some notational changes.