Hunter Liu's Website

15. Two Inverse Function Theorem Problems

≪ 14. Contraction Mappings | Table of Contents | 16. Final Review ≫

There is a midterm on Monday, and there were two practise problem sets that were released over this past week. Two of these problems are particularly difficult, and they are variants of the inverse function theorem without the assumption of a continuous derivative.

Since many students have struggled with these problems both in office hours and over email (and, to be fair, I struggled with these too), I decided to post my solutions for you to study off of.

Overall, the driving argument and intuition behind both solutions is, “a function looks like its derivative”.

Problem 1.

Let \(U\subseteq \mathbb{R}^2\) open and \(F:U\to \mathbb{R}^2\). Suppose \(F\) is differentiable at some \(x\in U\) and that \(\left. DF\right\rvert_{x}\) is nonsingular. Prove that there exists some \(\epsilon>0\) such that \(0< d\left( x, x’ \right)< \epsilon\) implies \(F(x)\neq F(x’)\)

Hint
Reduce to the case where \(\left. DF\right\rvert_{x}\) is the identity. Write down the definition of the derivative. What happens if \(F(x)=F(x’)\)?

Something to note here is that the problem does not ask you to show that \(F\) is locally injective. In fact this is not true without stronger assumptions on the differentiability of \(F\) on \(U\)! For instance, consider the function \[f(x)=x+x^2\sin\left( \frac{1}{x^2} \right).\] This function is differentiable on \((-1, 1)\) and \(f’(0)=1\) is nonsingular. However, \(f\) is not injective on any neighbourhood of \(0\). Although it’s true that \(f(x)\neq 0\) when \(x\) is close to zero and nonzero, it will repeat nonzero values infinitely often!

Solution

By composing with \(\left. DF\right\rvert_{x} ^{-1}\), we may assume without loss of generality that \(\left. DF\right\rvert_{x}\) is the identity! This is a consequence of the chain rule and the fact that linear functions are differentiable.

By the definition of the derivative, there exists some \(\epsilon>0\) such that \(0< d\left( x, x’ \right)< \epsilon\) implies \[\frac{\left\lVert F(x’)-F(x) - \left( x’-x \right) \right\rVert}{\left\lVert x’-x \right\rVert} < \frac{1}{2}.\] Then, if \(0< d\left( x, x’ \right)< \epsilon\), if \(F(x’)=F(x)\), we have \[\frac{\left\lVert F(x’)-F(x)-\left( x’-x \right) \right\rVert}{\left\lVert x’-x \right\rVert} = \frac{\left\lVert x’-x \right\rVert}{\left\lVert x’-x \right\rVert} = 1 < \frac{1}{2},\] a contradiction. Thus, \(F(x’)\neq F(x)\) whenever \(0< d\left( x’,x \right) < \epsilon\). \(\square\)

Remark
In the above solution, the underlying principle is the heuristic that \[F(x’)-F(x)\approx x’-x\] when \(x’\) and \(x\) are close together. Zero is a quantifiably poor approximation of \(x’-x\)!

Problem 2.

Suppose \(U\subseteq \mathbb{R}^2\) is open and \(F:U\to \mathbb{R}^2\) is differentiable and nonsingular everywhere on \(U\). Show that \(F(U)\) is open.

This is a fairly tricky problem to work out in full detail, but at its core, the argument is intuitive and straightforward. The idea is again that \(F\) looks like a linear function near any point \(x_0\), so \(F\) will take a small circle around \(x_0\) to a small (distorted) circle around \(F\left( x_0 \right)\). This distorted circle must stay some distance away from \(F\left( x_0 \right)\), so one can fit a little disc around \(F\left( x_0 \right)\) within this distorted circle.

This is a heuristic argument at best, though one may formalise it with some topological results that are unavailable to us. Here is a proof that uses only things we have seen before.

Solution

Let \(x_0\in U\); we wish to show that there is some \(\epsilon>0\) such that \(B\left( F\left( x_0 \right),\epsilon \right)\subseteq F\left( U \right)\) (i.e., the image of \(F\) contains a small ball centred at \(F\left( x_0 \right)\)).

Since \(\left. DF\right\rvert_{x_0}\) is nonsingular, we may perform the same trick as before and assume without loss of generality that \(\left. DF\right\rvert_{x_0}\) is the identity. Moreover, subtracting \(F\left( x_0 \right)\) we do not change its derivative, so we may assume also that \(F\left( x_0 \right)=0\). Explictly, we are replacing \(F\) with \(\left. DF\right\rvert_{x_0} ^{-1}\left( F(x)-F\left( x_0 \right) \right)\).

Critically, these assumptions will not affect the conclusion — if the image of \(\left. DF\right\rvert_{x_0} ^{-1}\left( F\left( x \right)-F\left( x_0 \right) \right)\) contains an open ball \(V\) centred at \(0\), then the image of \(F\) contains \(\left. DF\right\rvert_{x_0}(V)+F\left( x_0 \right)\), which is an open set (verify this!).

Anyways, we are assuming that \(\left. DF\right\rvert_{x_0}\) is the identity and \(F\left( x_0 \right)=0\) for simplicity. By the definition of the derivative, there exists some \(\epsilon>0\) such that \(0 < d\left( x, x_0 \right)< \epsilon\) implies \[\frac{\left\lVert F(x)-\left( x-x_0 \right) \right\rVert}{\left\lVert x-x_0 \right\rVert} < \frac{1}{2}.\]

Let \(\delta>0\), to be chosen later. (In fact, we will end up with \(\delta=\frac{\epsilon}{8}\), but it’s not clear where this figure comes from until later.) We claim that if \(\delta\) is sufficiently small and \(y\in B\left( F\left( x_0 \right), \delta \right)\), then \(y\) is in the image of \(F\).

Let \(C=\overline{B}\left( x_0, \frac{\epsilon}{2} \right).\) Consider the function \(G_y(x)=\left\lVert F(x)-y \right\rVert ^2\), which is a continuous function \(G_y:C\to \mathbb{R}\). Since \(C\) is compact (it is closed and bounded!), \(G_y\) attains its global minimum at some point \(\hat x\).

If \(G_y\left( \hat x \right)=0\), then \(F\left( \hat x \right)=y\), so \(y\) is in the image of \(F\).

Otherwise, there are two scenarios that can happen: either \(\hat x\) is on the interior of \(C\), where the gradient \(\nabla G_y\left( \hat x \right)=0\), or \(\hat x\) lies on the border of \(C\). We can compute this gradient explicitly: \[G_y\left( x \right)=\left\langle F(x)-y, F(x)-y \right\rangle,\] so \[\nabla G_y\left( \hat x \right)=2 \left. DF\right\rvert_{\hat x}\left( F\left( \hat x \right)-y \right).\] Verify this computation as an exercise! Since \(F\left( \hat x \right)-y\neq 0\) and since \(\left. DF\right\rvert_{\hat x}\) is nonsingular by assumption, we see that this is impossible. So, \(\hat x\) must lie on the boundary of \(C\), i.e. \(d\left( \hat x,x_0 \right)=\frac{\epsilon}{2}\).

We now have \[\frac{\left\lVert F\left( \hat x \right)-\left( \hat x-x_0 \right) \right\rVert}{\left\lVert \hat x-x_0 \right\rVert} < \frac{1}{2}\implies \left\lVert F\left( \hat x \right) \right\rVert > \frac{1}{2}\left\lVert \hat x-x_0 \right\rVert=\frac{\epsilon}{4}.\] In other words, \(F\left( \hat x \right)\) is at least a distance of \(\frac{\epsilon}{4}\) from the origin, \(F\left( x_0 \right)\). Now we pick \(\delta=\frac{\epsilon}{8}\) so that \(y\) is within a distance of \(\frac{\epsilon}{8}\) of the origin. It follows that \(y\) is closer to \(0=F\left( x_0 \right)\) than it is to \(F\left( \hat x \right)\), so \[G_y\left( x_0 \right)=\left\lVert F\left( x_0 \right)-y \right\rVert^2 < \frac{\epsilon^2}{64} < \frac{\epsilon^2}{16} < \left\lVert F\left( \hat x \right)-y \right\rVert^2 = G_y\left( \hat x \right).\] This contradicts \(\hat x\) being a global minimum of \(G_y\). We conclude that \(G_y\) attains a global minimum of \(0\) whenever \(y\in B\left( F\left( x_0 \right), \frac{\epsilon}{8} \right)\), so we conclude that \(B\left( F\left( x_0 \right), \frac{\epsilon}{8} \right)\subseteq F\left( U \right)\). Thus \(F\left( U \right)\) is open, as desired. \(\square\)