Hunter Liu's Website

14. Contraction Mappings

≪ 13. Comments about Homework 6 | Table of Contents | 15. Two Inverse Function Theorem Problems ≫

A surprisingly useful application of metric space topology is the contraction mapping principle, also known as the Banach fixed point theorem. It states:

Theorem 1.

Let \(\left( X,d \right)\) be a complete metric space. Let \(f:X\to X\) be a function such that \[d\left( f(x), f(y) \right) < r\cdot d\left( x, y \right)\] for all \(x\neq y\) and for some fixed constant \(0< r< 1\). (Such a function is called a contracting mapping.) Then there is a unique point \(x_0\in X\) such that \(f\left( x_0 \right)=x_0\) (i.e., \(f\) has a unique fixed point).

The idea of the proof is to pick any starting point \(y_0\) and construct the sequence \(\left\lbrace y_n \right\rbrace\) inductively by setting \[y _{n+1}= f\left( y_n \right).\] Then this sequence is Cauchy, and since \(\left( X,d \right)\) is complete, it will converge! This limit will be a fixed point.

You guys will see a complete proof of this theorem in lecture tomorrow! For today, we should think about the different assumptions of the theorem and why they’re important before jumping into some applications.

First, we need that \(\left( X,d \right)\) is complete, as we’re arguing that a Cauchy sequence has a limit.

Exercise 2.

Find a metric space \(\left( X, d \right)\) that’s not complete and a contraction mapping \(f:X\to X\) that has no fixed points.

Of course, if \(f\) is a contraction mapping and \(f\) has a fixed point, then necessarily the fixed point must be unique (can you see why?).

The second point of note is that we required \(f\) to shrink distances by a constant factor of \(0< r< 1\); this theorem does not necessarily hold if \(r=1\)!

Exercise 3.

Define \(f:\mathbb{R}\to\mathbb{R}\) via \(x\mapsto x+\frac{1}{1+e^x}\).

  1. Show that \(f\) shrinks distances in the sense that \[\left\lvert f(x)-f(y) \right\rvert<\left\lvert x-y \right\rvert\] for all \(x\neq y\), but that there exists no \(0< r< 1\) such that \[\left\lvert f(x)-f(y) \right\rvert < r \left\lvert x-y \right\rvert\] for all \(x\neq y\).
  2. Show that \(f\) has no fixed points.
Hint
Use the mean value theorem!

However, there is a variant of the contraction mapping principle that describes situations where distance-shrinking functions (as in the exercise above) do have unique fixed points!

Theorem 4.

If \(\left( X,d \right)\) is a compact metric space and \(f:X\to X\) satisfies \[d\left( f(x),f(y) \right) < d\left( x, y \right)\] for all \(x\neq y\), then \(f\) has a unique fixed point.

The proof is left as an exercise (and in fact was a problem on a basic exam in recent history), but you can find a proof in the lecture notes!

When tackling problems that look like applications of the contracting mapping principle, one often needs to restrict the domain to some smaller set in order to ensure that the map really is a contraction mapping!

Exercise 5.

Define the sequence \[\begin{align*} x_0 = 1, && x_1 = 1+\frac{1}{1}, && x_2 = 1+\frac{1}{1+\frac{1}{1}}, && x_3 = 1+\frac{1}{1+\frac{1}{1+\frac{1}{1}}}, && \cdots \end{align*}\] Show that \(x_n\) converges to a limit, and find the limit.

Exercise 6. (UCLA Basic Exam, Fall 2023 #2)

Let \(\alpha>0\). For any \(x_0>0\), define the sequence \[x _{n+1}=\frac{1}{2}\left( x_n+\frac{\alpha}{x_n} \right).\] Show that this sequence converges to a unique limit independent of \(x_0\) (but perhaps depending on \(\alpha\)). Find the limit.

Newton Iteration

If you’re an applied mathematician, you may see iterative algorithms as tools for solving equations or systems of equations numerically. One such algorithm is called fixed point iteration, and it’s the most primitive algorithm; it directly employs the naïve trick of iterating a function with itself until a fixed point is reached.

Perhaps a more familiar tool is Newton iteration, or Newton’s root-finding method. If you’ve suffered through AP Calculus, perhaps you still have memories of this.

Exercise 7.

Let \(f:\mathbb{R}\to \mathbb{R}\) be twice continuously differentiable, and suppose there exists some \(x_*\in \mathbb{R}\) such that \(f\left( x_* \right)=0\) and \(f’\left( x_* \right)\neq 0\). Show that there exists a \(\delta>0\) such that for all \(x_0\in B\left( x_*, \delta \right)\), the sequence defined inductively by \[x_{n+1}=x_n-\frac{f\left( x_n \right)}{f’\left( x_n \right)}\] converges to \(x_*\).

Hint 1
Think about the function \(g\left( x \right)=x-\frac{f(x)}{f’(x)}\). Can you identify when this will be a contraction mapping?
Hint 2
Try using Taylor’s theorem.

Applications to Ordinary Differential Equations

A significantly more celebrated application of Theorem 1 is the existence and uniqueness of solutions to ODEs with initial values, which can be proven using something called Picard iteration. Most often, people introduce this argument for first-order ODEs, but with a little bit of work, one can extend this principle to higher-order ODEs.

Theorem 8. (Picard-Lindelӧf)

Let \(f:\mathbb{R}\times \mathbb{R}\to \mathbb{R}\) be a continuous function. Suppose additionally that there exists some \(L>0\) such that for any \(t\in\mathbb R\) and all \(y_1,y_2\in \mathbb R\), \[\left\lvert f\left( t, y_1 \right)-f\left( t,y_2 \right) \right\rvert< L \left\lvert y_1-y_2 \right\rvert.\] Let \(\left( t_0, y_0 \right)\in \mathbb R^2\). Then there exists a unique differentiable function \(y\) defined on a neighbourhood of \(t_0\) such that \(y\left( t_0 \right)=y_0\) and \[y’(t) = f\left( t, y(t) \right).\]

Of course, this also works when \(f\) is only defined on a subset of \(\mathbb{R}\times \mathbb{R}\) instead.

The idea is to start with a naïve guess for what the solution should be, then integrate both sides of the differential equation. This produces a function on the metric space of continuous functions on some domain under the uniform norm. One often needs to shrink the domain in order to make this iterative scheme into a contraction mapping, from which point our contraction mapping principle does the heavy lifting.

Exercise 9.

Generalise the Picard-Lindelӧf theorem to systems of first-order differential equations as follows:

Suppose \(f:\mathbb{R}\times \mathbb{R}^n\to \mathbb{R}^n\) is a continuous function. Suppose also that there exists some \(L>0\) such that for all \(t\in \mathbb{R}\) and \(y_1,y_2\in \mathbb{R}^n\), \[\left\lVert f\left( t,y_1 \right)-f\left( t,y_2 \right) \right\rVert\leq L \left\lVert y_1-y_2 \right\rVert.\] Let \(t_0\in \mathbb{R}\) and \(y_0\in \mathbb{R}^n\). Show that there exists a unique differentiable function \(y\), defined on a neighbourhood of \(t_0\), such that \(y\left( t_0 \right)=y_0\) and \(y’(t)=f\left( t, y(t) \right)\).

Well, this is really a part of the Picard-Lindelöf theorem, and one may treat integration of a vector a coordinate-wise integration. The proof really works out in exactly the same way thereafter. What this says is that initial value problems for systems of ODEs have unique solutions.

Exercise 10.

Generalise the Picard-Lindelöf theorem to higher-order ODEs as follows:

Suppose \(f:\mathbb{R}\times \mathbb{R}^n\to \mathbb{R}\) is a continuous function. Suppose also that there exists some \(L>0\) such that for all \(t\in \mathbb{R}\) and for all \(v_1,v_2\in \mathbb{R}^n\), \[\left\lVert f\left( t, v_1 \right)-f\left( t,v_2 \right) \right\rVert\leq L \left\lVert v_1,v_2 \right\rVert.\] Let \(t_0\in \mathbb{R}\), and let \(y_0,\ldots, y _{n-1}\in \mathbb{R}\). Show that there exists an \(n\)-times differentiable function \(y\) defined on a neighbourhood of \(t_0\) such that \(y ^{(k)}\left( t_0 \right)= y_k\) for all \(k=0,\ldots, n-1\) (i.e., the first \(n-1\) derivatives of \(y\) satisfy an initial value condition) and \[y ^{(n)}(t) = f\left( t, y(t), y’(t), \ldots, y ^{(n-1)}(t) \right).\]

In other words, not only are systems of ODEs with initial values guaranteed to have unique (local) solutions, but higher order ODEs also do.

There are two ways to approach this. Perhaps the more common approach is to express a higher-order ODE as a system of first-order ODEs, then use our first “generalisation” of Picard-Lindelöf. The other approach, which I have not worked out and whose validity I am unsure of, depends on the following idea:

Exercise 11.

Let \(C^k\left( \left[ 0, 1 \right] \right)\) be the set of \(k\)-times continuously differentiable functions on \(\left[ 0,1 \right]\). Define the norm \[\left\lVert f \right\rVert = \sum _{n=0}^{k} \sup _{x\in \left[ 0, 1 \right]} \left\lvert f ^{(n)}(x) \right\rvert.\] (Note that this is the sum of the suprema, not the suprema of the sums.)

  1. Show that \(d\left( f, g \right)=\left\lVert f-g \right\rVert\) is a metric.
  2. Show that \(C^k\left( \left[ 0, 1 \right] \right)\) is complete with respect to this metric.
  3. Let \(x_0,\ldots, x _{k-1}\) be any real numbers. Show that the set \[\left\lbrace f\in C^k\left( \left[ 0,1 \right] \right) : f ^{(n)}(0)=x_n\ \forall n=0,\ldots, k-1 \right\rbrace\] is a closed subset of \(C^k([0, 1]\).
  4. Use the above facts together with a modified version of Picard iteration to give an alternate proof of the existence and uniqueness of solutions to higher-order ODEs.