Derivatives

10. Derivatives

≪ 9. Midterm 2 Review | Table of Contents | 11. Riemann Integrals ≫

I think we’ve all seen the definition of derivatives before, but I think it’s important to remember the definition one more time for good measure.

Definition 1.

Let \(f: (a, b)\to \mathbb{R}\) be a function, and let \(x \in (a, b)\). The derivative of \(f\) at \(x\) is defined as \[f’(x) = \lim _{h\to 0} \frac{f(x+h)-f(x)}{h} = \lim _{a\to x} \frac{f(a)-f(x)}{a-x}, \] if it exists. If \(f’(x)\) exists for all \(x\in (a, b)\), \(f\) is differentiable.

I’m hoping you have seen the definition of a derivative at some point in your life before.

Proposition 2.

If \(f\) is differentiable at a point \(x\), then \(f\) is continuous at \(x\).

Both the differentiability and continuity of \(f\) at \(x\) are “local statements” - they only depend on the values of \(f\) at points infinitesimally close to \(x\). This proposition says that being differentiable contains more information about this local behaviour than continuity.

This is…sort of obvious. While continuity only qualitatively describes how small changes in the input result in small changes in the output of \(f\), differentiability quantifies how big or small these changes need to be.

You hopefully have shown that linear combinations of differentiable functions are differentiable, proving along the way that derivatives are linear. Similar to continuous functions, products of differentiable functions are differentiable as well, and this is often cited as the product rule.

Proposition 3. The Product Rule

Let \(f, g\) be two functions that are differentiable at a point \(x\). Then \(P(x) = f(x) \cdot g(x)\) is also differentiable at \(x\), and its derivative is \[P’(x) = f’(x) g(x) + f(x) g’(x). \]

The proof is remarkably similar to proving that the product of continuous functions is continuous: we’ll imagine an addition of zero.

Proof

Expanding the definition, we have \[\begin{align*} P’(x) &= \lim _{h\to 0} \frac{f(x+h) g(x+h) - f(x) g(x)}{h} \\ &= \lim _{h\to 0} \frac{1}{h} \left( f(x+h)g(x+h) - f(x) g(x+h) + f(x) g(x+h) - f(x) g(x) \right) \\ &= \lim _{h\to 0} \frac{f(x+h)g(x+h) - f(x) g(x+h)}{h} + \lim _{h\to 0}\frac{f(x) g(x+h)-f(x)g(x)}{h} \\ &= \left( \lim _{h\to 0} g(x+h) \right) \left( \lim _{h\to 0} \frac{f(x+h)-f(x)}{h} \right) + f(x)\lim _{h\to 0}\frac{g(x+h)-g(x)}{h} \\ &= g(x) f’(x) + f(x) g’(x). \end{align*}\] At the end, we used the continuity of \(g\) at \(x\) (since \(g\) is differentiable at \(x\)). \(\square \)

Note that much like in continuity, the converse is not true: if \(f(x)\cdot g(x)\) is differentiable, it’s possible for neither \(f\) nor \(g\) to be differentiable, even if both are continuous. Thus, even if the product rule (and likewise, the chain rule) predicts that the derivative of a function should not exist, that does not necessarily mean that it doesn’t. For instance,

Example 4.

Show that the function \[f(x) = \begin{cases} x^2\sin\left( \frac{1}{x} \right) & x\neq 0, \\ 0 & x= 0\end{cases}\] is differentiable everywhere. (You may use the chain rule and the product rule.) What is the derivative of \(f\) at \(0\)?

Solution

When \(x\neq 0\), we can use the product rule and the chain rule to get \[f’(x) = 2x\sin \left( \frac{1}{x} \right) - \cos \left( \frac{1}{x} \right).\] However, \(f’(0)\) is not defined, and you can’t take the limit as \(x\to 0\) either.

Instead, we claim that \(f’(0) = 0\). Using \(f(0) = 0\), we have \[f’(0) = \lim _{h\to 0} \frac{f(h)}{h} = \lim _{h\to 0} \left\lvert h \right\rvert \sin \left( \frac{1}{h} \right).\] (Note that \(\frac{h^2}{h} = \left\lvert h \right\rvert \neq h\).) Since \(-1 \leq \sin \left( \frac{1}{h} \right) \leq 1\) for all \(h\neq 0\), by the squeeze theorem, it follows that this limit is \(0\). \(\square\)

Again, this example shows that one can take the product of two continuous functions, one of which is not differentiable, and get something that’s differentiable. This is not a contradiction to the product rule, and if the product rule gives you something undefined at a point, you may have a derivative yet.

The same thing applies to the chain rule:

Exercise 5.

Let \[f(x) = \begin{cases} x\sin \left( \frac{1}{x} \right) & x\neq 0, \\ 0 & x=0\end{cases}\] and \(g(x) = x^2\). Show that \(f\) is continuous but not differentiable at \(0\). Show that \(f\circ g\) is differentiable everywhere. What is its derivative at \(0\)?

As you can perhaps tell, these examples both contain a very, very bad point of discontinuity for the derivatives of the product or composite of two functions. The moment you get “almost” continuity everywhere, all of these worries fade away. For instance,

Proposition 6.

Let \(f : (-1, -1) \to \mathbb{R} \) be continuous, and suppose \(f’\) is uniformly continuous on \(\left[ -\frac{1}{2},\frac{1}{2} \right]\setminus \left\lbrace 0 \right\rbrace\). Then \(f\) is differentiable at \(0\), and \[f’(0) = \lim _{x\to 0}f’(x).\]

What this theorem is saying is, if you have a function whose derivative is one point away from being continuous, then it’s actually just continuous.

We won’t prove this statement today. Try as you might, it’s really really difficult to juggle the nested limits. After all, we saw several weeks back that nested limits are the bane of our existence, and we don’t really have any good tools for dealing with them.

One ends up proving this instead by using the fundamental theorem of calculus: rather than directly messing around with the derivatives, we integrate out the derivatives and put them back in later.

This is a paradigmatic idea to follow with derivatives in general, by the way. Differentiation takes nice functions and makes them ugly and hard to work with; integration in general takes ugly functions and makes them nice to work with. Thus, whenever you want to say something about derivatives, it’s often easiest to integrate first and undo this later.

Let’s finish off our discussion by looking at one of the greatest features of the theory of differentiation.

Theorem 7. Mean Value Theorem

Let \(f: \left[ a, b \right]\to \mathbb{R}\) be continuous, and suppose \(f\) is continuously differentiable on \((a, b)\). Then, there exists some \(c\in (a, b)\) such that \[f’(c) = \frac{f(b) - f(a)}{ b - a }.\]

What a great theorem! Let’s reframe this theorem in a slightly different view. Let \(x_0\) be a fixed point, and let \(f\) be differentiable on some open interval containing \(x_0\). Let \(x\) be in this interval. The mean value theorem applied to \(f\) says there exists a point \(c\) between \(x\) and \(x_0\) such that \[f’(c) = \frac{f(x) - f\left( x_0 \right)}{x - x_0} \implies f(x) = f\left( x_0 \right) + f’(c) \left( x-x_0 \right).\] This should remind you a lot of Taylor’s theorem. In fact, one can think of the mean value theorem as describing how accurate a \(0\)-order Taylor polynomial approximation of a function \(f\) is. Taylor’s theorem is a vast generalisation of this idea.

Thinking about how the mean value theorem was proven, one can say:

Theorem 8. First-order Taylor's Theorem

Let \(f\) be a twice-differentiable function on an open interval containing \(x_0\). Let \(x\) be in this open interval, and define \[T_1(x) = f\left( x_0 \right) + f’\left( x_0 \right) \left( x-x_0 \right).\] For any \(x\neq x_0\), there exists some \(c\) between \(x\) and \(x_0\) such that \[f(x) = T_1(x) + \frac{1}{2} f’’(c) \left( x-x_0 \right)^2.\]

The proof below is quite unconventional, but I think it’s a good demonstration of what the mean value theorem has done for humanity.

Proof

Define \[g(t) = f(t) - T_1(t) + \frac{\left( t-x_0 \right)^2}{\left( x-x_0 \right)^2} \left( f(x) - T_1(x) \right). \] Observe quickly that \(g\left( x_0 \right)=g’\left( x_0 \right) = 0\) and that \(g(x) = 0\). By Rolle’s theorem, there exists some \(c_0\) between \(x\) and \(x_0\) such that \(g’\left( c_0 \right) = 0\). By Rolle’s theorem applied to \(g’\), there is a \(c\) between \(x_0\) and \(c_0\) such that \(g’’(c) = 0\). Expanding this expression yields \[g’’(c) = f’’(c) - 2 \cdot \frac{f(x) - T_1(x)}{ \left( x-x_0 \right)^2} = 0.\] Rearranging this yields exactly \[f(x) = T_1(x) + \frac{1}{2}f’’(c) \left( x-x_0 \right)^2,\] as desired. \(\square\)

One can actually prove Taylor’s theorem with the exact same argument and some slick induction. I’ll leave that to you as an exercise.

Hunter Liu's Website

10. Derivatives