Numerical Integration and Differentiation

5. Numerical Integration and Differentiation

≪ 4. Lagrange Polynomials and Their Problems | Table of Contents | 6. A Note about Divided Differences ≫

Last week, we talked about the different ways of computing Lagrange’s interpolating polynomials, and during lecture, you should have discussed the extension of these techniques to Hermite’s interpolating polynomials. Both of these answer the same question: given an unknown function \(f\) on an interval \(\left[ a, b \right]\), and given the value of \(f\) at several points in this interval, can you “guess” the values of \(f\) elsewhere on the interval?

One way of approaching this is by invoking the fundamental theorem of algebra: given points \(x_0,\ldots, x_n\) and oracular knowledge of \(f\left( x_0 \right),\ldots, f\left( x_n \right)\), there is exactly one polynomial of degree at most \(n\) that passes through each point \(\left( x_k, f\left( x_k \right) \right)\) for each \(k\). However, all we know is that such a polynomial exists, and the process of determining this polynomial (though not particularly difficult) was somewhat spontaneous. This is even more evident in the mysterious and difficult-to-memorise formula for Hermite interpolation.

Let’s reinterpret the process of polynomial interpolation through the lens of linear algebra. The key idea is that although the space of differentiable (or even smooth functions) is an infinite-dimensional vector space, we are only exposed to a “finite-dimensional set of information” about our function. We can then exploit the constraints of dimensionality to infer more information about our function based on what we’re given.

Let \(P_n\) be the set of polynomials of degree at most \(n\). This is a finite-dimensional real vector space, so \(P_n^*\) (the set of linear functions \(P_n\to \mathbb{R}\)) is isomorphic to \(P_n\) and hence has the same dimension.

There are lots of linear functionals in \(P_n^*\), but there is a set of \(n+1\) functionals we can exploit. We define \(e _{x_i}\) to be the evaluation at \(x_i\): it sends the polynomial \(p\in P_n\) to the real number \(p\left( x_i \right)\). Written another way, \(e _{x_i}\left( p \right)=p\left( x_i \right)\). \(e _{x_0},\ldots, e _{x_n}\) are linearly independent and thus span \(P_n^*\).

Let \(x\) be a real number, and let \(e_x\) be the evaluation at \(x\) (i.e., \(e_x(p) = p(x)\)). Then, \(e_x\in P_n^*\), so there exist real numbers \(c_0(x),\ldots, c_n(x)\) such that \[e_x = c_0(x) e _{x_0}+ \cdots + c_n(x) e _{x_n}.\] Written another way, there exist real numbers \(c_0(x),\ldots, c_n(x)\) that depend only on \(x\) (and \(x_0,\ldots, x_n\), of course) such that for all polynomials \(p\) of degree at most \(n\), we have \[p(x) = c_0(x) p\left( x_0 \right) + \cdots + c_n(x) p \left( x_n \right).\]

There is a way to determine these coefficients \(c_0(x),\ldots, c_n(x)\) using linear algebra as well. Let \(\left\lbrace p_0,\ldots, p_n \right\rbrace\) be a basis for \(P_n\). Then, one has the matrix equation \[ \begin{pmatrix} p_0(x) \\ p_1(x) \\ \vdots \\ p_n(x) \end{pmatrix} = \begin{pmatrix} p_0\left( x_0 \right) & p_0 \left( x_1 \right) & \cdots & p_0 \left( x_n \right) \\ p_1\left( x_0 \right) & p_1\left( x_1 \right) & \cdots & p_1\left( x_n \right) \\ \vdots & \vdots & & \vdots \\ p_n\left( x_0 \right) & p_n\left( x_1 \right) & \cdots & p_n\left( x_n \right) \end{pmatrix} \begin{pmatrix} c_0(x) \\ c_1(x) \\ \vdots \\ c_n(x) \end{pmatrix}.\] Since \(p_0,\ldots, p_n\) are linearly independent, the matrix is invertible (exercise!), so one can recover \(c_0,\ldots, c_n\) by simply inverting this matrix. This shows that the \(c_i(x)\) are all linear combinations of \(p_i(x)\), hence are themselves polynomials.

Remark 1.

This process will always produce the same result for \(c_i(x)\), no matter what choice of basis \(\left\lbrace p_0,\ldots, p_n \right\rbrace\) you pick. It turns out that the \(c_i(x)\)’s described above are exactly the Lagrange polynomials: \[c_i(x) = \prod _{\substack{k = 0 \\ k \neq i}}^{n} \frac{x - x_k}{x_i-x_k}.\]

A more typical interpretation is that the \(c_i\)’s themselves form a basis for \(P_n\), and coordinate representations with respect to this basis contain information about the underlying polynomial.

Remark 2.

To perform the interpolation between the points \(\left( x_0,y_0 \right),\ldots, \left( x_n, y_n \right)\), one needs to first determine the \(c_i(x)\)’s as above, then compute the dot product \[\begin{pmatrix} c_0(x) & \cdots & c_n(x) \end{pmatrix} \begin{pmatrix} y_0 \\ \vdots \\ y_n \end{pmatrix}.\] Neville’s method can be interpreted as expressing this dot product in matrix form by using \[\begin{pmatrix} c_0(x) \\ \vdots \\ c_n(x) \end{pmatrix} = \begin{pmatrix} p_0\left( x_0 \right) & \cdots & p_0\left( x_n \right) \\ \vdots & \vdots & \vdots \\ p_n\left( x_0 \right) & \cdots & p_n\left( x_n \right) \end{pmatrix} ^{-1} \begin{pmatrix} p_0(x) \\ \vdots \\ p_n(x) \end{pmatrix}.\] For a good choice of \(p_0,\ldots, p_n\), one can rewrite this as a product of \(n\) “sparse-ish” \(k\times (k-1)\) matrices, where \(k\) ranges over \(2,\ldots, n+1\). The intermediate products then give information about the lower-order interpolating polynomials!

Remark 3.

The method of divided differences can be interpreted as a change-of-basis: rather than using the Lagrange polynomials, one picks \(c_i(x)\)’s that are very easy to compute. From our dual perspective, this corresponds to using the \(i\)-th divided differences as a dual basis instead of the evaluation functionals. The (dual) change-of-basis matrix is triangular and represents the computations of the divided differences!

Derivatives and Integrals are Linear

This is a rather suspicious perspective to take on interpolating polynomials, especially if you’re not very familiar with linear algebra. However, the level of abstraction in the above discussion helps us generalise these techniques and ideas in various directions.

In applications like signal processing and image analysis, the discrete Fourier transform is a centrally important tool. Linear algebra enters the fray since the DFT is a linear transformation; though computing it is extremely burdensome, people have developed a class of fast Fourier transform algorithms that rely on sparse matrix factorisations (somewhat similar to Neville’s method, but even better).

For this class, one of two applications of interest is the idea of numerical differentiation: given a function \(f\) and points \(\left( x_0, f\left( x_0 \right) \right), \ldots, \left( x_n, f\left( x_n \right) \right)\) close to \(x_*\), can we estimate \(f’(x_*)\)?

A first step would be to apply Taylor’s theorem: there is a polynomial \(p_f\) of degree at most \(n\) such that \[f\left( x \right) = p_f\left( x - x_* \right) + O \left( \left\lvert x - x_* \right\rvert ^{n+1} \right).\] By definition of the Taylor polynomial, one has \(p_f’(0) = f’\left( x_* \right)\).

We can now play the same game as before: let \(P_n\) be the space of real polynomials of degree at most \(n\), let \(e_i\) be the evaluation at \(x_i - x_*\), and let \(D\) be differentiation at \(0\). Each of these are elements of \(P_n^*\), and one still has that \(\left\lbrace e_0, \ldots, e_n \right\rbrace\) form a basis for \(P_n^*\). Thus, there exist constants \(d_0,\ldots, d_n\) such that \(D = d_0e_0 + \cdots d_ne_n\). Written more explicitly, one has for any polynomial \(p\in P_n\) that \[p’(0) = d_0 p\left( x_0 - x_* \right) + \cdots + d_n p \left( x_n - x_* \right).\] Applying this to \(p_f\) from before and using the fact that \(p_f\left( x_i-x_* \right)= f\left( x_i \right) + O\left( \left\lvert x_i-x_* \right\rvert ^{n+1} \right)\) yields \[f’\left( x_* \right) = d_0 f\left( x_0 \right) + \cdots + d_n f\left( x_n \right) + O\left( \sum _{i=0}^{n}\left\lvert x_i - x_* \right\rvert ^{n+1} \right).\] In fact, this technique isn’t limited only to the first derivative; one can apply this idea to any of the first \(n\) derivatives!

In much the same vein, we can estimate \(\int _{a}^{b} f(x) dx\) with the exact same technique. With the \(e _{x_0},\ldots, e _{x_n}\) as before, we have that \(I : p \mapsto \int _{a}^{b} p(x) dx\) is a linear operator on \(P_n\), hence there exist constants \(i_0, \ldots, i_n\) such that \(I = i_0 e _{x_0} + \cdots + i_n e _{x_n}\). That is to say, for any \(p\in P_n\), we have \[\int _{a}^{b} p(x) dx = i_0 p \left( x_0 \right) + \cdots + i_n p \left( x_n \right).\] Taking \(p\) to be the interpolating polynomial for the points \(\left( x_0, f\left( x_0 \right) \right), \ldots, \left( x_n, f\left( x_n \right) \right)\), we get \[ \int _{a}^{b} f(x) dx = i_0 f\left( x_0 \right) + \cdots + i_n f\left( x_n \right) + O\left( \left\lvert b-a \right\rvert ^{n+1} \right).\] The error term comes from integrating the error form for the interpolating polynomial.

Remark 4.

We’ve used two different types of polynomials to approximate our unknown function \(f\) in the above discussion: when approximating a derivative of \(f\), we used a Taylor polynomial, and when approximating the integral of \(f\), we used an interpolating polynomial. This reflects how the derivative only uses local data about \(f\), and the Taylor polynomial captures as much of the local data as possible; in contrast, integrals consider global data about \(f\), and an interpolating polynomial more appropriately captures such data.

On Computational Efficiency and Sharpness of the Error Terms

While the idea of using linear algebra to approach these numerical estimations is expositionally extremely efficient, in practise it’s quite a bad idea to apply it directly without some extra work. Inverting a matrix to determine the coefficients \(d_0,\ldots, d_n\) or \(i_0,\ldots, i_n\) using something like Cramer’s Rule is unfathomably slow.

Most commonly, the data points \(x_0,\ldots, x_n\) can be chosen freely; most frequently, they are chosen to be evenly spaced out. These coefficients can then be worked out explicitly ahead of time. In numerical integration, these are known as the Newton-Cotes formulae. In these cases, due to the symmetry of the data, one sometimes gets a better error term than anticipated (e.g., \(O\left( \left\lvert b-a \right\rvert^5 \right)\) instead of \(O\left( \left\lvert b-a \right\rvert ^{4} \right)\)).

Even when these data points are not uniform, one can work out explicit formulae for what our coefficients should be. As one may expect, they are closely related to the Lagrange polynomials. However, they are generally quite difficult to compute or even describe; the work needed to do this (computationally and mentally) is likely comparable to the work needed to use the matrix equation.

Later in life (i.e., if and when you take 151B), you will learn about ways to efficiently invert matrices, either exactly or approximately, and this can save a lot of time when performing numerical differentiation or integration in bulk.

Exploiting MATLAB’s Amazing Linear Algebra

Fortunately, MATLAB has a rich set of features when it comes to linear algebra, and we can use this to our advantage. At the cost of some computation time, we can avoid all the pain of working out the coefficients by hand (such as on your homework).

Let’s start with numerical differentiation: we are given a list of data points \(\left( x_0, f\left( x_0 \right),\ldots, \left( x_n, f\left( x_n \right) \right) \right)\), and we are asked to approximate the derivative \(f’\left( x_* \right)\) at some point \(x_*\). Let’s write the pseudocode for this function first. At the core of our algorithm is the matrix equation \[\begin{pmatrix} p_0’(0) \\ \vdots \\ p_n’(0) \end{pmatrix} = \begin{pmatrix} p_0 \left( x_0 - x_* \right) & \cdots & p_0 \left( x_n-x_* \right) \\ \vdots & \vdots & \vdots \\ p_n \left( x_0 -x_* \right) & \cdots & p_n \left( x_n - x_* \right) \end{pmatrix} \begin{pmatrix} d_0 \\ \vdots \\ d_n \end{pmatrix}.\] Recall that we have the freedom to pick the polynomials \(p_0,\ldots, p_n\); thus, we’ll select \(p_i(x) = x^i\). The derivatives are very, very easy to compute. 0. INPUTS: column vectors x = [x0, ..., xn] and y = [y0, ..., yn]; a real number x_star.

Define an \((n+1)\times (n+1)\) matrix A according to the formula A(i, j) = (x(j) - x_star) ^ i.
Compute [d0, ..., dn] = (A ^ -1) * [0, 1, 0, ..., 0].
Compute and output the dot product d0 * y0 + ... + dn * yn. That’s all there is to it! Here it is in MATLAB:

 1function fp = num_diff(x, y, x_star): 
 2    % setting up the matrix 
 3    n = length(x); 
 4    A = zeros(n); 
 5    A(1, :) = 1;        % set the entire first row to ones 
 6    for row = 2:n
 7        A(row, :) = (x' - x_star) .* A(row - 1, :); 
 8    end % of for loop 
 9
10    % compute the vector of coefficients, d
11    v = zeros(n, 1); 
12    v(2, 1) = 1; 
13    d = A \ v; 
14
15    % then output the dot product d * y 
16    fp = d' * y; 
17end % of num_diff

Think about what line 7 is doing and why it works! I think writing code like this really helps one appreciate the wonders of MATLAB.

Try to write numerical integration by yourself: write a function that takes column vectors x = [x0, ..., xn] and y = [y0, ...., yn] alongside two real numbers a and b as inputs, then computes \(\int _{a}^{b} f(x) dx\), where \(f\) is an unknown function passing through the provided data points.

Pseudocode

The pseudocode is almost identical to the pseudocode for numerical differentiation. The only difference is that instead of having a vector of derivatives, we’ll have a vector of integrals to work with. We’ll continue using \(p_i(x) = x^i\) — these should be easy to integrate as well.

Define an \((n+1)\times (n+1)\) matrix A according to the formula A(i, j) = (x(j) - x_star) ^ i.
Define the vector v by v(i) = (b^(i+1) - a^(i+1)) / (i+1). Compute [i0, ..., in] = (A ^ -1) * v.
Compute and output the dot product i0 * y0 + ... + in * yn.

Solution

 1function integral = num_int(x, y, a, b) 
 2    % setting up the matrix 
 3    n = length(x); 
 4    A = zeros(n); 
 5    A(1, :) = 1;        % set the entire first row to ones 
 6    for row = 2:n
 7        A(row, :) = (x' - x_star) .* A(row - 1, :); 
 8    end % of for loop 
 9
10    % compute the vector of coefficients, d
11    v = 1:n; 
12    v = ((b .^ v) - (a .^ v)) ./ v; % black magic 
13    i = A \ v; 
14
15    % return the dot product of i * y. 
16    integral = i' * y; 
17end % of num_int

Exercise 5.

Rewrite the function num_diff so that it takes in a callable function f, two real numbers a and b, and two integers N and n. Then, approximate the graph of the derivative of f at N evenly spaced points across the interval \(\left[ a, b \right]\) by numerically differentiating \(f\) with \(n\) evenly spaced data points at each point.

Exercise 6.

Rewrite the function num_int so that it takes in a callable function f, two real numbers a and b, and two integers N and n. Then, numerically integrate \(\int _{a}^{b} f(x) dx\) by splitting \(\left[ a, b \right]\) into \(N\) intervals of equal length and using an \(n\)-point approximation to the integral of \(f\) over each sub-interval.

Hunter Liu's Website

5. Numerical Integration and Differentiation

Derivatives and Integrals are Linear

On Computational Efficiency and Sharpness of the Error Terms

Exploiting MATLAB’s Amazing Linear Algebra