Hunter Liu's Website

5. Numerical Integration and Differentiation

≪ 4. Lagrange Polynomials and Their Problems | Table of Contents | 6. A Note about Divided Differences ≫

Last week, we talked about the different ways of computing Lagrange’s interpolating polynomials, and during lecture, you should have discussed the extension of these techniques to Hermite’s interpolating polynomials. Both of these answer the same question: given an unknown function ff on an interval [a,b]\left[ a, b \right], and given the value of ff at several points in this interval, can you “guess” the values of ff elsewhere on the interval?

One way of approaching this is by invoking the fundamental theorem of algebra: given points x0,,xnx_0,\ldots, x_n and oracular knowledge of f(x0),,f(xn)f\left( x_0 \right),\ldots, f\left( x_n \right), there is exactly one polynomial of degree at most nn that passes through each point (xk,f(xk))\left( x_k, f\left( x_k \right) \right) for each kk. However, all we know is that such a polynomial exists, and the process of determining this polynomial (though not particularly difficult) was somewhat spontaneous. This is even more evident in the mysterious and difficult-to-memorise formula for Hermite interpolation.

Let’s reinterpret the process of polynomial interpolation through the lens of linear algebra. The key idea is that although the space of differentiable (or even smooth functions) is an infinite-dimensional vector space, we are only exposed to a “finite-dimensional set of information” about our function. We can then exploit the constraints of dimensionality to infer more information about our function based on what we’re given.

Let PnP_n be the set of polynomials of degree at most nn. This is a finite-dimensional real vector space, so PnP_n^* (the set of linear functions PnRP_n\to \mathbb{R}) is isomorphic to PnP_n and hence has the same dimension.

There are lots of linear functionals in PnP_n^*, but there is a set of n+1n+1 functionals we can exploit. We define exie _{x_i} to be the evaluation at xix_i: it sends the polynomial pPnp\in P_n to the real number p(xi)p\left( x_i \right). Written another way, exi(p)=p(xi)e _{x_i}\left( p \right)=p\left( x_i \right). ex0,,exne _{x_0},\ldots, e _{x_n} are linearly independent and thus span PnP_n^*.

Let xx be a real number, and let exe_x be the evaluation at xx (i.e., ex(p)=p(x)e_x(p) = p(x)). Then, exPne_x\in P_n^*, so there exist real numbers c0(x),,cn(x)c_0(x),\ldots, c_n(x) such that ex=c0(x)ex0++cn(x)exn.e_x = c_0(x) e _{x_0}+ \cdots + c_n(x) e _{x_n}. Written another way, there exist real numbers c0(x),,cn(x)c_0(x),\ldots, c_n(x) that depend only on xx (and x0,,xnx_0,\ldots, x_n, of course) such that for all polynomials pp of degree at most nn, we have p(x)=c0(x)p(x0)++cn(x)p(xn).p(x) = c_0(x) p\left( x_0 \right) + \cdots + c_n(x) p \left( x_n \right).

There is a way to determine these coefficients c0(x),,cn(x)c_0(x),\ldots, c_n(x) using linear algebra as well. Let {p0,,pn}\left\lbrace p_0,\ldots, p_n \right\rbrace be a basis for PnP_n. Then, one has the matrix equation (p0(x)p1(x)pn(x))=(p0(x0)p0(x1)p0(xn)p1(x0)p1(x1)p1(xn)pn(x0)pn(x1)pn(xn))(c0(x)c1(x)cn(x)). \begin{pmatrix} p_0(x) \\ p_1(x) \\ \vdots \\ p_n(x) \end{pmatrix} = \begin{pmatrix} p_0\left( x_0 \right) & p_0 \left( x_1 \right) & \cdots & p_0 \left( x_n \right) \\ p_1\left( x_0 \right) & p_1\left( x_1 \right) & \cdots & p_1\left( x_n \right) \\ \vdots & \vdots & & \vdots \\ p_n\left( x_0 \right) & p_n\left( x_1 \right) & \cdots & p_n\left( x_n \right) \end{pmatrix} \begin{pmatrix} c_0(x) \\ c_1(x) \\ \vdots \\ c_n(x) \end{pmatrix}. Since p0,,pnp_0,\ldots, p_n are linearly independent, the matrix is invertible (exercise!), so one can recover c0,,cnc_0,\ldots, c_n by simply inverting this matrix. This shows that the ci(x)c_i(x) are all linear combinations of pi(x)p_i(x), hence are themselves polynomials.

Remark 1.

This process will always produce the same result for ci(x)c_i(x), no matter what choice of basis {p0,,pn}\left\lbrace p_0,\ldots, p_n \right\rbrace you pick. It turns out that the ci(x)c_i(x)’s described above are exactly the Lagrange polynomials: ci(x)=k=0kinxxkxixk.c_i(x) = \prod _{\substack{k = 0 \\ k \neq i}}^{n} \frac{x - x_k}{x_i-x_k}.

A more typical interpretation is that the cic_i’s themselves form a basis for PnP_n, and coordinate representations with respect to this basis contain information about the underlying polynomial.

Remark 2.

To perform the interpolation between the points (x0,y0),,(xn,yn)\left( x_0,y_0 \right),\ldots, \left( x_n, y_n \right), one needs to first determine the ci(x)c_i(x)’s as above, then compute the dot product (c0(x)cn(x))(y0yn).\begin{pmatrix} c_0(x) & \cdots & c_n(x) \end{pmatrix} \begin{pmatrix} y_0 \\ \vdots \\ y_n \end{pmatrix}. Neville’s method can be interpreted as expressing this dot product in matrix form by using (c0(x)cn(x))=(p0(x0)p0(xn)pn(x0)pn(xn))1(p0(x)pn(x)).\begin{pmatrix} c_0(x) \\ \vdots \\ c_n(x) \end{pmatrix} = \begin{pmatrix} p_0\left( x_0 \right) & \cdots & p_0\left( x_n \right) \\ \vdots & \vdots & \vdots \\ p_n\left( x_0 \right) & \cdots & p_n\left( x_n \right) \end{pmatrix} ^{-1} \begin{pmatrix} p_0(x) \\ \vdots \\ p_n(x) \end{pmatrix}. For a good choice of p0,,pnp_0,\ldots, p_n, one can rewrite this as a product of nn “sparse-ish” k×(k1)k\times (k-1) matrices, where kk ranges over 2,,n+12,\ldots, n+1. The intermediate products then give information about the lower-order interpolating polynomials!

Remark 3.

The method of divided differences can be interpreted as a change-of-basis: rather than using the Lagrange polynomials, one picks ci(x)c_i(x)’s that are very easy to compute. From our dual perspective, this corresponds to using the ii-th divided differences as a dual basis instead of the evaluation functionals. The (dual) change-of-basis matrix is triangular and represents the computations of the divided differences!

Derivatives and Integrals are Linear

This is a rather suspicious perspective to take on interpolating polynomials, especially if you’re not very familiar with linear algebra. However, the level of abstraction in the above discussion helps us generalise these techniques and ideas in various directions.

In applications like signal processing and image analysis, the discrete Fourier transform is a centrally important tool. Linear algebra enters the fray since the DFT is a linear transformation; though computing it is extremely burdensome, people have developed a class of fast Fourier transform algorithms that rely on sparse matrix factorisations (somewhat similar to Neville’s method, but even better).

For this class, one of two applications of interest is the idea of numerical differentiation: given a function ff and points (x0,f(x0)),,(xn,f(xn))\left( x_0, f\left( x_0 \right) \right), \ldots, \left( x_n, f\left( x_n \right) \right) close to xx_*, can we estimate f(x)f’(x_*)?

A first step would be to apply Taylor’s theorem: there is a polynomial pfp_f of degree at most nn such that f(x)=pf(xx)+O(xxn+1).f\left( x \right) = p_f\left( x - x_* \right) + O \left( \left\lvert x - x_* \right\rvert ^{n+1} \right). By definition of the Taylor polynomial, one has pf(0)=f(x)p_f’(0) = f’\left( x_* \right).

We can now play the same game as before: let PnP_n be the space of real polynomials of degree at most nn, let eie_i be the evaluation at xixx_i - x_*, and let DD be differentiation at 00. Each of these are elements of PnP_n^*, and one still has that {e0,,en}\left\lbrace e_0, \ldots, e_n \right\rbrace form a basis for PnP_n^*. Thus, there exist constants d0,,dnd_0,\ldots, d_n such that D=d0e0+dnenD = d_0e_0 + \cdots d_ne_n. Written more explicitly, one has for any polynomial pPnp\in P_n that p(0)=d0p(x0x)++dnp(xnx).p’(0) = d_0 p\left( x_0 - x_* \right) + \cdots + d_n p \left( x_n - x_* \right). Applying this to pfp_f from before and using the fact that pf(xix)=f(xi)+O(xixn+1)p_f\left( x_i-x_* \right)= f\left( x_i \right) + O\left( \left\lvert x_i-x_* \right\rvert ^{n+1} \right) yields f(x)=d0f(x0)++dnf(xn)+O(i=0nxixn+1).f’\left( x_* \right) = d_0 f\left( x_0 \right) + \cdots + d_n f\left( x_n \right) + O\left( \sum _{i=0}^{n}\left\lvert x_i - x_* \right\rvert ^{n+1} \right). In fact, this technique isn’t limited only to the first derivative; one can apply this idea to any of the first nn derivatives!

In much the same vein, we can estimate abf(x)dx\int _{a}^{b} f(x) dx with the exact same technique. With the ex0,,exne _{x_0},\ldots, e _{x_n} as before, we have that I:pabp(x)dxI : p \mapsto \int _{a}^{b} p(x) dx is a linear operator on PnP_n, hence there exist constants i0,,ini_0, \ldots, i_n such that I=i0ex0++inexnI = i_0 e _{x_0} + \cdots + i_n e _{x_n}. That is to say, for any pPnp\in P_n, we have abp(x)dx=i0p(x0)++inp(xn).\int _{a}^{b} p(x) dx = i_0 p \left( x_0 \right) + \cdots + i_n p \left( x_n \right). Taking pp to be the interpolating polynomial for the points (x0,f(x0)),,(xn,f(xn))\left( x_0, f\left( x_0 \right) \right), \ldots, \left( x_n, f\left( x_n \right) \right), we get abf(x)dx=i0f(x0)++inf(xn)+O(ban+1). \int _{a}^{b} f(x) dx = i_0 f\left( x_0 \right) + \cdots + i_n f\left( x_n \right) + O\left( \left\lvert b-a \right\rvert ^{n+1} \right). The error term comes from integrating the error form for the interpolating polynomial.

Remark 4.

We’ve used two different types of polynomials to approximate our unknown function ff in the above discussion: when approximating a derivative of ff, we used a Taylor polynomial, and when approximating the integral of ff, we used an interpolating polynomial. This reflects how the derivative only uses local data about ff, and the Taylor polynomial captures as much of the local data as possible; in contrast, integrals consider global data about ff, and an interpolating polynomial more appropriately captures such data.

On Computational Efficiency and Sharpness of the Error Terms

While the idea of using linear algebra to approach these numerical estimations is expositionally extremely efficient, in practise it’s quite a bad idea to apply it directly without some extra work. Inverting a matrix to determine the coefficients d0,,dnd_0,\ldots, d_n or i0,,ini_0,\ldots, i_n using something like Cramer’s Rule is unfathomably slow.

Most commonly, the data points x0,,xnx_0,\ldots, x_n can be chosen freely; most frequently, they are chosen to be evenly spaced out. These coefficients can then be worked out explicitly ahead of time. In numerical integration, these are known as the Newton-Cotes formulae. In these cases, due to the symmetry of the data, one sometimes gets a better error term than anticipated (e.g., O(ba5)O\left( \left\lvert b-a \right\rvert^5 \right) instead of O(ba4)O\left( \left\lvert b-a \right\rvert ^{4} \right)).

Even when these data points are not uniform, one can work out explicit formulae for what our coefficients should be. As one may expect, they are closely related to the Lagrange polynomials. However, they are generally quite difficult to compute or even describe; the work needed to do this (computationally and mentally) is likely comparable to the work needed to use the matrix equation.

Later in life (i.e., if and when you take 151B), you will learn about ways to efficiently invert matrices, either exactly or approximately, and this can save a lot of time when performing numerical differentiation or integration in bulk.

Exploiting MATLAB’s Amazing Linear Algebra

Fortunately, MATLAB has a rich set of features when it comes to linear algebra, and we can use this to our advantage. At the cost of some computation time, we can avoid all the pain of working out the coefficients by hand (such as on your homework).

Let’s start with numerical differentiation: we are given a list of data points (x0,f(x0),,(xn,f(xn)))\left( x_0, f\left( x_0 \right),\ldots, \left( x_n, f\left( x_n \right) \right) \right), and we are asked to approximate the derivative f(x)f’\left( x_* \right) at some point xx_*. Let’s write the pseudocode for this function first. At the core of our algorithm is the matrix equation (p0(0)pn(0))=(p0(x0x)p0(xnx)pn(x0x)pn(xnx))(d0dn).\begin{pmatrix} p_0’(0) \\ \vdots \\ p_n’(0) \end{pmatrix} = \begin{pmatrix} p_0 \left( x_0 - x_* \right) & \cdots & p_0 \left( x_n-x_* \right) \\ \vdots & \vdots & \vdots \\ p_n \left( x_0 -x_* \right) & \cdots & p_n \left( x_n - x_* \right) \end{pmatrix} \begin{pmatrix} d_0 \\ \vdots \\ d_n \end{pmatrix}. Recall that we have the freedom to pick the polynomials p0,,pnp_0,\ldots, p_n; thus, we’ll select pi(x)=xip_i(x) = x^i. The derivatives are very, very easy to compute. 0. INPUTS: column vectors x = [x0, ..., xn] and y = [y0, ..., yn]; a real number x_star.

  1. Define an (n+1)×(n+1)(n+1)\times (n+1) matrix A according to the formula A(i, j) = (x(j) - x_star) ^ i.
  2. Compute [d0, ..., dn] = (A ^ -1) * [0, 1, 0, ..., 0].
  3. Compute and output the dot product d0 * y0 + ... + dn * yn. That’s all there is to it! Here it is in MATLAB:
 1function fp = num_diff(x, y, x_star): 
 2    % setting up the matrix 
 3    n = length(x); 
 4    A = zeros(n); 
 5    A(1, :) = 1;        % set the entire first row to ones 
 6    for row = 2:n
 7        A(row, :) = (x' - x_star) .* A(row - 1, :); 
 8    end % of for loop 
 9
10    % compute the vector of coefficients, d
11    v = zeros(n, 1); 
12    v(2, 1) = 1; 
13    d = A \ v; 
14
15    % then output the dot product d * y 
16    fp = d' * y; 
17end % of num_diff 

Think about what line 7 is doing and why it works! I think writing code like this really helps one appreciate the wonders of MATLAB.

Try to write numerical integration by yourself: write a function that takes column vectors x = [x0, ..., xn] and y = [y0, ...., yn] alongside two real numbers a and b as inputs, then computes abf(x)dx\int _{a}^{b} f(x) dx, where ff is an unknown function passing through the provided data points.

Pseudocode

The pseudocode is almost identical to the pseudocode for numerical differentiation. The only difference is that instead of having a vector of derivatives, we’ll have a vector of integrals to work with. We’ll continue using pi(x)=xip_i(x) = x^i — these should be easy to integrate as well.

  1. Define an (n+1)×(n+1)(n+1)\times (n+1) matrix A according to the formula A(i, j) = (x(j) - x_star) ^ i.
  2. Define the vector v by v(i) = (b^(i+1) - a^(i+1)) / (i+1). Compute [i0, ..., in] = (A ^ -1) * v.
  3. Compute and output the dot product i0 * y0 + ... + in * yn.
Solution
 1function integral = num_int(x, y, a, b) 
 2    % setting up the matrix 
 3    n = length(x); 
 4    A = zeros(n); 
 5    A(1, :) = 1;        % set the entire first row to ones 
 6    for row = 2:n
 7        A(row, :) = (x' - x_star) .* A(row - 1, :); 
 8    end % of for loop 
 9
10    % compute the vector of coefficients, d
11    v = 1:n; 
12    v = ((b .^ v) - (a .^ v)) ./ v; % black magic 
13    i = A \ v; 
14
15    % return the dot product of i * y. 
16    integral = i' * y; 
17end % of num_int 

Exercise 5.

Rewrite the function num_diff so that it takes in a callable function f, two real numbers a and b, and two integers N and n. Then, approximate the graph of the derivative of f at N evenly spaced points across the interval [a,b]\left[ a, b \right] by numerically differentiating ff with nn evenly spaced data points at each point.

Exercise 6.

Rewrite the function num_int so that it takes in a callable function f, two real numbers a and b, and two integers N and n. Then, numerically integrate abf(x)dx\int _{a}^{b} f(x) dx by splitting [a,b]\left[ a, b \right] into NN intervals of equal length and using an nn-point approximation to the integral of ff over each sub-interval.