5. Numerical Integration and Differentiation
≪ 4. Lagrange Polynomials and Their Problems | Table of Contents | 6. A Note about Divided Differences ≫Last week, we talked about the different ways of computing Lagrange’s interpolating polynomials, and during lecture, you should have discussed the extension of these techniques to Hermite’s interpolating polynomials. Both of these answer the same question: given an unknown function on an interval , and given the value of at several points in this interval, can you “guess” the values of elsewhere on the interval?
One way of approaching this is by invoking the fundamental theorem of algebra: given points and oracular knowledge of , there is exactly one polynomial of degree at most that passes through each point for each . However, all we know is that such a polynomial exists, and the process of determining this polynomial (though not particularly difficult) was somewhat spontaneous. This is even more evident in the mysterious and difficult-to-memorise formula for Hermite interpolation.
Let’s reinterpret the process of polynomial interpolation through the lens of linear algebra. The key idea is that although the space of differentiable (or even smooth functions) is an infinite-dimensional vector space, we are only exposed to a “finite-dimensional set of information” about our function. We can then exploit the constraints of dimensionality to infer more information about our function based on what we’re given.
Let be the set of polynomials of degree at most . This is a finite-dimensional real vector space, so (the set of linear functions ) is isomorphic to and hence has the same dimension.
There are lots of linear functionals in , but there is a set of functionals we can exploit. We define to be the evaluation at : it sends the polynomial to the real number . Written another way, . are linearly independent and thus span .
Let be a real number, and let be the evaluation at (i.e., ). Then, , so there exist real numbers such that Written another way, there exist real numbers that depend only on (and , of course) such that for all polynomials of degree at most , we have
There is a way to determine these coefficients using linear algebra as well. Let be a basis for . Then, one has the matrix equation Since are linearly independent, the matrix is invertible (exercise!), so one can recover by simply inverting this matrix. This shows that the are all linear combinations of , hence are themselves polynomials.
Remark 1.
This process will always produce the same result for , no matter what choice of basis you pick. It turns out that the ’s described above are exactly the Lagrange polynomials:
A more typical interpretation is that the ’s themselves form a basis for , and coordinate representations with respect to this basis contain information about the underlying polynomial.
Remark 2.
To perform the interpolation between the points , one needs to first determine the ’s as above, then compute the dot product Neville’s method can be interpreted as expressing this dot product in matrix form by using For a good choice of , one can rewrite this as a product of “sparse-ish” matrices, where ranges over . The intermediate products then give information about the lower-order interpolating polynomials!
Remark 3.
The method of divided differences can be interpreted as a change-of-basis: rather than using the Lagrange polynomials, one picks ’s that are very easy to compute. From our dual perspective, this corresponds to using the -th divided differences as a dual basis instead of the evaluation functionals. The (dual) change-of-basis matrix is triangular and represents the computations of the divided differences!
Derivatives and Integrals are Linear
This is a rather suspicious perspective to take on interpolating polynomials, especially if you’re not very familiar with linear algebra. However, the level of abstraction in the above discussion helps us generalise these techniques and ideas in various directions.
In applications like signal processing and image analysis, the discrete Fourier transform is a centrally important tool. Linear algebra enters the fray since the DFT is a linear transformation; though computing it is extremely burdensome, people have developed a class of fast Fourier transform algorithms that rely on sparse matrix factorisations (somewhat similar to Neville’s method, but even better).
For this class, one of two applications of interest is the idea of numerical differentiation: given a function and points close to , can we estimate ?
A first step would be to apply Taylor’s theorem: there is a polynomial of degree at most such that By definition of the Taylor polynomial, one has .
We can now play the same game as before: let be the space of real polynomials of degree at most , let be the evaluation at , and let be differentiation at . Each of these are elements of , and one still has that form a basis for . Thus, there exist constants such that . Written more explicitly, one has for any polynomial that Applying this to from before and using the fact that yields In fact, this technique isn’t limited only to the first derivative; one can apply this idea to any of the first derivatives!
In much the same vein, we can estimate with the exact same technique. With the as before, we have that is a linear operator on , hence there exist constants such that . That is to say, for any , we have Taking to be the interpolating polynomial for the points , we get The error term comes from integrating the error form for the interpolating polynomial.
Remark 4.
We’ve used two different types of polynomials to approximate our unknown function in the above discussion: when approximating a derivative of , we used a Taylor polynomial, and when approximating the integral of , we used an interpolating polynomial. This reflects how the derivative only uses local data about , and the Taylor polynomial captures as much of the local data as possible; in contrast, integrals consider global data about , and an interpolating polynomial more appropriately captures such data.
On Computational Efficiency and Sharpness of the Error Terms
While the idea of using linear algebra to approach these numerical estimations is expositionally extremely efficient, in practise it’s quite a bad idea to apply it directly without some extra work. Inverting a matrix to determine the coefficients or using something like Cramer’s Rule is unfathomably slow.
Most commonly, the data points can be chosen freely; most frequently, they are chosen to be evenly spaced out. These coefficients can then be worked out explicitly ahead of time. In numerical integration, these are known as the Newton-Cotes formulae. In these cases, due to the symmetry of the data, one sometimes gets a better error term than anticipated (e.g., instead of ).
Even when these data points are not uniform, one can work out explicit formulae for what our coefficients should be. As one may expect, they are closely related to the Lagrange polynomials. However, they are generally quite difficult to compute or even describe; the work needed to do this (computationally and mentally) is likely comparable to the work needed to use the matrix equation.
Later in life (i.e., if and when you take 151B), you will learn about ways to efficiently invert matrices, either exactly or approximately, and this can save a lot of time when performing numerical differentiation or integration in bulk.
Exploiting MATLAB’s Amazing Linear Algebra
Fortunately, MATLAB has a rich set of features when it comes to linear algebra, and we can use this to our advantage. At the cost of some computation time, we can avoid all the pain of working out the coefficients by hand (such as on your homework).
Let’s start with numerical differentiation: we are given a list of data points , and we are asked to approximate the derivative at some point . Let’s write the pseudocode for this function first. At the core of our algorithm is the matrix equation
Recall that we have the freedom to pick the polynomials ; thus, we’ll select . The derivatives are very, very easy to compute.
0. INPUTS: column vectors x = [x0, ..., xn]
and y = [y0, ..., yn]
; a real number x_star
.
- Define an matrix
A
according to the formulaA(i, j) = (x(j) - x_star) ^ i
. - Compute
[d0, ..., dn] = (A ^ -1) * [0, 1, 0, ..., 0]
. - Compute and output the dot product
d0 * y0 + ... + dn * yn
. That’s all there is to it! Here it is in MATLAB:
1function fp = num_diff(x, y, x_star):
2 % setting up the matrix
3 n = length(x);
4 A = zeros(n);
5 A(1, :) = 1; % set the entire first row to ones
6 for row = 2:n
7 A(row, :) = (x' - x_star) .* A(row - 1, :);
8 end % of for loop
9
10 % compute the vector of coefficients, d
11 v = zeros(n, 1);
12 v(2, 1) = 1;
13 d = A \ v;
14
15 % then output the dot product d * y
16 fp = d' * y;
17end % of num_diff
Think about what line 7 is doing and why it works! I think writing code like this really helps one appreciate the wonders of MATLAB.
Try to write numerical integration by yourself: write a function that takes column vectors x = [x0, ..., xn]
and y = [y0, ...., yn]
alongside two real numbers a
and b
as inputs, then computes , where is an unknown function passing through the provided data points.
Pseudocode
The pseudocode is almost identical to the pseudocode for numerical differentiation. The only difference is that instead of having a vector of derivatives, we’ll have a vector of integrals to work with. We’ll continue using — these should be easy to integrate as well.
- Define an matrix
A
according to the formulaA(i, j) = (x(j) - x_star) ^ i
. - Define the vector
v
byv(i) = (b^(i+1) - a^(i+1)) / (i+1)
. Compute[i0, ..., in] = (A ^ -1) * v
. - Compute and output the dot product
i0 * y0 + ... + in * yn
.
Solution
1function integral = num_int(x, y, a, b)
2 % setting up the matrix
3 n = length(x);
4 A = zeros(n);
5 A(1, :) = 1; % set the entire first row to ones
6 for row = 2:n
7 A(row, :) = (x' - x_star) .* A(row - 1, :);
8 end % of for loop
9
10 % compute the vector of coefficients, d
11 v = 1:n;
12 v = ((b .^ v) - (a .^ v)) ./ v; % black magic
13 i = A \ v;
14
15 % return the dot product of i * y.
16 integral = i' * y;
17end % of num_int
Exercise 5.
Rewrite the function num_diff
so that it takes in a callable function f
, two real numbers a
and b
, and two integers N
and n
. Then, approximate the graph of the derivative of f
at N
evenly spaced points across the interval by numerically differentiating with evenly spaced data points at each point.
Exercise 6.
Rewrite the function num_int
so that it takes in a callable function f
, two real numbers a
and b
, and two integers N
and n
. Then, numerically integrate by splitting into intervals of equal length and using an -point approximation to the integral of over each sub-interval.