LAB #6: Polynomial Interpolation


TABLE OF CONTENTS

Introduction

Exercise 1

Exercise 8

MATLAB Tips

Exercise 2

Exercise 9

The Polynomial through Given Data

Exercise 3

Exercise 10

Vandermonde's Equation

Exercise 4

Exercise 11

Lagrange Polynomials

Exercise 5

Exercise 12

Computing Divided Differences

Exercise 6


Evaluating Divided Differences

Exercise 7




Introduction

This lab will consume three sessions. It covers material from Sections 3.1 and 3.2 of Atkinson.

This lab is concerned with interpolating data with polynomials. I would like to remark that there is a difference between interpolating and approximating. Interpolating means that the function passes through all the given points and does its best between them. Approximating means only that the function is close to all the given points. Although it might seem that interpolating is always better than approximating, this is not true.

Polynomials are commonly used to interpolate data. There are several methods for defining, constructing, and evaluating the polynomial of a given degree that passes through data values. We can:

Each of these approaches has its advantages, although we will find the divided difference form to be the best for computations.

It can be very helpful to use a consistent notation. We will try to use the names xdata and ydata to refer to known data values, while xval will be a arbitrary point at which we want to evaluate the interpolating polynomial, and pval will be the value of the polynomial at that point.

Matlab Tips

At some point in this lab, you will need to determine if a quantity x is not equal to a quantity y. The way to do this is

if ( x ~= y )

If you have a (square) linear system of equations, of the form

A * x = b

where A is an N by N matrix, and b and x are column vectors, then Matlab can solve the linear system either by using the expression:

     x = inv(A)*b

or, better, via the "backslash" command:

x = A \ b

The backslash command is strange looking. You can remember it by noting that the matrix A appears "underneath" the backslash. The difference between the two commands is merely that the backslash command is much more efficient. You will not notice a difference in this lab, but you might see one later if you use Matlab for more complicated projects.

Exercise 1: Test the backslash command by setting up and solving the linear system

        [ 1  2 ]         [ 17 ]
        [ 3  4 ]  * x =  [ 39 ]
      

Be careful to make your right hand side vector a column vector, or else use the apostrophe symbol to transpose the row vector! Remember that a row vector is defined using commas to separate values while a column vector uses semicolons. x=[1,2,3] is a row vector and y=[1;2;3] is a column vector. You can construct a matrix as a column vector of row vectors (or a row vector of column vectors). You can also construct the matrix one element at a time.

We will want to be able to store certain tables of data in a file. We can have Matlab read and write these tables using the save and load commands. For now, we are going to want to save two vectors of the same length, so Matlab can treat them as two rows of one array. Supposing we have two vectors called xdata and ydata, here's how we could store them in a file:

save parabola xdata ydata

Now suppose we've cleared the data, or exited Matlab, and want to retrieve the values of the data. All we need is the command

load parabola

and Matlab retrieves the names and values of the data it had stored in the file parabola.mat.

Exercise 2: create two vectors, called xdata and ydata, with the values:

        XDATA :  0  1  2
        YDATA :  3  6 11.2
      

Save them to the parabola file. To be sure that you really can save data this way, EXIT MATLAB AND START AGAIN. Now try to read the data back in, and make sure that it's correct.

We will want to make plots in which the data values show up with a marker of some sort. One way to do this is to specify 'o', which indicates that the points are to be marked with circles.

plot ( xdata, ydata, 'o' )

Exercise 3: load the data from the parabola file. On one graph, plot the pairs of points from the data file, and the curve:

y = x2 + 2x + 3

Does the curve pass through, or at least come near, the data points?

Matlab has a command

c = polyfit ( xdata, ydata, n )

which accepts a set of data, and returns the coefficients of the polynomial of degree n which passes near that data. This polynomial approximates, but does not necessarily interpolate, the data. In the Matlab command, any value of n may be used. We will be writing our own version of this routine today, but for us the value of n will be determined by the number of data points and the polynomial will interpolate the data.

Exercise 4: Read the data from the parabola file. Use the polyfit command to determine a set of polynomial coefficients that match the data. Now plot the data points as circles and the polynomial determined by polyfit. Does the formula match the data?

The Polynomial through Given Data

Maybe you didn't notice it, but in the secant method and the regula falsi method, we were solving the problem

Determine the polynomial p(x) of degree 1 that passes through the points (a,fa) and (b,fb).

followed by the problem:

Determine the point x0 at which a polynomial p(x) of degree 1 is equal to zero.

For Newton's method, we determined a (linear) polynomial that, at the point x=a had the value fa and the derivative fp. For Muller's method, we looked for a quadratic polynomial that passed through three data points.

The idea that is common here is that we are trying to construct a "model function", in this case a polynomial, which would "explain" all the data that we have. We may then want to examine the graph of the polynomial, evaluate it at other points, determine its integral or derivative, or do other things with it.

Vandermonde's Equation

Here's one way to see how to organize the computation of the polynomial that passes through a set of data. It's not the best way, but it will give us a good place to start.

Suppose we wanted to determine the linear polynomial

p(x) = c(1) * x + c(2)

that passes through the values (xdata1,ydata1) and (xdata2,ydata2). We simply have to solve the set of linear equations for c(1) and c(2):

        c(1) * xdata1 + c(2) = ydata1
        c(1) * xdata2 + c(2) = ydata2

which (usually) has the solution

        c(1) = ( ydata2 - ydata1 ) / ( xdata2 - xdata1 )
        c(2) = ( xdata2 * ydata1 - xdata1 * ydata2 ) / ( xdata2 - xdata1 )

Compare that situation with the case where we want to determine the quadratic polynomial

p(x) = c(1) * x^2 + c(2) * x + c(3)

that passes through three sets of data values. Then we "simply" have to solve the following set of linear equations for the polynomial coefficients c:

        c(1) * xdata1^2 + c(2) * xdata1 + c(3) = ydata1
        c(1) * xdata2^2 + c(2) * xdata2 + c(3) = ydata2
        c(1) * xdata3^2 + c(2) * xdata3 + c(3) = ydata3

This is an example of a third order Vandermonde Equation. It is characterized by the fact that for each row (sometimes column) of the coefficient matrix, the succesive entries are generated by a decreasing (sometimes increasing) set of powers of a set of variables.

You should be able to see that, for any collection of abscissas and data points, it is possible to define a linear system that should be satisfied by the (unknown) polynomial coefficients. If we can solve the system, and solve it accurately, then that is one way to determine the interpolating polynomial.

Exercise 5: Vandermonde's coefficient matrix is nonsingular if and only if all the abscissas xdata are distinct. Why would we never want to have two abscissas the same? What are we saying if two abscissas are very close, and the corresponding ydata values are far apart?

If we look at this problem as an abstract linear system, we need to define the individual entries of the matrix A:

      for i = 1:3
        for j = 1:3
          A(i,j) = xdata(i)^(3-j)
        end
      end 

Then we have to set the right hand side to the ordinates ydata. If we can get all of that set up, then actually solving the linear system is easy. We just write:

c = A \ ydata'

which you can think of as saying "set c to the value of the inverse of A times the transpose of ydata." Note that we use a "backward" slash and an apostrophe in this equation!

Remark: In actual practice, you would never use a Vandermonde matrix to find interpolating polynomials. For more than a few points, the Vandermonde matrix can be so poorly conditioned that its solution cannot be found!

Exercise 6: write a Matlab M function file, fitpoly.m, of the form

          function c = fitpoly ( xdata, ydata )
          n = ?
          A = ?
          c = ?

which accepts vectors xdata and ydata of any length, and returns the coefficients of the polynomial that passes through that data. Test your function on the following data:

          XDATA:  0   1   2
          YDATA:  3   6  11.2

In addition, the value p(3) should be 18.6. You might like to plot your data and polynomial over the interval [-2, 3].

Lagrange Polynomials

Suppose we fix the set of n distinct abscissas xdata, and think about the problem of constructing a polynomial which has (not-yet-specified) values ydata at these points. Now suppose I had a polynomial l7(x) which was zero at each abscissa value, except that at xdata7, where it was 1. Then ydata7 * l7(x) would have the value ydata7 at xdata7, and if we could just find a similar polynomial for the other abscissas, I'd be done.

In fact, the Lagrange polynomials are easily constructed for any set of abscissas. Each Lagrange polynomial will be of order n (which is degree n-1). There will be n Lagrange polynomials, one per abscissa, and the i-th polynomial will be named li(x), and will have the "special relationship" with the abscissa xdatai, namely, it will be 1 there, and 0 at the other abscissas.

Remark: The trick of finding a function that equals 1 at a distinguished point and zero at all other points in a set is very powerful. One of the places you will meet it again is when you study the construction of finite elements for solving partial differential equations.

In terms of Lagrange polynomials, then, the interpolating polynomial has the form:

          p(x) = ydata1 * l1(x)
               + ydata2 * l2(x)
               ...
               + ydatan * ln(x)

Assuming we can determine these magical polynomials, this is a second way to define the interpolating polynomial to a set of data.

Exercise 7: Let's consider the abscissas of our data set:

            I:  1   2   3
        XDATA:  0   1   2
        YDATA:  3   6  11.2

You can construct the Lagrange polynomial l2(x) in a two-step process:

  1. Construct the non-trivial quadratic polynomial that is zero at the xdata values of 0 and 2; and,

  2. Divide that polynomial by its value at 1, which cannot be 0 (Why?).

Compute the values of l2 at the three values of xdata to check them.

Here's the formula for li(x):

li(x) = f1(x) * ... * fi-1(x) * fi+1(x) * ... * fn(x)

(skipping the i-th factor), where each factor has the form

fj(x) = ( x - xdataj ) / ( xdatai - xdataj)

Exercise 8: write a Matlab M function file lag_poly.m of the form

function pval = lag_poly ( xdata, i, xval )

which takes the data values xdata and evaluates the i-th Lagrange polynomial at xval. Hint: Try something like:

          pval = 1;
          for j = 1 : n
            if j ~= i
              pval = pval .* SOMETHING
            end
          end 

Note that an elementwise multiplication is being performed. If you did this correctly, then you should be able to compute that the values of l3(x) for our data set:

            I:  1   2   3
        XDATA:  0   1   2
        YDATA:  3   6  11.2
        L3(X):  0   0   1

(We're not matching the ydata values yet, just getting ready to do so).

If you did the exercise correctly, and you used the proper elementwise operations in your function, then you should be able to issue the following nifty commands:

lag_poly ( xdata, 1, xdata )
lag_poly ( xdata, 2, xdata )
lag_poly ( xdata, 3, xdata )

What would you expect these commands to print out?

Exercise 9: now write a MATLAB M function file lagval.m of the form

function pval = lagval ( xdata, ydata, xval )

which takes the data values xdata and ydata, and evaluates the interpolating polynomial at xval. This step should be easy! Test your code on the same old data:

            I:  1   2   3
        XDATA:  0   1   2
        YDATA:  3   6  11.2
      

Again, if you are using elementwise operations, you should be able to issue the following command:

lagval ( xdata, ydata, xdata )

and check everything in one stroke!

Computing Divided Differences

If you've ever tried to compute the next number in a sequence by looking at the pattern of differences between successive entries, you've started down the path of divided differences. Newton used this method back when there weren't good tables of function values, so that he had to do a lot of interpolation himself. The divided difference table is a third way to define the interpolating polynomial for a set of data.

A sample divided difference table for the polynomial and data can be written in the following form

    x             f(x)           Df(x)      D2f(x)
    0              3
                                 3
    1              6                        1.1
                                 5.2
    2             11.2
 


Here is the algorithm for computing a divided difference table, as taken from Atkinson (p. 141). Since you need to begin to learn to think these things through on your own, I will present the algorithm as the book does, changing only the notation and making little attempt to suggest how to write it in Matlab, except for the function statement:

function d = divdif ( xdata, ydata )

Note that this algorithm does not compute the columns of the above table and place them in d. It computes selected entries from the table and places them in d. The ones it computes are just the ones needed in the algorithm to evaluate the interpolating polynomial that is described in the next section.

The algorithm is described as:

  1. Remark: xdata and ydata are vectors with entries xdatai and f(xdatai) respectively, for i from 0 to n. On output, d will contain the divided difference table f[xdata0,...,xdatan].

  2. Copy the entries of ydata into d.

  3. Do through step 5 for i from 1 to n.

  4. Do through step 5 for j from n down to i.

  5.             dj := ( dj - dj-1) /( xdataj - xdataj-i)
              

    (Note the final subscript is j - i, not j - 1!)

  6. Exit from the algorithm.

One of the biggest problems with this description is that it assumes that data goes from 0 to n whereas in MATLAB our indexing starts at 1. Think carefully about what the book means by n and how you must implement this algorithm! If necessary, go through the motions with our set of three data points, and see if you can understand what i and j are counting, and how you must adjust the loop limits.

Exercise 10: write a MATLAB M function called divdif.m as described above. Test your function on our simple data, the resulting d vector should be as follows:

          xdata:  0   1   2
          ydata:  3   6  11.2
          d:      3   3  1.1

It might be helpful to use the following example from Atkinson as a second test. (The function used to generate the data is the square root function.) The xdata1 and ydata1 values are in Table 3.2 on page 141.


          xdata1:  2.0           2.1        2.2          2.3        2.4
          ydata1: 1.4142136   1.4491377   1.4832397   1.5165750   1.5491933
          d1    : 1.4142136   0.3492411  -0.0411045   0.0092430  -0.0024868


Evaluating Divided Differences

Once we've gone to all the work of setting up our divided differences, we're only halfway home! Now we have to figure out how to evaluate the polynomial represented by the divided difference table at an arbitrary point xval. The book's algorithm is as follows:

function pval = divval ( xval, xdata, d )

The algorithm is described as:

  1. Remark: xval is a point at which the polynomial is to be evaluated, xdata is the set of abscissas, and d is the divided difference information computed by divdif. On output, pval is the value of the divided difference polynomial at xval.

  2. pval := dn

  3. Do through step 4 for i from n-1 down to 0.

  4. pval := di + ( xval - xdatai) * pval

  5. Exit from the algorithm.

Remark: The divdif and divval functions are used in pairs. First, use divdif to get d, and then use d in divval to evaluate the approximating polynomial at the values xval.

Exercise 11: write a MATLAB M function called divval.m as described above. Use the divided difference table d from the previous exercise, and evaluate the divided difference polynomial at xval = 3, at which point you should get pval = 18.6. Use the second example, with xdata1, ydata1, and d1, to compute the approximate value of the square root using fourth-order polynomial interpolation at the value x=2.05 to get 1.431782079. This value appears in Table 3.3, page 142, at the top of the column labeled p4(x).

Pathology

We never really thought about it, but we assume that an interpolating polynomial p(x) gives us a good approximation to the function f(x) everywhere, no matter what function we choose. If the approximation is not good, we assume it gets better if we increase the number of data points. And if there was an example where the approximation did not get better, we would assume that happens only because the function is "bad". Well, here is a very simple example of how bad things happen to good functions!

Exercise 12: Consider interpolating the function:

f(x) = 1 / ( 1 + x * x )

This should take you no more than about 10 lines, assuming you are using vector operations. Repeat the process for n = 10 and 20. Fill in a table containing the following three lines.

        N =  5  Max Diff = ________
        N = 11  Max Diff = ________
        N = 21  Max Diff = ________

Isn't there a theorem that says that on a given closed interval, any continuous function can be uniformly well approximated by a polynomial? What is the relationship between your evidence, and this fact?


Back to the MATH2070 page.

Last revised on October 8, 2000.