MATH2070: LAB 7: Polynomial Interpolation and Beyond



Introduction Exercise 1
Matlab Hints Exercise 2
An Experimental Framework Exercise 3
Bracketing Exercise 4
Piecewise Linear Interpolation Exercise 5
  Exercise 6
  Exercise 7
  Exercise 8
  Exercise 9
  Exercise 10


Introduction

We saw last time, with the Runge function, that the interpolating polynomial could get worse as its degree increased. This means that our strategy of using equally spaced data for high degree polynomial interpolation is a bad idea. In this lab, we will test other possible strategies for spacing the data and then we will look at an alternative way to do interpolation with a guarantee that the error will get smaller when we take more points.

This lab will take three sessions. If you print this lab, you may prefer to use the pdf version.


Matlab Hints


An Experimental Framework

In the previous lab, we compared several different methods of polynomial interpolation, and we used a common approach that standardized the comparisons. Part of this lab will be to generate polynomial interpolants for a few different functions on different sets of points. We will be comparing the accuracy of the interpolating polynomials, just as we did last lab. Instead of repeating basically the same code all the time, it is more convenient to automate the process in an m-file. The utility function in the next exercise is designed to test interpolation for different functions on different sets of points.

Remark: I often use Matlab to do comparisons of several different methods. I find it very helpful to build a utility such as the one below to standardize the comparisons as well as to make them quick and easy. It is both error-prone and bothersome to repeat the same sequence of commands several times.

Exercise 1:

In this exercise, you will construct a utility function m-file that will take as input the name of a function (m-file) to interpolate and a set of points xdata on which to perform the interpolation. In the previous lab, the points xdata were uniformly distributed but in this lab we will investigate non-uniform distributions. The interpolation error will be computed by picking a much larger number of test points than are used for interpolation and computing the maximum error on this larger set of test points. The function uses eval_lag.m and lagrange.m from last lab. You can use your versions or download mine. An outline of this function follows:

function max_error=test_poly_interpolate(f,xdata )
% max_error=test_poly_interpolate(f,xdata ) 
% utility function used for testing different interpolation methods.
% f is the function to be interpolated 
% xdata are abscissae at which interpolation takes place
% max_error is the maximum difference between the function
%   and its interpolant

% Choose the number of the test points and generate them
% Use 4001 because it is odd, capturing the interval midpoint
NTEST=4001;
% construct NTEST points evenly spaced so that 
% they cover the interpolation interval in a standard way, i.e.,
% xval(1)=xdata(1) and xval(NTEST)=xdata(length(xdata))
xval= ???

% we need the results of f at the points xdata to do the 
% interpolation WARNING: this is a vector statement
% In a real problem, ydata would be "given" somehow, and
% a function would not be available
ydata=feval(f,xdata);

% use Lagrange interpolation from lab6 to do the interpolation
% WARNING: these use componentwise (vector) statements.
% Generate yval as interpolated values corresponding to xval
yval=eval_lag( ??? 

% we will be comparing yval with the exact results of f at xval
% In a real problem, the exact results would not be available.
yexact= ???

% plot the exact and interpolated results on the same plot
% this gives assurance that everything is reasonable
plot( ??? 

% compute the error.  The easiest way is to use the 
% Matlab norm function.  
% The inf indicates a maximum (infinity) norm.
max_error=norm(yexact-yval,inf);
You can copy this code from the web page, type it in, or download my version.
  1. Look a the code in test_poly_interpolate.m and complete the statements that have question marks in them.
  2. We will be using the Runge function, $ f(x)=1/(1+x^2)$. Recover your runge.m file from last lab or rewrite it. Do not forget to use componentwise (vector) syntax.
  3. Construct xdata containing ndata=5 evenly spaced points between -5 and +5 and use test_poly_interpolate to plot and evaluate the error between the interpolated and exact polynomials.
  4. As a test, fill in the following table.
            Runge function, evenly spaced points
      ndata =  5  Max Error = ________
      ndata = 11  Max Error = ________
      ndata = 21  Max Error = ________
    
    You should get the values 0.43836, 1.9156 and 59.822. (Note that these values are different from the ones you got in last lab. This is because we are interpolating on the interval $ [-5,5]$ instead of $ [-\pi,\pi]$.)

After looking at the plotted comparisons between the Runge function and its polynomial fit, one possibility for the poor fit is that the points are evenly spaced. Maybe they should be concentrated near the endpoints (so they don't oscillate wildly there) or near the center (maybe the oscillations are caused by so many points near the endpoints). Let's examine these two hypotheses.

The strategy used in the next exercise is to use a function to change the distribution of points. That is, pick a nonlinear function $ f$ that maps the interval $ [-1,1]$ into itself (we will use $ x^2$ and $ x^{1/2}$) and also pick an affine function $ g$ that maps the given interval, $ [-5,5]$ into $ [-1,1]$. Then use the points $ x_k=g^{-1}(f(g(\tilde{x_k}))=g^{-1}\circ f\circ g(\tilde{x}_k)$, where $ \tilde{x_k}$ are uniformly distributed.

Exercise 2: We have been looking at evenly-spaced points between -5 and 5. One way to concentrate these points near zero would be to recall that, for numbers less than one, $ \vert x\vert^2<x$. Hence, if xdata represents a sequence of numbers uniformly distributed between -5 and 5, then 5*( sign(xdata).*abs(xdata./5).^2 ) is a similar sequence of numbers concentrated near 0. (The sign and abs functions are used to get the signs correct for negative values.)
  1. Construct 11 points uniformly distributed between $ [-5,5]$ in a vector called xdata. Use the transformation
    xdata=5*( sign(xdata).*abs(xdata./5).^2 )
    
    Look at the concentrated distribution of points using the command
    plot(xdata,zeros(size(xdata)),'*')
    
    to plot the redistributed points along the x-axis. Your points should be concentrated near the center of the interval $ [-5,5]$. You do not need to send me this plot.
  2. Use this transformation and the test_poly_interpolate utility from Exercise 1 to fill in the following table for the Runge function with ndata points concentrated near 0.
            Runge function, points concentrated near 0
      ndata =  5  Max Error = ________
      ndata = 11  Max Error = ________
      ndata = 21  Max Error = ________
    

Exercise 3: Points can be concentrated near the endpoints in a similar manner, by forcing them away from zero using the $ \vert x\vert^{1/2}=\sqrt{x}$ function.
  1. Use a transformation similar to the one in the previous exercise but with $ \vert x\vert^{1/2}$ with 11 points. Take care to get the signs of negative xdata's correct. Plot the resulting redistribution of points. They should be more concentrated near the endpoints.
  2. Use this transformation and the test_poly_interpolate utility from Exercise 1 to fill in the following table for the Runge function with ndata points concentrated near the endpoints.
            Runge function, points concentrated near endpoints
      ndata =  5  Max Error = ________
      ndata = 11  Max Error = ________
      ndata = 21  Max Error = ________
    

You could ask why I chose $ x^2$ and $ x^{1/2}$ out of the many possibilities. Basically, it was an educated guess. It turns out, though, that there is a systematic way to pick optimally-distributed points. It is also true that the Chebyshev points are closely related to trigonometric interpolation (discussed last time).


Chebyshev Points

If we have no idea what function is generating the data, but we are allowed to pick the locations of the data points beforehand, the Chebyshev points are the smartest choice. (Note: there is more on Chebyshev points and interpolation on p. 228 of Atkinson.)

Theorem 3.2 in Atkinson (p. 134) shows that the interpolation error at any point x is given by an expression of the form:

$\displaystyle f(x)-p(x)=\left[\frac{(x-x_1)(x-x_2)\ldots(x-x_n)}{n!}\right]f^{n}(\xi)$

where $ \xi$ is an unknown nearby point. This is something like an error term for a Taylor's series. (Atkinson's expression starts counting with $ x_0$, but this expression counts from $ x_1$.)

We can't do a thing about the expression $ f^{n}(\xi)$, because $ f$ is an arbitrary (differentiable) function, but we can affect the magnitude of the bracketed expression. For instance, if we are only using a single data point ($ n=1$) in the interval [10,20], then the very best choice for the data point would be $ x_1=15$, because then the maximum absolute value of the expression

$\displaystyle \left[\frac{(x-x_1)}{1!}\right]$

would be 5. The worst value would be to choose $ x$ at one of the endpoints, in which case we'd double the maximum value. This doesn't guarantee better results, but it improves the chance.

Constructing the Chebyshev points

For a given number $ n$ of data points, the Chebyshev points can be constructed in the following way.

  1. Pick equally-spaced angles $ \theta_k=(2k-1)\pi/(2n)$, for $ k=1,2,\ldots,n$.
  2. The Chebyshev points on the interval $ [a,b]$ are given as

    $\displaystyle x_k=0.5(a+b+(a-b)\cos\theta_k),$

    for $ k=1,2,\ldots,n$.

Exercise 4:
  1. Write a Matlab function m-file called cheby_points.m with the signature:
    function xdata = cheby_points ( a, b, ndata )
    % xdata = cheby_points ( a, b, ndata )
    % more comments
    
    % your name and the date
    
    that returns in the vector xdata the values of the ndata Chebyshev points in the interval [a,b]. You can do this in 3 lines using vector notation:
    k = (1:ndata);
    theta = (vector expression involving k)
    x = (vector expression involving theta)
    
    or you can use a for loop. (The vector notation is more compact and runs faster, but is a little harder to understand.)
  2. To check, here's what you should get for ndata=4, and [a,b]=[-5,5]:
          1         2         3         4
      -4.61940  -1.91342   1.91342   4.61940
    
  3. You should be able to see from the formula that the Chebyshev points are not uniformly distributed on the interval, but they are symmetrical about its center.

Exercise 5: Repeat exercise 1 but with Chebyshev points. Fill in the table. (Note the extra row!) You should see smaller errors than before, especially for larger ndata.
        Runge function, Chebyshev points
  ndata =  5  Max Error = ________
  ndata = 11  Max Error = ________
  ndata = 21  Max Error = ________
  ndata = 41  Max Error = ________

We now leave the subject of interpolation using a single polynomial and discuss interpolation using different polynomials on different subintervals. This ``piecewise'' interpolation is a much better strategy than using single polynomials for most applications.


Bracketing

In the next section, we are going to consider interpolation using piecewise linear functions, instead of polynomial functions. That means that, in order to evaluate the interpolating function we first must know in which of the intervals (pieces) $ x$ lies, and then compute the value of the function depending on the endpoints of the interval. This section addresses the issue of how to find the left and right end points of the subinterval on which $ x$ lies.

We will write a utility routine to perform this task. We want to get the utility routine right, because we need to be able to rely on it. As usual, it's one of those things that's very easy to describe, and not quite so easy to program.

Suppose we are given a set of $ N$ abscissæ, in increasing order

$\displaystyle x_1<x_2<\dots<x_N$

with functional values $ y_n$ for $ n=1,2,\dots,N$ that together define a piecewise linear function $ \ell(x)$. In order to evaluate $ \ell(x)$ for a given $ x$, you need to know the interval value of $ n$ so that $ x\in[x_n,x_{n+1}]$. This situation is illustrated below, with $ N=4$, and $ n=2$.


\begin{picture}(10,7)(-1,-1)
\put(-1,0){\line(1,0){10}} % x-axis
\multiput(1,0)(...
...){\circle*{.1}}
\par
\put(5,-.3){$x$}
\put(5,0){\vector(0,1){2.3}}
\end{picture}

In Matlab notation, this amounts to the following discussion. Denote $ N$ by ndata and the $ x_n$ by the Matlab vector xdata. We will assume that the components of xdata are strictly increasing. Now suppose that we are given a value xval and we are asked to find an integer named left_index so that the subinterval [ xdata(left_index), xdata(left_index+1) ] contains xval. The index left_index is that of the left end point of the subinterval, and the index (left_index+1) is that of its right end point. (The term index in a computer program generally means the same thing as subscript in a mathematical expression.)

The following clarifications are needed. We seek an integer value left_index so that one of the following cases holds.

Case 1
If xval is less than xdata(1), then regard xval as in the first interval and left_index=1; or
Case 2
If xval is greater than xdata(ndata), then regard xval as in the final interval and left_index=ndata-1 (read that value carefully!); or,
Case 3
xval satisfies xdata(k)<=xval and xval<xdata(k+1), then left_index=k.
We would like to be able to write a Matlab function to perform this task even when xval is a vector of values, but it is not easy to do it efficiently. Instead, in the following exercise, you will write a function scalar_bracket to do it for the case xval is a scalar, not a vector. In the following exercise you will see how it can be written for vectors.

Exercise 6:

In this exercise, we will be writing a Matlab function m-file called scalar_bracket.m and test it thoroughly. Later, you will see how to write it for vectors.

  1. Write a function m-file called scalar_bracket.m that completes the following skeleton. The ``cases'' mentioned in the skeleton refer to the list of three cases above.
    function left_index=scalar_bracket(xdata,xval)
    % left_index=scalar_bracket(xdata,xval)
    % more comments
    
    % your name and the date
    
    ndata = ??? number of x data points ???
    
    % first check Case 1
    if  ??? condition on xval ???
      left_index = ???
      return
    % then Case 2
    elseif ??? condition on xval ???
      left_index = ???
      return
    % finally Case 3
    else
      for k = 1:ndata-1
        if ??? condition on xval ???
          left_index = ???
          return
        end
      end
      error('Scalar_bracket: this cannot happen!')
    end
    

  2. Now, make up a set of at least five values for xdata (don't forget to make them ascending!), and a set of at least seven test values xval, and test scalar_bracket with the values you have selected, one at a time. Make the tests as comprehensive as you can. Include your tests in your summary.

Note: The call to the Matlab error function may seem superfluous to you because, indeed, it can never be executed. It is a powerful debugging tool, however. If you make an error in the cases, it might turn out that the error message does get executed. In that case, you are alerted to the error immediately instead of having an incorrect value of left_index cause an obscure error hundreds of statements later.

Exercise 7: In this exercise, you will see how the same task can be accomplished using vector statements. The code presented below is written to be as efficient as possible, particularly for the case of very large vectors xval.
  1. Copy the following code to a file named bracket.m.

    function left_indices=bracket(xdata,xval)
    % left_indices=bracket(xdata,xval) 
    % more comments
    
    % your name and the date
    
    ndata=length(xdata);
    
    left_indices=zeros(size(xval));
    % Case 1
    left_indices( find( xval<xdata(2) )  )=1;
    % Case 2
    left_indices( find( xdata(ndata-1)<=xval )  )=ndata-1;
    % Case 3
    for k=2:ndata-2
      left_indices(find( (xdata(k)<=xval) & (xval<xdata(k+1)) ))=k;
    end
    if any(left_indices==0)
      error('bracket: not all indices set!')
    end
    
  2. Assuming xval=[0.5,4.5,3.5,4.1], what would the value of indices be as a result of the following statement?
    indices=find( (4<=xval) & (xval<5) )
    
  3. What would the value of left_indices be after the following statements, assuming again xval=[0.5,4.5,3.5,4.1]?
    left_indices=zeros(size(xval));
    left_indices(find( (4<=xval) & (xval<5) ))=4
    
  4. Explain the meaning of the statement beginning ``if any''.
  5. The loop in bracket has k=2:ndata-2, but the loop in scalar_bracket has k=1:ndata-1. Why do the two functions give the same results?
  6. Add a descriptive set of comments to the code.
  7. Use the same sets of values for xdata and xval as you used for scalar_bracket.m in the previous exercise and check that you get the same set of values for left_indices.


Piecewise Linear Interpolation

Now we are ready to consider piecewise linear interpolation. The idea is that our interpolating function is not going to be a smooth polynomial defined by a formula. Instead, it will be defined by piecing together linear interpolants that go through each consecutive pair of data points. To handle values below the first data point, we'll just extend the first linear interpolant all the way to the left. Similarly, we'll extend the linear function in the last interval all the way to the right.

The graph of a piecewise linear function may not be smooth, but one thing is certain: it will never oscillate between the data values. And the interpolation error is probably going to involve the size of the intervals, and the norm of the second derivative of the function, both of which are easy to understand. As the number of points gets bigger, the interval size will get smaller, and the norm of the second derivative won't change, so we have good reason to hope that we can assert a convergence result:

Given an interval $ [a,b]$ and a function $ f(x)$ with a continuous second derivative over that interval, and any sequence $ \lambda_{n}(x)$ of linear interpolants to $ f$ with the property that $ h_n$, the size of the maximum interval, goes to zero as $ n$ increases, then the linear interpolants converge to $ f$ both pointwise, and uniformly.

Atkinson does not discuss a theorem such as this one in this form, but its proof is similar to the discussion in Section 3.7. In fact, if $ C$ is a bound for the maximum absolute value of the second derivative of the function over the interval, then the pointwise interpolation error and the $ L^\infty$ norm are bounded by $ C h_{n}^2$, while the $ L^1$ norm is bounded by $ (b-a)C h_{n}^2$, and $ L^2$ norm is bounded by $ (\sqrt{b-a})Ch_n^2$.

Uniform convergence is convergence in the $ L^\infty$ norm, and is a much stronger result than pointwise convergence.

Thus, if the only thing you know is that your function has a bounded second derivative on $ [a,b]$, then you are guaranteed that the maximum interpolation error you make decreases to zero as you make your intervals smaller.

Stop and think: The convergence result for piecewise linear interpolation is so easy to get. Is it really better than polynomial interpolation? Why did we have such problems with polynomial interpolation before? One reason is that the error result for polynomial interpolation couldn't be turned into a convergence result. Using 10 data points, we had an error estimate in terms of the 10-th derivative. Then using 20 points, we had another error estimate in terms of the 20-th derivative. These two quantities can't be compared easily, and for nonpolynomial functions, successively higher derivatives can easily get successively and endlessly bigger in norm. The linear interpolation error bound involved a particular fixed derivative and held that fixed, so then it was easy to drive the error to zero. because the other factors in the error term depended only on interval size.

To evaluate a piecewise linear interpolant function at a point xval, we need to:

  1. Determine the interval (xdata(left_index),xdata(left_index+1)) containing xval;
  2. Determine the equation of the line which passes through the points (xdata(left_index),ydata(left_index)) and (xdata(left_index+1),ydata(left_index+1));
  3. Evaluate that line at xval.
The bracket.m function from the previous exercise does the first of these tasks, but what of the second and third?

Exercise 8: This exercise involves writing a Matlab function m-file called eval_plin.m (evaluate piecewise linear interpolation) to do steps 2 and 3 above.
  1. Write down a general formula for the linear function through an arbitrary pair of points.
  2. Write a file called eval_plin.m starting with the signature
    function yval = eval_plin ( xdata, ydata, xval )
    % yval = eval_plin ( xdata, ydata, xval )
    % more comments
    
    % your name and the date
    
    Add appropriate comments.
  3. Add lines that use bracket.m to find the vector of indices in xdata identifying the intervals that each of the values in xval lies. (It is possible to do this in one line.)
  4. Add lines evaluating the formula you found in step 1 above on the interval you found in step 3. (It is possible to do this in one line.)
  5. Test eval_plin.m on a piecewise linear function using the following sequence of commands.
      xdata=linspace(-1,1,5);
      ydata=3*abs(xdata)+1;  % the function y=3*|x|+1
      plot(xdata,ydata,'*');
      hold on
      xval=linspace(-1.5,1.5,4001);
      plot(xval,eval_plin(xdata,ydata,xval));
      hold off
    
    (Note: this test procedure is very similar to the test_poly_interpolate function we wrote earlier.) Your plot should pass through the five data points and consist of two straight lines forming a ``V''. Please send me a copy of your plot.
A piecewise linear function was chosen for testing in the last part of this exercise for both theoretical and practical reasons. In the first place, you are fitting a piecewise linear function to a function that is already piecewise linear and both with breaks at $ x=0$, so your fitted function will agree identically with the original. This makes it easy to see that the interpolation results are correct. In the second place, it is very unlikely you will get it right for this piecewise lineear data but have it wrong for other data.

Exercise 9: In this exercise, we will examine convergence of piecewise linear interpolants to the Runge function we have considered before.
  1. Copy the file test_poly_interpolate.m to a file called test_plin_interpolate.m. Change the signature to be
    function max_error=test_plin_interpolate(f,xdata)
    % max_error=test_plin_interpolate(f,xdata)
    % more comments
    
    % your name and the date
    
    and change from using polynomial interpolation to piecewise linear interpolation using eval_plin. Do not forget to change to comments to reflect your coding changes.
  2. Consider the same Runge function we used above. Over the interval [-5,5], use ndata=5 equally spaced data points, use
    test_plin_interpolate to plot the interpolant and evaluate the interpolation error.
  3. Repeat the evaluation for progressively finer meshes and fill in the following table. You do not need to send me the plots that are generated. Warning: The final few of these tests may take a minute or two to complete on a slower computer, although they take only a few seconds on the computers in GSCC.
            Runge function, Piecewise linear
      ndata =  5  Max Error(  5) = ________
      ndata = 11  Max Error( 11) = ________
      ndata = 21  Max Error( 21) = ________     
      ndata = 41  Max Error( 41) = ________
      ndata = 81  Max Error( 81) = ________
      ndata =161  Max Error(161) = ________
      ndata =321  Max Error(321) = ________
      ndata =641  Max Error(641) = ________
    

The errors in the table decrease as ndata increases, as they should. But decreasing error is only half the story: we are interested in the rate that the errors decrease. You know from class that, generally, errors tend to be bounded by $ Ch^p$ where $ C$ is a constant, often related to a derivative of the function being interpolated, $ h$ is related to the subinterval size (in the uniform case, $ h=1/$ndata), and $ p$ is an integer, often a small integer.

One way to estimate the rate errors decrease is to plot $ \log($error$ )$ versus $ \log h$. From the general bound, $ \log($error$ )\le\log C+p\log h$, so $ \log($error$ )$ versus $ \log h$ should yield a straight line. Furthermore, plotting $ \log($error$ )$ versus $ \log h$ is the same thing as plotting error versus $ h$ as a log-log plot.

It is possible to visually estimate the error from a log-log plot. To do so, simply plot lines with known slope $ q$ until you see one that is roughly parallel to the error plot. You will try this in the following exercise.

Another way to estimate the rate of error decrease is to successively halve the subinterval lengths. You may note that the ndata values chosen above result in subintervals halving in length. If the error is roughly proportional to $ Ch^p$, halving $ h$ should result in reducing the error by $ (1/2)^p$. Thus, |Max Error(321)|/|Max Error(641)| should be roughly $ 2^p$ for some integer $ p$.

Exercise 10: This exercise investigates the behavior of the errors in the table in the previous exercise.
  1. Plot the values of Max Error vs. $ h=$(1/ndata) in the table in Exercise 9 using a log-log plot (the Matlab function loglog is used just like plot, but results in a log-log plot). Your points should be roughly a straight line on the plot, especially for larger values of ndata. Send me a copy of this plot.
  2. Visually estimate the slope of the line in the following manner.
    1. Choose a value of C so that the line y=C* h$ ^3$ passes through the point h=1/641, Max Error(641).
    2. With this value of C, plot the line y=C* h$ ^3$ on the same log-log plot as the error plot above (use hold on). If you have computed C correctly, the two plots should pass through the same point at h=1/641.
    3. Do the same thing for y=C* h$ ^2$.
    4. Do the same thing for y=C*h.
    5. Which of the values p best approximates the slope of the error curve, p=3, p=2, or p=1? Be sure to send me this plot with your summary file.
  3. Compute the ratios Max Error(5)/Max Error(11),
    Max Error(11)/Max Error( 21), Max Error(21)/Max Error( 41), etc. Do they appear to approach $ 2^p$ for some integer $ p$? If so, what is your estimate of the value of $ p$?



Back to MATH2070 page.





Mike Sussman 2006-09-22