The bisection method is very reliable, but it can be relatively slow, and it does not generalize easily to more than one dimension. In this lab we will look at Newton's method for finding roots of functions. Newton's method can be much faster than bisection, but it requires formulæ for both the function and its derivative, and it can easily fail. Nonetheless, it is a workhorse method in numerical analysis.
We will be taking three sessions to complete this lab. If you print this lab, you may find the pdf version more appropriate.
Root finding routines check after each step to see whether the current result is good enough. The tests that are made are called ``termination conditions'' or ``stopping tests''. Common tests include:
it_count > ITMAX
You should note that the true error,
is
not included among the stopping tests. This is because you would need to
know the exact solution to use it.
You know that the bisection method is very reliable and rarely fails but always takes a (sometimes large) fixed number of steps. Newton's method works more rapidly a good deal of the time, but does fail. And Newton's method works in more than one dimension. One objective of this lab will be to see how Newton can fail and get a flavor of what might be done about it.
Although we will seem to spend a lot of time looking at failures, you should still expect that Newton's method will converge most of the time. It is just that when it converges there is nothing special to say, but when it fails to converge there is always the interesting questions of, ``Why?'' and ``How can I remedy it?''
You will find the definition of Newton's method in Atkinson (Section
2.2) and in the lectures, along with convergence theorems and the
like. The idea of Newton's method is that, starting from a guessed
point
, find the straight line that passes through
the point
and has slope
. The next
iterate,
is simply the location that this straight line
intersects the
-axis.
A ``quick and dirty'' derivation of the formula can be taken from
the definition of the derivative of a function. Assuming you are
at the point
, (I am using superscripts in parentheses to
denote iteration to reduce confusion between iteration number and
vector components.)
As you know, Newton's method also will work for vectors,
so long as the derivative is properly handled. Assume that
and
are
-vectors. Then the Jacobian
matrix is the matrix
whose elements are
Newton's method requires both the function value and its derivative, unlike the bisection method that requireds only the function value. Matlab provides syntax for returning two or more values from a function. The f1.m file from the previous lab can be modified in the following way to return both the value (y) and its derivative (yprime).
function [y,yprime]=f1(x) % [y,yprime]=f1(x) computes y=x^2-9 and its derivative, % yprime=2*x % your name and the date y=x.^2-9; yprime=2*x;Conveniently, this modified version could be used for the bisection method also because Matlab returns only the first value (y) unless the second is explicitly used. If this new function is used in code as if it had a single result y, then the value yprime is computed and discarded, so the modified function will continue to work in the old way. If both y and yprime are needed, as in Newton's method, then syntax like
[z z1]=f1(1)will return z=-8 and z1=2. When it is necessary to call the function f1 from inside a function such as newton, you must use Matlab's feval function. The notation for this is similar:
[z z1]=feval('f1',1)
will again return z=-8 and z1=2.
In the following exercises you will write a version of Newton's method and apply it to several test functions, including the three from the previous lab. The first step is to modify each of those three functions to return both values of both the function and its derivative
f1: y=x^2-9 yprime=??? f2: y=x^6-x-1 yprime=??? f3: y=2*x-exp(-x) yprime=???
Remark: There are several other programming strategies for computing derivatives. One might be to compute the derivative value in a new m-file with a new name such as df. Another way would be to compute the derivative using some sort of divided difference formula. This latter is useful for very complicated functions. For simple functions, Matlab provides ways of putting formulas in-line without making separate m-files. You could use Matlab's symbolic capabilities to symbolically differentiate the functions, too. Finally, there are automated ways to discover the formula for a derivative from the m-file or symbolic formula defining the function, and these can be used to generate the m-file for df automatically.
In the next exercise, you will get down to the task of writing Newton's method as a function m-file. This m-file is shorter than bisect.m and the instructions take you through it step by step.
function [x,numIts,increment]=newton(f,x) % [x,numIts,increment]=newton(f,x) % f is the name of a function with signature [y,yprime]=f(x) % on input, x is the initial guess % on output, x is the final solution % numIts is ??? % increment is ??? % Newton's method is used to find x so that f(x)=0 % your name and the date(You may have to complete this exercise before coming back to explain the parameters.)
EPSILON = 1.0e-6; MAXITS = 500;
for numIts=1:MAXITS
x = x - increment;
disp(strcat(num2str(numIts), ' x=', num2str(x), ' increment=', ...
num2str(increment)));
if abs(increment)<EPSILON
break;
end
end
The break statement causes the loop to be completed before all MAXITS iterations are complete. Note that if newton ever returns with numIts having the value 500 (or whatever you have MAXITS set to), then Newton's method has failed.
The disp statement prints out some useful numbers during the iteration. For the purpose of this lab, leave this printing in the file, but when the function is being used as part of a calculation, it should not print extraneous information.
Name Formula guess approxRoot No. Steps f1 x^2-9 0.1 _________ _________ f2 x^6-x-1 10 _________ _________ f3 2*x-exp(-x) 10 _________ _________
Remark: The theoretical convergence rate of Newton's method is quadratic. You can see this rate nicely if you look at the final several values of increment for the case of f2. This rate depends on an accurate computation of the derivative. Even a minor error in the derivative can yield a method that does not converge very quickly or that diverges. In the above table, none of the cases should have taken as many as 25 iterations. If they did, check your derivative formula. (As a general rule, if you apply Newton's method and it takes hundreds or thousands of iterations but it does converge, check your derivative calculation very carefully. In the overwhelming majority of cases you will find a bug there.)
The proofs presented in class show that quadratic convergence
depends on the ratio
being finite at the solution.
In most cases, this means that
. When this condition
fails at the desired solution, the convergence rate deteriorates
to linear. You will see this illustrated in the following exercises.
f4=x^2,
returning both the function and its derivative.
f1=x^2-9. How many iterations did it take?
f5=x^20
using Newton, again starting at x=0.1.
How many iterations does it take?
Is the true error larger than EPSILON or smaller?
In practice, if you don't know the root, how can you know whether or not the derivative is zero at the root? You can't, really, but you can tell if the convergence rate is deteriorating. Suppose you keep track of the two ratios
increment=1; % this is a dummy valueJust before setting the value of increment inside the loop, save increment in oldIncrement:
oldIncrement=increment;
Function numIts r1 r2 f1=x^2-9 ______ ______ ______ f4=x^2 ______ ______ ______ f5=x^20 ______ ______ ______
This exercise shows that quadratic convergence sometimes fails, usually resulting in a linear convergence rate, and you can estimate the rate. This is not always a bad situation-it is converging after all-but now the stopping criterion is not so good, as you saw in Exercise 4 above. In practice, it is usually as important to know that you are reliably close to the answer as it is to get the answer in the first place.
In the cases of
and
, you saw that
turned out to
be a constant, not merely bounded above and below. In that case,
![]() |
||
![]() |
if abs(increment)<EPSILON*(1-r1)
break;
end
Note: I have multiplied through by Function numIts abs. error f1=x^2-9 ______ ______ f4=x^2 ______ ______ f5=x^20 ______ ______You should see that the modified convergence does not harm quadratic convergence and greatly improves the estimated error in the linear case.
Warning: This modification of the stopping
criterion is very nice when
settles down to a constant
value quickly. In real problems, a great deal of care must
be taken because
can cycle among several values, some
larger than 1, or it can take a long time to settle down.
Sometimes people become so confident of their computational tools that they attempt the impossible. What would happen if you attempted to find a root of a function that had no roots? Basically, the same kind of behavior occurs as when there are zeros of the derivative between the initial guess and the root.
f6=x^2+9.
f6=x^2+9,
starting from x=0.1. Describe what happens.
But the function
does have roots! The roots are
complex, but Matlab knows how to do complex arithmetic.
(Actually the roots are imaginary, but it is all the same to Matlab.)
All Matlab needs is to be reminded to use complex arithmetic.
Warning: If you have used the letter i as a variable in your Matlab session, its special value as the square root of -1 has been obscured. To eliminate the values you might have given it and return to its special value, use the command clear i. It is always safe to use this command, so I always use it just before I start to use complex numbers. For those people who prefer to use j for the imaginary unit, Matlab understands that one, too.
f6=x^2+9, starting from
the initial guess x=1+i. You should get the result
0+3*i and it should take fewer than 10 iterations.
Initial guess numIts error 1+i _______ _________ 1-i _______ _________ 10+5*i _______ _________ 10+eps*i _______ _________(Recall that eps is the smallest number you can add to 1.0 and still change it.)
What happened with that last case in Exercise 8? The derivative
is
and is not zero near
. In
fact, the derivative is zero only at the origin. The origin is
not near the initial guess nor either root. It turns out that
the complex plane is just that: a plane. While Newton's method
works in two or more dimensions, it is harder to see when it is
going to have problems and when not. We will elaborate a bit
in the next section and also return to this
issue in the next lab.
The earlier, one-dimensional cases presented in this lab might lead you to think that there is some theory relating the initial guess and the final root found using Newton's method. For example, it is natural to expect that Newton's method will converge to the root nearest the initial guess. This is not true in general! In the exercise below, you will see that it is not possible, in general, to predict which of several roots will arise starting from a particular initial guess.
Consider the function
, where
is a complex number
with real part
and imaginary part
. This function has the
following four roots.
NPTS=100;
clear i
x=linspace(-2,2,NPTS);
y=linspace(-2,2,NPTS);
omega(1)= ( 1+i)/sqrt(2);
omega(2)= (-1+i)/sqrt(2);
omega(3)= (-1-i)/sqrt(2);
omega(4)= ( 1-i)/sqrt(2);
hold on
for k=1:NPTS
for j=1:NPTS
z=x(k)+i*y(j);
plot(z,'.')
end
end
hold off
Be sure to add comments that identify the purpose of the file and your name.
root=newton('f7',z);
[difference,whichRoot]=min(abs(root-omega));
if difference>1.e-2
whichRoot=0;
end
which(k,j)=whichRoot;
which=[];
image(x,y,10*which) colorbar
One disadvantage of Newton's method is that we have to supply not only the function, but also a derivative. In the following exercise, we will try to make life a little easier by approximating the derivative of the function. One way would be to consider a nearby point, evaluate the function there, and then approximate the derivative by
function [x,numIts,increment]=newton2(f,x,dx)and change the comment lines to reflect the new function.
f1(x)=x^2-9,
starting from the point x = 5.0, using an
increment dx=0.00001. You should observe essentially the same
converged value in the same number of iterations as using newton.
It turns out that the function f1(x)=x^2-9 is an ``easy''
function to solve using Newton iterations. More ``difficult''
functions require a more delicate choice for dx.
f2(x)=x^6-x-1 and f5(x)=x^20,
and the starting point x = 5.0,
compute the number of steps taken to converge
using the true Newton method, and newton2 with
different stepsizes dx. Remember that when the
maximum number of iterations is reached, it indicates
divergence, so write ``diverges'' in that case.
Number of steps f2 Number of steps f5
true Newton __________________ __________________
dx = 0.00001 __________________ __________________
dx = 0.0001 __________________ __________________
dx = 0.001 __________________ __________________
dx = 0.01 __________________ __________________
dx = 0.1 __________________ __________________
dx = 0.5 __________________ __________________
As you can see, the choice of dx is critical. It is also quite difficult in more complicated problems.
Remark: As mentioned earlier, Newton's method generally converges quite quickly. If you write a Newton solver and observe poor or no convergence, the first thing you should look for is an error in the derivative. One way of finding such an error is to apply a method like newton1 and print both the estimated derivative and the analytical derivative computed inside your function. They should be close. This trick is especially useful when you are using Newton's method in many dimensions, where the Jacobian can have some correctly-computed components and some components with errors in them.
Back to MATH2070 page.