The root finding problem is the task of finding one or more
values for a variable
so that an equation
You will see in this lab that error estimates can be misleading and there are many ways to adjust and improve them. We will not examine more sophisticated error estimation in this lab.
This lab will take two sessions. If you print this lab, you may find the pdf version more appropriate.
As you have undoubtedly noticed, Quarteroni, Sacco, and Saleri include
a substantial amount of sample Matlab code in the text. Further, this
code is available for download at
http://www1.mate.polimi.it/%7ecalnum/programs.html. (Be careful, though;
I have observed errors in some of the code on the web site.)
Warning: This link may not be working!
You will find that the code presented in these labs has a very different appearance than the code printed in the text. Perhaps the biggest difference is that the code in the book often includes several commands on a single line, separated by semicolons, while the code in these labs never has more than one command on a single line. It is important for these labs that you follow the formatting presented in the labs for code you hand in because the code in the labs is formatted to make reading and debugging as easy as possible, while code in the text is formatted to take up minimal space (so the book is less expensive).
A second difference is that the code presented in the labs tends to use much longer variable names than the code in the text. Longer variable names also improves clarity.
Clarity is important for debugging because if code is harder to read then it is harder to debug. Debugging is the most difficult and least rewarding activity you will engage in, so anything that simplifies debugging pays off. Clarity is also important when others read your code. Since I need to read your code in order to give you a grade, please format your code similarly to that in the labs.
Suppose we want to know if there is a solution to
Since we can't solve this equation exactly, it's worth knowing whether
there is anything we can say about it. In fact, if there is a solution
to the equation, we can probably find a computer
representable number x which approximately satisfies the
equation (has low residual error) and which is close to
(has a
low approximation error). Each of the following two exercises uses a
plot to illustrate the existance of a solution.
Note: Matlab has the capability of plotting some
functions by name. We won't be using this capability in this
course because it is too specialized. Instead, we will construct
plots the way you probably learned when you first learned about
plotting: by constructing a list of
-values (abscissæ) and
corresponding
-values (ordinates) and then plotting the points
with lines connecting them.
plot(xplot,y1plot)then hold on, plot the second, and hold off. You saw this done before in Lab 2. Note: If you read the help files for the ``plot'' function, you will find that an alternative single command is plot(xplot,y1plot,xplot,y2plot).
print -djpeg exer1.jpgor use the ``File'' menu, ``export'' and choose JPEG image.
function y = cosmx ( x ) % y = cosmx(x) computes the difference y=cos(x)-x % your name and the date y = ???(This is very simple-the object of this exercise is to get going.) Include a copy of this file with your summary.
plot(??? hold on plot(??? hold offas above to plot cosmx and the horizontal axis on the interval
Hint: The horizontal axis is the line
. To plot this as
a function, you need a vector of at least two
-values between
and
and a corresponding vector of
-values all of which
are zero. Think of a way to generate an appropriate set of
-values
and
-values when all the
-values are zero. You could use the
Matlab function zeros.
The idea of the bisection method is very simple. We assume that we
are given two values
and
(
), and that the function
is positive at one of these values (say at
) and
negative at the other. (When this occurs, we say that
is
a ``change-of-sign interval'' for the function.)
Assuming
is continuous, we know that there must be at least
one root in the interval.
Intuitively speaking, if you divide a change-of-sign interval in half, one or the other half must end up being a change-of-sign interval, and it is only half as long. Keep dividing in half until the change-of-sign interval is so small that any point in it is within the specified tolerance.
More precisely, consider the point
.
If
we are done. (This is pretty unlikely).
Otherwise, depending on the sign of
, we know that the
root lies in
or
. In any case, our
change-of-sign interval is now half as large as before. Repeat this
process with the new change of sign interval until the interval is
sufficiently small and declare victory.
We are guaranteed to converge. We can even compute the
maximum number of steps this will take, because if the original
change-of-sign interval has length
then after one bisection the
current change-of-sign interval is of length
, etc.
We know in advance how well
we will approximate the root
. These are very powerful
facts, which make bisection a robust algorithm--that is, it
is very hard to defeat it.
Before we look at a sample bisection program, let's discuss some programming issues. If you haven't done much programming before, this is a good time to try to understand the logic behind how we choose variables to set up, what names to give them, and how to control the logic.
First of all, we should write the bisection algorithm as a Matlab function. There are two reasons for writing it as a function instead of a script m-file (without the function header) or by typing everything at the command line. The first reason is a practical one: we will be executing the algorithm several times, with differing end points, and functions allow us to use temporary variables without worrying about whether or not they have already been used before. The second reason is pedantic: you should learn how to do this now because it is an important tool in your toolbox.
A precise description of the bisection algorithm is presented by Quarteroni,
Sacco, and Saleri in Section 6.2.1, on pages 250-251. Their algorithm
can be expressed in the following way. (Note: the product
is negative if and only if
and
are of
opposite sign and neither is zero.)
Given a function
and
so that
, then
sequences
and
for
with
for each
can be constructed by
setting
,
then, for each
,
The bisection algorithm can be described in a manner more appropriate for computer implementation in the following way. Any algorithm intended for a computer program must address the issue of when to stop the iteration. In the following algorithm,
Algorithm Bisect(
,
,
,
,
)
Remarks:
The language of this description, called ``pseudocode,''
is based on a computer language
called Algol but is intended merely to be a clear means of specifying
algorithms in books and papers. One objective of this lab is to show
how to rewrite algorithms like it in the Matlab language. As a
starting point, let's fix
to be the function cosmx that
you just wrote. Later you will learn how to add it to
the calling sequence. Let's also fix
.
Here is a Matlab function that carries out the bisection algorithm for our cosmx function. In Matlab, returned values are separated from arguments, so the variables x and itCount are written on the left of an equal sign.
function [x,itCount] = bisect_cosmx( a, b)
% [x,itCount] = bisect_cosmx( a, b) uses bisection to find a
% root of cosmx between a and b to tolerance 0.000001=1.0e-6
% a=left end point of interval
% b=right end point of interval
% cosmx(a) and cosmx(b) should be of opposite signs
% x is the approximate root found
% itCount is the number of iterations required.
% your name and the date
EPSILON = 1.0e-6;
fa = cosmx(a);
fb = cosmx(b);
for itCount = 1:(???) % fill in this value from Exercise 3
x = (b+a)/2;
fx = cosmx(x);
% The following statement prints the progress of the algorithm
disp(strcat( 'a=' , num2str(a), ', fa=' , num2str(fa), ...
', x=' , num2str(x), ', fx=' , num2str(fx), ...
', b=' , num2str(b), ', fb=' , num2str(fb)))
if ( fx == 0 )
return; % found the solution exactly!
elseif ( abs ( b - a ) < EPSILON )
return; % satisfied the convergence criterion
end
if ( sign(fa) * sign(fx) <= 0 )
b = x;
fb = fx;
else
a = x;
fa = fx;
end
end
error('bisect_cosmx failed with too many iterations!')
Note: I have violated my own rule about having functions print things, but the disp statement merely prints progress of the function and can be discarded once you are confident it is correct.
Remarks about the code: Compare the code and the
algorithm. The code is similar to the algorithm, but there are some
significant differences. For example, the algorithm generates a sequence
of intervals
and sequences of endpoints
and
midpoints
. The code does not explicitly generate the intervals
and keeps track of only the most recent endpoint and midpoint values.
This is a typical strategy in writing code--use variables to represent
only the most recent values of a sequence when the full sequence does
not need to be retained.
A second difference is in the looping. There is an error exit in the case that the convergence criterion is never satisfied. If the function ever exits with this error, it is because either your estimate of the maximum number of iterations is too small or because there is a bug in your code.
Note the symbol == to represent equality in a test. Use of a single = sign causes a syntax error in Matlab. In languages such as C no syntax error is produced but the statement is wrong nonetheless and can cause serious errors that are very hard to find.
[z,iterations] = bisect_cosmx ( 0, 3 )Is the value z close to a root of the equation cos(z)=z?
Now we have written an m-file called bisect_cosmx.m that can find roots of the function cosmx on any interval, but we really want to use it on an arbitrary function. To do this, we could create an m-file for the other function, but we would have to call that m-file cosmx.m even if there were no cosines in it, because that is what the bisect_cosmx function expects. Suppose that we have three files, called f1.m through f3.m, each of which evaluates a function that we want to use bisection on. We could rename f1.m to cosmx.m, and change its name inside the file as well, and repeat this for the other files. But who wants to do that?
It is more convenient to introduce a dummy variable for the function name, just as we used the dummy variables a and b for numerical values. In Matlab, function names can be passed as quoted strings or, in newer versions, as ``function handles,'' and their use is only a little more difficult than for numbers.
In the bisection function given below, the name of the function is specified
in the function call using either a quoted string or the @
character. Either
[x, itCount] = bisect ( 'cosmx' , a, b )or
[x, itCount] = bisect ( @cosmx , a, b )will work. Inside the function bisect, the value of the function is found using the Matlab feval function, as shown below.
In newer versions of Matlab, either the quoted string or the @
character will still work, and feval can still be used to
evaluate the function. However, in newer versions of Matlab, if
you use the @ method (this is called a ``function handle''),
then you can dispense with the feval syntax and simply
call the function as f(x). If you are using your own
recent version of Matlab, use the method you feel more comfortable
with. If you are using the computers in GSCC, though, you must
use feval.
Warning: If you forget the quotes around the
function name or the @ in front of it, Matlab will produce
errors that seem to make no sense at all (even less than usual). Keep
in mind that function names in calling sequences must have
single quotes around them or the @ character in front of them.
Check for these first when you get errors in a function.
Remark: Quarteroni, Sacco, and Saleri have an m-file named bisect.m on p. 252. Their function includes a variable named fun that contains a string representing the function whose root is to be found. They find function values using eval. In contrast, bisect.m presented above uses a variable named f to contain a string or a function handle giving the name of an m-file and function values are found using feval. Using m-files for function calls is more flexible, allowing much more complicated functions, and is similar to the way that other programming languages (Fortran, C, C++, Java, etc.) work.
function [x, itCount] = bisect ( f, a, b )
fa = cosmx(a);must be rewritten as:
fa = feval ( f, a );Make this change for fa, and a similar change for fb and fx.
[z, iterations] = bisect ( 'cosmx', 0, 3 )Execute this command and make sure that you get the same result as before. (This is called ``regression testing.'')
Warning: If you forget the quotes around cosmx and use the command
[z, iterations] = bisect ( cosmx, 0, 3 ) % error!you will get the following mysterious error message:
>> bisect(cosmx,0,3) ??? Input argument 'x' is undefined. Error in ==> cosmx.m On line 2 ==> y=cos(x)-x;
Name Formula Interval approxRoot No. Steps f1 x^2-9 [0,5] _________ _________ f2 x^5-x-1 [1,2] _________ _________ f3 x*exp(-x) [-1,2] _________ _________ f4 2*cos(3*x)-exp(x) [0,6] _________ _________ f5 (x-1)^5 [0,3] _________ _________
You found above that the bisection algorithm admits an excellent stopping criterion, and the number of iterations can be predicted before beginning the algorithm. This feature is remarkable and highly unusual. The much more common case is that it takes as much or more ingenuity to design a good stopping criterion as it does to design the algorithm itself. Furthermore, stopping criteria are likely to be problem-dependent.
One stopping criterion that is usually available is to stop when
the residual of the function becomes small (
).
This criterion is better than nothing, and often serves as a good
starting point, but it can have drawbacks.
elseif ( abs ( b - a ) < EPSILON )to read
elseif ( abs ( fx ) < EPSILON )
residual true number of
Name Formula Interval error error steps
f0 x-1 [0,3] ___________ ___________ ____________
f5 (x-1)^5 [0,3] ___________ ___________ ____________
The secant method is described by Quarteroni, Sacco, and Saleri in
Section 6.2.2. Instead of dividing the interval in half, as is done
in the bisection method, it regards the function as approximately linear,
passing through the two points
and
and then finds the
root of this linear function. The interval
need no longer be
a change-of-sign interval, and the next value of
need no longer
lie inside the interval
. In consequence, residual convergence
must be used in the algorithm instead of true convergence.
It is easy to see that if a straight line passes through the two points
and
, then it crosses the
-axis (
) when
.
The secant method can be described in the following manner.
Algorithm Secant(
,
,
,
,
)
The secant method is sometimes much faster than bisection, but since it does not maintain an interval inside which the solution must lie, the secant method can fail to converge at all. Unlike bisection, the secant method can be generalized to two or more dimensions, and the generalization is usually called Broyden's method.
function [x,itCount] = secant(f, a, b) % [x,itCount] = secant(f, a, b) % more comments explaining what the function does and % the use of each variable % your name and the date
EPSILON = 1.0e-12; ITMAX = 1000;
Secant Bisection
Name Formula Interval approxRoot No. Steps No. Steps
f1 x^2-9 [0,5] _________ _________ _________
f2 x^5-x-1 [1,2] _________ _________ _________
f3 x*exp(-x) [-1,2] _________ _________ _________
f4 2*cos(3*x)-exp(x) [0,6] _________ _________ _________
f5 (x-1)^5 [0,3] _________ _________ _________
f3 x*exp(-x) [-1.5,1] _________ _________ _________
You should observe faster and slower convergence, compared with
bisection. You may also observe lack of convergence or convergence to
a value that is not a root.
Regula falsi is described by Quarteroni, Sacco, and Saleri in Section 6.2.2.
The regula falsi algorithm can be described in the following manner, reminiscent
of the bisection and secant algorithms given above. Since the interval
is supposed to begin as a change-of-sign interval, convergence is
guaranteed.
Algorithm Regula(
,
,
,
,
)
Regula Secant Bisection
Name Formula Interval approxRoot No. Steps No. Steps No. Steps
f1 x^2-9 [0,5] _________ _________ _________ _________
f2 x^5-x-1 [1,2] _________ _________ _________ _________
f3 x*exp(-x) [-1,2] _________ _________ _________ _________
f4 2*cos(3*x)-exp(x) [0,6] _________ _________ _________ _________
f5 (x-1)^5 [0,3] _________ _________ _________ _________
f3 x*exp(-x) [-1.5,1] _________ _________ _________ _________
You should observe that both faster and slower convergence, compared
with bisection and secant. You should not observe lack of convergence
or convergence to an incorrect solution.
Back to MATH2070 page.