MATH2070: LAB 5: Multidimensional Newton's Method
Introduction
Last time we discussed Newton's method for nonlinear equations in one
real or complex variable. In this lab, we will extend the discussion
to two or more dimensions. One of the examples will include a
common application of Newton's method, viz., nonlinear
least squares fitting.
This lab will take two sessions. If you print it, you might find
the pdf version
more convenient.
Modifications to newton.m for vector functions
You should recall, if
is a vector and
a
differentiable vector function with Jacobian matrix
, then
Newton's method can be written
 |
(1) |
(The superscript indicates iteration number. A superscript is used to
distinguish iteration number from vector index.)
The Jacobian matrix,
, is the multidimensional equivalent of
the derivative. It is given by
As an implementation note, the inverse of
should not be computed.
Instead, the system
should be solved. As you will see later in this course, this will save
about a factor of three in computing time over computing the
inverse. Matlab provides a special, division-like symbol for this
solution operation: the backslash (\) operator. If you wish to
solve the system
then the solution can be written as
Note that the divisor is written so that it appears ``underneath'' the
backslash. Mathematically speaking, this expression is equivalent to
but the first form, using \ is about three times faster.
-
- Exercise 1:
- Start from your version of newton.m from Lab 4, or
download my version. Change its name to
vnewton.m and change its signature to
function [x,numIts]=vnewton(f,x)
% comments
and modify it so that it: (a) Uses
the Matlab \ operator instead of scalar division for
increment;
(b) uses the norm rather than the absolute
value of the increment to determine when to stop; (c) does not use the
disp statement; and, (d) has appropriately
modified comments. (Leaving the disp statements in will cause
syntax errors when vector arguments are present, so be sure to eliminate them.)
- Test your modifications by comparing the value of the root
and the number of iterations required for the complex scalar (not
vector) case from last time (
). This shows that
vnewton will still work for scalars.
Your results should agree with those from newton. Fill
in the following table (same one as last lab, except the final case).
Initial guess numIts error numIts(newton)
1+i _______ _________ ______________
1-i _______ _________ ______________
10+5*i _______ _________ ______________
10+1.e-25*i _______ _________ ______________
A complex function revisited
It is possible to rewrite a complex function of a complex variable
as a vector function of a vector variable. This
is done by equating the real part of a complex number with the
first component of a vector and the imaginary part of a complex
number with the second component of a vector.
Consider the function
. Write
and
. Plugging these variables in yields
This can be written in an equivalent matrix form as
![$\displaystyle \left[\begin{array}{c}f_1(x_1,x_2) f_2(x_1,x_2)\end{array}\right]= \left[\begin{array}{c}x_1^2-x_2^2+9 2x_1x_2\end{array}\right]$](img14.png) |
(2) |
-
- Exercise 2:
- Write a function m-file
f2.m to compute the vector function described
above in Equation (2) and its Jacobian. It should
be in the form needed by vnewton and should have the signature
function [f,J]=f2(x)
% comments
where f is the two-dimensional column vector from Equation
(2) and J is its Jacobian matrix.
Hint: Compute df1dx1 (
),
df1dx2, df2dx1, and df2dx2. Then set
J=[df1dx1 df1dx2
df2dx1 df2dx2];
- Fill in the table below. Note that it should be the same as the
table in the previous exercise, except for its format, and the correct
solution is
Initial guess numIts norm(error)
[1;1] _______ _________
[1;-1] _______ _________
[10;5] _______ _________
[10;1.e-25] _______ _________
At this point, you should be confident that vnewton is correctly
programmed and yields the same results as newton from the previous
lab. Now, you will turn your attention to seeing why the initial
guess [10;1.e-25] requires so many Newton iterations. To do this,
you will be using Matlab as an investigative tool.
-
- Exercise 3: In this exercise, we will look more carefully at the iterates in
the poorly-converging case from the previous exercise. Because
we are interested in the sequence of iterates, we will generate
a special version of vnewton.m that returns the
full sequence of iterates. It will return the iterates as a
matrix whose columns are the iterates, with as many columns as
the number of iterates. (One would not normally return so much data
from a function call, but we have a special need for this exercise.)
- Make a copy of vnewton.m and call it vnewton1.m.
Change its signature line to the following
function [x, numIts, iterates]=vnewton1(f,x)
% comments
- Just before the loop, add the following statement
iterates=x;
to initialize the variable iterates.
- Add the following statement just after the new value
of x has been computed.
iterates=[iterates,x];
Since iterates is a matrix and x is a column vector,
this Matlab statement causes one column
to be appended to the matrix iterates for each iteration.
- Test your modified function vnewton1 using the same
f2.m from above, starting from the vector [1;1]. You
should get the same results as before for solution and number of
iterations. Check that the size of the matrix is
(number of
iterations+1), i.e., it has 2 rows and (numIts+1) columns.
The study of the poorly-converging case continues with the following exercise.
The objective is to try to understand what is happening. The point of
the exercise is two-fold: on the one hand to learn something about
Newton iterations, and on other hand to learn how one might use
Matlab as an investigative tool.
-
- Exercise 4:
- Apply vnewton1.m to the function f2.m
starting from the column vector [10;1.e-25]. This is
the poorly-converging case.
- Plot the iterates on the plane using the command
plot(iterates(1,:),iterates(2,:),'*-')
It is very hard to interpret this plot, but it is a place to start. You
do not have to send me a copy of your plot. Note that most of the iterates
lie along the
-axis, with some quite large.
- Use the zoom feature of plots in Matlab (the magnifying glass icon
with a + sign) to look at region with horizontal extent around [-20,20].
It is clear that most of the iterates appear to be on the
-axis.
Please send me a copy of this plot.
Warning: The magnifying glass icon in Matlab on the computers in GSCC
sometimes causes Matlab to crash! You can get the same effect with the
following commands
v=axis;
axis([-20,20,v(3),v(4)]);
- Look at the formula you used for the Jacobian in f2.m.
Explain why, if the initial guess starts with
exactly, then
all subsequent iterates also have
.
- You should be able to see what is happening. I chose an initial
guess with
, so subsequent iterates stayed near the
-axis. Since the root is at
, it takes many iterates before
the
component of the iterate can ``rise to the occasion.''
- Use a semilog plot to see how the iterates grow in the vertical
direction:
semilogy(iterates(2,:))
The plot shows the
component seems to grow exponentially (linearly
on the semilog plot) with seemingly random jumps superimposed. Please
send me a copy of this plot.
You now have a rough idea of how this problem is behaving. It converges
slowly because the initial guess is not close enough to the root for
quadratic convergence to be exhibited. The
component grows
exponentially, however, and eventually the iterates become close
enough to the root to exhibit quadratic convergence. It is possible to
prove these observations, although the proof is beyond the scope of
this lab.
This type of behavior might be described as ``wandering around looking
for a good initial guess.'' It is not uncommon and is one of the
reasons that a good initial guess is important when using Newton's
Method.
Nonlinear least squares
A second common application of Newton's method for vector functions
is to nonlinear curve fitting by least squares. This application
falls into the general topic of ``optimization'' wherein the
extremum of some function
is sought.
Of course, if
is differentiable, then its extrema are given by
the solution of the system of equations
for
, and the solution can be found using Newton's
method.
Suppose you are
observing some phenomenon that is evolving in time. Suppose that
you have a theoretical basis for assuming that the phenomenon
obeys an equation
After the observations have progressed for some time, you have
a large number of pairs of values
for
. The
question to be answered is, ``What values of
for
would best reproduce the observations?'' In other words, find the
values of
that minimize the
norm of the differences between the formula and observations.
Define
and we seek the minimum of
.
Note: The particular problem as stated can be reformulated
as a linear problem, resulting in reduced numerical difficulty. However,
it is quite common to solve the problem in this form and in more difficult
cases it is not possible to reduce the problem to a linear one.
In order to solve this problem, it is best to note that when the minimum
is achieved, the gradient
must be zero. The components
of the gradient
can be written as
(One rarely does this kind of calculation by hand any more. The Maple
and Mathematica symbolic manipulation programs greatly reduce the
manipulative chore. The symbolic toolbox in Matlab is a subset of
Maple.)
To apply Newton's method to
as defined in (3),
the sixteen components of the Jacobian matrix are also needed. These
are obtained from
above by differentiating with respect to
for
.
In the following exercise, you will see that Newton's method applied
to this system can require the convergence neighborhood
to be quite small.
-
- Exercise 5:
- Download my code for the least squares objective function
,
its gradient
, and the Jacobian matrix
. The file is
called objective.m
because
a function to be minimized is often called an ``objective function.''
(Our objective is to minimize the objective function.)
- Use the command help objective to see how to use it.
- Compute f and J at the solution
x=[0.15;2;1;3] to be sure that the function is zero there.
- Compute the determinant of J (above) to see that it is
nonsingular.
-
- Exercise 6: The point of this exercise is to see how sensitive Newton's method
can be when the initial guess is changed. Fill in the following
table with either the number of iterations required or the word
``failed,'' using vnewton.m and the indicated initial
guess. Note that the first line is the correct solution.
For this exercise, restrict the number of iterations to
be no more than 100.
Initial guess Number of iterations
[0.15; 2; 1.0; 3] ______________
[0.15; 2; 0.9; 3] ______________
[0.15; 2; 0.0; 3] ______________
[0.15; 2;-0.5; 3] ______________
[0.15; 2;-0.6; 3] ______________
[0.15; 2; 1.0; 4] ______________
[0.15; 2; 1.0; 5] ______________
[0.15; 2; 1.0; 6] ______________
[0.15; 1.99; 1.0; 3] ______________
[0.15; 1.97; 1.0; 3] ______________
[0.15; 1.95; 1.0; 3] ______________
[0.15; 1.93; 1.0; 3] ______________
[0.15; 1.91; 1.0; 3] ______________
[0.17; 2; 1.0; 3] ______________
[0.19; 2; 1.0; 3] ______________
[0.21; 2; 1.0; 3] ______________
You can see from the previous exercise that Newton can require a precise
guess before it will converge. Let's look at what happens in one of the
divergent cases.
-
- Exercise 7: Recall that your vnewton1.m function has signature
[x, numIts, iterates]=vnewton1(f,x), so that it returns
the matrix of all iterates. Apply that
function to the two final cases in the
previous exercise, and save the iterates using the following commands.
[x,numIts,convg]=vnewton1('objective',[0.19; 2; 1.0; 3]);
[x,numIts,divg]=vnewton1('objective',[0.21; 2; 1.0; 3]);
In particular, look at
x(1) (this is the one that is not already correct in the
initial guess) for each of the first six iterates.
Plot the first component of the iterates on the same plot using the commands
plot(convg(1,1:6),'b')
hold on
plot(divg(1,1:6),'r')
legend('convergent','divergent')
hold off
You should see that the divergent case starts out immediately with
larger variations than the convergent case. Please send me this plot.
(In the plot window, go to File
Export, then choose
``JPEG image'' from the drop-down box, and give the file a name.)
Softening (damping)
The exercise in the previous section indicates that Newton gets in
trouble when its increment is too large. One way to mitigate this
problem is to ``soften'' or ``dampen'' the iteration by putting a
factor (less than one) on the iterate.
 |
(4) |
where
is a number smaller than one.
Try this idea out in the following exercise.
-
- Exercise 8:
- Starting from your vnewton.m file, copy it to a new file
named snewton.m (for ``softened Newton''), change its signature
to
function [x,numIts]=snewton(f,x,alpha)
and change it to conform with Equation (4). Don't
forget to change the comments and the convergence criterion.
Matlab warning: The backslash
operator does not observe the ``operator precedence'' you might expect, so you
need parentheses. For example, 3*2\4=0.6667, but
3*(2\4)=6!
- Using
, to the nearest 0.01, how large can the first
component of the initial guess get before the iteration diverges? (Leave
the other three values at their correct values.)
Softening by a constant factor can improve the initial behavior of the
iterates, but it destroys the quadratic convergence of the method. Further,
it is hard to guess what the softening factor should be. There
are tricks to soften the iteration in such a way that when it starts to
converge, the softening goes away. One such trick is to compute
You should be able to see how you might prove that this softening
strategy does not destroy the quadratic convergence rate.
Another trick is to choose
so that the objective function
necessarily decreases on each iteration. This latter idea can be quite
effective, but offers no advantage in the present case.
-
- Exercise 9:
- Starting from your snewton.m file, copy it to a new file
named snewton1.m, change its signature to
function [x,numIts]=snewton1(f,x)
and change it to conform with Equation (5). Don't
forget to change the comments.
- Consider the number of iterations used by snewton in the
previous exercise for the convergent case with largest first component.
Does snewton1 take fewer iterations? How many fewer?
- Using snewton1.m, to the nearest 0.01, how large can the first
component of the initial guess get before the iteration diverges? (Leave
the other three values at their correct values.)
Back to MATH2070 page.
Mike Sussman
2006-09-02