|Test matrices||Exercise 2|
|The linear system ``problem''||Exercise 3|
|Gauss factorization||Exercise 4|
|Permutation matrices||Exercise 5|
|PLU factorization||Exercise 6|
|PLU solution||Exercise 7|
|A system of ODEs||Exercise 8|
In many numerical applications, notably the ordinary and partial differential equations this semester, the single most time-consuming step is the solution of a system of linear equations. We now begin a major topic in Numerical Analysis, Numerical Linear Algebra. We assume you are familiar with the concepts of vectors and matrices, matrix multiplication, vector dot products, and the theory of the solution of systems of linear equations, the matrix inverse, eigenvalues and so on. We now concentrate on the practical aspects of carrying out the computations required to do linear algebra in a numerical setting.
This lab will take four sessions. If you print this lab, you may prefer to use the pdf version.
We begin by noting three special test matrices that we will refer to when trying out algorithms. Then we state the linear system problem and consider three methods of solution, using the determinant, the inverse matrix, or Gauss factorization. We will find that factorization into three simply-solved factors is the best way to go, and we will write a m-file to perform the factorization and another to solve systems given the factorization. We will end up with an example using our m-files as part of the numerical solution of a partial differential equation.
We will now consider a few simple test matrices that we can use to study the behavior of the algorithms for solution of a linear system. We can use such problems as a quick test that we haven't made a serious programming error, but more importantly, we also use them to study the accuracy of the algorithms. We want test problems which can be made any size, and for which the exact values of the determinant, inverse, or eigenvalues can be determined. Matlab itself offers a ``rogues' gallery'' of special test matrices through the Matlab function gallery, but we will mainly be using our own m-files for these test matrices.
The three test we will be using are:
Row of the Frank matrix has the formula:
The Hilbert matrix arises in interpolation and approximation contexts because it happens that . The Hilbert matrix is at once nice because its inverse has integer elements and also not nice because it is extremely difficult to compute the inverse using the usual formulæ for matrix inverses.
Matlab has special functions for the Hilbert matrix and its inverse, called hilb(n) and invhilb(n), but we will use the m-file hilbert.m.
Download all three m-files, dif2.m, frank.m, and hilbert.m now.
The linear system ``problem'' can be posed in the following way. Find a -vector that satisfies the matrix equation
You probably already know that there is ``usually" a solution if the matrix is square, (that is, if ). We will concentrate on this case for now. But you may wonder if there is an intelligent response to this problem for the cases of a square but singular matrix, or a rectangular system, whether overdetermined or underdetermined.
At one time or another, you have probably been introduced to several algorithms for producing a solution to the linear system problem, including Cramer's rule (using determinants), constructing the inverse matrix, Gauss-Jordan elimination and Gauss factorization. We will see that it is usually not a good idea to construct the inverse matrix and we will focus on Gauss factorization for solution of linear systems.
We will be concerned about several topics:
A classic result from linear algebra is the following:
Theorem The linear system problem (1) is uniquely solvable for arbitrary if and only if the inverse matrix exists. In this case the solution can be expressed as .
So what's the catch? There are a few:
In the following exercises, you will see that constructing the inverse matrix takes time proportional to (for an matrix), as does simply solving the system, but that solving the system is several times faster than constructing the inverse matrix. You will also see that matrices with special formats, such as tridiagonal matrices, can be solved very efficiently. And you will see even simple matrices can be difficult to numerically invert.
You will be measuring elapsed time in order to see how long these calculations take. The problem sizes are designed to take a modest amount of time on the computers in GSCC.
Matlab provides the commands tic and toc to measure computational time. The tic command starts the timer, and the toc command stops it, either printing the elapsed time or returning its value as in the expression elapsedTime=toc;. The times are in seconds. (Note: tic and toc measure elapsed time. When a computer is doing more than one thing, the cputime function can be used to measure how much computer time is being used for this task alone.)
function elapsedTime=exer1(n) % elapsedTime=exer1(n) % comments % your name and the date A = dif2(n); b = ones(n,1); % the right side vector doesn't change the time tic; Ainv = ??? % compute the inverse matrix xSolution = ??? % compute the solution elapsedTime=toc;
Time to compute inverse matrices n time ratio 160 _________ _________ 320 _________ _________ 640 _________ _________ 1280 _________ _________ 2560 _________
x=A\b;It may help you to remember this operator if you think of the A as being ``underneath'' the slash. The effect of this operator is to find the solution of the system of equations A*x=b.
Time to compute solutions n time ratio 160 _________ _________ 320 _________ _________ 640 _________ _________ 1280 _________ _________ 2560 _________
Comparison of times n (time for inverse)/(time for solution) 160 _________ 320 _________ 640 _________ 1280 _________ 2560 _________
You should be getting the message: ``You should never compute an inverse when all you want to do is solve a system.''
Warning: the ``\'' symbol in Matlab will work
when the matrix A is not square. In fact, sometimes it
will work when A actually is a vector. The results are not usually
what you expect and no error message is given. Be careful of this potential
for error when using ``\''. A similar warning could be made
/ (division) symbol because Matlab will try to
``interpret'' it if you use it with matrices or vectors and you
will get ``answers'' that are not what you intend.
Time to compute solutions Size time ratio 2560 _________ _________ 5120 _________ _________ 10240 _________ _________ 20480 _________ _________ 40960 _________ _________ 81920 _________
N=10; A = dif2(N); xKnown = ones(N,1); b = ??? % compute the right side corresponding to xKnown xSolution = ??? % compute the solution using "\" err=norm(xSolution-xKnown,inf)
dif2 matrix size error determinant cond. no. 10 _________ __________ __________ 40 _________ __________ __________ 160 _________ __________ __________The dif2 matrix is about as nicely behaved as a matrix can be. It is neither close to singular nor difficult to invert.
Hilbert matrix size error determinant cond. no. 10 _________ __________ __________ 15 _________ __________ __________ 20 _________ __________ __________The Hilbert matrix is so close to being singular that numerical approximations cannot distinguish it from a singular matrix.
Frank matrix size error determinant cond. no. 10 _________ __________ __________ 15 _________ __________ __________ 20 _________ __________ __________The Frank matrix can be proved to have a determinant equal to 1.0 because its eigenvalues come in pairs. Thus, it is not close to singular. Nonetheless, roundoff errors conspire to make it appear to be singular on a computer. The condition number can alert you to this possibility. You can find some information about the Frank matrix from the Matrix Market, and the references therein.
In the following sections, you will write your own routine that does the same task as the Matlab ``\'' command.
The standard method for computing the inverse of a matrix is Gaussian elimination. You already know this algorithm, perhaps by another name such as row reduction. You have seen it discussed in class and you can find a discussion on the web at, for example, http://en.wikipedia.org/wiki/Gaussian_elimination or at http://www.ping.be/~ping1339/stelsels.htm#The-Gauss-method
First, we will prefer to think of the process as having two separable parts, the ``factorization'' or ``forward substitution'' and ``back substitution'' steps. The important things to recall include:
The simplest version of Gauss factorization for a matrix with entries is called Gauss factorization with no pivoting, because it has a very simple way of choosing the pivot equation. The pivot on for the row is the diagonal entry . If on step , the diagonal is zero the method will fail, even though a solution may actually exist. The forward substitution steps begin at the upper left of the matrix and proceed down the diagonal, converting all the entries below the diagonal to zero. The backward substitution steps begin at the lower right and work their way back up the diagonal.
The method of ``Gauss factorization with partial pivoting'' chooses as pivot the largest value for . To keep the calculation orderly, this pivot row is actually moved into the row of the matrix. In this case, the matrix gradually is transformed into upper triangular shape. For exact arithmetic, if there is a solution, this method is guaranteed to reach it. Moreover, for inexact arithmetic, this method has better accuracy properties than no pivoting.
The method of ``Gauss factorization with complete pivoting'' chooses as pivot for the row the coefficient of largest magniture for and , This method has more bookkeeping than the partial pivoting method, and doesn't produce much improvement, so it is little used.
We now turn to writing code for the Gauss factorization. The discussion in this lab assumes that you are familiar with Gaussian elimination (sometimes called ``row reduction''), and is entirely self-contained. Because, however, interchanging rows is a source of confusion, we will assume at first that no row interchanges are necessary.
The idea is that we are going to use the row reduction operations to turn a given matrix into an upper triangular matrix (all of whose elements below the main diagonal are zero) and at the same time keep track of the multipliers in the lower triangular matrix . These matrices will be built in such a way that could be reconstructed from the product at any time during the factorization procedure.
Alert: The fact that even before the computation is completed can be a powerful debugging technique. If you get the wrong answer at the end of a computation, you can test each step of the computation to see where it has gone wrong. Usually you will find an error in the first step.
Let's do a simple example first. Consider the matrix
Notice that the matrix is what you would obtain from a row reduction operation applied to . Also notice that
Now, suppose you have an 5 by 5 matrix (switching to Matlab notation) A that is the transpose of the Frank matrix, and you wish to perform row reduction steps to turn all the entries in the first column below the first row into zero, and keep track of the operations in the matrix L and the result in U. Convince yourself that the following code does the trick.
n=5; Jcol=1; A=frank(5)'; % TRANSPOSE: so A is not too easy L=eye(n); % square n by n identity matrix U=A; for Irow=Jcol+1:n % compute Irow multiplier and save in L(Irow,Jcol) L(Irow,Jcol)=U(Irow,Jcol)/U(Jcol,Jcol); % multiply row "Jcol" by L(Irow,Jcol) and subtract from row "Irow" U(Irow,Jcol:n)=U(Irow,Jcol:n)-L(Irow,Jcol)*U(Jcol,Jcol:n); end
function [L,U]=gauss_lu(A) % function [L,U]=gauss_lu(A) % performs an LU factorization of the matrix A using % Gaussian reduction. % A is the matrix to be factored. % L is a lower triangular factor with 1's on the diagonal % U is an upper triangular factor. % A = L * U % your name and the date
It turns out that omitting pivoting leaves the function you just wrote vulnerable to failure.
A1 = [ -2 1 0 0 0 1 0 1 -2 0 0 0 0 1 -2 1 -2 1 0 0 0 1 -2 1 0]The method should fail for this matrix.
In Gaussian elimination it is entirely possible to end up with a zero on the diagonal, as the previous exercise demonstrates. Since you cannot divide by zero, the procedure fails. It is also possible to end up with small numbers on the diagonal and big numbers off the diagonal. In this case, the algorithm doesn't fail, but roundoff errors can overwhelm the procedure and result in wrong answers. One solution to these difficulties is to switch rows and columns around so that the largest remaining entry in the matrix ends up on the diagonal. Instead of this ``full pivoting'' strategy, it is almost as effective to switch the rows only, instead of both rows and columns. This is called ``partial pivoting.'' In the following section, permutation matrices are introduced and in the subsequent section they are used to express Gaussian elimination with partial pivoting as a ``PLU'' factorization.
In general, a matrix that looks like the identity matrix but with two rows interchanged is a matrix that causes the interchange of those two rows when multiplied by another matrix. A permutation matrix can actually be a little more complicated than that, but all permutation matrices are products of matrices with a single interchanged pair of rows.
P1=[1 0 0 0 0 0 1 0 0 1 0 0 0 0 0 1]; P2=[1 0 0 0 0 0 0 1 0 0 1 0 0 1 0 0];
You already have written gauss_lu.m to do Gauss eliminiation without pivoting. Now, you are going to add pivoting to the process. The previous section should have convinced you that pivoting is the same thing as multiplying by a permutation matrix. Just as in gauss_lu.m, where you started out with U equal to the identity matrix and saved the row operations in it, you will start off with a permutation matrix P=eye(n) and save the row interchanges in it.
function [P,L,U]=gauss_plu(A)and change the comments to describe the new output matrix P.
% first, choose a pivot row by finding the largest entry % on or below position k on the diagonal [ colMax, pivotShifted ] = max ( abs ( U(Jcol:n, Jcol ) ) ); % The value of pivotShifted from max needs to be adjusted. % See help max for more information. pivotRow = Jcol+pivotShifted-1; if pivotRow ~= Jcol % no pivoting if pivotRow==Jcol U([Jcol, pivotRow], :) = U([pivotRow, Jcol], :); L([Jcol, pivotRow], 1:Jcol-1)= L([pivotRow, Jcol], 1:Jcol-1); P(:,[Jcol, pivotRow]) = P(:,[pivotRow, Jcol]); endThe reason that pivotShifted must be adjusted to get pivotRow is because, for example, if J(jcol+1,jcol) happens to be the largest entry in the column, then the max function will return pivotShifted=2), not jcol+1. This is because (jcol+1) is the second entry in the vector (Jcol:n).
% TEMPORARY DEBUGGING CHECK if norm(P*L*U-A,'fro')> 1.e-12 * norm(A,'fro') error('If you see this message, there is a bug!') end
We have seen how to factor a matrix into the form:
Let's consider the factored linear system once more.
It helps to make
up some names for variables and look at the problem in a different way:
In summary, the PLU Solution Algorithm is: To solve after finding the factors , , and the right hand side ,
Ouch! I promised a simple algorithm, but now instead of having to solve one system, we have to solve three, and we have three different solution vectors running around. But things actually have gotten better, because these are really simple systems to solve:
P=[0 1 1 0]; L=[1 0 0.5 1]; U=[2 4 0 7];confirm that .
Step 0 b = [ 19; 10 ] 1 z = [ __________; __________ ] 2 y = [ __________; __________ ] 3 x = [ __________; __________ ]
In the following exercise, you will be writing a routine plu_solve.m, to solve the linear system (1) by solving the permutation, lower and upper triangular systems. Because this is a three stage process, we will divide it into four separate m-files: u_solve.m, l_solve.m, p_solve.m, and plu_solve.m. The plu_solve routine will look like:
function x = plu_solve ( P, L, U, b ) % function x = plu_solve ( P, L, U, b ) % solve a factored system % P=permutation matrix % L=lower-triangular % U=upper-triangular % b=right side % x=solution % your name and the date % Solve P*z=b for z using p_solve z = p_solve(P,b); % Solve L*y=z for y using l_solve y = l_solve(L,z); % Solve U*x=y for x using u_solve x = u_solve(U,y);
function z=p_solve(P,b) % z=p_solve(P,b) % P is an orthogonal matrix % b is the right hand side % z is the solution of P*z=b % your name and the date z= ???
function y=l_solve(L,z) % y=l_solve(L,z) % L is a lower-triangular matrix whose diagonal entries are 1 % z is the right hand side % y is the solution of L*y=z % your name and the date % set n for convenience and simplicity n=length(z); % initialize y to zero and make sure it is a column vector y=zeros(n,1); % first row is really an easy one, especially since the diagonal % entries of L are equal to 1 Irow=1; y(Irow)=z(Irow); for Irow=2:n rhs = z(Irow); for Jcol=1:Irow-1 rhs = rhs - ??? end y(Irow) = ??? end
function x = u_solve(U,y) % x=u_solve(U,y) % U is an upper-triangular matrix % y is the right hand side % x is the solution of U*x=y % your name and the date % set n for convenience and simplicity n=length(y); % initialize y to zero and make sure it is a column vector x=zeros(n,1); % last row is the easy one Irow=n; x(Irow)=y(Irow)/U(Irow,Irow); % the -1 in the middle means it is going up from the bottom for Irow=n-1:-1:1 rhs = y(Irow); % the following loop could also be written as a single % vector statement. for Jcol=Irow+1:n rhs = rhs - ??? end x(Irow) = ??? end
x = plu_solve(P,L,U,b)for the matrices and right side vector in the previous exercise.
A = dif2(3); x = [1;1;1]; b = A*x; [P L U]=gauss_plu(A); xSolution = plu_solve ( ??? ); relErr = norm(x-xSolution)/norm(x)What are P, L, U, and xSolution? You should see the error is roundoff-sized.
A = rand(100,100); x = rand(100,1); b = A*x;and the rest of the code as in the previous part of this Exercise. The error relErr will usually be a roundoff sized number. There is always some chance that the randomly generated matrix A will end up poorly conditioned or singular, though. If that happens, generate A and x again: they will be different this time.
You might wonder why we don't combine the m-files gauss_plu.m and plu_solve.m into a single routine. One reason is that there are other things you might want to do with the factorization instead of solving a system. For example, one of the best ways of computing a determinant is to factor the matrix. (The determinant of any permutation matrix is , the determinant of L is 1 because L has all diagonal entries equal to 1, and the determinant of U is the product of its diagonal entries.) Another reason is that the factorization is much more expensive (i.e., time-consuming) than the solve is, and you might be able to solve the system many times using the same factorization. This latter case is presented in the following section.
You saw how to numerically solve certain ordinary differential equations in Lab 1. In this section you will see that the implicit Euler method can be used to solve a system of ordinary differential equations (or, what is the same thing, an ordinary differential equation involving vectors and matrices). At each time step in that solution, you need to solve a linear system of equations, and you will use plu_factor and plu_solve for this task. You will also see the advantage of splitting the solution into a ``factor'' step and a ``solve'' step.
Consider a system of ordinary differential equations.
In Matlab, we will represent the vector as x(:,k), where the ``:'' means ``all the subscripts'' and is equivalent to x(1:end,k). In addition, the Matlab function eye (that is a pun-get it?) generates the identity matrix.
The following code embodies the procedure described above. Copy it into a script m-file called back_euler_system.m. This particular system, using the dif2 matrix, turns out to be a discretization of the time-dependent heat equation. The variable represents temperature and is a function of time and also of an (here unspecified) one-dimensional spatial variable that is a multiple of its subscript. The plot is a compact way to illustrate the vector with its subscript measured along the horizontal axis.
% ntimes = number of temporal steps from t=0 to t=1 . ntimes=20; % dt = time increment dt=1/ntimes; t(1,1)=0; % N is the dimension of the space N=400; % initial values (sawtooth) x(1:N/2,1) = (1:N/2)'; x(N/2+1:N,1) = (N/2+1:N)'-N; % discretization matrix is dif2 multiplied % by a constant to make the solution interesting A=N^2/20*dif2(N); EulerMatrix=eye(N)-dt*A; tic; for k = 1 : ntimes t(1,k+1) = t(1,k) + dt; [P,L,U]=gauss_plu(EulerMatrix); x(:,k+1) = plu_solve(P,L,U,x(:,k)); end CalculationTime=toc % plot the solution at sequence of times plot(x(:,1)) hold on for k=2:2:ntimes plot(x(:,k)) end title 'Solution at selected times' hold off
You should be aware that the system matrix in this exercise is sparse, but we have not taken advantage of that fact here. A lot of time could be saved if we modified the code to take advantage of sparseness.
If you happened to try Exercise 11 using the ``\'' form, you would find that it is substantially faster than our routines. It is faster because it takes advantage of programming efficiencies that are beyond the scope of the course, not because it uses different methods. One such efficiency would be the replacement of as many loops as possible with vector statements.
Back to MATH2071 page.