Matrices

Solving a system of linear equations is not technically difficult: just eliminate the variables in a systematic fashion. When there are only two or three variables, this is easy to manage. But for a bigger system, things can quickly get confusing. We need to develop a systematic method.

The first thing to notice is that the names of the variables don’t matter. Consider, for example, the two systems

 

\begin{array}{cc}  x + y &=3\\  2x-y &= 4  \end{array}

 

and

 

\begin{array}{cc}  u + v &=3\\  2u-v &= 4  \end{array}

 

It’s clear that if we ignore the names of the variables — x and y versus u and v — these two systems are the same. The reason we can tell that they’re the same is because the {\em coefficients} of the variables are the same and the numbers on the right hand side are the same. These are really the only things about a system of linear equations that matter, and so what we can do is strip the system down to its bare bones and rewrite it like this:

 

\left(  \begin{array}{cc|c}  1&1&3\\  2&-1&4  \end{array}  \right)

 

This is an augmented coefficient matrix (in general, a rectangular array of numbers, like the above, is called a matrix; a matrix with an additional vertical line, which plays the same role as the equals signs in the original equations, is augmented). This matrix has two horizontal rows (1\ 1\ 3) and (2\ -1\ 4), one row for each equation in the system. It also has three vertical columns \begin{array}{c}1\\2\end{array}, \begin{array}{c}1\\-1\end{array}, and \begin{array}{c} 3\\4 \end{array} (one column for each of the two variables and one for the numbers on the right hand side of each linear equation). A matrix with two rows and three columns is called a 2 \times 3 matrix.

For example, the augmented coefficient matrix of the system

 

\begin{array}{cc}  x + 3y -z &=-1\\  -x+2y+2z &= -4\\  x - y -z &= 3  \end{array}

 

is the 3 \times 4 matrix

 

\left (  \begin{array}{ccc|c}  1&3&-1&-1\\  -1&2&2&-4\\  1&-1&-1&3  \end{array}  \right )

 

The augmented coefficient matrix

 

\left (  \begin{array}{ccccc|c}  2&1&0&-1&1&9\\  0&3&3&0&7&8\\  -5&2&1&1&4&3  \end{array}  \right )

 

corresponds to the system of linear equations

 

\begin{array}{cc}  2x_1+x_2-x_4+x_5 &= 9\\  3x_2+3x_3+7x_5 &= 8\\  -5x_1 + 2x_2 + x_3 + x_4 + 4x_5 &= 3  \end{array}

 

We now know how to rewrite a system of linear equations as an augmented coefficient matrix. But how does that help us find the solution of that system? As we’ve already discussed, to find the solution of a simple system of linear equations like

 

\begin{array}{cc}  x + y &=3\\  2x-y &= 4  \end{array}

 

we can rewrite the first equation as y=3-x and substitute this back into the second equation to get 2x-(3-x)=4, an equation from which y has been eliminated (and which is easy to solve). Here’s another way of achieving the same thing: Instead of substituting to eliminate a variable, let’s combine the equations in a way that produces a simpler set of equations with the same solution.

We will first do this with the equations themselves, then come up with a set of rules for manipulating the equations, then we will see how these rules correspond to operations on the augmented coefficient matrices. Eventually we will come up with a way to solves the equations, without playing with the equations at all, but simply by using a series of moves to get the matrix into exactly the form we want.

Consider the system

 

\begin{array}{cc}  x + y &=3\\  2x-y &= 4  \end{array}

 

1) Our first step to make the system simpler is to eliminate x from the second equation. To do this, notice that if we multiply the whole of the first equation, x+y=3, by -2 and then add it to the second equation, 2x-y=4, then

 

\begin{array}{rcl}  -2(x+y) &= &-2 (3)\\  2x-y &= &4\\ \hline  -2x - 2y + 2x - y &= &-6 + 4  \end{array}

 

which after simplification becomes the equation y = \frac{2}{3}, which does not contain x.

2) Replace the original second equation, 2x-y=4, with the equation y=\frac{2}{3} to get a modified system of linear equations:

 

\begin{array}{cc}  x + y &=3\\  y &= \frac{2}{3}  \end{array}

 

3) Now we combine the two equations in this modified system to eliminate y from the first one. All we have to do is subtract
the second equation from the first:

 

\begin{array}{rcl}  x+y &= & 3\\  -y &= &-2/3\\ \hline  x + y - y &= &3 - 2/3  \end{array}

 

which becomes, after simplification, the equation x =\frac{7}{3}.

4) Replace the first equation, x+y=3, in the modified system of linear equations with this new equation, which does not contain y, and we have the (further modified) system

 

\begin{array}{cc}  x &=\frac{7}{3}\\  y &= \frac{2}{3}  \end{array}

 

which is also, of course, the solution of the system.

It’s interesting to see what all of this corresponds to in terms of the geometrical picture of what’s going on. Of course taking a single equation and multiplying it by a constant doesn’t change the line that it corresponds to (line, in this case as we’re in 2d). However, adding together equations does give us new lines. This might sound a bit strange. However, what we saw before was that the equations we get in the end, which are really like different combinations of the original graphs, correspond to two new lines (vertical and horizontal) which tell us the solution to the original equations.

What makes the method from this example preferable to the other one (substitution) is that we can employ essentially the same procedure with an augmented coefficient matrix. When we do, the most important thing to remember is that every row of an augmented coefficient matrix is essentially an equation. Anything which we can `legally’ do to a system of linear equations, we can do to its augmented coefficient matrix.

So what can we `legally’ do to a system of linear equations? We have to be clearer about that word `legal’: What we want to make sure of, when we do something to a system of linear equations, is that the resulting system has exactly the same solution as the original system. Two systems of linear equations with this property are called equivalent. It can be shown that if we do any of the following to a system of linear equations, then the resulting system is equivalent to the original:

 

1) Swap two equations.
2) Multiply an equation by a non-zero constant.
3) Replace an equation with the sum of that equation and a constant multiple of some other equation.

 

We can now translate these three operations on equations into an equivalent set of operations on the corresponding matrices. These operations are called the Elementary Row Operations:

If any of the following is done to an augmented coefficient matrix, then the resulting matrix has the same solution as the original:

 

E1) Swap two rows.
E2) Multiply a row by a non-zero constant.
E3) Replace any row with itself plus a constant multiple of any other row.

 

For clarity, we should mention that when we talk about the `solution’ of a matrix, we mean the solution of the associated system of linear equations.

The augmented coefficient matrix of the system

 

\begin{array}{cc}  x + y &=3\\  2x-y &= 4  \end{array}

 

is

 

\left (  \begin{array}{cc|c}  1&1&3\\  2&-1&4  \end{array}  \right ).

 

We shall now use elementary row operations to reduce this system to a simpler system with the same solution. When we worked with equations, we produced a simpler system by eliminating variables. When working with an augmented coefficient matrix, this is equivalent to trying to reduce the matrix to one with lots of 0s. For convenience, we shall write R_1 and R_2 for the first and second rows of the augmented coefficient matrix, respectively.

1) Our first step in the previous example was to eliminate x from the second equation. Our first step here will be to reduce the augmented coefficient matrix to one that has a 0 in the first element of the second row (the position in the matrix corresponding to the first variable, x, in the second equation). To make sure that the reduced matrix has the same solution as the original, we have to use an elementary row operation, and so we replace R_2 with R_2-2R_1 (an elementary row operation of the third kind). This gives us the matrix

 

\left (  \begin{array}{cc|c}  1&1&3\\  0&-3&-2  \end{array}  \right )  \begin{array}{l}  \\  R_2-2R_1  \end{array}

 

(the notation R_2-2R_1 indicates that the second row of this matrix is equal to the second row of the previous matrix minus two times the first row of the previous matrix).

It’ll make things easier if the first non-zero entry in row 2 is 1 (rather than 3), so let’s fix that using an elementary row operation of the second kind:

 

\left (  \begin{array}{cc|c}  1&1&3\\  0&1&2/3  \end{array}  \right )  \begin{array}{l}  \\  R_2/(-3)  \end{array}

 

(notice that R_2 now refers to the second row of the previous matrix, and not to the second row of the original matrix).

In the previous example, our next step was to eliminate the variable y from the first equation, i.e., replacing R_1 with a row that has a 0 in the 2nd column. To do this with an elementary row operation, we replace R_1 with R_1-R_2.

 

\left (  \begin{array}{cc|c}  1&0&7/3\\  0&1&2/3  \end{array}  \right )  \begin{array}{l}  R_1-R_2\\  \  \end{array}

 

We’ve now got a very simple augmented coefficient matrix and we can just read off the solution:

 

\begin{array}{cc} x&=7/3\\  y&=2/3\end{array}

 

Let’s try the same technique — using elementary row operations to reduce an augmented coefficient matrix to a simpler one with the same solution — with a slightly bigger system:

 

\begin{array}{cc}  x + 3y -z &=-1\\  -x+2y+2z &= -4\\  x - y -z &= 3  \end{array}

 

In this case this corresponds to a system of three planes in three dimensions which look like:

p3d1

The augmented coefficient matrix is

 

\left(  \begin{array}{ccc|c}  1&3&-1&-1\\  -1&2&2&-4\\  1&-1&-1&3  \end{array}  \right)

 

We begin by `clearing’ the first column (i.e., eliminating the variable x from every equation except the first):

 

\left(  \begin{array}{ccc|c}  1&3&-1&-1\\  0&5&1&-5\\  0&-4&0&4  \end{array}  \right )  \begin{array}{l}  \ \\  R_2+R_1\\  R_3-R_1  \end{array}

 

Now we make things neater by making the first nonzero entry in each row equal to 1:

 

\left(  \begin{array}{ccc|c}  1&3&-1&-1\\  0&1&1/5&-1\\  0&1&0&-1  \end{array}  \right)  \begin{array}{l}  \ \\  R_2/5\\  R_3/(-4)  \end{array}

 

The next step is to eliminate the variable y from the first and third equations by clearing the second column:

 

\left(  \begin{array}{ccc|c}  1&0&-8/5&2\\  0&1&1/5&-1\\  0&0&-1/5&0  \end{array}  \right)  \begin{array}{l}  R_1-3R_2 \\  \\  R_3-R_2  \end{array}

 

Usually, we would now multiply R_3 by -5 so that the first nonzero entry (1/5) becomes 1. However, that will just make the process of clearing the third column more laborious. We therefore proceed to clear the third column without first tidying up:

 

\left(  \begin{array}{ccc|c}  1&0&0&2\\  0&1&0&-1\\  0&0&-1/5&0  \end{array}  \right)  \begin{array}{l}  R_1-8R_3\\  R_2+R_3\\  \  \end{array}

 

Finally, we tidy up row 3:

 

\left (  \begin{array}{ccc|c}  1&0&0&2\\  0&1&0&-1\\  0&0&1&0  \end{array}  \right )  \begin{array}{l}  \\  \\  (-5)R_3\  \end{array}

 

This is what we wanted — a simple augmented coefficient matrix from which we can read off the solution:

 

\begin{array}{cc}  x&=2\\  y&=-1\\  z&=0  \end{array}

Try and go through each of the matrices and work out what they correspond to in terms of both equations, and planes in 3d space. You will find that however much the equations change, they always correspond to three planes intersecting at the same point.

Reduced row echelon form

In the previous section, we used elementary row operations to reduce a system of linear equations to a simpler system in which it was easy to see the solution. This idea can be turned into a rigorous, systematic method — Gauss reduction — for finding the solution of a system of linear equations. But to start, we must decide what goal we have in mind when we use elementary row operations. We would like to produce a matrix that is simple enough that we can just read off the solution. With that in mind, we give some definitions.

Definition

The first non-zero entry in each row of a matrix is called a pivot.

In the matrix below

 

\left(  \begin{array}{cccc}  1 & 2 & 0 & -1 \\  0 & 5 & 7 & 0 \\  0 & 0 & 0 & 3 \\  0 & 0 & 0 & 0 \\  \end{array}  \right)

 

the numbers 1, 5, and 3 are the pivots in the 1st, 2nd, and 3rd rows, respectively. The fourth row does not have a pivot.

Definition

A matrix is said to be in reduced row echelon form if it satisfies all four of the following conditions:

 

1) All the non-zero rows (i.e., rows with at least one non-zero element) are above any rows that contain only zeroes.
2) The pivot in every non-zero row is 1.
3) Every pivot is (strictly) to the right of every pivot in the rows above it.
4) Every pivot is the only non-zero entry in its column.

 

Below is an example of a matrix that is in reduced row echelon form and several examples of matrices that are not.

 

\left(  \begin{array}{cccccc}  1&0&0&0&2&-11\\  0&1&0&0&3&0\\  0&0&0&1&-2&3\\  0&0&0&0&0&0\\  0&0&0&0&0&0\\  \end{array}  \right)

 

This matrix is in reduced row echelon form.

 

\left(  \begin{array}{cccccc}  1&0&0&0&2&-11\\  0&1&0&0&3&0\\  0&0&0&0&0&0\\  0&0&0&1&-2&3\\  0&0&0&0&0&0\\  \end{array}  \right)

 

Not in reduced row echelon form: there is a row of zeroes above a non-zero row.

 

\left(  \begin{array}{cccccc}  1&0&0&0&2&-11\\  0&3&0&0&3&0\\  0&0&0&-1&-2&3\\  0&0&0&0&0&0\\  0&0&0&0&0&0\\  \end{array}  \right)

 

Not in reduced row echelon form: some of the pivots are not equal to 1.

 

\left(  \begin{array}{cccccc}  1&0&0&0&2&-11\\  0&0&0&1&-2&3\\  0&1&0&0&3&0\\  0&0&0&0&0&0\\  0&0&0&0&0&0\\  \end{array}  \right)

 

Not in reduced row echelon form: the pivot in row 3 is to the left of the pivot in row 2.

 

\left(  \begin{array}{cccccc}  1&-3&0&0&2&-11\\  0&1&0&2&3&0\\  0&0&0&1&-2&3\\  0&0&0&0&0&0\\  0&0&0&0&0&0\\  \end{array}  \right)

 

Not in reduced row echelon form: the pivot in row 2 is not the only non-zero element in its column. Similarly for the pivot in row 3.

If an augmented coefficient matrix is in reduced row echelon form, then it is easy to read off the solution. We’ve already seen two examples of this: The last augmented coefficient matrix in the first example was in reduced row echelon form, and similarly for the second example. In each of these two examples, the system had a unique solution. In general, when we find the solution of a system of linear equations, there are three possibilities:

 

1) The system has no solution.
2) The system has exactly one solution.
3) The system has infinitely many solutions

 

We shall prove this in a few of lectures.

How clear is this post?