These notes are taken from the resource book and were originally written by Dr Erwin. I will be editing and adding to them throughout. Most mistakes within them can thus be presumed to be mine rather than Dr Erwin’s.

In this section we are going to develop a new set of methods to solve a type of problem we are relatively familiar with. We will find a way to translate between methods we know well, but which turn out not to be very efficient, methods which are graphically very intuitive, but not very calculationally useful, and methods which are computationally extremely powerful, but appear rather abstract compared with the other two ways of looking at these problems. These three methods which we will utilise in detail in the coming sections are shown in the following diagram:

matrices.001

As we go through I will try and show how we can go between these apparently different formalisms. To start with we won’t use matrices, but as we come on to them we will first think that they are horrible and abstract, and then realise that the machinery we develop for them is incredibly powerful.

Let’s start with a very simple example of a question which will be the type that we will eventually want to answer using matrices.

Suppose that we want to find all values of x and y for which

 

x + y =3
2x-y = 4

 

(to be clear: we wish to find the values of x and y that satisfy both of these equations simultaneously). This is a system of linear equations. Solving this system is easy: Rewrite the first equation as y=3-x and substitute this into the second equation to get 2x-(3-x)=4. From this, it follows that x=\frac{7}{3} and so y=3-\frac{7}{3}=\frac{2}{3}.

Here we have treated the system purely algebraically, but let’s see what we’ve done graphically. Each of these equations is a constraint on the points in the {\mathcal R}^2 plane – this simply corresponds in this case to two lines. Which look like this:

la1

What we mean by having solved the equations is to have converted them into two new equations, one which involves only x, and one which involves only y (note that we can’t think of the former as a function, because it’s not single valued). These two new equations x=\frac{7}{3} and y=3-\frac{7}{3}=\frac{2}{3} correspond to two new lines which we plot on the same graph:

la2

In some way we’ve combined the two equations together to get two new equations which correspond to horizontal and vertical lines, where before we had intersecting slanted lines. The solution corresponds to the point of intersection of the two original lines, and of course to the intersection of the two new lines.

In two dimensions this is relatively easy to see, and indeed in three dimensions as we will see too, but in higher dimensions it is harder to visualise. However, we should still keep in mind what is happening geometrically in higher dimensions.

In principle, we can apply the same method as above (eliminating the variables) to any system of linear equations. However, if we are asked to solve a system like

 

u-v+3w+x-5y+2z=3
9u-3v-w+2x-y+12z=-8
17u+w-21x-y+11z=35
2u+2v+3w+x-6y-3z=0
4u-2v+w-54y+2z=-71

 

then things are going to get messy, fast, unless we have a systematic method. In this section of the course, we shall develop some systematic methods for solving systems of linear equations.

Systems of linear equations

Definition: An equation of the form a_1x_1+a_2x_2+ \cdots + a_nx_n = b (where x_1,x_2,\ldots,x_n are variables and a_1,a_2,\ldots,a_n,b are real numbers) is called a linear equation in the variables x_1,x_2,\ldots,x_n.

Each of the following is either a linear equation or can be rewritten as one:

 

3x-5y=7
y = 2-6x
9x+y-2z=3
x_1 + 2x_2 = 5x_3 - 12
10x_1 - \frac{3}{2} x_7 + x_{21} = 0
2x_1 + 3x_2 +4x_3 -2 x_5 + 9x_6 +11x_8 = \pi

 

Each of the following is not a linear equation:

 

x^2+y = 4
2xyz = 5
\sin (x_1) - 3x_2 + 4x_3 = -11
e^{x_1} \ln (x_2 + x_3) - x_4^3 = 2

 

Suppose we are given several linear equations and asked to determine which values of the variables satisfy all of them simultaneously. To be specific, suppose that we are given m linear equations in the n variables x_1,x_2,\ldots,x_n:

 

a_{11}x_1 + a_{12}x_2 + \cdots + a_{1n}x_n = b_1
a_{21}x_1 + a_{22}x_2 + \cdots + a_{2n}x_n = b_2
\cdots
a_{m1}x_1 + a_{m2}x_2 + \cdots + a_{mn}x_n = b_m

 

This is called a system of linear equations. If c_1,c_2,\ldots,c_n are real numbers for which

 

a_{11}c_1 + a_{12}c_2 + \cdots + a_{1n}c_n = b_1
a_{21}c_1 + a_{22}c_2 + \cdots + a_{2n}c_n = b_2
\cdots
a_{m1}c_1 + a_{m2}c_2 + \cdots + a_{mn}c_n = b_m

 

(i.e., if (x_1,x_2,\ldots,x_n) = (c_1,c_2,\ldots,c_n) satisfies every one of these linear equations), then the vector (c_1,c_2,\ldots,c_n) is a solution of this system of linear equations.

Consider the system of linear equations

 

x + 2y -z =5
-x+3y+z = 0
2x - y -2z = 5

 

1(2) + 2(1) -(-1) =5
-1(2)+3(1)+(-1) = 0
2(2) -(1) -2(-1) = 5

 

the vector (x,y,z) = (2,1,-1) is a solution of this system. Note that here we are moving away from the notation for a vector as \left<x,y,z\right> and interchangeably using the coordinate and the vector which goes from the origin to that coordinate.

Consider the system of linear equations

 

x+y=1
x+y=-1

 

From the first equation, we have y=1-x. Substituting y=1-x into the second equation, we get x+(1-x)=-1 which after simplification becomes 1=-1, which is not true. This system of linear equations therefore has no solution. We can see the reason for this when we plot these equations and see that they look like

la3

They don’t intersect anywhere and thus they have no solution.

 

We can see from the example above that some systems of linear equations have no solution. And, while we found a solution to the system of linear equations in the first example, we do not know whether (2,1,-1) is the only solution or whether there are others. We shall therefore ask two, related, questions:

 

Given a system of linear equations:
1) Does the system have at least one solution?
2) If it does, how do we find all the solutions?

When we are asked to solve a system of linear equations, we must either show that the system has no solution, or we must find all the solutions of the system. The set of all such vectors is called the solution of that system.

How clear is this post?