I recently saw a post on Quora asking what people generally find exciting about Linear Algebra, and it really took me back, since Linear Algebra was the first thing in the more modern part of mathematics that I fell in love with, thanks to Dr Erwin. I decided to write a Mathemafrica post on concepts that I believe are foundational in Linear Algebra, or at least concepts whose beauty almost gets me in tears (of course this is only a really small part of what you would expect to see in a proper first Linear Algebra course). I did my best to keep it as fluffy as I saw necessary. I hope you will find some beauty as well in the content. If not, then maybe it will be useful for the memes. The post is incomplete as it stands. It has been suggested that this can be made more accessible to a wider audience than as it stands by possibly building up on it, so I shall work on that, but for now, enjoy this! (I will be happy to explain anything.)

Introduction

So far, there are two posts on Mathemafrica under my name. The first one dealt in a more general sense with counting objects in sets, introducing some ways to do this using functions. The second post had some very informal introduction to the vector spaces, and linear independence of vectors, giving examples in 3D space. This post shall take some ideas from both, directly from the second post. In that spirit, without much repetition:

Vector Spaces

The set of real numbers together with the usual addition and multiplication of numbers is an example of a classification of mathematical structures known as fields. In what is about to follow, fields will be denoted as {F}. Feel free to think of {F} as being {\mathbb{R}}, but remember that this need not be the case.

A vector space {{ V }} over {{ F }} is a non-empty set of elements called vectors, with two laws of combination: vector addition and scalar multiplication, satisfying the following properties: To every pair of vectors {{A}, {B} \in {V}}, there is an associated vector in {{V}} called their sum, denoted by {{A+B}}. Addition is associative: {{(A+B ) +C=A+(B+C)}} for all {A, B,C\in V}. There exists a vector, {{ 0}} , such that {{A+0=A}} for all { {A} \in {V}}. Each element, {{A}}, in {{V}} has an inverse, {{-A}}. Addition is commutative: {{A+B=B+A}}. To every scalar, {{a}} , in {{F}}, and vector {{A}} in {{V}}, there is a unique vector called the product of {{a}} and {{A}}, {{Aa}}. Scalar multiplication is associative: {{(ab)A=a(bA)}}. Scalar multiplication is distributive with respect to vector addition: {{(A+B )a=Aa+Ba}}. Scalar multiplication is distributive with respect to scalar addition: {{A(a+b)=Aa+Ab}}. Lastly, {{1A=A}} where {{1}} is in {{F}}.

Bases

A basis of a vector space is a set {\Omega} that contains the maximum number of linearly independent vectors in the vector space. As a reminder, it shall be mentioned that a set of vectors is linearly independent if the following is true: {\Sigma a_i\alpha_i=0\iff a_i=0,\,a_i\in F,\alpha_i\in V} (i.e if one takes any one of the vectors, it cannot be written in terms of the others in the same set). It is said that the set containing just these vectors, {\alpha_i}, is linearly independent. If this set has the largest number of linearly independent vectors that one can find in the vector space, then it is said to be a basis of the vector space. Intuitively: this means that any element, {\zeta}, of the vector space can be written as {\zeta=\Sigma a_i\alpha_i,\,a_i\in \mathbb{R},\alpha_i\in \Omega} (i.e every other element in the vector space can be written using the elements of the basis).

Linear Maps

A lot of people know what a linear function is, for instance {f(x)=x}. More generally, linear functions are said to be functions of the form {f(x)=ax}. These functions tend to be nicer (for instance fix the origin), and the idea of a linear function extended to linear algebra allows the study of a lot of things related to the structure of vector spaces, but of course a lot of other things in addition to that. Starting off gently, a short discussion is due.

Let {U, V} be vector spaces over some fields. A linear map is defined as follows: {T: V\to U} such that {T(a\alpha+b\beta)=aT(\alpha)+bT(\beta),\, \alpha,\beta\in V,\, a,b\in F}. This simply means that if one has a linear combination of vectors {a\alpha+b\beta} and they apply the linear map on the linear combination, then the result they obtain is the same with when they ‘act’ on each vector by the map, then they multiply by the scalar that multiplies the corresponding vector. It is worth noting that in the above {(a\alpha+b\beta)\in V} while {aT(\alpha)+bT(\beta)\in U}. The theory of operators is a powerful one that forms the basis of the wonders of linear algebra. Linear maps in this study are usually thought of as transformations of vectors in a vector space–because that is what they are–so this post shall adopt that naming.

The focus of this is of course not linear transformations, so no example shall be provided here, but one can be found at: Linear Transformations.

Conservation of Dimension

Let {U, V} be vector spaces over some field as usual. The image and kernel of a linear transformation (from one of the vector spaces to the other) are defined as follows: Considering {T: V\to U}, the image of the transformation is the set of all elements, {\beta}, of {U} such that there is some element, {\alpha}, in {V} such that {T(\alpha)=\beta}. The kernel is defined as the set of all elements, {\alpha}, of {V} such that {T(\alpha)=0}, where {0} is the vector with zeros on all entries.

One of the truly remarkable results of linear algebra states that for any linear transformation {T:V\to W}, dim{V}=dim(Im({T}))+dim(Ker({T})). This result is known as the conservation of dimension. The true power lies in the fact that it makes no mention of the linear transformation directly, nor does it make mention of the space {W}. This means that by considering some linear transformation, and studying the dimensions of its image space and the kernel, one can recover information about the space whose elements are ‘acted’ upon by the transformation.

Remark: The dimension here represents the number of linear independent elements in the corresponding sets. In the beginning, the basis of a space {\Omega} was introduced, and the number of elements of such is known as the dimension of the vector space. Correspondingly, given the image or kernel of a transformation, one can single out linearly independent elements, take the maximum number that they can find, and then count them to find the dimension of the image or respectively the kernel.

Quotient Spaces

If {U \subseteq V }is a subspace (i.e a subset of a vector space that itself is a vector space), the quotient space {V/U} is defined by {v_1 \sim v_2 \,(v_1,v_2\in V)} if {v_1-v_2 \in U}. This means that if two elements have a difference that is some element of {U}, then it is said that they are related/similar. It should be clear that this is an equivalence relation to those who are familiar with the notion.
Set theoretically, {V/U=\{v+U|v\in V\}}. Note that this is a vector space in its own right (one can check that the axioms hold), and the elements of {V/U} can be regarded as the parallel translates {v + U} of the subspace {U}. There is a natural surjective (onto) linear map {M: V \to V/U}, and it is the case that: dim{(V/U) = }dim{ V - }dim{ U}.

Intuitively, one can think of this as isolating a subspace inside the vector space, then translating it by considering what happens when they add (to every element of the subspace) some element in the bigger space, taking all such effects, then what they get is {V/U}. Observe that the dimension of this new space is dim {V - }dim {U}, and this emphasises the fact that one clearly defines the region that they are interested in, and this operation ‘invalidates’ any translations by elements that are already inside the specific subspace. This happens naturally since the subspaces are closed under addition of elements and scalar multiplication. Another more fluffy way to think about this is to consider some solid object making way through the air (maybe someone threw it). If one wants to analyse the motion of the object, then internal forces of atoms/molecules are less likely to give any clear detail about the motion. The only variables that have effect are external variables. One can now think of effects of translations of elements from the subspace {U} as being internal forces (they keep the space intact ), while elements outside do the actual translation.

Example:
Consider {V=\mathbb{R}^3}, {U=S(\mathbb{R})} then {V/U=\{\alpha +S(\mathbb{R})|\alpha\in \mathbb{R}^3\}\cong \mathbb{R}^2}, where the last symbol means that the sets {V/U, \mathbb{R}^2} are essentially the same, and {S(\mathbb{R})} is the subspace of \mathbb{R}^3 isomorphic to \mathbb{R} . Observe that using the idea of a linear map {M} from above : Ker{(M)\cong \mathbb{R}}, and Im{(M)\cong \mathbb{R}^2}. Beautiful!

Decomposition of Vector Spaces

Getting to even more golden grounds, more abstract concepts shall be introduced.

Inner Products

Consider a vector space, {V}, over some field, {F}. An inner product is a beast {\langle .,.\rangle} that satisfies the following four properties. Let {u, v, w\in V} be vectors and {\alpha\in F} be a scalar, then:

1. {\langle u+v},{w\rangle=\langle u,w\rangle+\langle v,w\rangle}.

2. {\langle \alpha v},{w\rangle=\alpha \langle v,w\rangle}.

3. {\langle v,w\rangle= \langle w,v\rangle}.

4. {\langle v,v\rangle\geq 0} with equality if and only if {v=0}.

The best way to think about inner products is that they are maps that have the exact properties listed above. They might have more, but cannot afford to miss even one of those {4} listed. Of course, \langle.,.\rangle: V\times V\to \mathbb{R} for our purposes.

Projections within Vector Spaces

Inner products are closely tied to projections in interpretation. As an example, one might consider the dot product as an example of an inner product. The dot product has a wonderful intuition behind it. Given two vectors, it projects one of the vectors on the other, then multiplies the magnitude of the projected vector and that of the vector along which is has been projected. The dot product will be zero if the vectors in consideration are perpendicular to each other. Let this be the basis of thinking, although it might be limiting, but it should help a bit if the above definition seems too mechanical. Moving forward, then:
Consider then any vector space, {V}. Define for any subspace {U}, {U^{\perp}:= \{v \in V | \langle v,u\rangle = 0,\, \forall u\in U\}}.
This is called the orthogonal complement of the subspace {U}, and one can think of this as the set of all vectors in {V} that are perpendicular to all elements of {U}. More intuitively, suppose that there is a basis {\Omega'} of {U}, and {\Omega} for {V}, then one can think of this as being {\Omega \backslash \Omega'} i.e take the basis vectors that make up the whole vector space, and take away those that also belong to the subspace, then remaining ultimately is some subspace that is different from that which was in consideration initially, and it is in some sense ‘orthogonal’ to what was started with (or rather independent, since definition is that involving the inner products and not basis vectors directly).

Decomposition

Consider {V} as a vector space over {F}. The claim is that if U is a subspace of V, {V=U\oplus U^{\perp}}.

Think of {\oplus} (direct sum) as a way of adding mathematical spaces together, so that given {U, U^{\perp}}, one can write {U\oplus U^{\perp}=\{u_1+u_2|u_1\in U,\, u_2\in U^{\perp}\}}. This is the definition of a direct sum.

The argument of proof goes as follows: Since the subspaces are in {V}, dim{V\geq } dim {(U \oplus U^{\perp})}. The other direction of the inequality is given by the properties of the inner product (in fact, the famous Pythagoras theorem) to show that the map U\oplus U^{\perp}\to V has a trivial kernel, and so we get that dim{V\leq}dim{(U\oplus U^{\perp})} which is sufficient for a conclusion.

The nerds will find more peace in knowing the formality of the argument from the pythagorean argument in what follows. Suppose {V} is a vector space, {x,y\in V}, {|x| = 1}, then let {U = \mathbb{R}x \subseteq V} , and define {P : V \to U} by {Py=\langle x,y \rangle x}. Then {P^2= P}, so call {P} a projection. Generally, {V = }Ker{P} {\oplus }Im{ P}, and this can be seen because for all {y\in V } one can write {y = P y + (I - P)y}. In the case at hand, clearly Ker{ P = U^{\perp}}. So {V = U \oplus U^{\perp}}. Lastly, consider any {y\in V}, and write {y = P y + z} where {P y \in U} and {z \in U^{\perp}}, then the following is true: {|y|^2 = |Py|^2 + |z|^2 \geq |Py|^2 = |\langle x, y \rangle|^2}, using the definition of the product, then from here, Bob’s your uncle.

At this point, it could bring some level of joy (to the reader) to play around with the idea of quotient spaces given the discussion above, and see how the ideas unite.

As a last remark: I think linear algebra is absolutely phenomenal, and I believe that it is the beginning of all the things that make life great. I hope that this post was clear and interesting enough. I surely wish I had learnt these concepts, and a couple of other things in my second year linear algebra course. Please, let me know if you find any errors (more importantly logical) as not much editing went into this. I shall gladly correct them.

[The formatting could be better, but that is WordPress, not me. Also, I converted Latex to WordPress, so there are some glitches surely in terms of formatting. Let me know if you see any remaining. I did my best to keep them minimal.]

How clear is this post?