2 | Matrices
2 | Matrices
This chapter of Linear Algebra by Dr JH Klopper is licensed under an Attribution-NonCommercial-NoDerivatives 4.0 International Licence available at http://creativecommons.org/licenses/by-nc-nd/4.0/?ref=chooser-v1 .
2.1 Introduction
2.1 Introduction
Matrices and their use cases are some of the most important aspects of linear algebra. In this chapter, we will set the groundwork for our understanding of matrices.
Matrices and vectors are commonly used in data science. As example, we can consider a linear regression model. In simple linear regression, we use a single independent variable to determine an estimate of a dependent variable. In (1), we consider such a model.
y
β
0
β
1
(
1
)An example of a simple linear regression model is shown in the table below, simulating the number of hours studied and the percentage points scored on a test for five observations.
Hours | Result |
10 | 95 |
9 | 85 |
8 | 76 |
8.5 | 80 |
9.5 | 89 |
In this example, the hours studied is the independent variable and the test result is the dependent variable. We can use the Wolfram Language to create a simple linear regression model.
In[]:=
model=LinearModelFit[{{{1,10},{1,9},{1,8},{1,8.5},{1,9.5}},{95,85,76,80,89}}]
Out[]=
FittedModel
In[]:=
model["ParameterTable"]
Out[]=
Estimate | Standard Error | t-Statistic | P-Value | |
#1 | 0.4 | 3.45736 | 0.115695 | 0.915204 |
#2 | 9.4 | 0.382971 | 24.5449 | 0.00014825 |
The =0.4 and =9.4. Our model is written as in (2), where the first grid of five rows and two columns of values in parenthesis is a matrix and the second grid in two rows and a single column is a vector as we explored in the first chapter.
ParameterTable
property of the model shows the coefficients
β
0
β
1
Aβ=
y
1 | 10 |
1 | 9 |
1 | 8 |
1 | 8.5 |
1 | 9.5 |
β 0 |
β 1 |
95 |
85 |
76 |
80 |
89 |
1 | 10 |
1 | 9 |
1 | 8 |
1 | 8.5 |
1 | 9.5 |
0.4 |
9.4 |
94.4 |
85 |
75.6 |
80.3 |
89.7 |
95 |
85 |
76 |
80 |
89 |
(
2
)The last line is (2) shows how our model estimates the test result very closely.
We are able to write our model using matrix-vector multiplication, which we will learn about in this chapter.
2.2 Definitions
2.2 Definitions
Definition 2.2.1 A matrix is an array of numbers or elements arranged in rows and columns, shown in (3).
A=
a 11 | a 12 | ⋯ | a 1n |
a 21 | a 22 | ⋯ | a 2n |
a 31 | a 32 | ⋯ | a 3n |
⋮ | ⋮ | ⋱ | ⋰ |
a m1 | a m2 | … | a mn |
(
3
)The subscripts denote the row and column number. There are rows and columns such that we can also use the notation in (4), where and , and ∈ for real-valued matrices.
m
n
1≤i≤m
1≤j≤n
a
ij
A={}
a
ij
(
4
)The element is therefor in row and column . Note that we usually omit the comma between the row and column numbers.
a
21
2
1
Definition 2.2.2 The shape of a matrix is the number of rows and columns of a matrix.
To indicate the shape (or dimensions) of the matrix, we denote it as , that is rows and columns.
A
m×n
m
n
We represent a matrix as a list object in the Wolfram Language. Each row is a nested list. In (5), we see a matrix, which is recreated in code.
A=
3 | 3 |
2 | -1 |
4 | 0 |
(
5
)In[]:=
(*Entervaluesbyrow*)matrixA={{3,3},{2,-1},{4,0}}
Out[]=
{{3,3},{2,-1},{4,0}}
To print this object to the screen, we use the MatrixForm function.
In[]:=
(*UseMatrixFormonlyfordisplaypurposes,notfroassigningtoavariable*)MatrixForm[matrixA]
Out[]//MatrixForm=
3 | 3 |
2 | -1 |
4 | 0 |
2.3 Matrix arithmetic
2.3 Matrix arithmetic
2.3.1 Scalar-matrix multiplication
2.3.1 Scalar-matrix multiplication
Definition 2.3.1.q The operation of scalar-matrix multiplication is defined in (6) for and
k∈
A matrix can be multiplied by a scalar as depicted in (6). Each element is multiplied by the scalar, .
k∈
kA=k
=
a 11 | a 12 | ⋯ | a 1n |
a 21 | a 22 | ⋯ | a 2n |
a 31 | a 32 | ⋯ | a 3n |
⋮ | ⋮ | ⋱ | ⋰ |
a m1 | a m2 | … | a mn |
k· a 11 | k· a 12 | ⋯ | k· a 1n |
k· a 21 | k· a 22 | ⋯ | k· a 2n |
k· a 31 | k· a 32 | ⋯ | k· a 3n |
⋮ | ⋮ | ⋱ | ⋰ |
k· a m1 | k· a m2 | … | k· a mn |
(
6
)Below we multiply our matrixA object by and print the results as a matrix. Note how the result is each element, multiplied by .
k=3
3
In[]:=
MatrixForm[3matrixA]
Out[]//MatrixForm=
9 | 9 |
6 | -3 |
12 | 0 |
The properties of scalar-matrix multiplication are listed in (7-11).
1=
A
m×n
A
m×n
(
7
)0=
A
m×n
O
m×n
(
8
)()=(),∀,∈
k
1
k
2
A
m×n
k
1
k
2
A
m×n
k
1
k
2
(
9
)(+)=+
k
1
k
2
A
m×n
k
1
A
m×n
k
2
A
m×n
(
10
)k(+)=+
A
m×n
B
m×n
kA
m×n
kB
m×n
(
11
)We will learn more about the zero matrix, , in matrix addition.
O
m×n
Problem 2.3.1.1 Calculate the scalar-vector multiplication of the matrix (list object) assigned to the variable matrixA and the scalar .
3
The list object matrixA representing the matrix is printed in the code cell below.
In[]:=
(*ReprintmatrixAusingthematrixform*)MatrixForm[matrixA]
Out[]//MatrixForm=
3 | 3 |
2 | -1 |
4 | 0 |
Multiplying the matrix by the scalar is shown below, using the MatrixForm function to display the result in matrix format.
3
In[]:=
(*Multiplyingthematrixbythescalar3*)MatrixForm[3matrixA]
Out[]//MatrixForm=
9 | 9 |
6 | -3 |
12 | 0 |
2.3.2 Matrix addition
2.3.2 Matrix addition
Definition 2.3.2.1 The operation of matrix addition is defined in (12) and is only valid for matrices of the same shape.
A
m×n
a 11 | a 12 | ⋯ | a 1n |
a 21 | a 22 | ⋯ | a 2n |
a 31 | a 32 | ⋯ | a 3n |
⋮ | ⋮ | ⋱ | ⋰ |
a m1 | a m2 | … | a mn |
B
mn
b 11 | b 12 | ⋯ | b 1n |
b 21 | b 22 | ⋯ | b 2n |
b 31 | b 32 | ⋯ | b 3n |
⋮ | ⋮ | ⋱ | ⋰ |
b m1 | b m2 | … | b mn |
(A+B)
m×n
a 11 b 11 | a 12 b 12 | ⋯ | a 1n b 1n |
a 21 b 21 | a 22 b 22 | ⋯ | a 2n b 2n |
a 31 b 31 | a 32 b 32 | ⋯ | a 3n b 3n |
⋮ | ⋮ | ⋱ | ⋰ |
a m1 b m1 | a m2 b m2 | … | a mn b mn |
(
12
)Because of inheritance from the properties of real number addition, it follows that +=+. This is termed additive commutativity of matrices (see the properties below).
A
mn
B
mn
B
mn
A
mn
Below, we create another matrix (as a list object) and add it to the matrixA list object.
In[]:=
matrixB={{4,1},{2,2},{3,-1}};MatrixForm[matrixA+matrixB]
Out[]//MatrixForm=
7 | 4 |
4 | 1 |
7 | -1 |
Matrix subtraction is matrix addition, where the second matrix is multiplied by the scalar and then added to the first.
-1
In[]:=
MatrixForm[matrixA+(-1×matrixB)]
Out[]//MatrixForm=
-1 | 2 |
0 | -3 |
1 | 1 |
We can simply use the minus operator.
Problem 2.4.1
In (13-15) we see the properties of matrix addition, namely additive commutativity in (13), additive associativity in (14), and the additive identity in (15).
2.5 Matrix-vector multiplication
2.5 Matrix-vector multiplication
Consider the two equations in (16).
This results in (18).
After the scalar multiplication, we have vector addition, as in (19).
Since vectors are equal if their components are equal, we are back to the two equations in (16). So, we have a new way to look at these two equations. We can take it a bit further and write the problem as in (20).
As mentioned, we could do this multiplication because the number of columns of matrixA is equal to the number of rows of vector x. The result is a column vector where the number of rows equal the number of rows of the matrix. Let’s have a look at (22) to see how this happened.
From (17) we note that we can also view the problem as in (23).
2.6 Scaling matrices
2.6 Scaling matrices
We start with our vector x from above and look at its norm.
Now we multiply this vector with our scaling matrix.
This linear scaling transformation can also reverse the direction of the vector. Let’s see how its done.
Note that the norm would still be the same, as it indicates (absolute) length, guaranteed by the fact that we square each component and add the squares.
Not only can we scale a matrix, but we can also rotate it with rotation matrices.
2.7 Rotation matrices
2.7 Rotation matrices
The rotation matrix that will perform the required rotation is shown in (27).
Let’s create this rotation matrix and see if if delivers as promised.
Now we create vRotated and then plot the two vectors, shown in Figure 2.7.1.
2.8 Matrix multiplication
2.8 Matrix multiplication
We can multiply two matrices too. From what you have learned about matrix-vector multiplication, it should be clear to see that we need the column number of the first matrix and the row number of the second matrix to be equal for matrix-matrix multiplication to be possible. The resultant matrix will have a shape equal to the number of rows of the first matrix and the number of columns of the second matrix. Take a look at (29) below.
Let’s use the Wolfram Language for a practical example.
The properties of matrix-matrix multiplication is listed in (31-35).
2.9 Identity matrix
2.9 Identity matrix
Let’s see why this is so in (35).
The IdentityMatrix function in the Wolfram Language creates an identity matrix of required size.
Let’s look at a practical example.
2.10 Matrix transpose
2.10 Matrix transpose
Note that we can do the same with a vector. The transpose of a column vector is a row vector. In a row vector, all the elements are listed in a single row. The transpose of a column vector and the resultant row vector is shown in (40).
We list the properties of a matrix transpose in (41-44).
The proof of these properties follow from the definitions of the transpose, matrix addition, scalar-matrix multiplication and inheritance from the properties of real numbers. The property in (44) can be proven by considering that each entry in the matrix-matrix multiplication is a dot product of associated rows and columns.
Example 2.10.1
Shows that the property in (44) holds for the matrices in (45).
The equality in the property in (44) is shown below.
2.11 Diagonal and triangular matrices
2.11 Diagonal and triangular matrices
The main diagonal are the entries from the top left (first element), towards the bottom right.
We create a matrix in the code cell below, assigned to the variable A and then use the DiagonalMatrixQ function to determine if A is a diagonal matrix.
While we reference square matrices in the definition, diagonal matrices need not be square.
We can create diagonal matrices using the DiagonalMatrix function. These will be square matrices.
A matrix has more than just a main diagonal. The matrix below, assigned to the variable B has a first super-diagonal.
Super-diagonal and sub-diagonal matrices do not have the property that they are equal to their transposes.
The super- and sub-diagonals can be even further apart from the main diagonal.
Triangular matrices have entries of all zeros below or above the main diagonal.
We can convert a matrix to upper and lower triangular form using the UpperTriangularize and LowerTriangularize functions. These simply convert the required values to 0. We can also specify the diagonal if required.
2.12 Trace of a matrix
2.12 Trace of a matrix
2.13 Symmetric matrices
2.13 Symmetric matrices
Symmetric matrices are square matrices with similar element values across from the main diagonal. An example is created in the code below.
The SymmetricMatrixQ function checks whether a matrix is symmetric. Below, we create a symmetric matrix with -2, 0, and 3 symmetric across the main diagonal.