8 | Inner product spaces
8 | Inner product spaces
This chapter of Linear Algebra by Dr JH Klopper is licensed under an Attribution-NonCommercial-NoDerivatives 4.0 International Licence available at http://creativecommons.org/licenses/by-nc-nd/4.0/?ref=chooser-v1 .
8.1 Introduction
8.1 Introduction
In this chapter we review the idea of the inner products. We incorporate it to develop the idea of an inner product space.
8.1 Inner product
8.1 Inner product
The scalar or dot product is a function that maps to , shown in (1).
u,v∈
n
u·v=v=
T
u
n
∑
i=1
u
i
v
i
(
1
)In the previous notebooks we showed that vectors in are only one of many examples of vectors spaces, all satisfying the axioms of vectors spaces under the binary operations of vector addition, , and scalar vector multiplication, . Here we use the term vector as an element of a set in a broader sense.
n
⊕
⊙
V
In this notebook, we extend the notion of a dot product to an inner product. An inner product is defined on a vector space, , with elements . An inner product is a mapping of these vectors to a scalar. denoted by . Such an inner product defined on a set is termed an inner product space if the following four axioms, shown in (2) are satisfied.
V
u,v∈V
〈u,v〉
〈u,v〉=〈v,u〉〈u+v,w〉=〈u,w〉+〈v,w〉〈ku,v〉=k〈u,v〉〈u,u〉≥0,〈u,u〉=0u=0
(
2
)These four axioms are named (in order), the symmetry axiom, the additivity axiom, the homogeneity axiom, and the positivity axiom.
8.2 Weighted Euclidean inner product
8.2 Weighted Euclidean inner product
For ,,…,∈ (positive real numbers), we define the weighted Euclidean inner product in (3).
w
1
w
2
w
n
+
〈u,v〉=
n
∑
i=1
w
i
u
i
v
i
(
3
)Note that this is different from the dot product where .
u·v=∑
u
i
v
i
We can view this as an example in which we conduct an experiment with a single, random outcome. This outcome can be mapped to any one of the elements ,,…,. If we conduct this experiment times, we will have a varying frequency (number of occurrences) of each of the listed elements. If these frequencies are ,,…, for each corresponding element then ++…+=m. If are numerical values, then we can calculate the arithmetic mean, , of our outcomes, shown in (4).
x
1
x
2
x
n
m
f
1
f
2
f
n
f
1
f
2
f
n
x
i
x
m
x
1
m
n
∑
i=1
f
i
x
i
(
4
)Below we verify this with code.
In[]:=
SeedRandom[12];outcome=RandomInteger[{1,5},100];
In[]:=
Sort@Tally[outcome]
Out[]=
{{1,14},{2,32},{3,16},{4,15},{5,23}}
We note instances of , of , of , of , and of .
14
x=1
32
x=2
16
x=3
15
x=4
23
x=5
In[]:=
f=Sort@Tally[outcome]〚;;,2〛
Out[]=
{14,15,16,23,32}
We can multiply each .
f
i
x
i
In[]:=
totals=Times@@@Sort@Tally[outcome]
Out[]=
{14,64,48,60,115}
The sum total of these products are .
301
In[]:=
Total[totals]
Out[]=
301
Dividing this sum total by the sample size, returns the expected value (the mean in this case).
In[]:=
N@
Total[totals]
100
Out[]=
3.01
The Mean function returns the same value.
In[]:=
N@Mean[outcome]
Out[]=
3.01
In the example below, we investigate whether where the inner product satisfies the four axioms of an inner product space.
u,v,w∈
2
〈u,v〉=4+3
u
1
v
1
u
2
v
2
Since are all elements of the field of real numbers we inherit the properties of addition and multiplication of a field and the symmetric property holds. Below in (5), we look at the additivity axiom. Note that we define the weights on the inner product of any pair of the vectors.
3,2,,
u
i
v
i
〈u+v,w〉=〈u,w〉+〈v,w〉〈u+v,w〉=4(+)+3(+)〈u+v,w〉=4+4+3+3〈u+v,w〉==(4+3)+(4+3)〈u+v,w〉=〈u,w〉+〈v,w〉
u
1
v
1
w
1
u
2
v
2
w
2
u
1
w
1
v
1
w
1
u
2
w
2
v
2
w
2
u
1
w
1
u
2
z
2
v
1
w
1
v
w
w
2
(
5
)For the homogeneity axiom we again inherit the properties of the field of real numbers, (6).
〈ku,v〉=4()+3()〈ku,v〉=k(4+3)〈ku,v〉=k〈u,v〉)
ku
1
v
1
ku
2
v
2
u
1
v
1
u
2
v
2
(
6
)For the positivity axiom we also inherit the field properties, (7).
〈u,u〉=4+3=4+3≥0
u
1
u
1
u
2
u
2
2
u
1
2
u
2
(
7
)Since are all non-negative scalar, (7) will clearly only equal of is the zero vector.
3,4,,
2
u
1
2
u
2
0
u
8.3 Length and distance in inner product spaces
8.3 Length and distance in inner product spaces
For , we review the norm of a vector and the distance between two vectors in (8).
u,v∈
2
u=
,v=
u·v=v=+||u||==+d(u,v)=||u-v||==+
u 1 |
u 2 |
v 1 |
v 2 |
T
u
u
1
v
1
u
2
v
2
u·u
=2
∑
i=1
2
u
i
2
u
1
2
u
2
(u-v).(u-v)
=2
∑
i=1
2
(-)
u
i
v
i
2
(-)
u
1
v
1
2
(-)
u
2
v
2
(
8
)If we now consider the weighted inner product in (9), we will have a different norm and difference between the vectors.
u·v=+
au
1
v
1
bu
2
v
2
(
9
)The norm-squared of is shown in (10).
u
||u=+=+
2
||
au
1
u
1
bu
2
u
2
2
au
1
2
bu
2
(
10
)For the distance between two vectors, we first define the difference as , to make the derivation clear.
u-v=p
d(u,v)=+d(u,v)=+d(u,v)=
(u-v)·(u-v)
=p·p
=ap
1
p
1
bp
2
p
2
2
ap
1
2
bp
2
a+b
2
(-)
u
1
v
1
2
(-)
u
2
v
2
(
11
)So, for a concrete example, we look at (12).
u=
,=
,a=4,b=3||u=4+3=4d(u,v)=
1 |
0 |
v
0 |
1 |
2
||
2
(1)
2
(0)
4+3
=2
(1)
2
(-1)
7
(
12
)8.4 Inner product spaces and matrices
8.4 Inner product spaces and matrices
The Euclidean inner product (dot product) and the weighted Euclidean inner product are examples (special cases) of a more general class of inner product spaces on called the inner product on generated by . It is defined in (13).
n
n
A
n×n
〈u,v〉=Au·Av
(
13
)In a previous example, we defined an inner product space . This can be rewritten as in (14).
〈u,v〉=+
au
1
v
1
bu
2
v
2
A=
〈u,v〉=
·
〈u,v〉=(+)·(+)〈u,v〉=+
a | 0 |
0 | b |
a | 0 |
0 | b |
u 1 |
u 2 |
a | 0 |
0 | b |
v 1 |
v 2 |
a
u
1
b
u
2
a
v
1
b
v
2
au
1
v
1
bu
2
v
2
(
14
)Since and are weights, we can rewrite them as ,,.... This gives a general diagonal matrix,, (15).
a
b
w
1
w
2
A
n×n
A=
w 1 | 0 | ⋯ | 0 |
0 | w 2 | … | 0 |
⋮ | ⋮ | ⋱ | ⋮ |
0 | 0 | ⋯ | w n |
(
15
)8.4.1 Inner product on M2×2
8.4.1 Inner product on
M
2×2
We define an inner product for two matrices in (16).
2×2
U=
,V=
〈U,V〉=tr(U)
u 1 | u 2 |
u 3 | u 4 |
v 1 | v 2 |
v 3 | v 4 |
T
V
(
16
)The result is shown in (17).
T
V
v 1 | v 3 |
v 2 | v 4 |
u 1 | u 3 |
u 2 | u 4 |
u 1 v 1 u 3 v 3 | u 1 v 2 u 3 v 4 |
u 2 v 1 u 4 v 3 | u 2 v 2 u 4 v 4 |
T
U
u
1
v
1
u
2
v
2
u
3
v
3
u
4
v
4
4
∑
i=1
u
i
v
i
(
17
)We simply have elementwise multiplication and the sum all of the products. As an example, we consider the two examples below.
In[]:=
A={{2,4},{3,4}};B={{-1,2},{3,2}};MatrixForm[A]MatrixForm[B]
Out[]//MatrixForm=
2 | 4 |
3 | 4 |
Out[]//MatrixForm=
-1 | 2 |
3 | 2 |
Elementwise multiplication is done by multiplying the matrices.
Since A B returns a nested list, we can create a single list from it by using the Flatten function. The Total function will sum all the elements.
8.4.2 Inner products on polynomials of degree 2
8.4.2 Inner products on polynomials of degree 2
8.4.3 Inner product of functions in a closed interval
8.4.3 Inner product of functions in a closed interval
The norm of this inner product space is shown in (24).
The unit circle is lastly shown in (25).
8.5 Properties of inner product spaces
8.5 Properties of inner product spaces
As an example, we consider (27).
Note that this is very similar to the multiplication we are familiar with in algebra.
8.6 Lengths, angles and orthogonality
8.6 Lengths, angles and orthogonality
Here, we recall the equation for the angle between two vectors, using the dot product, shown in (29). Zero vectors are excluded from the assignment of an angle between vectors.
An equation for the angle between vectors in an inner product space, is shown in (30).
From the constraints on the the cosine function, (31) must be satisfied.
The numerator must therefor be less than or equal to the denominator. To show that this is so, we use the Cauchy-Schwarz inequality, shown in (32)
8.6.1 Angle between vectors in inner product space
8.6.1 Angle between vectors in inner product space
8.6.2 Orthogonality
8.6.2 Orthogonality
While we have considered this in Euclidean space where the inner product was defined as the dot product, a generalization to inner product spaces allows us to consider orthogonality in a broader sense.
In the example below, we look at orthogonality of two square matrices, shown in (38).
We conclude that these two matrices are orthogonal.
For interest, we also calculate the norm of each, shown in (42).
8.6.3 Pythagorean theorem
8.6.3 Pythagorean theorem
We can check on this using the inner product of the two vectors, (46).
8.6.4 Orthogonal complements
8.6.4 Orthogonal complements
8.7 Null space and row space of matrices
8.7 Null space and row space of matrices
We require vectors that span a space for which the basis vectors will be perpendicular to these four vectors. We are therefor interested in the transpose of this matrix.
The solution is shown in (52).
The basis for the space spanned is represented by the first three column vectors.
8.8 Orthonormal basis
8.8 Orthonormal basis
In inner product spaces, our aim is to work with orthogonal basis vectors as this can simply solutions. A set of orthogonal basis vectors requires all pairs of vectors to be orthogonal to each other.
We further reduce these orthogonal basis vector to their normal form, called orthonormal vectors. A set or orthonormal vectors as are normalized and all pairs are orthogonal.
Let’s consider the set of vectors in (53) and assume that the Euclidean inner product is defined on them.
We normalize a vector by using ().
The Normalize function performs this task. We see all three as column vectors below.
An orthonormal basis is then a basis of orthonormal vectors for an inner product space.
8.8.1 Expressing a vector by its orthonormal basis
8.8.1 Expressing a vector by its orthonormal basis
They are mutually orthogonal, which we can check using the Euclidean dot product.
This is simply done using Gauss-Jordan elimination on the augmented matrix for this system.
The coordinate vector is a vector of the fraction of each of the basis vectors that make up their linear combination to result in the original vector.
8.9 Projections
8.9 Projections
8.10 Gram-Schmidt orthonormalization
8.10 Gram-Schmidt orthonormalization
All non-zero finite-dimensional inner product spaces have orthonormal bases. This is a very important theorem. There is a sequence of steps that will turn a basis into an orthonormal basis. This sequence of steps relies on the projection of a vector onto another and considering the orthogonal component of that projection.
Let’s if these two vectors are indeed orthogonal.
We need only normalize them now, to be an orthonormal basis.
In Figure 8.10.1 we see the three basis vectors in orange, blue and green.
They are indeed independent.
We can use Gauss-Jordan elimination on the matrix of coefficients and visualization to show the fact that these vectors are mutually orthogonal.
Figure 8.10.3 visualizes the vectors.
In Figure 8.10.4 we visualize this orthonormal basis.
We can use the Orthogonalize function to calculate the orthonormal basis vectors.
While we used the dot product as our inner product in our example, this applies to any defined inner product space.
8.10.1 QR-decomposition
8.10.1 QR-decomposition
From matrix-matrix multiplication, we have (83).
The orthonormal basis is calculated using the Gram-Schmidt process, shown in (86).
Now we do QR-decomposition using the QRDecoposition function.
8.11 Approximations
8.11 Approximations
8.11.1 Least squares approximation of a linear system
8.11.1 Least squares approximation of a linear system
We can solve this system using Gauss-Jordan elimination.
8.11.2 Simulated example data
8.11.2 Simulated example data
Least squares approximation finds its use in building linear regression models. Feature variables make up column vectors in a matrix and a variable in the form of a vector of actual observations serves as target that must be predicted by the model. Coefficients for the linear approximation(linear combination of the feature column vectors in the matrix) must be calculated so as to minimize the difference between the predicted vector and the vector of actual calculations. It is the usual case that there are many more observations (rows) than columns and the feature matrix spans a much smaller space than that defined by the target vector. Least squares are used to find the closest vector in the space spanned by the matrix to the target vector.
The first three columns are the features variables, including the column of ones.
The last column is the target vector.
These errors (also called residuals) are plotted below.
8.12 Change of basis
8.12 Change of basis
Depending on the problem at hand, different bases might be more appropriate and it is useful to be able to change a basis.
We now have the coordinate vector for the original basis, (97).
8.13 Orthogonal matrices
8.13 Orthogonal matrices
Below, we check on the properties of orthogonal matrices.
The OrthogonalMatrixQ function returns True when a matrix is orthogonal.
We have seen matrices as operators on vectors. Consider the rotation matrix in ().
Now we investigate orthogonal matrices as operators. From the dot product, we recall (107).
We also recall (109) for dot products.
Now we can derive (110).
8.14 Positive definite matrices
8.14 Positive definite matrices
The PositiveDefiniteMatrixQ function returns a True of a given matrix is positive definite.
This is a positive definite matrix.