13 Linear Algebra in Julia

using LinearAlgebra

The LinearAlgebra package provides, among other things:

additional subtypes of AbstractMatrix which are usable like other matrices, among them:
- Tridiagonal
- SymTridiagonal
- Symmetric
- UpperTriangular
additional/extended functions: norm, opnorm, cond, inv, det, exp, tr, dot, cross, …
a universal solver for systems of linear equations: \
- x = A \ b solves \(A \mathbf{x}=\mathbf{b}\) by appropriate matrix factorization and forward/backward substitution
Matrix factorizations
- LU
- QR
- Cholesky
- SVD
- …
Computation of eigenvalues/eigenvectors
- eigen, eigvals, eigvecs
Access to BLAS/LAPACK functions

13.1 Special Matrix Types

A = SymTridiagonal(fill(1.0, 4), fill(-0.3, 3))

4×4 SymTridiagonal{Float64, Vector{Float64}}:
  1.0  -0.3    ⋅     ⋅ 
 -0.3   1.0  -0.3    ⋅ 
   ⋅   -0.3   1.0  -0.3
   ⋅     ⋅   -0.3   1.0

B = UpperTriangular(A)

4×4 UpperTriangular{Float64, SymTridiagonal{Float64, Vector{Float64}}}:
 1.0  -0.3   0.0   0.0
  ⋅    1.0  -0.3   0.0
  ⋅     ⋅    1.0  -0.3
  ⋅     ⋅     ⋅    1.0

These types are stored space-efficiently. The usual arithmetic operations are implemented:

A + B

4×4 Matrix{Float64}:
  2.0  -0.6   0.0   0.0
 -0.3   2.0  -0.6   0.0
  0.0  -0.3   2.0  -0.6
  0.0   0.0  -0.3   2.0

Read-only index access is possible,

A[1,4]

0.0

Write operations are not necessarily possible:

A[1,3] = 17

ArgumentError: cannot set off-diagonal entry (1, 3)
Stacktrace:
 [1] setindex!(A::SymTridiagonal{Float64, Vector{Float64}}, x::Int64, i::Int64, j::Int64)
   @ LinearAlgebra ~/.julia/juliaup/julia-1.12.5+0.x64.linux.gnu/share/julia/stdlib/v1.12/LinearAlgebra/src/tridiag.jl:486
 [2] top-level scope
   @ ~/Julia/Book26/JuliaBook/chapters/11_LinAlg.qmd:91
Error showing value of type ArgumentError
StackOverflowError:
Stacktrace:
 [1] show(io::IOContext{IOBuffer}, x::ArgumentError) (repeats 79983 times)
   @ Main.Notebook ~/Julia/Book26/JuliaBook/chapters/11_LinAlg.qmd:24

Conversion to a ‘normal’ matrix is possible using collect():

A2 = collect(A)

4×4 Matrix{Float64}:
  1.0  -0.3   0.0   0.0
 -0.3   1.0  -0.3   0.0
  0.0  -0.3   1.0  -0.3
  0.0   0.0  -0.3   1.0

13.1.1 The Identity Matrix `I`

I denotes an identity matrix (square, diagonal elements = 1, all others = 0) of the required size

A + 4I

4×4 SymTridiagonal{Float64, Vector{Float64}}:
  5.0  -0.3    ⋅     ⋅ 
 -0.3   5.0  -0.3    ⋅ 
   ⋅   -0.3   5.0  -0.3
   ⋅     ⋅   -0.3   5.0

13.2 Norms

To study questions such as conditioning or convergence of an algorithm, we need a metric. For linear spaces, it is appropriate to define the metric via a norm: \[ d(x,y) := \lVert x-y\rVert \]

13.2.1 \(p\)-Norms

A simple class of norms in \(ℝ^n\) are the \(p\)-norms \[ \lVert \mathbf{x} \rVert_p = \left(\sum |x_i|^p\right)^\frac{1}{p}, \] which generalize the Euclidean norm \(p=2\).

The Max-Norm \(p=\infty\)

Let \(x_{\text{max}}\) be the component of \(\mathbf{x}\in ℝ^n\) with the largest absolute value. Then always \[ \lvert x_{\text{max}}\rvert \le \lVert\mathbf{x}\rVert_p \le n^\frac{1}{p} \lvert x_{\text{max}}\rvert \] (Consider a vector whose components are all equal to \(x_{\text{max}}\) respectively a vector whose components are all equal to zero except \(x_{\text{max}}\).)

It follows that \[ \lim_{p \rightarrow \infty} \lVert\mathbf{x}\rVert_p = \lvert x_{\text{max}}\rvert =: \lVert\mathbf{x}\rVert_\infty. \]

In Julia, the LinearAlgebra package defines a function norm(v, p).

v = [3, 4]
w = [-1, 2, 33.2]

@show norm(v)  norm(v, 2)  norm(v, 1) norm(v, 4) norm(w, Inf);

norm(v) = 5.0
norm(v, 2) = 5.0
norm(v, 1) = 7.0
norm(v, 4) = 4.284572294953817
norm(w, Inf) = 33.2

If the 2nd argument p is missing, p=2 is set.
The 2nd argument can also be Inf (i.e., \(+\infty\)).
The 1st argument can be any numerical container. The sum \(\sum \lvert x_i\rvert^p\) extends over all elements of the container.
Thus, for a matrix norm(A) is equal to the Frobenius norm of the matrix A.

A = [1 2 3
     4 5 6
     7 8 9]
norm(A)      # Frobenius norm

16.881943016134134

Since norms are homogeneous under multiplication with scalars, \(\lVert\lambda \mathbf{x}\rVert = |\lambda|\cdot\lVert\mathbf{x}\rVert\), they are completely determined by the specification of the unit ball. Subadditivity of the norm (triangle inequality) is equivalent to the convexity of the unit ball (code visible by clicking).

Code

using Plots

colors=[:purple, :green, :red, :blue,:aqua, :black]
x=LinRange(-1, 1, 1000)
y=LinRange(-1, 1, 1000)

fig1=plot()
for p ∈ (0.8, 1, 1.5, 2, 3.001, 1000)
    contour!(x,y, (x,y) -> p * norm([x, y], p), levels=[p], aspect_ratio=1, 
        cbar=false, color=[pop!(colors)], contour_labels=true, ylim=[-1.1, 1.1])
end
fig1

Unit balls in \(ℝ^2\) for different \(p\)-norms: \(p\)=0.8; 1; 1.5; 2; 3.001 and 1000

We see that, \(p\ge 1\) must hold so that the unit ball is convex and \(\lVert.\rVert_p\) is a norm.

However, the Julia function norm(v, p) also works for parameters p<1.

13.2.2 Induced Norms (Operator Norms)

Matrices \(A\) represent linear mappings \(\mathbf{v}\mapsto A\mathbf{v}\). The matrix norm induced by a vector norm answers the question:

“By what factor can a vector be maximally stretched by the transformation \(A\)?”

Due to the homogeneity of the norm under multiplication with scalars, it is sufficient to consider the image of the unit ball under the transformation \(A\).

Definition

Let \(V\) be a vector space with dimension \(0<n<\infty\) and \(A\) an \(n\times n\)-matrix. Then \[ \lVert A\rVert_p = \max_{\lVert\mathbf{v}\rVert_p=1} \lVert A\mathbf{v}\rVert_p \]

Induced norms are difficult to calculate for general \(p\). Exceptions are the cases

\(p=1\): column sum norm
\(p=2\): spectral norm and
\(p=\infty\): row sum norm

These 3 cases are implemented in Julia in the function opnorm(A, p) from the LinearAlgebra package, where again opnorm(A) = opnorm(A, 2) holds.

A = [ 0    1
      1.2  1.5 ]

@show opnorm(A, 1) opnorm(A, Inf) opnorm(A, 2) opnorm(A);

opnorm(A, 1) = 2.5
opnorm(A, Inf) = 2.7
opnorm(A, 2) = 2.0879899930905124
opnorm(A) = 2.0879899930905124

The following picture shows the effect of \(A\) on unit vectors. Vectors of the same color are mapped onto each other. (Code visible by clicking):

Code

using CairoMakie

A = [ 0    1
      1.2  1.5 ]

t = LinRange(0, 1, 100)
xs = sin.(2π*t)
ys = cos.(2π*t)
Axys = A * [xs, ys]

u = [sin(n*π/6) for n=0:11]
v = [cos(n*π/6) for n=0:11]
x = y = zeros(12)

Auv = A * [u,v] 

fig2 = Figure(size=(800, 400))
lines(fig2[1, 1], xs, ys, color=t, linewidth=5, colormap=:hsv, axis=(; aspect = 1, limits=(-2,2, -2,2), 
      title=L"$\mathbf{v}$", titlesize=30))
arrows2d!(fig2[1,1], x, y, u, v, tipwidth=10, colormap=:hsv, shaftcolor=range(0,11), shaftwidth=3)

Legend(fig2[1,2], MarkerElement[], String[], L"⟹", width=40, height=30, titlesize=30, framevisible=false)

lines(fig2[1,3], Axys[1], Axys[2], color=t, linewidth=5, colormap=:hsv, axis=(; aspect=1, limits=(-2,2, -2,2), 
      title=L"$A\mathbf{v}$", titlesize=30))
arrows2d!(fig2[1,3], x, y, Auv[1], Auv[2], tipwidth=10,  colormap=:hsv, shaftcolor=range(0,11),
             shaftwidth=3)

fig2

Image of the unit ball under \(v \mapsto Av\) with \(\lVert A\rVert\approx 2.088\)

13.2.3 Condition Number

For p = 1, p = 2 (default) or p = Inf, cond(A,p) returns the condition number in the \(p\)-norm \[ \text{cond}_p(A) = \lVert A\rVert_p \cdot \lVert A^{-1}\rVert_p \]

@show cond(A, 1)  cond(A, 2) cond(A)  cond(A, Inf);

cond(A, 1) = 5.625
cond(A, 2) = 3.633085176038432
cond(A) = 3.633085176038432
cond(A, Inf) = 5.625000000000001

13.3 Matrix Factorizations

The basic tasks of numerical linear algebra:

Solve a system of linear equations \(A\mathbf{x} = \mathbf{b}\).
If no solution exists, find the best approximation, i.e., the vector \(\mathbf{x}\) that minimizes \(\lVert A\mathbf{x} - \mathbf{b}\rVert\).
Find eigenvalues and eigenvectors \(A\mathbf{x} = \lambda \mathbf{x}\) of \(A\).

These tasks can be solved using matrix factorizations. Some fundamental matrix factorizations:

LU decomposition \(A=L\cdot U\)
- factorizes a matrix as a product of a lower and an upper triangular matrix
- always works (possibly after row exchanges - pivoting)
Cholesky decomposition \(A=L\cdot L^*\)
- the upper triangular matrix is the conjugate of the lower,
- half the effort compared to LU
- only works if \(A\) is Hermitian and positive definite
QR decomposition \(A=Q\cdot R\)
- decomposes \(A\) as a product of an orthogonal (or unitary in the complex case) matrix and an upper triangular matrix
- \(Q\) is length-preserving (rotations and/or reflections); the scalings are described by \(R\)
- always works
SVD (Singular value decomposition): \(A = U\cdot D \cdot V^*\)
- \(U\) and \(V\) are orthogonal (or unitary), \(D\) is a diagonal matrix with entries on the diagonal \(σ_i\ge 0\), the so-called singular values of \(A\).
- Every linear transformation \(\mathbf{v} \mapsto A\mathbf{v}\) can thus be represented as a rotation (and/or reflection) \(V^*\), followed by a pure scaling \(v_i \mapsto \sigma_i v_i\) and another rotation \(U\).

13.3.1 LU Factorization

LU factorization is Gaussian elimination. The result of Gaussian elimination is the upper triangular matrix \(U\). The lower triangular matrix \(L\) contains ones on the diagonal and the non-diagonal entries \(l_{ij}\) are equal to minus the coefficients by which row \(Z_j\) is multiplied and added to row \(Z_i\) in Gaussian elimination:

\[ A=\left[ \begin{array}{ccc} 1 &2 &2 \\ 3 &-4& 4 \\ -2 & 1 & 5 \end{array}\right] ~ \begin{array}{c} ~\\ Z_2 \mapsto Z_2 \mathbin{\color{red}-}\textcolor{red}{3} Z_1\\ Z_3 \mapsto Z_3 + \textcolor{red}{2} Z_1 \end{array} \quad \Longrightarrow\ \left[ \begin{array}{ccc} 1 &2 &2 \\ &-10& -2 \\ & 5 & 9 \end{array}\right] ~ \begin{array}{c} ~\\ ~\\ Z_3 \mapsto Z_3 + \textcolor{red}{\frac{1}{2}} Z_2 \end{array} \quad \Longrightarrow\ \left[ \begin{array}{ccc} 1 &2 &2 \\ &-10& -2 \\ & & 8 \end{array}\right] = U \]

\[ A = \left[ \begin{array}{ccc} 1 && \\ \mathbin{\color{red}+}\textcolor{red}{3} &1 & \\ \mathbin{\color{red}-}\textcolor{red}{2} & \mathbin{\color{red}-}\textcolor{red}{\frac{1}{2}}& 1 \end{array} \right] \cdot \left[ \begin{array}{ccc} 1 &2 &2 \\ &-10& -2 \\ & & 8 \end{array}\right] \]

In practice, it is often necessary to solve \(A\mathbf{x}=\mathbf{b}\) for a fixed matrix \(A\) and multiple right-hand sides \(\mathbf{b}\).
The factorization of \(A\), which has a cubic complexity \(\sim n^3\) relative to the size \(n\) of the matrix, needs to be performed only once.
For each subsequent right-hand side \(\mathbf{b}\), the forward and backward substitution steps have quadratic complexity \(\sim n^2\).

The LinearAlgebra package of Julia contains the function lu(A, options) for calculating an LU decomposition:

A = [ 1   2  2
      3  -4  4
     -2   1  5]

L, U, _ = lu(A, NoPivot())
display(L)
display(U)

3×3 Matrix{Float64}:
  1.0   0.0  0.0
  3.0   1.0  0.0
 -2.0  -0.5  1.0
3×3 Matrix{Float64}:
 1.0    2.0   2.0
 0.0  -10.0  -2.0
 0.0    0.0   8.0

Pivoting

Let’s look at one step of Gaussian elimination:

\[ \left[ \begin{array}{cccccc} * &\cdots & * & * & \cdots & * \\ &\ddots & \vdots &\vdots && \vdots \\ && * & * &\cdots & * \\ && & \textcolor{red}{a_{ij}}&\cdots & a_{in}\\ && & \textcolor{blue}{a_{i+1,j}}&\cdots & a_{i+1,n}\\ &&& \textcolor{blue}{\vdots} &&\vdots \\ && & \textcolor{blue}{a_{mj}}&\cdots & a_{mn} \end{array} \right] \]

The goal is to make the entry \(a_{i+1,j}\) disappear by adding an appropriate multiple of row \(Z_i\) to row \(Z_{i+1}\). This only works if the pivot element \({\color{red}a_{ij}}\) is not zero. If \({\color{red}a_{ij}}=0\), we must exchange rows to fix this.

Furthermore, the conditioning of the algorithm is best if we arrange the matrix at each step so that the pivot element is the largest in absolute value in the corresponding column of the remaining submatrix. In (row) pivoting, rows are exchanged to ensure that

\[ |\textcolor{red}{a_{ij}}|=\max_{k=i,...,m} |\textcolor{blue}{a_{kj}}| \]

LU in Julia

The factorizations in Julia return a special object that contains the matrix factors and additional information.
The Julia function lu(A) performs an LU factorization with pivoting.

F = lu(A)
typeof(F)

LU{Float64, Matrix{Float64}, Vector{Int64}}

Elements of the object:

@show F.L F.U F.p;

F.L = [1.0 0.0 0.0; 0.3333333333333333 1.0 0.0; -0.6666666666666666 -0.5 1.0]
F.U = [3.0 -4.0 4.0; 0.0 3.333333333333333 0.6666666666666667; 0.0 0.0 8.0]
F.p = [2, 1, 3]

One can also use an appropriate tuple on the left side:

L, U, p = lu(A);
p

3-element Vector{Int64}:
 2
 1
 3

The permutation vector indicates how the rows of the matrix have been permuted. It holds: \[ L\cdot U = P A\] Julia’s syntax of indirect indexing allows applying the row permutation with the notation A[p,:]:

display(A)
display(A[p,:])
display(L*U)

3×3 Matrix{Int64}:
  1   2  2
  3  -4  4
 -2   1  5
3×3 Matrix{Int64}:
  3  -4  4
  1   2  2
 -2   1  5
3×3 Matrix{Float64}:
  3.0  -4.0  4.0
  1.0   2.0  2.0
 -2.0   1.0  5.0

Forward/backward substitution with an LU-object is performed by the \ operator:

b = [1, 2, 3]
x = F \ b

3-element Vector{Float64}:
 -0.10000000000000002
 -0.012500000000000006
  0.5625

Verification:

A * x - b

3-element Vector{Float64}:
 0.0
 0.0
 0.0

In Julia, the \ operator hides a quite universal “matrix solver” which perfoms implicitely an appropriate matrix factorization:

A \ b

3-element Vector{Float64}:
 -0.10000000000000002
 -0.012500000000000006
  0.5625

13.3.2 QR Decomposition

The function qr() returns a special QR object that contains the components \(Q\) and \(R\). The orthogonal (or unitary) matrix \(Q\) is in an optimized form. Conversion to a “normal” matrix is, as always, possible with collect() if needed.

F = qr(A)
@show typeof(F) typeof(F.Q)
display(collect(F.Q))
display(F.R)

typeof(F) = LinearAlgebra.QRCompactWY{Float64, Matrix{Float64}, Matrix{Float64}}
typeof(F.Q) = LinearAlgebra.QRCompactWYQ{Float64, Matrix{Float64}, Matrix{Float64}}
3×3 Matrix{Float64}:
 -0.267261   0.872872  0.408248
 -0.801784  -0.436436  0.408248
  0.534522  -0.218218  0.816497
3×3 Matrix{Float64}:
 -3.74166  3.20713  -1.06904
  0.0      3.27327  -1.09109
  0.0      0.0       6.53197

13.3.3 Appropriate Factorization

The function factorize() returns a factorization appropriate for the given matrix type; see documentation for details. This factorization can subsequently be utilized for multiple right-hand sides \(\mathbf{b_1}, \mathbf{b_2},\ldots\).

Af = factorize(A)

LU{Float64, Matrix{Float64}, Vector{Int64}}
L factor:
3×3 Matrix{Float64}:
  1.0        0.0  0.0
  0.333333   1.0  0.0
 -0.666667  -0.5  1.0
U factor:
3×3 Matrix{Float64}:
 3.0  -4.0      4.0
 0.0   3.33333  0.666667
 0.0   0.0      8.0

Af \ [1, 2, 3]

3-element Vector{Float64}:
 -0.10000000000000002
 -0.012500000000000006
  0.5625

Af \ [5, 7, 9]

3-element Vector{Float64}:
 0.4000000000000001
 0.42500000000000004
 1.875