Frontmatter

If you are publishing this notebook on the web, you can set the parameters below to provide HTML metadata. This is useful for search engines and social media.

Notebook for 2023-02-10

Block Gaussian elimination

The main topic for this lecture was thinking about Gaussian elimination in a blockwise fashion. In particular, we considered solving block 2-by-2 linear systems with matrices of the form

$$A = \begin{bmatrix} B & V \\ W^T & C \end{bmatrix}$$

A particular instance of this is solving the linear system

$$\begin{bmatrix} B & V \\ W^T & -I \end{bmatrix} \begin{bmatrix} x \\ y \end{bmatrix} = \begin{bmatrix} d \\ 0 \end{bmatrix}$$

In terms of two block equations, this is

$$\begin{aligned} Bx + Vy &= d \\ W^T x - y &= 0 \end{aligned}$$

and substituting $y = W^T x$ (from the latter equation) into the former equation gives

$$(B + VW^T)x = d$$

Going the other way, we can eliminate the $x$ variables to get an equation just in $y$:

$$(-I-W^T B^{-1} V) y = -W^T B^{-1} d$$

and then substitute into $Bx = d-Vy$ to solve the system.

Let's go ahead and illustrate how to use this to solve $(B + VW^T) x = d$ given a solver for systems with $B$.

md"""
# Notebook for 2023-02-10

## Block Gaussian elimination

The main topic for this lecture was thinking about Gaussian elimination in a blockwise fashion. In particular, we considered solving block 2-by-2 linear systems with matrices of the form

$$A = \begin{bmatrix} B & V \\ W^T & C \end{bmatrix}$$

A particular instance of this is solving the linear system

$$\begin{bmatrix} B & V \\ W^T & -I \end{bmatrix} \begin{bmatrix} x \\ y \end{bmatrix} = \begin{bmatrix} d \\ 0 \end{bmatrix}$$

In terms of two block equations, this is

$$\begin{aligned}
Bx + Vy &= d \\
W^T x - y &= 0
\end{aligned}$$

and substituting $y = W^T x$ (from the latter equation) into the former equation gives

$$(B + VW^T)x = d$$
9.6 ms
using LinearAlgebra
272 μs
solve_bordered (generic function with 1 method)
# Solves the system (B+VW') x = d given a solver solveB(g) = B\g
function solve_bordered(solveB, V, W, d)
invBV = solveB(V)
invBd = solveB(d)
S = I + W'*invBV
y = S\(W'*invBd)
end
462 μs
test_solve_bordered (generic function with 1 method)
function test_solve_bordered()
B = rand(10,10)
V = rand(10,2)
W = rand(10,2)
x = rand(10)
d = B*x+V*(W'*x)
solveB(y) = B\y
err = x - solve_bordered(solveB, V, W, d)
norm(err)/norm(x)
end
835 μs
2.802295651434949e-15
987 μs

Iterative refinement

The trick of forming a bordered system is interesting in part because it allows us to take advantage of a fast solver for one system (from an existing factorization, for example) in order to solve a slightly modified system. But there's a cost! Even if the modified system is well-conditioned, poor conditioning in the reference problem (the $B$ matrix) can lead to quite bad results.

md"""
## Iterative refinement

The trick of forming a bordered system is interesting in part because it allows us to take advantage of a fast solver for one system (from an existing factorization, for example) in order to solve a slightly modified system. But there's a cost! Even if the modified system is well-conditioned, poor conditioning in the reference problem (the $B$ matrix) can lead to quite bad results.
"""
171 μs
test_sensitive_bordered (generic function with 1 method)
function test_sensitive_bordered()
s = ones(10)
s[10] = 1e-12
B = rand(10,10)*Diagonal(s)*rand(10,10)
V = rand(10,2)
W = rand(10,2)
x = rand(10)
d = B*x+V*(W'*x)
solveB(y) = B\y
err = x - solve_bordered(solveB, V, W, d)
norm(err)/norm(x)
end
913 μs
0.15406594322491596
67.0 μs

A technique in the notes (but not discussed in lecture) can be used to partly fix up this bad behavior. When we have a somewhat sloppy solver like the one here, but we're able to compute a good residual (and the problem we're solving is not terribly ill conditioned) sometimes a few steps of iterative refinement can make a big difference.

md"""
A technique in the notes (but not discussed in lecture) can be used to partly fix up this bad behavior. When we have a somewhat sloppy solver like the one here, but we're able to compute a good residual (and the problem we're solving is not terribly ill conditioned) sometimes a few steps of *iterative refinement* can make a big difference.
"""
169 μs
test_itref (generic function with 1 method)
function test_itref()
s = ones(10)
s[10] = 1e-12
B = rand(10,10)*Diagonal(s)*rand(10,10)
V = rand(10,2)
W = rand(10,2)
x = rand(10)
d = B*x+V*(W'*x)
solveB(y) = B\y

relerrs = zeros(5)
relerrs[1] = norm(x-xx)/norm(x)
for j = 2:5
xx += solve_bordered(solveB, V, W, d-B*xx-V*(W'*xx))
relerrs[j] = norm(x-xx)/norm(x)
end
end
1.5 ms
54.0 μs

Sensitivity

We also spent a little time in this lecture talking about the sensitivity of linear systems to perturbations. In particular, we differentiated the equation

$$Ax = b$$

which, using variational notation, gave us

$$(\delta A) x + A (\delta x) = \delta b$$

In addition to being useful for error analysis, this formula is actually something you can use computationally, as we illustrate below.

md"""
## Sensitivity

We also spent a little time in this lecture talking about the sensitivity of linear systems to perturbations. In particular, we differentiated the equation

$$Ax = b$$

which, using variational notation, gave us

$$(\delta A) x + A (\delta x) = \delta b$$

In addition to being useful for error analysis, this formula is actually something you can use computationally, as we illustrate below.
"""
191 μs
test_perturbed (generic function with 1 method)
function test_perturbed()

# Set up a test problem and directions to move A and b
A = rand(10,10)
δA = rand(10,10)
b = rand(10)
δb = rand(10)

# Compute solutions to a reference and perturbed system
h = 1e-8
x = A\b
= (A+h*δA)\(b+h*δb)

# Compute the directional deriv δx and a finite difference estimate
δx_est = (-x)/h
δx = A\(δb-δA*x)

# Compare the analytic derivative to the finite difference estimate
norm(δx-δx_est)/norm(δx)
end
742 μs
5.359429109831612e-8
72.2 μs