Final exam
You should be able to solve this exam using only the course notes and previous assignments, but you are welcome to consult any resource you wish except for people outside the course staff. Provide attribution (a citation or link) for any ideas you get. Your final write-up should be your own.
You should do problem 1, and any four out of the remaining five problems. Please indicate which four of the remaining five problems you want graded (we will not grade all five and take the best).
Snacks
(1 pt) Complete the course evaluation, if you have not already done so!
(1 pt) Give an example of a continuous function $f$ with a zero in $[0,1]$ where bisection starting from the interval $[0,1]$ will fail. Explain.
(1 pt) Give an example 1D optimization problem where Newton iteration converges linearly (not quadratically) to a minimum. Explain.
(1 pt) Given
F = cholesky(A)
for spd $A \in \mathbb{R}^{n \times n}$, give a Julia code fragment to evaluate $e_n^T A^{-1} e_n = (A^{-1})_{nn}$ in $O(1)$ time.(1 pt) Using Newton iteration, solve the simultaneous equations $x^2 + xy^2 = 9$ and $3x^2 y - y^3 = 4$. You may use the initial guess $(x,y) = (1,1)$. Report your results to at least ten digits.
Answer
1.0
1.0
1.0
Interesting iterations
Consider the function $w(s)$ defined by the scalar equation
$$we^w = s.$$
(2 point) Argue that the equation has a unique solution for any $s > 0$.
(2 points) By manipulating the equations and some clever uses of Taylor's theorem with remainder, we get upper and lower bounds of $\log((1+\sqrt{1+4s})/2) \leq w \leq \log(1+s)$. For $s \ll 1$, both these expressions have large relative errors. Rewrite each expression accurately using the
log1p
, which evaluates $\log(1+z)$ accurately when $z \ll 1$.(2 points) Consider the iteration $w_{k+1} = s \exp(-w_k)$. Derive an error iteration and analyze its convergence. For what range of $s$ values does the iteration converge?
(2 points) Consider the fixed point iteration $w_{k+1} = G(w_k)$ where $G(w) = s(1+w)/(s+e^w)$. The solution to $we^w = s$ is a fixed point of $G$ (you do not need to show this). Argue that this iteration converges quadratically from good enough initial guesses. In fact, it converges for any initial guess in $[0,\infty)$; you do not have to prove this.
(2 points) Complete the function below to evaluate $w(s)$ and $w'(s)$ for a given $s$. Report values of $w(1)$ and $w'(1)$ to at least ten digits.
Answer
bracket_w (generic function with 1 method)
compute_w (generic function with 1 method)
We provide sanity checks for the bracketing behavior at small $s$ and for the correctness of the derivative calculation.
Bracket FAILED: 1.0e-16 ∈ [ 0.0, 0.0 ]
Relative error in dw: Inf
Quirky quadratics
Consider the almost-quadratic optimization problem of minimizing (or finding a stationary point of)
$$\phi(x) = \frac{1}{2} x^T H x - x^T d + g(c^T x).$$
Here $H \in \mathbb{R}^{n \times n}$, $x, c, d \in \mathbb{R}^n$, and $g : \mathbb{R} \rightarrow \mathbb{R}$.
(2 points) Write an expression for $\nabla \phi$.
(2 points) Show that $x = H^{-1} (d + \gamma c)$ for some $\gamma$
(2 points) Show $\gamma + g'(\alpha + \gamma \beta) = 0$ for some $\alpha$ and $\beta$.
(4 points) Using part 3, complete the Newton solver below to compute $\gamma$ and hence to compute a stationary point of $\phi$. For the given test case, report the computed $\gamma$ to at least ten digits. For full credit, you should not factor $H$ more than once, and you should use a minimal number of linear solves with $H$.
Answer
p3newton (generic function with 1 method)
Block GS
Consider the linear system $Lz = h$ with block structure
$$\begin{bmatrix} A & B \\ C & D \end{bmatrix} \begin{bmatrix} x \\ y \end{bmatrix} = \begin{bmatrix} f \\ g \end{bmatrix}$$
where $A \in \mathbb{R}^{n \times n}$ and $D \in \mathbb{R}^{m \times m}$ are invertible. The block Gauss-Seidel iteration is
$$\begin{align*} Ax^{k+1} &= f - B y^k \\ Dy^{k+1} &= g - C x^{k+1}. \end{align*}$$
(2 points) Write the splitting $L = M-N$ associated with this iteration.
(2 points) Write $R = M^{-1} N$ in terms of the component matrices $A, B, C, D$.
(3 points) Argue that $\rho(R) = \rho(D^{-1} C A^{-1} B)$ where $\rho$ denotes the spectral radius. You may use without argument that the eigenvalues of a block triangular matrix are equal to the eigenvalues of its diagonal blocks.
(3 points) Now consider the nonlinear block Gauss-Seidel iteration $Ax^{k+1} = u(y^k)$ and $D y^{k+1} = w(x^{k+1})$ where $u : \mathbb{R}^m \rightarrow \mathbb{R}^n$ and $w : \mathbb{R}^n \rightarrow \mathbb{R}^m$ satisfy $\|u'\|_2 \leq M_u$ and $\|w'\|_2 \leq M_w$ everywhere. For convenience, we can also write $y^{k+1} = D^{-1} w(A^{-1} u(y^k))$ (we can similarly write an iteration satisfied by the $x^{k+1}$ and $x^k$ alone). Argue that the $y_{k}$ iteration converges if $M_u M_w < \sigma_{min}(D) \sigma_{\min}(A)$.
Answer
Lolling linkages
We consider an optimization problem inspired by the equilibrium behavior of a chain of four unit-length rigid beams connected by pivot joints, where the position of the end beams is constrained. Leaving aside the physical model, we have the constrained optimization problem
$$\mbox{minimize } 3 \sin(\theta_1) + \sin(\theta_2) \mbox{ s.t. } \cos(\theta_1) + \cos(\theta_2) = 2 - \delta.$$
(2 points) Argue from Taylor expansion of $\cos(\theta) = \sqrt{1-\sin(\theta)^2}$ that $\cos(\theta) \approx 1 - \frac{1}{2} \sin(\theta)^2$. Hence for small $\delta$, the constraint is approximately $\sin(\theta_1)^2 + \sin(\theta_2)^2 = 2 \delta$. For what values of $\sin(\theta_1)$ and $\sin(\theta_2)$ satisfying this constraint do we minimize the objective?
(4 points) Starting from an initial guess derived from the estimate in the previous problem, write a Newton iteration to find the optimum angles. What are the angles for $\delta = 10^{-2}$, $\delta = 10^{-1}$, and $\delta = 0.25$? Give a semilog convergence plot for the $\delta = 0.25$ case.
(4 points) Assuming the above constrained optimization problem is satisfied, write a Newton solver to find $\delta$ given the additional equation $y = \sin(\theta_1) + \sin(\theta_2)$.
Answer
linkage_θ (generic function with 1 method)
linkage_δ (generic function with 1 method)
We provide a consistency check for part 3.
MethodError: no method matching iterate(::Nothing)
Closest candidates are:
iterate(!Matched::Union{LinRange, StepRangeLen}) at range.jl:872
iterate(!Matched::Union{LinRange, StepRangeLen}, !Matched::Integer) at range.jl:872
iterate(!Matched::T) where T<:Union{Base.KeySet{<:Any, <:Dict}, Base.ValueIterator{<:Dict}} at dict.jl:712
...
- indexed_iterate(::Nothing, ::Int64)@tuple.jl:91
- top-level scope@Local: 3
We also provide the code to report the numbers and plots that we want!
MethodError: no method matching iterate(::Nothing)
Closest candidates are:
iterate(!Matched::Union{LinRange, StepRangeLen}) at range.jl:872
iterate(!Matched::Union{LinRange, StepRangeLen}, !Matched::Integer) at range.jl:872
iterate(!Matched::T) where T<:Union{Base.KeySet{<:Any, <:Dict}, Base.ValueIterator{<:Dict}} at dict.jl:712
...
- indexed_iterate(::Nothing, ::Int64)@tuple.jl:91
- top-level scope@Local: 4
Double trouble
Consider the matrix-valued function $G(s) = A+sB$ where $G : [0,1] \rightarrow \mathbb{R}^{n \times n}$. We give a specific case below. In our example, as $s$ moves from zero to one, a pair of real eigenvalues "collide" and become a complex conjugate pair. We are interested in locating this. Your task is to complete several analysis codes that can help.
(3 points) Write a bisection routine to approximately compute the $s$ where we go from all real eigenvalues to some complex eigenvalues. Resolve $s$ to at least three digits (bracketing interval of length $10^{-3}$).
(3 points) Complete the pseudo-arclength continuation code, trace $\lambda(\gamma)$ vs $s(\gamma)$ for $(v(\gamma), \lambda(\gamma) s(\gamma))$ given by $G(s) v = \lambda v$ and $v^T v = 1$. Plot the resulting curve. What is the maximum value of $s$ you see?
(4 points) Write a Gauss-Newton iteration to solve the nonlinear equations $(G(s)-\lambda I)^2 V = 0$ and $V_0^T V = I$, which characterize a point at which we have a double eigenvalue. The unknowns in this problem are $s$, $\lambda$, and $V \in \mathbb{R}^{n \times 2}$; note that we have $2n+4$ equations in $2n+2$ unknowns. Nevertheless, this overdetermined system is solvable, and Gauss-Newton should converge to it quadratically. Demonstrate quadratic convergence with a convergence plot. Use an initial guess of $s = 0.25$, $\lambda = 3$, and $V = V_0$ computed by the previous equation.
Answer
p6bisection (generic function with 1 method)
p6continuation (generic function with 1 method)
p6gnsolver (generic function with 1 method)
p6subspace (generic function with 1 method)
Bisection final interval: [0.0, 1.0]
Maximum $s$ in continuation: 0.0
Gauss-Newton converged to s = 0.25, λ = 3.0
No strict ticks found
No strict ticks found
No strict ticks found