HW 4 for CS 4220
You may (and probably should) talk about problems with the each other, with the TAs, and with me, providing attribution for any good ideas you might get. Your final write-up should be your own.
1. Dear colleagues
A smooth function on $[-1, 1]$ can be approximated by a Chebyshev series of up to degree $d$ in $O(d^2)$ time using point values on a Chebyshev mesh (we can actually do this in $O(d \log d)$ using FFT tricks, but won't bother here):
$$\begin{align*} \hat{f}(x) &= \sum_{j=0}^d a_j T_j(x) \\ a_j &= \frac{2}{d+1} \sum_{j=0}^d f(x_j) T(x_j) \\ x_j &= \cos\left( \frac{\pi (j+1/2)}{d+1} \right) \end{align*}$$
fit_cheb_series (generic function with 1 method)
It is convenient to have a corresponding function to evaluate a Chebyshev series, too.
eval_cheb_series (generic function with 1 method)
A nice visual check is to compare $\hat{f}(x)$ and $f(x)$ for some smooth test function.
The colleague matrix associated with $\hat{f}(x)$ is the $d$-by-$d$ real matrix
$$C_T = \frac{1}{2} (B - e_1 v^T)$$
where
$$B = \begin{bmatrix} 0 & 1 \\ 1 & 0 & 1 \\ & 1 & 0 & 1 \\ & & \ddots & \ddots & \ddots \\ & & & 1 & 0 & 1 \\ & & & & 2 & 0 \end{bmatrix}$$
and $v^T = \begin{bmatrix} a_{d-1} & a_{d-2} & \ldots & a_0 \end{bmatrix}/a_d$ is the row vector of coefficients for $\hat{f}(x)/a_d$, save for the highest-order coefficient which has been scaled to 1 – we will assume $a_d \neq 0$.
Let $T(x)$ denote the length $d$ vector with entries $T_{d-1}(x)$ through $T_{0}(x)$. We claim that $\hat{f}(x) = 0$ iff $(C_T - xI) T(x) = 0$, i.e. $T(x)$ is an eigenvector of $C_T$ associated with the eigenvalue $x$. If $f(x)$ is well approximated by $\hat{f}(x)$ on $[-1,1]$, this gives a good way of estimating the zeros of $f(x)$ on $[-1,1]$ as well.
Questions
Why is $(C_T - xI) T(x) = 0$ when $\hat{f}(x) = 0$?
Write a function
root_cheb_series(a)
that computes all the real zeros of $\hat{f}(x)$ between $[-1, 1]$ by solving an eigenvalue problem with the colleague matrix. Note that the Juliaeigvals
routine computes the eigenvalues of a matrix.
Hint for 1: Note that $(B/2-xI) T(x) = -e_1 T_d(x)/2$ and $v^T T(x) = \hat{f}(x)/a_d - T_d(x)$. Conclude that in general $(C_T-xI) T(x) = -e_1 \hat{f}(x)/(2a_d)$.
Hint for 2: You may want to test using the example function above, which has known zeros at $x_j = \log(j\pi/5)$ for $j = 1, 2, 3, 4$. Assuming that the real zeros are returned in sorted order, you should get relative errors below $10^{-6}$ for all the computed roots reported with the disabled code snippet below.
2. Twirling, twirling
Suppose $A$ is a real matrix with a complex conjugate pair of eigenvalues $\lambda_{\pm} = \alpha \pm i\beta$ and corresponding eigenvectors $z_{\pm} = x \pm iy$.
Show that
$$A \begin{bmatrix} x & y \end{bmatrix} = \begin{bmatrix} x & y \end{bmatrix} \begin{bmatrix} \alpha & \beta \\ -\beta & \alpha \end{bmatrix}$$
Writing the polar form $\alpha + i \beta = \rho \exp(i\theta)$, observe that
$$A^k \begin{bmatrix} x & y \end{bmatrix} \begin{bmatrix} c_1 \\ c_2 \end{bmatrix} = \rho^k \begin{bmatrix} x & y \end{bmatrix} \begin{bmatrix} \cos(k\theta) & \sin(k\theta) \\ -\sin(k\theta) & \cos(k\theta) \end{bmatrix} \begin{bmatrix} c_1 \\ c_2 \end{bmatrix}$$
Note: This shows that if $A$ has a complex conjugate pair of eigenvalues of largest magnitude (and all other eigenvalues are smaller in magnitude), then iterates of the power method will asymptotically "spin around" by an angle of $\theta$ at each step in a two-dimensional space spanned by the real and imaginary parts of the associated eigenvector.
3. Bidiagonal Tikhonov
In the lecture notes, we claimed that solving an upper bidiagonal system with Tikhonov regularization in $O(n)$ was "left as an exercise for the student." This is perhaps an overly difficult exercise, so here we provide a routine tik_bd_solve
that solves the system $(B^T B + \sigma^2 I) x = B^T f$ where $B$ is a square upper bidiagonal matrix with diagonal entries $d_1, \ldots, d_n$ and superdiagonal entries $e_1, \ldots, e_{n-1}$, i.e.
$$B = \begin{bmatrix} d_1 & e_1 \\ & d_2 & e_2 \\ && \ddots & \ddots \\ &&& d_{n-1} & e_{n-1} \\ &&&& d_n \end{bmatrix}$$
tik_bd_solve (generic function with 1 method)
We can illustrate the behavior of the solver with a small example:
0.0
-1.8735e-16
-3.33067e-16
-1.11022e-16
Julia provides a wrapper to the LAPACK routine dgebrd
(or the single precision equivalent), which computes a bidiagonal reduction in packed form. We provide a slightly gentler wrapper that computes $Q^T A P = B$ and provides routines to apply $Q$, $Q^T$, $P$, and $P^T$ as well as the diagonal and superdiagonal elements of $B$ packed as a bidiagonal matrix.
bidiagonal_reduction (generic function with 1 method)
It's always good to have a sanity check; here's one for our routine above.
2.945754565036118e-15
Your job is to
Write a routine
bd_solve(A, b, σ)
that solves the Tikhonov regularized least squares problem by bidiagonal reduction followed by a call totik_bd_solve
.Write a routine
lcurve(A, b, σs)
that computes the solution norm $\|x(\sigma)\|$ and the residual norm $\|r(\sigma)\|$ for each $\sigma$ in the vectorσs
. Your code will take $O(mn^2)$ time for the initial bidiagonal reduction, but then should take $O(n)$ time per $\sigma$.
You may use the following tester codes to check your work.