Notebook for 2022-03-27
This lecture is mostly to remind you about some relevant calculus – but it is helpful to be able to sanity check your calculus numerically, so let's do a notebook of examples and finite difference checks.
Directional derivatives and finite differences
Let's consider a concrete example of $f : \mathbb{R}^2 \rightarrow \mathbb{R}^2$
$$f(x) = \begin{bmatrix} x_1 + 2x_2 - 2 \\ x_1^2 + 4 x_2^2 - 4 \end{bmatrix}$$
ftest (generic function with 1 method)
Now define $g(s) = f(x_0+su)$ for some randomly chosen $x_0$ and direction $u$
g (generic function with 1 method)
We compute the derivative $g'(0)$ both analytically and using a finite difference estimate. So far, we have used one-sided finite differences, but it is actually a little more accurate to use symmetric finite differences:
$$g'(s) = \frac{g(s+h)-g(s-h)}{2h} + O(h^2)$$
2.5452209356438296e-11
We can also compute second derivatives analytically or by computing second derivatives. We could do this by finite differencing the first derivative, or by the formula
$$g''(s) = \frac{g(s-h) -2 g(s) + g(s+h)}{h^2} + O(h^2).$$
1.3830592783544882e-9
The second-order Taylor series approximation to $g(s)$ lies directly atop the true approximation (as it should in this case – can you see why?).
Derivatives, approximation, and chain rule
The function $f$ is differentiable at $x$ if there is a good affine (constant plus linear) approximation
$$f(x) + f'(x)z + o(\|z\|)$$
where the Jacobian $f'(x)$ (also written $J(x)$ or $\frac{\partial f}{\partial x})$ is the $m \times n$ matrix whose $(i,j)$ entry is the partial derivative $f_{i,j} = \partial f_i/\partial x_j$. If $f$ is differentiable, the Jacobian matrix maps directions to directional derivatives.
1.1509785999592623e-16
The chain rule is just about matrix multiplication: the derivative of a composition is the composition of derivatives. As an example of this, consider the behavior of $f$ on a circle: $h(\theta) = f(g(\theta))$.
6.603894155300888e-9