Frontmatter

If you are publishing this notebook on the web, you can set the parameters below to provide HTML metadata. This is useful for search engines and social media.

Notebook for 2023-03-20

As a running example in this lecture, we looked at numerical computations for the reciprocal function computing $r = 1/d$ (assuming that we didn't have division already readily at hand).

md"""
# Notebook for 2023-03-20

As a running example in this lecture, we looked at numerical computations for the reciprocal function computing $r = 1/d$ (assuming that we didn't have division already readily at hand).
"""
125 ms
using Plots
8.4 s

Long division

This is binary long division (the binary version of the algorithm you learned in school). It gets about one bit of additional accuracy per step.

md"""
## Long division

This is binary long division (the binary version of the algorithm you learned in school). It gets about one bit of additional accuracy per step.
"""
138 μs
reciprocal_div (generic function with 1 method)
# Approximate x=1/d by n steps of binary long division
function reciprocal_div(d, n; monitor=(x)->nothing)

r = 1.0 # Current remainder
x = 0.0 # Current reciprocal estimate
bit = 0.5 # Value of a bit in the current place

for k = 1:n
if r > d*bit
x += bit
r -= d*bit
end
bit /= 2
end
x
end
1.6 ms
begin
rd_errs = []
reciprocal_div(5, 30, monitor=(x)->push!(rd_errs, abs(x-0.2)))
plot!(rd_errs, yscale=:log10)
end
2.7 s

Bisection

Bisection uses the fact that the reciprocal $1/d$ likes between a lower bound $l$ such that $dl-1 < 0$ and an upper bound $h$ such that $dh-1 > 0$. Bisection starts with an initial interval $[l,h]$ and repeatedly replaces it with a smaller interval $[l,m]$ or $[m,h]$ (for $m = (l+h)/2$) such that there continues to be a sign change of the function $dx-1$ between the two endpoints.

md"""
## Bisection

Bisection uses the fact that the reciprocal $1/d$ likes between a lower bound $l$ such that $dl-1 < 0$ and an upper bound $h$ such that $dh-1 > 0$. Bisection starts with an initial interval $[l,h]$ and repeatedly replaces it with a smaller interval $[l,m]$ or $[m,h]$ (for $m = (l+h)/2$) such that there continues to be a sign change of the function $dx-1$ between the two endpoints.
"""
173 μs
reciprocal_bisect (generic function with 1 method)
function reciprocal_bisect(d, n; monitor=(x)->nothing)
hi = 1.0 # Upper bound
lo = 0.0 # Lower bound
for k = 1:n
x = (hi+lo)/2
fx = d*x-1
if fx > 0.0
hi = x
else
lo = x
end
end
(hi+lo)/2
end
1.7 ms
begin
rb_errs = []
reciprocal_bisect(5, 30, monitor=(x)->push!(rb_errs, abs(x-0.2)))
plot(rb_errs, yscale=:log10)
end
659 μs

A modified Newton approach

This is the fixed point iteration

$$x_{k+1} = x_k-\frac{dx_k-1}{\hat{d}}$$

where $\hat{d}$ is a power of 2 as close as possible to $d$. If we divided by $d$ instead of $\hat{d}$, we would converge in one step –- but we are doing all this under the assumption that we don't know how to divide in general. On the other hand, division by $\hat{d}$ is easy in binary floating point when $\hat{d}$ is a power of two.

md"""
## A modified Newton approach

This is the fixed point iteration

$$x_{k+1} = x_k-\frac{dx_k-1}{\hat{d}}$$

where $\hat{d}$ is a power of 2 as close as possible to $d$. If we divided by $d$ instead of $\hat{d}$, we would converge in one step --- but we are doing all this under the assumption that we don't know how to divide in general. On the other hand, division by $\hat{d}$ is easy in binary floating point when $\hat{d}$ is a power of two.
"""
194 μs
reciprocal_newtonish (generic function with 1 method)
function reciprocal_newtonish(d, , n; monitor=(x)->nothing)
x = 1/
for k = 1:n
fx = d*x-1
x -= fx/
end
x
end
1.5 ms
begin
rn1_errs = []
reciprocal_newtonish(5, 4, 20, monitor=(x)->push!(rn1_errs, abs(x-0.2)))
plot(rn1_errs, yscale=:log10)
end
504 μs

Newton iteration

Let $g(x) = x^{-1} - d$. Newton iteration for this function gives (with some algebra)

$$x_{k+1} = x_k-\frac{g(x_k)}{g'(x_k)} = x_k (2-d x_k)$$

which we can compute with only a subtraction and two multiplications. The iteration converges quadratically from good enough initial guesses; that means that the error at step $k+1$ is proportional to the square of the error at step $k$. On a log scale, the error curve looks like a downward-facing parabola.

md"""
## Newton iteration

Let $g(x) = x^{-1} - d$. Newton iteration for this function gives (with some algebra)

$$x_{k+1} = x_k-\frac{g(x_k)}{g'(x_k)} = x_k (2-d x_k)$$

which we can compute with only a subtraction and two multiplications. The iteration converges *quadratically* from good enough initial guesses; that means that the error at step $k+1$ is proportional to the square of the error at step $k$. On a log scale, the error curve looks like a downward-facing parabola.
"""
210 μs
reciprocal_newton (generic function with 1 method)
function reciprocal_newton(d, x, n; monitor=(x)->nothing)
for k=1:n
x = x*(2-d*x)
end
x
end
1.5 ms
begin
rn2_errs = []
reciprocal_newton(5, 0.25, 5, monitor=(x)->push!(rn2_errs, abs(x-0.2)))
plot(rn2_errs, yscale=:log10)
end
474 μs

Comparing methods

Let's wrap up by looking at all four methods we've considered on a common axis. It will take something like 53 iterations for bisection or long division to reach full accuracy; our modified Newton takes something like 25 steps; and our Newton iteration reaches correct rounding in only 6 steps.

md"""
## Comparing methods

Let's wrap up by looking at all four methods we've considered on a common axis. It will take something like 53 iterations for bisection or long division to reach full accuracy; our modified Newton takes something like 25 steps; and our Newton iteration reaches correct rounding in only 6 steps.
"""
140 μs
begin
plot(rd_errs, yscale=:log10, label="Long div")
plot!(rb_errs, yscale=:log10, label="Bisection")
plot!(rn1_errs, yscale=:log10, label="Mod Newton")
plot!(rn2_errs, yscale=:log10, label="Newton")
end
52.8 ms