Notebook for 2023-03-20

As a running example in this lecture, we looked at numerical computations for the reciprocal function computing $r = 1/d$ (assuming that we didn't have division already readily at hand).

md"""

# Notebook for 2023-03-20

As a running example in this lecture, we looked at numerical computations for the reciprocal function computing $r = 1/d$ (assuming that we didn't have division already readily at hand).

"""

125 ms

using Plots

8.4 s

Long division

This is binary long division (the binary version of the algorithm you learned in school). It gets about one bit of additional accuracy per step.

md"""

## Long division

This is binary long division (the binary version of the algorithm you learned in school). It gets about one bit of additional accuracy per step.

"""

138 μs

reciprocal_div (generic function with 1 method)

# Approximate x=1/d by n steps of binary long division

function reciprocal_div(d, n; monitor=(x)->nothing)

r = 1.0 # Current remainder

x = 0.0 # Current reciprocal estimate

bit = 0.5 # Value of a bit in the current place

for k = 1:n

if r > d*bit

x += bit

r -= d*bit

end

monitor(x)

bit /= 2

end

1.6 ms

begin

rd_errs = []

reciprocal_div(5, 30, monitor=(x)->push!(rd_errs, abs(x-0.2)))

plot!(rd_errs, yscale=:log10)

end

2.7 s

Bisection

Bisection uses the fact that the reciprocal $1/d$ likes between a lower bound $l$ such that $dl-1 < 0$ and an upper bound $h$ such that $dh-1 > 0$. Bisection starts with an initial interval $[l,h]$ and repeatedly replaces it with a smaller interval $[l,m]$ or $[m,h]$ (for $m = (l+h)/2$) such that there continues to be a sign change of the function $dx-1$ between the two endpoints.

md"""

## Bisection

"""

173 μs

reciprocal_bisect (generic function with 1 method)

function reciprocal_bisect(d, n; monitor=(x)->nothing)

hi = 1.0 # Upper bound

lo = 0.0 # Lower bound

for k = 1:n

x = (hi+lo)/2

monitor(x)

fx = d*x-1

if fx > 0.0

hi = x

else

lo = x

end

(hi+lo)/2

end

1.7 ms

begin

rb_errs = []

reciprocal_bisect(5, 30, monitor=(x)->push!(rb_errs, abs(x-0.2)))

plot(rb_errs, yscale=:log10)

end

659 μs

A modified Newton approach

This is the fixed point iteration

$$x_{k+1} = x_k-\frac{dx_k-1}{\hat{d}}$$

where $\hat{d}$ is a power of 2 as close as possible to $d$. If we divided by $d$ instead of $\hat{d}$, we would converge in one step –- but we are doing all this under the assumption that we don't know how to divide in general. On the other hand, division by $\hat{d}$ is easy in binary floating point when $\hat{d}$ is a power of two.

md"""

## A modified Newton approach

This is the fixed point iteration

$$x_{k+1} = x_k-\frac{dx_k-1}{\hat{d}}$$

where $\hat{d}$ is a power of 2 as close as possible to $d$. If we divided by $d$ instead of $\hat{d}$, we would converge in one step --- but we are doing all this under the assumption that we don't know how to divide in general. On the other hand, division by $\hat{d}$ is easy in binary floating point when $\hat{d}$ is a power of two.

"""

194 μs

reciprocal_newtonish (generic function with 1 method)

function reciprocal_newtonish(d, d̂, n; monitor=(x)->nothing)

x = 1/d̂

for k = 1:n

monitor(x)

fx = d*x-1

x -= fx/d̂

end

1.5 ms

begin

rn1_errs = []

reciprocal_newtonish(5, 4, 20, monitor=(x)->push!(rn1_errs, abs(x-0.2)))

plot(rn1_errs, yscale=:log10)

end

504 μs

Newton iteration

Let $g(x) = x^{-1} - d$. Newton iteration for this function gives (with some algebra)

$$x_{k+1} = x_k-\frac{g(x_k)}{g'(x_k)} = x_k (2-d x_k)$$

which we can compute with only a subtraction and two multiplications. The iteration converges quadratically from good enough initial guesses; that means that the error at step $k+1$ is proportional to the square of the error at step $k$. On a log scale, the error curve looks like a downward-facing parabola.

md"""

## Newton iteration

Let $g(x) = x^{-1} - d$. Newton iteration for this function gives (with some algebra)

$$x_{k+1} = x_k-\frac{g(x_k)}{g'(x_k)} = x_k (2-d x_k)$$

which we can compute with only a subtraction and two multiplications. The iteration converges *quadratically* from good enough initial guesses; that means that the error at step $k+1$ is proportional to the square of the error at step $k$. On a log scale, the error curve looks like a downward-facing parabola.

"""

210 μs

reciprocal_newton (generic function with 1 method)

function reciprocal_newton(d, x, n; monitor=(x)->nothing)

for k=1:n

monitor(x)

x = x*(2-d*x)

end

1.5 ms

begin

rn2_errs = []

reciprocal_newton(5, 0.25, 5, monitor=(x)->push!(rn2_errs, abs(x-0.2)))

plot(rn2_errs, yscale=:log10)

end

474 μs

Comparing methods

Let's wrap up by looking at all four methods we've considered on a common axis. It will take something like 53 iterations for bisection or long division to reach full accuracy; our modified Newton takes something like 25 steps; and our Newton iteration reaches correct rounding in only 6 steps.

md"""

## Comparing methods

"""

140 μs

begin

plot(rd_errs, yscale=:log10, label="Long div")

plot!(rb_errs, yscale=:log10, label="Bisection")

plot!(rn1_errs, yscale=:log10, label="Mod Newton")

plot!(rn2_errs, yscale=:log10, label="Newton")

end

52.8 ms