%matplotlib notebook
import math
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
This notebook shows some examples of Monte Carlo integration which we'll look at in class.
If you are looking at the HTML export of this notebook, the source file is here: monte-carlo.ipynb and you can download it and run it in Jupyter Notebook if you have it. (If you'd like to get it, installing Anaconda is a quick way.)
Here is an integral we'd like to compute:
$$I = \int_0^1 1 - \sqrt{1 - x^4} dx$$It doesn't have a closed form but the value (computed by Wolfram Alpha as an Elliptic Integral) is approximately 0.125981.
def f1(x):
return 1 - np.sqrt(1 - x**4)
plt.figure()
xs = np.linspace(0, 1, 501)
plt.plot(xs, f1(xs));
Here is a strategy to estimate this integral using uniform sampling.
strategy1 = {
'eval': lambda x: f1(x),
'sample': lambda xi: xi,
'pdf': lambda x: np.ones(x.shape)
}
def estimate(strategy, sampler, N):
"""Integrate a function using N independent samples drawn from a provided pdf."""
xis = sampler(N)
xs = strategy['sample'](xis)
return np.mean(strategy['eval'](xs) / strategy['pdf'](xs))
def independent_sampler(N):
return np.random.random_sample(N)
estimate(strategy1, independent_sampler, 10000) - 0.125981
0.0017636856234602705
Let's ses how the error depends on the number of samples...
def stdev_of(fn, M):
"""Compute the standard deviation of repeated calls to a function."""
results = [fn() for i in range(M)]
return np.std(results)
stdev_of(lambda: estimate(strategy1, independent_sampler, 1000000), 100)
0.00020658421995445867
Ns = np.int32(np.logspace(2, 5, 7))
stdevs = [stdev_of(lambda: estimate(strategy1, independent_sampler, N), 100) for N in Ns]
plt.figure()
plt.loglog(Ns, stdevs, '-')
plt.xlabel('# samples')
plt.ylabel('standard deviation');
plt.axis('equal');
This log-log plot comes out as a straight line, showing that the number of samples and the noise have a power-law relationship; and the slope of this line at $-1/2$ shows that the noise is inversely proportional to the square root of the number of samples, which is what we expected.
Here is another sampling strategy that tries to mimic the integrand with something that is easy to integrate analytically.
strategy2 = {
'eval': lambda x: f1(x),
'sample': lambda xi: xi ** (1/5),
'pdf': lambda x: 5 * x**4
}
A quick verification that this sampling strategy is internally consistent.
def histcheck(strategy, N):
xis = np.random.random_sample(N)
xs = strategy['sample'](xis)
ps, edges, patches = plt.hist(xs, math.ceil(math.sqrt(N)/10), density=True)
plt.plot(edges, strategy['pdf'](edges), 'k')
plt.figure()
histcheck(strategy2, 100000)
Ns = np.int32(np.logspace(2, 5, 7))
stdevs1 = [stdev_of(
lambda: estimate(strategy1, independent_sampler, N), 100
) for N in Ns]
stdevs2 = [stdev_of(
lambda: estimate(strategy2, independent_sampler, N), 100
) for N in Ns]
plt.figure()
plt.loglog(Ns, stdevs1, Ns, stdevs2)
plt.xlabel('# samples')
plt.ylabel('standard deviation');
plt.axis('equal');
From this plot we can see that importance sampling has significantly decreased the noise (by a factor of around 10 in this case) but since the slope is unchanged, the convergence rate is still the same ($1/\sqrt{N}$).
Another strategy for reducing variance is to change the set of random samples being fed into the process. We can do that in our example just by replacing the function that generates the samples with one that generates stratified samples.
def stratified_sampler(N):
return (np.arange(N) + np.random.random_sample(N)) / N
Ns = np.int32(np.logspace(2, 5, 7))
stdevs1 = [stdev_of(
lambda: estimate(strategy1, independent_sampler, N), 100
) for N in Ns]
stdevs2 = [stdev_of(
lambda: estimate(strategy2, independent_sampler, N), 100
) for N in Ns]
stdevs1s = [stdev_of(
lambda: estimate(strategy1, stratified_sampler, N), 100
) for N in Ns]
stdevs2s = [stdev_of(
lambda: estimate(strategy2, stratified_sampler, N), 100
) for N in Ns]
plt.figure()
plt.loglog(Ns, stdevs1, color='tab:blue')
plt.loglog(Ns, stdevs2, color='tab:orange')
plt.loglog(Ns, stdevs1s, color='tab:blue', dashes=[4,2])
plt.loglog(Ns, stdevs2s, color='tab:orange', dashes=[4,2])
plt.loglog([1e1,1e6], [1e-2,1e-7], 'k:')
plt.xlabel('# samples')
plt.ylabel('standard deviation');
plt.axis('equal');
This plot shows that in this example stratification has a huge benefit. Comparing to the reference line at slope -1, you can see the slope of the dashed lines on that plot is about -1.5. Again this is as expected for a nice smooth 1D integrand.
Let's try a messier integrand to see if it takes more samples for stratification to start winning.
def f2(x):
return np.cos(10*x**2) + np.cos(21*x-5) + np.cos(38.7*x+2) + np.cos(82.4*x-3) + np.cos(154.2*x)*2 + np.cos(301.1*x)*4
plt.figure()
xs = np.linspace(0,1,5001)
plt.plot(xs, f2(xs))
[<matplotlib.lines.Line2D at 0x7f79f0e0e340>]
strategy1_f2 = {
'eval': lambda x: f2(x),
'sample': lambda xi: xi,
'pdf': lambda x: np.ones(x.shape)
}
Ns = np.int32(np.logspace(1, 5, 7))
stdevs1 = [stdev_of(
lambda: estimate(strategy1_f2, independent_sampler, N), 100
) for N in Ns]
stdevs1s = [stdev_of(
lambda: estimate(strategy1_f2, stratified_sampler, N), 100
) for N in Ns]
plt.figure()
plt.loglog(Ns, stdevs1, color='tab:blue')
plt.loglog(Ns, stdevs1s, color='tab:blue', dashes=[4,2])
plt.loglog([1e0,1e6], [1e-0,1e-6], 'k:')
plt.xlabel('# samples')
plt.ylabel('standard deviation');
plt.axis('equal');
Now you can see that we have to get past 100 samples before the better convergence rate of stratified sampling starts to take over.
I just wanted to illustrate the appearance of stratified point sets in 2D.
def independent_sampler2D(N):
"""generates N^2 samples"""
return (np.random.random_sample(N*N), np.random.random_sample(N*N))
def stratified_sampler2D(N):
"""generates N^2 samples"""
x,y = np.mgrid[0:N,0:N]
return ((x + np.random.random_sample((N,N)))/N, (y + np.random.random_sample((N,N)))/N)
plt.figure(figsize=(9.5,4.75))
plt.subplot(1,2,1)
x,y = independent_sampler2D(30)
plt.plot(x, y, '.')
plt.axis('equal')
plt.subplot(1,2,2)
x,y = stratified_sampler2D(30)
plt.plot(x.flatten(), y.flatten(), '.')
plt.axis('equal');
Let's try using these point sets to integrate a Gaussian over the unit square. Per Wolfram Alpha the answer is about 0.557746.
def f2d(x,y):
return np.exp(-(x**2+y**2))
strategy1_2d = {
'eval': lambda x: f2d(x[0],x[1]),
'sample': lambda xi: xi,
'pdf': lambda x: 1
}
estimate(strategy1_2d, independent_sampler2D, 100) - 0.557746
0.003802829952953113
estimate(strategy1_2d, stratified_sampler2D, 100) - 0.557746
-1.5176825898577384e-05
Stratification is getting us a more accurate answer at 100 samples; let's make the plot to see the trend...
Ns = np.int32(np.logspace(1, 3, 5))
stdevs1 = [stdev_of(
lambda: estimate(strategy1_2d, independent_sampler2D, N), 100
) for N in Ns]
stdevs1s = [stdev_of(
lambda: estimate(strategy1_2d, stratified_sampler2D, N), 100
) for N in Ns]
plt.figure()
plt.loglog(Ns*Ns, stdevs1, color='tab:blue')
plt.loglog(Ns*Ns, stdevs1s, color='tab:blue', dashes=[4,2])
plt.loglog([1e1,1e7], [1e-1,1e-7], 'k:')
plt.xlabel('# samples')
plt.ylabel('standard deviation');
plt.axis('equal');
Now we see that stratification gets us from $N^{-1/2}$ to $N^{-1}$.