Processing math: 16%

Lecture 17: Chebychev's inequality and the weak law of large numbers

Chebychev's inequality

Claim (Chebychev's inequality): For any random variable X, Pr(|XE(X)|a)Var(X)a2

Proof: Note that |XE(X)|a if and only if (XE(X))2a2. Therefore Pr(|XE(X)|a)=Pr((XE(X))2a2). Applying Markov's inequality to the variable (XE(X))2 gives

Pr(|XE(X)|a)=Pr((XE(X))2a2)E((XE(X))2)a2=Var(X)a2

by definition.

Example: Last time we used Markov's inequality and the fact that the average height is 5.5 feet to show that if a door is 55 feet high, then we are guaranteed that at least 90% of people can fit through it.

If we also know that the standard deviation of height is σ = 0.2 feet, we can use Chebychev's inequality to build a smaller door. Let X be the height random variable. Var(X) = σ^2 = 0.04.

If x - E(X) \geq a then |x - E(X)| \geq a. Therefore, the event (X - E(X)) is a subset of the event (|X - E(X)| \geq a), and thus Pr(X - E(X) \geq a) \leq Pr(|X - E(X)| \geq a). This lets us apply Chebychev's inequality to conclude Pr(X - E(X) \geq a) \leq \frac{Var(X)}{a^2}.

Solving for a, we see that if a \geq .6, then Pr(X -E(X) \geq a) \leq 0.10. This in turn gives us Pr(X \lt a + E(X)) = Pr(X - E(X) \lt a) \geq 0.9. Thus, if the door is at least 6.1 feet tall, then 90% of the people can fit through.

Weak law of large numbers

Suppose we wish to estimate the average value of the height of a population by sampling n people from the population and averaging their height. The weak law of large numbers says that this will give us a good estimate of the "real" average.

Formally, we can model this experiment by letting our outcomes be sequences of n people. We can define several random variables: X_1 is the height of the first person sampled; X_2 is the height of the second person sampled, X_3 is the height of the third and so forth.

Since these are all measures of height, E(X_1) = E(X_2) = \cdots = E(X_n) (let's call this value \mu) and Var(X_1) = \cdots = Var(X_n) (let's call this value \sigma^2). The result of our sampled average is given by the random variable (X_1 + X_2 + \cdots + X_n)/n. The weak law of large numbers says that this variable is likely to be close to the real expected value:

Claim (weak law of large numbers): If X_1, X_2, \dots, X_n are independent random variables with the same expected value \mu and the same variance σ^2, then Pr\left(\left|\frac{X_1 + X_2 + \cdots + X_n}{n} - μ\right| \geq a\right) \leq \frac{σ^2}{na^2}

Proof: By Chebychev's inequality, we have Pr\left(\left|\sum X_i/n - E(\sum X_i/n)\right| \geq a\right) \leq \frac{Var(\sum X_i/n)}{a^2}

Now, by linearity of the expectation, we have E(\sum X_i/n) = \sum E(X_i)/n = nμ/n = μ

As was shown in homework 5, Var(cX) = c^2Var(X), and we also know that if X and Y are independent, that Var(X + Y) = Var(X) + Var(Y). Therefore, we have Var(\sum X_i/n) = \sum Var(X_i)/n^2 = nσ^2/n^2 = σ^2/n

Plugging these into the result from Chebychev's, we have Pr\left(\left|\sum X_i/n - μ\right| \geq a\right) \leq \frac{σ}{na^2}

which is what we were trying to show.