Lecture 14: Random variables

Reading: Cameron 3.1–3.3,
Last semester's notes
examples of Bayes' rule and the law of total probability
definitions: random variable, expectation

Law of total probability

Claim: (law of total probability) If $A_1, \dots, A_n$ partition the sample space $S$ (that is, if $A_i \cap A_j = \emptyset$ for $i \neq j$ and $S = \cup A_i$ ), then

$Pr(B) = \sum_{i} Pr(B|A_i)Pr(A_i)$

Proof sketch: Write $B = \cup_{i} (B \cap A_i)$ . Apply third axiom to conclude $Pr(B) = \sum_{i} Pr(B \cap A_i)$ . Apply definition of $Pr(B | A_i)$ .

Example using Bayes's rule and law of total probability

Suppose we are given a test for a condition. Let $A$ be the event that a patient has the condition, and let $B$ be the event that the test comes back positive.

The probability that a patient has the condition is $Pr(A) = 1/10000$ . The test has a false positive rate of $Pr(B | \bar{A}) = 1/100$ (a false positive is when the test says "yes" despite the fact that the patient does not have the disease), and a false negative rate of $Pr(\bar{B} | A) = 5/100$ .

Suppose a patient tests positive. What is the probability that they have the disease? In other words, what is $Pr(A|B)$ ?

Bayes's rule tells us $Pr(A|B) = \frac{Pr(B|A)Pr(A)}{Pr(B)}$ . We can find $Pr(B|A)$ using the fact from last lecture: $Pr(B|A) = 1 - Pr(\bar{B}|A) = 95/100$ . $Pr(A)$ is given. We can use the law of total probability to find $Pr(B)$ ; $Pr(B) = Pr(B|A)Pr(A) + Pr(B|\bar{A})Pr(\bar{A})$ .

Plugging everything in, we have

$\begin{aligned} Pr(A|B) &= \frac{Pr(B|A)Pr(A)}{Pr(B|A)Pr(A) + Pr(B|\bar{A})Pr(\bar{A})} \\ &= \frac{(95/100)(1/10000)}{(95/100)(1/10000) + (1/100)(9999/10000)} \\ &= \frac{95}{95+9999} \approx 1/100 \\ \end{aligned}$

This is a surprising result: we take a test that fails $\lt 5$ % of the time, and it says we have the disease, yet we have only about a 1% chance of having the disease.

However, note that our chances have grown from $0.0001$ to $0.01$ , so we did learn quite a bit from the test.

Random variables

Definition: A ( $\mathbb{R}$ -valued) random variable $X$ is a function $X : S → \mathbb{R}$ .

Definition: The expected value of $X$ , written $E(X)$ is given by $E(X) ::= \sum_{k \in S} X(k)Pr(\{k\})$

Definition: Given a random variable $X$ and a real number $x$ , the poorly-named event $(X = x)$ is defined by $(X = x) ::= \{k \in S \mid X(k) = x\}$ .

This definition is useful because it allows to ask "what is the probability that $X = x$ ?"

Claim: (alternate definition of $E(X)$ ) $E(X) = \sum_{x \in \mathbb{R}} x\cdot Pr(X=x)$

Proof sketch: this is just grouping together the terms in the original definition for the outcomes with the same $X$ value.

Note: You may be concerned about " $\sum_{x \in \mathbb{R}}$ . In discrete examples, $Pr(X = x) = 0$ almost everywhere, so this sum reduces to a finite or at least countable sum. In non-discrete example, this summation can be replaced by an integral. Measure theory is a branch of mathematics that puts this distinction on firmer theoretical footing by replacing both the summation and the integral with the so-called "Lebesgue integral". In this course, we will simply use " $\sum$ " with the understanding that it becomes an integral when the random variable is continuous.

Example: Suppose I roll a fair 6-sided die. On an even roll, I win $10. On an odd roll, I lose however much money is shown. We can model the experiment (rolling a die) using the sample space $S = \{1,2,3,4,5,6\}$ and an equiprobable measure. The result of the experiment is given by the random variable $X : S → \mathbb{R}$ given by $X(1) ::= -1$ , $X(2) ::= 10$ , $X(3) ::= -3$ , $X(4) ::= 10$ , $X(5) ::= -5$ , and $X(6) ::= 10$ .

According to the definition,

$\begin{aligned} E(X) &= (1/6)X(1) + (1/6)X(2) + (1/6)X(3) + (1/6)X(4) + (1/6)X(5) + (1/6)X(6) \\ &= (1/6)(-1) + (1/6)(10) + (1/6)(-3) + (1/6)(10) + (1/6)(-5) + (1/6)(10) \\ \end{aligned}$

According to the alternate definition, $E(X)$ is given by

$\begin{aligned} E(X) &= (-1)Pr(X = -1) + (-3)Pr(X = -3) + (-5)Pr(X = -5) + 10Pr(X = 10) \\ &= (-1)(1/6) + (-3)(1/6) + (-5)(1/6) + (10)(1/6 + 1/6 + 1/6) \end{aligned}$

Definition: The probability mass function (PMF) of $X$ is the function $PMF_X : \mathbb{R} → \mathbb{R}$ given by $PMF_X(x) = Pr(X = x)$ .