definitions: random variable, expectation
Claim: (law of total probability) If A1,…,An partition the sample space S (that is, if Ai∩Aj=∅ for i≠j and S=∪Ai), then
Pr(B)=∑iPr(B|Ai)Pr(Ai)
Proof sketch: Write B=∪i(B∩Ai). Apply third axiom to conclude Pr(B)=∑iPr(B∩Ai). Apply definition of Pr(B|Ai).
Suppose we are given a test for a condition. Let A be the event that a patient has the condition, and let B be the event that the test comes back positive.
The probability that a patient has the condition is Pr(A)=1/10000. The test has a false positive rate of Pr(B|ˉA)=1/100 (a false positive is when the test says "yes" despite the fact that the patient does not have the disease), and a false negative rate of Pr(ˉB|A)=5/100.
Suppose a patient tests positive. What is the probability that they have the disease? In other words, what is Pr(A|B)?
Bayes's rule tells us Pr(A|B)=Pr(B|A)Pr(A)Pr(B). We can find Pr(B|A) using the fact from last lecture: Pr(B|A)=1−Pr(ˉB|A)=95/100. Pr(A) is given. We can use the law of total probability to find Pr(B); Pr(B)=Pr(B|A)Pr(A)+Pr(B|ˉA)Pr(ˉA).
Plugging everything in, we have
Pr(A|B)=Pr(B|A)Pr(A)Pr(B|A)Pr(A)+Pr(B|ˉA)Pr(ˉA)=(95/100)(1/10000)(95/100)(1/10000)+(1/100)(9999/10000)=9595+9999≈1/100
This is a surprising result: we take a test that fails <5% of the time, and it says we have the disease, yet we have only about a 1% chance of having the disease.
However, note that our chances have grown from 0.0001 to 0.01, so we did learn quite a bit from the test.
Definition: A (R-valued) random variable X is a function X:S→R.
Definition: The expected value of X, written E(X) is given by E(X)::=∑k∈SX(k)Pr({k})
Definition: Given a random variable X and a real number x, the poorly-named event (X=x) is defined by (X=x)::={k∈S∣X(k)=x}.
This definition is useful because it allows to ask "what is the probability that X=x?"
Claim: (alternate definition of E(X)) E(X)=∑x∈Rx⋅Pr(X=x)
Proof sketch: this is just grouping together the terms in the original definition for the outcomes with the same X value.
Note: You may be concerned about "∑x∈R. In discrete examples, Pr(X=x)=0 almost everywhere, so this sum reduces to a finite or at least countable sum. In non-discrete example, this summation can be replaced by an integral. Measure theory is a branch of mathematics that puts this distinction on firmer theoretical footing by replacing both the summation and the integral with the so-called "Lebesgue integral". In this course, we will simply use "∑" with the understanding that it becomes an integral when the random variable is continuous.
Example: Suppose I roll a fair 6-sided die. On an even roll, I win $10. On an odd roll, I lose however much money is shown. We can model the experiment (rolling a die) using the sample space S={1,2,3,4,5,6} and an equiprobable measure. The result of the experiment is given by the random variable X:S→R given by X(1)::=−1, X(2)::=10, X(3)::=−3, X(4)::=10, X(5)::=−5, and X(6)::=10.
According to the definition,
E(X)=(1/6)X(1)+(1/6)X(2)+(1/6)X(3)+(1/6)X(4)+(1/6)X(5)+(1/6)X(6)=(1/6)(−1)+(1/6)(10)+(1/6)(−3)+(1/6)(10)+(1/6)(−5)+(1/6)(10)
According to the alternate definition, E(X) is given by
E(X)=(−1)Pr(X=−1)+(−3)Pr(X=−3)+(−5)Pr(X=−5)+10Pr(X=10)=(−1)(1/6)+(−3)(1/6)+(−5)(1/6)+(10)(1/6+1/6+1/6)
Definition: The probability mass function (PMF) of X is the function PMFX:R→R given by PMFX(x)=Pr(X=x).