Statement of Bayes's rule
Definition: Two events A and B are independent if Pr(A∩B)=Pr(A)Pr(B).
Important: Do not assume events are independent unless given a good reason to do so.
Example: Suppose I roll two 6-sided dice. For either die, the probability of getting any number from 1...6 is 1/6. What is the probability of getting a pair of twos?
Answer: it could be anything from 0 to 1/6. The dice could be taped together in a way that it is impossible to get (2,2) while still making the probability of any given roll for either die to be 1/6.
If we are also told that the two rolls are independent, then we can conclude that Pr({(2,2)})=Pr({(2,n)∣n∈{1,…,6}}∩{(n,2)∣n∈{1,…,6}})=Pr({(2,n)∣n∈{1,…,6}})⋅Pr({(n,2)∣n∈{1,…,6}})=(1/6)(1/6)
Definition: If A and B are events, then the probability of A given B, written Pr(A|B) is given by Pr(A|B)::=Pr(A∩B)Pr(B) Note that Pr(A|B) is only defined if Pr(B)≠0.
Intuitively, Pr(A|B) is the probability of A in a new sample space created by restricting our attention to the subset of the sample space where B occurs. We divide by Pr(B) so that Pr(B|B)=1.
Note: A|B is not defined, only Pr(A|B); this is an abuse of notation, but is standard.
Conditional probability can be used to give an equivalent definition of independence:
Claim: A and B are independent if and only if Pr(A|B)=Pr(A).
This is perhaps a more intuitive notion of independence: if I tell you that B happened, it doesn't change your estimate of the probability that A happens.
Proof: (⇒) Suppose A and B are independent. We wish to show Pr(A|B)=Pr(A). Well, by definition, Pr(A|B)=Pr(A∩B)/Pr(B). Since A and B are independent, Pr(A∩B)=Pr(A)Pr(B). Plugging this in, we see that Pr(A|B)=Pr(A)Pr(B)/Pr(B)=Pr(A).
(⇐) Suppose Pr(A|B)=Pr(A). Then Pr(A)=Pr(A∩B)/Pr(B). Multiplying both sides by Pr(B) gives the desired result.
Bayes's rule is a simple way to compute P(A|B) from P(B|A).
Claim: (Bayes's rule): P(A|B)=P(B|A)P(A)/P(B).
Proof: left as exercise.
Example: See next lecture
Using conditional probability, we can draw a tree to help discover the probabilities of various events. Each branch of the tree partitions part of the sample space into smaller parts.
For example: suppose that it rains with probability 30%. Suppose that when it rains, I bring my umbrella 3/4 of the time, while if it is not raining, I bring my umbrella with probability 1/10. Given that I bring my umbrella, what is the probability that it is raining?
One way to model this problem is with the sample space
S={raining(r),notraining(nr)}×{umbrella(u),noumbrella(nu)}={(r,u),(nr,u),(r,nu),(nr,nu)}
Let R be the event "it is raining". Then R={(r,u),(r,nu)}. Let U be the event "I bring my umbrella". Then U={(r,u),(nr,u)}.
The problem tells us that Pr(R)=3/10. It also states that Pr(U|R)=3/4 while Pr(U|ˉR)=1/10. We can use the following fact:
Fact: Pr(ˉA|B)=1−Pr(A|B). Proof left as exercise.
to conclude that Pr(ˉU|R)=1/4 and Pr(ˉU|ˉR)=9/10.
We can draw a tree:
Probability tree (LaTeX source)
We can compute the probabilities of the events at the leaves by multiplying along the paths. For example, Pr({(r,u)})=Pr(U∩R)=Pr(R)Pr(U|R)=(3/10)(3/4)=(9/40)
To answer our question, we are interested in Pr(R|U)=Pr(U∩R)/Pr(U)=(9/40)/(3/10)=3/4.
Note we could also answer this using Bayes's rule and the law of total probability; it would amount to exactly the same calculation. The tree just helps organize all of the variables.