Loading [MathJax]/jax/output/HTML-CSS/jax.js

Lecture 12: probability

Aside: proving facts about sets

Occasionally I assert properties of sets. For example, in the previous lecture I asserted that AB=(AB)(AB)(BA), while in today's lecture I asserted that if AS, then A(SA)=S.

On your homework, you may assert these kinds of properties without proof as long as:

  1. They are clearly stated.
  2. They are true
  3. They do not trivialize the problem

For example, if asked to prove that A(BC)=(AB)(AC), it is not enough to say "this is obvious", but it is fine to say that this is obvious in the context of another proof.

To show you how such a proof would go, I gave the following example:

Claim: If ES then E(SE)=S.

Proof: We must show that if x is in the left hand side, then it is in the right hand side, and vice-versa; this is what it means for two sets to be equal.

First, choose an arbitrary xE(SE). By definition of , either xE or xSE. In the former case, since ES, we see that xS, while in the latter case, xS because SE is the set of elements of S that don't appear in E. In either case, xS, completing the proof in this direction.

For the other direction, assume xS. Then either xE or xE. In the former case, xE(SE) by definition of . In the latter case, by definition of , we see that xSE, so that xE(SE), again by definition of . In either case, x is in E(SE), completing the proof in this direction.

Here is another example:

Claim: E(SE)=.

Proof: by contradiction. Suppose E(SE). Then there exists some xE(SE). By definition of , we see xE and xSE. By definition of , we see that xE, but this contradicts the fact that xE.

Definitions

I will use 2S and Pow(S) interchangably to refer to the power set of S; I prefer 2S (it's shorter) but did not want to use it before we proved that |2S|=2|S|. Recall that the power set of S is the set of all subsets of S.

A probability space is a set S (called the sample space) paired with a function Pr:2SR, satisfying:

  1. for all ES, Pr(E)0.
  2. Pr(S)=1.
  3. If E1 and E2 are disjoint, then Pr(E1E2)=Pr(E1)+Pr(E2).

Pr is called the probability function or probability measure.

The elements of S are called outcomes; the subsets of S are called events. Thus the probability measure assigns a (non-negative) real number to every event.

Important: The probability of E is not |E|/|S|. This is true for some probability spaces, but not all. Assuming that Pr(E)=|E|/|S| will lead to incorrect answers for most problems.

Examples

To model the throw of a single six-sided die, we could choose the sample space S={1,2,,6}. If we wanted to assume that all outcomes were equally likely, we could define Pr(E)=|E|/6, but this is only one possible definition; we could certainly model a die with different likelihoods for different sides, which would give a different function.

There are many ways to model a throw of two dice. On possible sample space is

S1={1,2,3,,12}

Another possible sample space is S2=N×N where N={1,2,,6}.

There are a few things that determine a good choice:

Another example: suppose we wanted to perform an experiment by selecting a student from the room uniformly at random and sampling their height. Possible sample spaces include:

Again, these are all perfectly reasonable ways to model the experiment (they will of course have different probability functions). However, some of them make it easier to write down the probability function.

Properties of probability spaces

Everything else that we know about probability is derived from the definition. Here are some examples:

Notation: if there is a sample space that is clear from context, I will write ˉE (read "E complement") for SE.

Claims about probability all assume that S and Pr form a probability space; I will not explicitly write this down.

Claim: Pr(E)+Pr(ˉE)=1 (alternatively, Pr(ˉE)=1Pr(E)).

Proof: By above, E and ˉE are disjoint, so Pr(E)+Pr(ˉE)=Pr(EˉE)by rule 3=Pr(S)since EˉE=S=1by rule 2

Claim: For all E, Pr(E)1.

Proof: For the sake of contradiction, suppose there were some E with Pr(E)>1. By rule 2, we know Pr(ˉE)0. Adding these inequalities together, we see that Pr(E)+Pr(ˉE)>1+0=1. But by the previous claim, we know that Pr(E)+Pr(ˉE)=1; this is a contradiction.