ProbabilityProbability Models
In this section we will learn how to mathematically represent and reason about randomness. One benefit of having an explicit mathematical model, as opposed to simply applying some set list of rules to probability situations, is that the intuitive approach to probability has serious limitations when analyzing tricky or sophisticated phenomena. Consider the following example.
Example (Exchange paradox)
Two envelopes are placed on the table in front of you, containing and dollars for some unknown positive number (you don't know which envelope is which). You choose one of the envelopes and discover $10 inside. You have a choice to switch envelopes; should you?
On one hand, your chance of getting the better envelope was 50% to begin with, and opening the envelope did not provide any information on whether you succeeded. From this perspective, you should be indifferent to switching.
On the other hand, you might reason that the unopened envelope contains either $20 or $5, with a 50% chance of each. So on average the other envelope contains $12.50. from this perspective, you should switch.
How can we adjudicate between these contradictory analyses? It would be very helpful to have model for the situation—that is, a mathematical object together with a way to translate questions about the situation to unambiguous questions about the object. This provides separation of concerns: questions about the model will be math questions and can be answered with mathematical certainty. Any remaining uncertainty about the applicability of conclusions will pertain to whether the model suitably reflects reality.
Our first probability model
Let's set the exchange paradox aside and develop a model for the following simple experiment: two flips of a fair coin. We begin by observing that we can write down all
This is clearly an important set; let's call it the sample space and denote it as .
Furthermore, we need a way to specify how likely each outcome is to occur. It seems reasonable in this scenario to believe that each of the four outcomes is equally likely, in which case we should assign a probability value of to each outcome. The general mathematical object which assigns a particular value to each element in a set is a
So all together, we have
- the sample space , which contains the possible outcomes of the experiment, and
- the probability mass function from to which indicates the probability of each outcome in .
The pair is already enough to specify the experiment, but we need a few more translations for the model to be useful.
Events
In the context of the experiment, an event is a
Exercise
Identify a mathematical object in our model which can be said to correspond to the phrase "the first flip turns up heads". Which of the following is true of this object?
Solution. The outcomes and are the ones which satisfy the condition "the first flip turns up heads". Therefore, the event corresponds to a subset of , namely the subset .
Exercise
Explain how to obtain the probability of an event from the probability mass function.
For concreteness, consider , a probability mass function which assigns mass to each outcome, and the event .
Solution. The probability of the event is the sum of the probabilities of the two outcomes in the event, namely .
In general, we sum all of the probability masses of the outcomes in the event to find the probability of the event.
Some common terms for combining and modifying predicates include and, or, and not. For example, we might be interested in the event "the first flip comes up heads and the second does not come up heads, or the first flip comes tails". Each of these corresponds to one of the set-theoretic operations we have learned:
Exercise
Match each term to its corresponding set-theoretic operation by appropriately sorting the items in the second list. Assume that and are events.
For concreteness, you can think about the events "first flip comes up heads" and "second flip comes up heads" for the two-flip probability space we've been considering.
the event that and both occur
the event that does not occur
the event that either occurs or occurs
x-sortable .item.md(data-index="0") the intersection .item.md(data-index="2") the union .item.md(data-index="1") the complement
Solution. The event that both and occur is , since is the set of outcomes in both and .
The event that does not occur is , since the complement of includes all the outcomes that are not in .
The event that either or occurs is , since is the set of outcomes which are in either or .
Exercise
Suppose a group of friends enter the lottery. For let be the event that the $i$th friend wins. Express the following events using set notation.
- At least one friend loses.
- All friends win.
- At least one friend wins.
Solution.
- The event that at least one friend loses is
- The event that all friends win is
- The event that at least one friend wins is
Since events play a more prominent role than individual outcomes in discussions of probability, we will demote the probability mass function to auxiliary status and instead focus on the function from the set of events to which assigns to each event the total probability mass therein. For example, for our two-flip experiment, the function satisfies
and so on.
Exercise
If
then the number of elements in the domain of is
Solution. The domain of is the set of subsets of . Since has 4 elements, there are elements in the domain of .
We call the probability measure associated with the probability mass function . The pair is called a probability space. Probability measures satisfy the following properties.
Theorem (Properties of a probability measure)
If is a probability space, then
- —"something has to happen"
- for all —"probabilities are non-negative"
- if and are mutually exclusive events—"probability is additive"
These are the fundamental properties of a probability measure on a finite sample space , in the sense that functions from the set of events to satisfying the above properties are in one-to-one correspondence with probability mass functions.
One further important property is a consequence of the fundamental ones. It says that if 's occurrence implies 's occurrence, then .
Exercise (Monotonicity)
Use the additivity property and the fact that to show that if then
Solution. We have by additivity. Since and probabilities are non-negative, it follows that
as required.
Exercise (Subadditivity)
Show that for all events and .
Use this property to show that if occurs with probability zero and occurs with probability zero, then the probability that or occurs is also zero.
Solution. Define to be the set of 's which are in but not , and let be the set of 's which are in but not . Then
since , , and are disjoint and together make up . Furthermore, since and similarly for , we have
as desired.
We have if both and have probability zero, so in that case.
Countable additivity
If is countably infinite, then the additivity property extends to countable additivity: If is a pairwise disjoint sequence of events, then .
Exercise (Countable additivity)
Suppose that is the set of ordered pairs of positive integers, with probability mass at each pair . Show that the probability of the event is equal to the sum of the probabilities of the events as ranges over
Solution. The probability of the event is the sum of the probability masses of all the points in which lie to the right of the line . The probability of the event is the sum of the probability masses of all of the points on the vertical line . So summing these probabilities over each value in amounts to totalling the probability mass right of the line in columns. Since positive quantities may be summed in any order, this column-wise sum will indeed yield the total mass right of the line .
Exercise
Show that the function sums to 1 as ranges over the set of ordered pairs of positive integers.
Solution. The sum along the first column is , and the sum along the second column is , and so on. Summing these column sums, we get , as desired.