Topics:
• The three types of sample spaces (according to how many outcomes they contain)
These are:
Two types of discrete sample spaces:
a finite sample space – contains a finite number of outcomes
an infinite discrete, or countably infinite, sample space – contains an infinite number of outcomes which are countable in the sense that they can be put in a one-to-one correspondence with the natural numbers.
One type which is not discrete:
a continuous sample space – for us usually consists of a (possibly infinite) interval on the real line or a region in the plane – or could be in more dimensions. What distinguishes this type is that there is an uncountable number of outcomes.
These types of sample spaces will require different ways of defining probabilities and different methods of working with probabilities. You will see that in infinite sample spaces, whether they are discrete or continuous, some rather strange things can happen.
• Defining a probability distribution on a sample space. (For now, we are just treating the probabilities as numbers which satisfy the axioms given below, without worrying about what they mean.)
Definition: For a finite sample space S, a probability distribution on S assigns a real number P(A) to every event A in S, so that the following axioms hold:
Axioms:
- For every event A in S, $P(A) \ge 0$
- $P(S) = 1$
- If A and B are mutually exclusive events in S, $P(A\cup B) = P(A) + P(B)$
[If S is a countably infinite sample space, we will have to add another axiom to this list. For continuous sample spaces we will have to make some changes!]
A number of theorems follow from these axioms: (Proofs were done for some of these in class, and you can read them in the textbook. I’ve given hints for how to prove them below.)
Note: I will be using the prime notation for complements in these notes.
Theorem: $P(A’) = 1 – P(A)$
The proof comes from the fact that $A$ and $A’$ are mutually exclusive and their union is the whole sample space.
Theorem: $P(\varnothing) = 0$
You might think this is obvious, but it does have to be proved! The proof comes from the fact that the null set is the complement of S, and we use the previous theorem.
Theorem: If $A \subset B$, then $P(A) \le P(B)$
The proof comes from breaking down B into two mutually exclusive pieces: $B = A\cup (B\cap A’)$. (Draw a Venn diagram to see why this is true.)
Theorem: For any event A, $P(A) \le 1$
The proof comes from the fact that all probabilities are non-negative (axiom 1) together with the fact that $P(A) + P(A’) = 1$ (which was used in proving the first theorem).
This theorem is very important to keep in mind. It often happens that people make errors in computing probabilities and end up with a probability which is larger than 1. You should instantly realize that this must be wrong. (Similarly if you end up with a probability which is negative.)
Theorem: For a finite collection of events $A_1$, $A_2$, … , $A_n$ which are pairwise mutually exclusive (that is, every possible pair of them are mutually exclusive),
$P(A_1 \cup A_2 \cup \dots \cup A_n) = P(A_1) + P(A_2) + \cdots + P(A_n)$
In shorthand, this is written as
\[ P\left(\bigcup_{i=1}^n A_i\right) = \sum_{i=1}^n P(A_i)\]
Consequence of all the above: Especially considering the theorem just above, this means that we can define a probability distribution on a finite sample space by just giving its values for the outcomes in the sample space, and this is how we will do it in the future. The probabilities of the outcomes can be chosen in any way at all as long as they are all non-negative and they add up to 1.
Theorem: $P(A\cup B = P(A) + P(B) – P(A\cap B)$
Look at the Venn diagram to see why this is true.
Examples discussed in class: From the textbook, Example 2.3.3, and problems 2.3.1 and 2.3.4
Please note that the problems intend you to use the axioms and theorems given above, plus some very basic ideas of where probabilities come from. Several problems also assume that you will use “equally likely outcomes”, for instance in Example 2.3.3 they assume that because there are 3 cards of the same rank as the card you drew first, and 51 total cards left in the deck, that you can conclude that the probability of drawing a card of the same rank is 3/51. I want to point out that they have not justified defining a probability this way yet, and don’t get too used to it, because it needs to be justified! In particular, Example 2.3.3 should have specified that the cards were being chosen at random, or some other wording that would mean that every card was equally likely to be chosen.
I’ll try to write up my discussion of the examples but it may not happen today!
Homework:
• Look for your invitation to join the Piazza discussion board for this class, which should have been sent to whatever email address Blackboard has for you. It would be from “The Piazza Team”. (New students, it will be sent sometime on Tuesday.) Or you can sign up by going directly to Piazza and use your City Tech email address to verify you are a student here.
• The homework problems and information about the quiz next time are posted here.
Don’t forget, if you get stuck on a problem, you can post a question on Piazza. Make sure to give your question a good subject line and tell us the problem itself – we need this information in order to answer your question. And please only put one problem per posted question!
Note on my schedule: for the time being, I will only be able to be online to read and reply to emails at certain times of the day. (It is possible that I may be online at other times but I cannot guarantee it.) The times are roughly:
Monday – Friday early morning
Monday-Thursday around 2:30-3:00 PM
Sunday-Thursday evenings around 9-10 PM
Please be aware of this if you need to contact me by email. Thanks.