class: center, middle, inverse, title-slide #
Lecture 7: Probability Rules and Conditional Probabilities
## Data Analysis for Psychology in R 1
### ALEX DOUMAS & TOM BOOTH ### Department of Psychology
The University of Edinburgh ### AY 2020-2021 --- ## Today - Brief recap of probability (from last lecture) stated as rules of probability - Rule for conditional probabilities - Bayes rule - Contingency tables and conditional probability - Our first test (without the actual test!) --- ## Learning objectives - Understand the use of probability rules - Understand the basics of and how to use Bayes's equation - Understand use of probability to test independence --- ## Rules of probability 1. For any event `$$0 \leq p(A) \leq 1$$` 2. The sum of the probabilities of all possible outcomes = 1 `$$p(A_1) + p(A_2) + ... p(A_i) = 1$$` 3. Probability of the complement of A (Not A): `$$p(\sim A) = 1 - p(A)$$` --- ## Rules of probability -4. For mutually exclusive events, probability of A or B: `$$p(A \bigcup B) = p(A) + p(B)$$` - The simple addition rule (works only for mutually exclusive events). --- ## Rules of probability -5. General addition rule: `$$p(A \bigcup B) = p(A) + p(B) - p(A \bigcap B)$$` -- - The general addition rule subsumes rule 4. - It works when an event can fall into both A and B. - E.g. Think a playing card being red or a 7 (there are red 7s). - It also works for mutually exclusive events. - If two events are mutually exclusive, then `\(p(A \bigcap B)=0\)` --- ## Rules of probability -6. If A and B are independent, probability of A and B `$$p(A \bigcap B) = p(A)p(B)$$` -- - The probability of A and B will always we less than or equal to the probability of either single event. - Why? --- ## Question - Why is the probability of event A *and* event B occurring arrived at using multiplication? -- - As per they "Why?" above, the probability that two things occur, conceptually, has to be equally likely or less likely than a single event. - As we represent probability using decimal, multiplication is our tool. -- - But lets think about it in a another way. --- ## (Hopefully) an intuitive answer - Suppose we have two events. - Event A occurs `\(\frac{1}{4}\)` of the time - Event B occurs `\(\frac{1}{2}\)` of the time - If the events are independent (the probability of one does not impact the probability of the other), then event B will happen with equal probability when both A and not A. - So, of the total number of times A occurs, B will occur half the time. - And of the total number of times A does not occur, B will occur half the time. --- ## (Hopefully) an intuitive answer | | | | | | | | | |---|---|---|---|---|---|---|---| | o | o | o | o | o | o | o | o | | o | o | o | o | o | o | o | o | | o | o | o | o | o | o | o | o | | o | o | o | o | o | o | o | o | - Each `o` above is an event (32 in total). --- ## (Hopefully) an intuitive answer | | | | | | | | | |---|---|---|---|---|---|---|---| | A | A | o | o | o | o | o | o | | A | A | o | o | o | o | o | o | | A | A | o | o | o | o | o | o | | A | A | o | o | o | o | o | o | - Event A occurs `\(\frac{1}{4}\)` of the time. - The A's above (8 times) --- ## (Hopefully) an intuitive answer | | | | | | | | | |---|---|---|---|---|---|---|---| | o | o | o | o | o | o | o | o | | o | o | o | o | o | o | o | o | | B | B | B | B | B | B | B | B | | B | B | B | B | B | B | B | B | - Event B occurs `\(\frac{1}{2}\)` of the time. - The B's above (16 times) --- ## (Hopefully) an intuitive answer | | | | | | | | | |---|---|---|---|---|---|---|---| | A | A | o | o | o | o | o | o | | A | A | o | o | o | o | o | o | | AB | AB | B | B | B | B | B | B | | AB | AB | B | B | B | B | B | B | - Events A and B occur (or intersect) on 4 occasions. - 4/32 = 1/8 - 1/8 = 1/2*1/4 --- ## Rules of probability -7. General multiplication rule: `$$p(A \bigcap B) = p(A)p(B|A)$$` -- so, `$$p(B \bigcap A) = p(B)p(A|B)$$` - Notice that *and* is commutative (i.e., `\(a \bigcap b = b \bigcap a\)`). - probability of rain and tuesday can't be different from the probability of tuesday and rain, right? -- - But what is `\(p(B|A)\)`? --- ## Conditional probability - But what is `\(p(B|A)\)`? - This is referred to as **conditional probability**. - Probability of B **given** A, which is written, `\(p(B|A)\)` -- - Note though that when `\(p(A)\)` and `\(p(B)\)` are independent, then the `\(p(B|A) = p(B)\)` - Hence the simple multiplication rule for independent events. --- ## Conditional probability - We can calculate conditional probability as; `$$p(B|A) = \frac{p(A \bigcap B)}{p(A)}$$` - or the inverse: `$$p(A|B) = \frac{p(A \bigcap B)}{p(B)}$$` --- ## Where does this come from? - It comes from re-arranging our general multiplication rule. - Remember: `$$p(A \bigcap B) = p(A)p(B|A)$$` -- - Divide both side by `\(p(A)\)` `$$\frac{p(A \bigcap B)}{p(A)} = \frac{p(A)p(B|A)}{p(A)}$$` --- ## Where does this come from? - Cancel `\(p(A)\)` from right hand side `$$\frac{p(A \bigcap B)}{p(A)} = p(B|A)$$` -- - Re-arrange `$$p(B|A) = \frac{p(A \bigcap B)}{p(A)}$$` --- ## Enter Bayes rule -- - If: `$$p(B|A) = \frac{p(A \bigcap B)}{p(A)}$$` -- and `$$p(A \bigcap B) = p(B \bigcap A)$$` -- and `$$p(B \bigcap A) = p(B)p(A|B)$$` -- - Replace `\(p(A \bigcap B)\)` with `\(p(B)p(A|B)\)` in the top equation and you get: `$$p(B|A) = \frac{p(B)p(A|B)}{p(A)}$$` --- ## Enter Bayes rule - From last slide: `$$p(B|A) = \frac{p(B)p(A|B)}{p(A)}$$` -- - Rearrange to get: `$$p(B|A) = \frac{p(A|B)p(B)}{p(A)}$$` -- - That's Bayes rule! --- ## Who cares? 1. Turns out to be a big deal in psychology and statistics - Many psychological models based on Bayes rule. - Bayesian statistics are becoming more prominant (Don't worry about this point now, we'll return to it in future years). 2. Can be really useful for calculting conditional probabilities when you don't know things like `\(p(A \bigcap B)\)` - But for now, let's look at an example of conditional probability in action... --- ## Conditional probability: an example | | Left | Right | Marginal | |----------|--------|---------|----------| | Boys | 25 | 24 | 49 | | Girls | 10 | 41 | 51 | | Marginal | 35 | 65 | 100 | - We have data on the handedness of boys and girls in a class - Let's say we select one student at random, and they are a girl. - What is the probability they are left handed? - `\(p(left | girl)\)` --- ## Conditional probability: an example | | Left | Right | Marginal | |----------|--------|---------|----------| | Boys | 25 | 24 | 49 | | **Girls**| **10** | **41** | **51** | | Marginal | 35 | 65 | 100 | `$$p(Left|Girl) = \frac{0.10}{0.51} = 0.196$$` --- ## Conditional probability: an example - We can also work this through using intersetions: - `\(p(left \bigcap girl)\)` = `\(\frac{10}{100}\)` = 0.10 - `\(p(girl) = \frac{51}{100}\)` = 0.51 `$$p(Left|Girl) = \frac{p(Left \bigcap Girl)}{p(Girl)}$$` `$$p(Left|Girl) = \frac{0.10}{0.51} = 0.196$$` --- ## Conditional probability: an example - Or using Bayes Equation. `$$p(Left|Girl) = \frac{p(Girl|Left)p(Left)}{p(Girl)}$$` | | Left | Right | Marginal | |----------|--------|---------|----------| | Boys | **25** | 24 | 49 | | Girls | **10** | 41 | 51 | | Marginal | **35** | 65 | 100 | `$$p(Girl|Left) = \frac{10}{35} = 0.2857$$` `$$p(Left|Girl) = \frac{p(Girl|Left)p(Left)}{p(Girl)} = \frac{0.2857*.35}{.51} = 0.196$$` --- ## Are events independent? - The multiplication rule, and the use of contingency tables, provides a way to assess if events are independent. - **Example**: Consider a different school with 100 pupils. - 55 male and 45 female. - Suppose we ask those students if they like stats. 40 say yes, 60 say no. - We can tabulate the proportions and think about the probabilities. --- ## Are events independent? | | Yes | No | Marginal | |----------|-----------------|---------------|----------| | Male | p(Male, Yes) | p(Male, No) | 0.55 | | Female | p(Female & Yes) | p(Female, No) | 0.45 | | Marginal | 0.40 | 0.60 | 1.00 | --- ## Are events independent? | | Yes | No | Marginal | |----------|--------------------|--------------------|----------| | Male | `\(0.55*0.40 = 0.22\)` | `\(0.55*0.60 = 0.33\)` | 0.55 | | Female | `\(0.45*0.40 = 0.18\)` | `\(0.45*0.60 = 0.27\)` | 0.45 | | Marginal | 0.40 | 0.60 | 1.00 | - If we assume all events to be independent, we can calculate expected frequencies. - These are the products of the two probabilities. - Notice that here we do not have actual frequencies like in example 1. We have cell probabilities. --- ## Are events independent? | | Yes | No | Marginal | |----------|--------|-------|----------| | Male | 0.22 | 0.33 | 0.55 | | Female | 0.18 | 0.27 | 0.45 | | Marginal | 0.40 | 0.60 | 1.00 | - From this table, **if** preference and gender are independent, then the probabilities of liking stats if male or female should be the same as: - `\(p(yes|male)\)` = `\(\frac{.22}{.55}\)` = 0.40 - `\(p(yes|female)\)` = `\(\frac{.18}{.45}\)` = 0.40 --- ## Are events independent? | | Yes | No | Marginal | |----------|------|------|----------| | Male | 0.10 | 0.45 | 0.55 | | Female | 0.30 | 0.15 | 0.45 | | Marginal | 0.40 | 0.60 | 1.00 | - But what if we observed the above... --- ## Are events independent? | | Yes | No | Marginal | |----------|------|------|----------| | Male | 0.10 | 0.45 | 0.55 | | Female | 0.30 | 0.15 | 0.45 | | Marginal | 0.40 | 0.60 | 1.00 | - Given this outcome, our conditional probabilities are: - `\(p(yes|male) = \frac{.10}{.55}\)` = 0.18 - `\(p(yes|female) = \frac{.30}{.45}\)` = 0.67 -- - Using Bayes: - `\(p(yes|male) = \frac{p(male|yes)p(yes)}{p(male)} = \frac{(.1/.4)*(.4)}{.55}\)` = 0.18 - `\(p(yes|female) = \frac{p(female|yes)p(yes)}{p(female)} = \frac{(.3/.4)*(.4)}{.45}\)` = 0.67 - In other words, gender and whether you like stats are related. --- ## Our first statistical test - What we have just done (more or less) is do all the background work to understand a `\(\chi^2\)` test of independence. - This is test of the independence of two nominal category variables. - To use this in practice, we have a couple of extra steps, but the above is the fundamental principle. --- # Summary of today 1. Review and extension of the rules of probability. 2. Bayes rule can be used to calculate conditional probabilities. 3. How to do a simple test for independence. --- # Next tasks + Next week, we will start work on probability distributions. + This week: + Complete your lab + Come to office hours + Come to Q&A session + Weekly quiz - on week 7 (lect 6) content + Open Monday 09:00 + Closes Sunday 17:00