Week 7: Introduction to Probability

class: center, middle, inverse, title-slide

.title[
# Week 7: Introduction to Probability 
]
.subtitle[
## Data Analysis for Psychology in R 1 
]
.author[
### dapR1 Team
]
.institute[
### Department of Psychology The University of Edinburgh
]

---

## Learning objectives
- Understand the link between probability, models and data
- Understand the basic rules of probability
- (From lab) being able to apply these rules

---
## Today
- Why do we need probability?
- Some concepts about sets
- Assigning probability - basic rules

---
## Why probability?
+ Consider the statements:

+ There is a 50\50 chance of a fair coin landing on Heads
  
--
  
  + There is a `$\frac{13}{52}$` chance of drawing a heart in a deck of cards
  
--

+ In both cases, we are presenting something based on known information about the world.
  
  + *We have a model for the world*.
  
--

+ But we do not have data.
  
  + We do not have, for example, any results from tossing an actual coin.

---
## Why probability?
- In statistics and data analysis, often the opposite is true.

- We have some data, but we do not know the *truth of the world*
  - We have to make inferences about it.

- In order to make these inferences, we are going to use probability, and models of the world based on it.

- But before we do, we need to build up some concepts.

---
## What is probability?

+ Likelihood of event’s occurrence

+ There are two ways to conceptualise probability:

--
  1. Relative frequency:
  
      + `$P(x)$`, or probability of x, is the proportion of times you would observe `$x$` if you took an infinite number of samples.
    
      + If I roll a die an infinite number of times, the probability I would roll a 4 would be exactly 1/6.

---
## What is probability?
+ Likelihood of event’s occurrence

+ There are two ways to conceptualise probability:

1. Relative frequency
  2. Analytic definition: 
      + Probability of an event = `$\frac{total \hspace{.1cm}successful\hspace{.1cm}outcomes}{all\hspace{.1cm}possible\hspace{.1cm}outcomes}$`
      
      + `$P(x) = \frac{a}{a+b}$`
          + `$a$` = ways that event `$x$` can occur
          + `$b$` = ways that event `$x$` can fail to occur
  
---
## What is probability?

.center[
### `$P(x) = \frac{a}{a+b}$`
]

.pull-left[
.center[
**x = Drawing a black card**
]
`$a$` = # of black cards

`$b$` = # of red cards

`$P(x) = \frac{26}{26+26}$`

`$P(x) = \frac{1}{2} = 0.50$`

]

.pull-right[
.center[
**x = Drawing a spade**
]
`$a$` = # of spades

`$b$` = # of diamonds + hearts + clubs

`$P(x) = \frac{13}{13+39}$`

`$P(x) = \frac{13}{52} = 0.25$`

]

---
## What is probability?
- The long run, or the *law of large numbers*

- Given an event `$A$` and a probability `$P(A)$`, over N trials, the probability that the relative frequency of `$A’$` will differ from `$P(A)$` approaches 0 as N approaches infinity

.center[
![](dapR1_lec6_IntroProbability_files/figure-html/coinFlipDat-1.svg)
]

---
## What is probability?
- What does all that stuff get us?

- Basic idea: probability of *x* is the ways *x* can happen divided by all the ways that all things (including *x*) can happen.

- So, in its most essential sense, the business of probability is figuring out what those two numbers are (ways *x* can happen ; ways that all things can happen).

- Generally, we base these calculations on structures called *sets*...

---
## Sets
- Probabilty is built on the theory of sets.

- **Set**: Well-defined collection of objects; composed of **elements** or **members**
    - `$A$` = {Element 1, Element 2, Element 3,...Element `$i$`}
    - `$A$` = { `$x$` | `$x$` is a student at the University of Edinburgh}
    
--

- Elements in a set are represented with the following notation:
    - `$x\in A$` = `$x$` is an element of set `$A$`
    - `$x\notin A$` = `$x$` is not an element of set `$A$`
    - `$A$` = { `$x$` | `$x$` is an integer, `$1 \leq x \leq 10$`}
        - Set `$A$` includes elements **such that** `$x$` is an integer `$\geq$` to 1 _and_ `$\leq$` to 10.

---
## Sets

.pull-left[
![](./figures/Sets.png)
]

.pull-right[
 A = Set 
 
]

---
count: false

## Sets

.pull-left[
![](./figures/Sets.png)
]

.pull-right[
 A = Set

U = Universal Set : All possible elements in a category of interest
 
]

---
count: false

## Sets

.pull-left[
![](./figures/Sets.png)
]

.pull-right[
 A = Set

U = Universal Set : All possible elements in a category of interest

B = Subset 
 - If `$B$` is a subset of `$A$` :
 - All elements of `$B$` must also be in `$A$`
 - However, all elements in `$A$` do not have to exist in `$B$` (although they can)
 - E.g., `$x \in B$` and `$x \in A$`. `$y \in A$`, but `$y \notin B$`
 
]

---
count: false

## Sets

.pull-left[
![](./figures/Sets.png)
]

.pull-right[
 A = Set

U = Universal Set : All possible elements in a category of interest

---
## Set Notation

.pull-left[
![](./figures/Sets.png)
]

.pull-right[

`$B \subseteq A$`
 - `$B$` is a subset of `$A$`
 
`$B \subset A$`
 - `$B$` is a **proper** subset of `$A$`
 - At least one element of `$A$` is **not** a member of `$B$`
 - `$B$` is not identical to `$A$`

`$A \not\subset B$`
  - `$A$` is not a subset of `$B$`

]

---
## Sets

.pull-left[
![](./figures/Sets.png)
]

.pull-right[

`$U$` = All DapR1 students

]

---
count: false

## Sets

.pull-left[
![](./figures/Sets.png)
]

.pull-right[

`$U$` = All DapR1 students

`$A$` = Vegetarians in DapR1

]

---
count: false

## Sets

.pull-left[
![](./figures/Sets.png)
]

.pull-right[

`$U$` = All DapR1 students

`$A$` = Vegetarians in DapR1

`$B$` = Vegans in DapR1

]

---
count: false

## Sets

.pull-left[
![](./figures/Sets.png)
]

.pull-right[

`$U$` = All DapR1 students

`$A$` = Vegetarians in DapR1

`$B$` = Vegans in DapR1

`$A^c$` = DapR1 students who are not vegetarian

]

---
count: false

## Sets

.pull-left[
![](./figures/Sets.png)
]

.pull-right[

`$U$` = All DapR1 students

`$A$` = Vegetarians in DapR1

`$B$` = Vegans in DapR1

`$A^c$` = DapR1 students who are not vegetarian

`$B \subseteq A$` because all vegans are vegetarian

`$A \not\subset B$` because not all vegetarians are vegan
 
]

---
## Set Operations

+ **Union** - when an object is a member of either set `$A$` _or_ set `$B$` (or both)

.center[
<img src="./figures/Unions.png" width="55%" />

`$A\bigcup B$` = { `$x$`| `$x\in A$` **or** `$x\in B$` **or** `$x\in A$` and `$B$` }
]

---
## Set Operations

+ **Intersection** - when an object is a member of set `$A$` _and_ set `$B$`

.center[
<img src="./figures/Unions.png" width="55%" />

`$A\bigcap B$` = { `$x$`| `$x\in A$` **and** `$x\in B$` }
]

---
## Set Operations

+ **Difference** - when an object is a member of set `$A$` _but not_ set `$B$`, or vice versa

.center[
<img src="./figures/Difference.png" width="50%" />
]

---
## Set Operations

+ **Empty Set** - Sets `$A$` and `$B$` are _mutually exclusive_; when `$A$` occurs, `$B$` cannot occur

.center[
<img src="./figures/emptySet.png" width="55%" />

[ `$A\bigcap B$` ] = `$\emptyset$`

]

---
## Does some of it seem familiar?
- Working with sets is often referred to as Boolean algebra

- You will come across this when you work with logicals in R.

---
## More probability terminology

- In probability, we talk about a *sample space* ( `$S$` ), which refers to all possible outcomes of a random experiment

- In the case of tossing two coins: 
  - `$S$` = {(Heads, Tails), (Heads, Heads), (Tails, Tails)}

- An event ( `$A$` ) is a subset the outcomes from `$S$` ( `$A \subset S$` )

- If the event is tossing two coins and at least one landing on heads:
  - `$A$` = {(Heads, Tails), (Heads, Heads)}

- A *simple event* ( `$a$` ) = single point such that `$a \in S$`
  - `$a$` = (Heads, Heads)

---
## More probability terminology

- To further illustrate these terms, suppose we have the following experiment:

- We roll two six-sided dice.
  - The outcome is the sum of the numbers on the upward pointing space.

- Sample Space:

|   | **1** | **2** | **3** | **4**  | **5**  | **6**  |
|---|---|---|---|----|----|----|
| **1** | 2 | 3 | 4 | 5  | 6  | 7  |
| **2** | 3 | 4 | 5 | 6  | 7  | 8  |
| **3** | 4 | 5 | 6 | 7  | 8  | 9  |
| **4** | 5 | 6 | 7 | 8  | 9  | 10 |
| **5** | 6 | 7 | 8 | 9  | 10 | 11 |
| **6** | 7 | 8 | 9 | 10 | 11 | 12 |

---
## More probability terminology

- To further illustrate these terms, suppose we have the following experiment:

- We roll two six-sided dice.
  - The outcome is the sum of the numbers on the upward pointing space.

- Event: At least one of the die landing on 4

| | **1** | **2** | **3** | **4** | **5** | **6** |
|---|---|---|---|----|----|----|
| **1** | 2 | 3 | 4 | 5 | 6 | 7 |
| **2** | 3 | 4 | 5 | 6 | 7 | 8 |
| **3** | 4 | 5 | 6 | 7 | 8 | 9 |
| **4** | 5 | 6 | 7 | 8 | 9 | 10 |
| **5** | 6 | 7 | 8 | 9 | 10 | 11 |
| **6** | 7 | 8 | 9 | 10 | 11 | 12 |
---
## More probability terminology

- To further illustrate these terms, suppose we have the following experiment:

- We roll two six-sided dice.
  - The outcome is the sum of the numbers on the upward pointing space.

- Simple Event: The dice summing to 12

| **Event** | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 |
|------------------|---|---|---|---|---|---|---|---|----|----|----|
| **Frequency**    | 1 | 2 | 3 | 4 | 5 | 6 | 5 | 4 | 3  | 2  | 1  |
| **Probability**  | `$\frac{1}{36}$` |  `$\frac{2}{36}$` |  `$\frac{3}{36}$` | `$\frac{4}{36}$`  |  `$\frac{5}{36}$` | `$\frac{6}{36}$`  |  `$\frac{5}{36}$` |  `$\frac{4}{36}$` |  `$\frac{3}{36}$`  |   `$\frac{2}{36}$` |  `$\frac{1}{36}$`  |

.center[
![](dapR1_lec6_IntroProbability_files/figure-html/unnamed-chunk-15-1.svg)
]

???
- A probability distribution maps the values of a random variable to the probability of it occurring.

---
## Probability from sets
- `$P(i)$` = probability of event `$i$`

- `$0 \leq P(i) \leq 1$` = probability of event `$i$` is between 0 and 1

- `$P(i_1) + P(i_2) + ... + P(i_n) = P(S)$` = probability of all events in sample space `$S$` = 1

- `$P(i) + P(\sim i) = 1$`, so `$P( \sim i) = 1 - P(i)$`

???
Note from sets we know:
1. Probability can't be less than 1 because there are at minimum 0 ways x can occur; and probability can't be more than 1 because there cannot be more ways x can occur than possible outcomes. 
2. All of the possible outcomes make up all the ways that things can occur. Therefore, if you add up all the outcomes you get ways (i.e., way1+way2+way3+... == ways), and ways/ways == 1. 
3. If the probability of x is x/ways, and notx is notx/ways, and x and not-x make up all the possible ways, then by definition x+notx == ways so x+notx/ways == 1

---
## Joint Probability
- Probability of A **and** B = `$p(A \bigcap B)$`
    - Joint event or *intersection* of A and B

- Probability of A **or** B = `$p(A \bigcup B)$`
    - *Union* of A and B

- Probability of **not** A = `$p(\sim A)$`
    - We describe the event not A as the *complement* of A.

---
## Relations between events
- **Mutually exclusive** events: If A occurs, B can not occur.
- **Independent** events: The occurance of event A does not impact event B.
- **Dependent** events: The occurance of event A **does** impact event B.
    - Impact means that A changes in the probability of B.

---
## Probabilities of joint events
- For mutually exclusive events, the probability of the union of those events is the sum of the individual probabilities.

- Think of the coin example:

`$$p(Head \bigcup Tails) = p(Head) + p(Tails) = 0.5 + 0.5 = 1.0$$`

---
## Probabilities of joint events
- For non-mutually exclusive events.

`$$p(A \bigcup B) = p(A) + p(B) - p(A \bigcap B)$$`

- Why?

`$$p(king or heart) = \frac{4}{52} + \frac{13}{52} - \frac{1}{52} = \frac{4}{13} = 0.31$$`

---
## Probabilities of joint events
- So far we have looked at rules for unions (or), but what about intersections (and) events:

- If events are independent, then;

`$$p(A \bigcap B) = p(A)p(B)$$`

- Consider getting a success (Head) in a coin flip (event A), and cutting the deck on a heart (event B).

`$$p(A \bigcap B) = \frac{1}{2} * \frac{1}{4} = \frac{1}{8} = .125 = 12.5%$$`

- What if events are not independent?

- We'll cover that next lecture. 
  - For now let's have a quick but important aside...

---
## Sampling: With and without replacement
- What does it mean to sample with or without replacement?

- Imagine a bag of balls, 6 red 4 blue.

- We take 1 ball:
    - `$p(red) = \frac{6}{10} = \frac{3}{5}$` = 0.60
    - `$p(blue) = \frac{4}{10} = \frac{2}{5}$` = 0.40

- If we put the ball back, then the probabilities are the same next draw.

- If we keep the ball, then the probabilities change...

- If we assume it was red, then;
      - `$p(red) = \frac{5}{9}$` = 0.56
      - `$p(blue) = \frac{4}{9}$` = 0.44

---
## Sampling: With and without replacement
- Whether we replace or not matters with respect to thinking about multiple trials.

- If we draw two balls from the sample, what is the probability they are both red?

- As long as the events are independent, then:
    - **With replacement** = 0.6*0.6 = 0.36 = 36%
    - **Without replacement** = 0.60*0.56 = 0.34 = 34%

---
# Summary of today
1. `$P(x)$` is `$x/y$` where *x* is ways *x* can happen and *y* is ways all things (including *x*) can happen. 
2. Sets help us conceptualise *x* and *y*. 
3. Rules for sets help us define rules for probability.

---
# Next tasks
+ Next week, we will continue to look at probability

+ This week:
  + Complete your lab
  + Come to office hours
  + Weekly quiz
      + Opens Monday 09:00
      + Closes Sunday 17:00