Week 11: Some Kind of End to the Course

class: center, middle, inverse, title-slide

.title[
# Week 11: Some Kind of End to the Course
]
.subtitle[
## Univariate Statistics and Methodology using R
]
.author[
### 
]
.institute[
### Department of Psychology The University of Edinburgh
]

---

class: inverse, center, middle
# Part 1
## Week 7 - 10 Recap

---
# Describing a pattern with a line

.pull-left[
![](lecture_11_files/figure-html/lm1-1.svg)
]
.pull-right[

```r
ggplot(df, aes(x = x1, y = y)) + 
  geom_point()
```

]

---
# Defining a line

.pull-left[
![](lecture_11_files/figure-html/unnamed-chunk-3-1.svg)
]
.pull-right[

`$$\Large y_i = b_0 + b_1(x_{1i}) + \varepsilon_i$$`

A line can be defined by two values:

- A starting point (Intercept)
  - A slope ($y$ across `$x1$`)

Fitting a linear model to some data provides coefficient estimates `$\hat{b}_0$` and `$\hat{b}_1$` that minimise `$\sigma_\varepsilon$`.

]

---
# Testing the coefficients (1)

.pull-left[
![](lecture_11_files/figure-html/unnamed-chunk-4-1.svg)
]
.pull-right[
`$$\Large \hat{y} = \color{red}{\hat{b}_0} + \hat{b}_1(x_{1})$$`

In the "null universe" where `$b_0 = 0$`, when sampling this many people, what is the probability that we will find an intercept at least as extreme as the one we _have_ found?

```
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
*## (Intercept)    6.982      1.026    6.81  1.5e-08 ***
## x1             0.806      0.234    3.45   0.0012 **
```
]

---
# Testing the coefficients (2)

.pull-left[
![](lecture_11_files/figure-html/unnamed-chunk-6-1.svg)
]
.pull-right[
`$$\Large \hat{y} = \hat{b}_0 + \color{red}{\hat{b}_1}(x_{1})$$`

In the "null universe" where `$b_1 = 0$`, when sampling this many people, what is the probability that we will find a relationship at least as extreme as the one we _have_ found?

```
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)    6.982      1.026    6.81  1.5e-08 ***
*## x1             0.806      0.234    3.45   0.0012 **
```
]

---
# Coefficient Sampling Variability

.pull-left[
![](lecture_11_files/figure-html/unnamed-chunk-8-1.svg)
]
.pull-right[
`$$\Large \hat{y} = \hat{b}_0 + \hat{b}_1(x_{1})$$`

Plausible range of values for `$b_0$` and `$b_1$`:

```r
confint(model)
```

```
##             Estimate 2.5 % 97.5 %
## (Intercept)   6.9816 4.919  9.044
## x1            0.8057 0.336  1.275
```
]

---
# Testing the model

![](lecture_11_files/figure-html/unnamed-chunk-11-1.svg)

.pull-left[
$$
\small 
`\begin{align}
& F_{df_{model},df_{residual}} = \frac{MS_{Model}}{MS_{Residual}} = \frac{SS_{Model}/df_{Model}}{SS_{Residual}/df_{Residual}} \\
& df_{model} = \text{nr predictors} \\
& df_{residual} = \text{sample size} - \text{nr predictors} - 1 \\
\end{align}`
$$
]
.pull-right[

```
## 
## Multiple R-squared:  0.199,	Adjusted R-squared:  0.182 
*## F-statistic: 11.9 on 1 and 48 DF,  p-value: 0.00118
```
]

---
# More predictors

.pull-left[
![](lecture_11_files/figure-html/unnamed-chunk-14-1.svg)
]
.pull-right[ 
![](lecture_11_files/figure-html/unnamed-chunk-15-1.svg)
]

---
# More predictors (2)

`$$\Large y_i = b_0 + b_1(x_{1i}) + b_2(x_{2i}) + \varepsilon_i$$`

- A starting point (Intercept)
- A slope (across `$x_1$`)
- _Another_ slope (across `$x_2$`)

Coefficient estimates `$\hat{b}_0$`, `$\hat{b}_1$`, `$\hat{b}_2$` minimise `$\sigma_\varepsilon$`.

---
# More predictors (3)

![](lecture_11_files/figure-html/unnamed-chunk-16-1.svg)

---
# _Even_ more predictors...

`$$\Large y_i = b_0 + b_1(x_{1i}) + b_2(x_{2i}) + \, ... \, + b_k(x_{ki}) + \varepsilon_i$$`
- A starting point (Intercept)
- A slope (across `$x_1$`)
- A slope (across `$x_2$`)
- ...
- ...
- A slope (across `$x_k$`)

---
# associations that depend on other things

.pull-left[
![](lecture_11_files/figure-html/unnamed-chunk-18-1.svg)
]
.pull-right[
![](lecture_11_files/figure-html/unnamed-chunk-19-1.svg)
]

---
# interactions

`$$\Large y_i = b_0 + b_1(x_{1i}) + b_2(x_{2i}) + b_3(x_{1i}\cdot x_{2i}) + \varepsilon_i$$`
- starting point (Intercept)
- A slope (across `$x_1$`)
- A slope (across `$x_2$`)
- ...
- How slope across `$x_1$` changes across `$x_2$`

---
# interactions

.pull-left[
![](lecture_11_files/figure-html/unnamed-chunk-20-1.svg)
]
.pull-right[

]

---
# interactions (2)

.pull-left[

![](lecture_11_files/figure-html/unnamed-chunk-21-1.svg)
]
.pull-right[

![](lecture_11_files/figure-html/unnamed-chunk-22-1.svg)
]

---
# other outcomes

.pull-left[

`$$ln(\frac{p}{1-p}) = b_0 + b_1(x_{1i}) + b_2(x_{2i})$$`

![](lecture_11_files/figure-html/unnamed-chunk-23-1.svg)
]
.pull-right[

$$
ln(\frac{p}{1-p}) \, \Rightarrow \, \frac{p}{1-p} \, \Rightarrow \, p
$$

![](lecture_11_files/figure-html/unnamed-chunk-24-1.svg)
]

---
# Checking Assumptions:  Linear Models

.pull-left[
### required

- linearity of relationship

- for the _residuals_:
  + normality
  + homogeneity of variance
  + independence
]
.pull-right[
### desirable
- uncorrelated predictors

- no "bad" (overly influential) observations
]

---
# Checking Assumptions:  Logit Models

.pull-left[
### required

- linearity of relationship .red[between IVs and log-odds]

- for the _residuals_:
  + .red[~~normality~~]
  + .red[~~homogeneity of variance~~]
  + independence
]
.pull-right[
### desirable
- uncorrelated predictors

- no "bad" (overly influential) observations

- .red[large samples (due to maximum likelihood fitting)]
]

---
class: inverse, center, middle, animated, flipInY
# End of Part 1

---
class: inverse, center, middle
# Part 2
## Common Tests as linear models

```r
usmr <- read_csv("https://uoepsy.github.io/data/usmr2022.csv")
```

---
# lm vs correlation

.pull-left[
#### regression, continuous predictor

```r
summary(lm(sleeprating ~ height, data = usmr))
```

```
## 
## Call:
## lm(formula = sleeprating ~ height, data = usmr)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -65.42 -11.66   5.52  16.96  36.16 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)
## (Intercept)   18.785     46.879     0.4     0.69
## height         0.279      0.278     1.0     0.32
## 
## Residual standard error: 22.6 on 76 degrees of freedom
##   (2 observations deleted due to missingness)
## Multiple R-squared:  0.013,	Adjusted R-squared:  4.73e-05 
## F-statistic:    1 on 1 and 76 DF,  p-value: 0.32
```
]
.pull-right[
#### Correlation

```r
cor.test(usmr$height, usmr$sleeprating)
```

```
## 
## 	Pearson's product-moment correlation
## 
## data:  usmr$height and usmr$sleeprating
## t = 1, df = 76, p-value = 0.3
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.1112  0.3284
## sample estimates:
##    cor 
## 0.1142
```
]

---
# lm vs t.test

.pull-left[
#### regression, intercept

```r
summary(lm(height ~ 1, data = usmr))
```

```
## 
## Call:
## lm(formula = height ~ 1, data = usmr)
## 
## Residuals:
## Min 1Q Median 3Q Max 
## -15.407 -7.607 -0.107 7.523 20.893 
## 
## Coefficients:
## Estimate Std. Error t value Pr(>|t|) 
## (Intercept) 168.11 1.04 162 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 9.22 on 78 degrees of freedom
## (1 observation deleted due to missingness)
```
]
.pull-right[
#### one sample t.test

```r
t.test(usmr$height, mu=0)
```

```
## 
## 	One Sample t-test
## 
## data: usmr$height
## t = 162, df = 78, p-value <2e-16
## alternative hypothesis: true mean is not equal to 0
## 95 percent confidence interval:
## 166.0 170.2
## sample estimates:
## mean of x 
## 168.1
```
]

---
# lm vs t.test (2)

.pull-left[
#### regression, binary predictor

```r
summary(lm(height ~ catdog, data = usmr))
```

```
## 
## Call:
## lm(formula = height ~ catdog, data = usmr)
## 
## Residuals:
## Min 1Q Median 3Q Max 
## -13.655 -7.501 0.499 8.399 19.499 
## 
## Coefficients:
## Estimate Std. Error t value Pr(>|t|) 
## (Intercept) 166.36 1.55 107.60 <2e-16 ***
## catdogdog 3.15 2.07 1.52 0.13 
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 9.15 on 77 degrees of freedom
## (1 observation deleted due to missingness)
## Multiple R-squared: 0.0291,	Adjusted R-squared: 0.0165 
## F-statistic: 2.31 on 1 and 77 DF, p-value: 0.133
```
]
.pull-right[
#### two sample t.test

```r
t.test(height ~ catdog, data = usmr,
       var.equal = TRUE)
```

```
## 
## 	Two Sample t-test
## 
## data:  height by catdog
## t = -1.5, df = 77, p-value = 0.1
## alternative hypothesis: true difference in means between group cat and group dog is not equal to 0
## 95 percent confidence interval:
##  -7.2706  0.9796
## sample estimates:
## mean in group cat mean in group dog 
##             166.4             169.5
```
]

---
# lm vs Traditional ANOVA

> If you should say to a mathematical statistician that you have discovered that linear multiple regression and the analysis of variance (and covariance) are identical systems, he would mutter something like "Of course&mdash;general linear model," and you might have trouble maintaining his attention.  If you should say this to a typical psychologist, you would be met with incredulity, or worse.  Yet it is true, and in its truth lie possibilities for more relevant and therefore more powerful research data.
.tr[
Cohen (1968)
]

---
# History

.pull-left[
.br3.pa2.bg-gray.white[
### .white[Multiple Regression]

- introduced c. 1900 in biological and behavioural sciences

- aligned to "natural variation" in observations

- tells us that means `$(\bar{y})$` are related to groups `$(g_1,g_2,\ldots,g_n)$`
]]
.pull-right[
.br3.pa2.bg-gray.white[
### .white[ANOVA]

- introduced c. 1920 in agricultural research

- aligned to experimentation and manipulation

- tells us that groups `$(g_1,g_2,\ldots,g_n)$` have different means `$(\bar{y})$`
]]

.pt2[
- both produce `$F$`-ratios, discussed in different language, but identical
]

---
# lm vs Traditional ANOVA

.pull-left[
#### regression, binary predictor

```r
summary(lm(height ~ eyecolour, data = usmr))
```

```
## 
## lm(formula = height ~ eyecolour, data = usmr)
## Coefficients:
## Estimate Std. Error t value Pr(>|t|) 
## (Intercept) 170.254 2.183 77.99 <2e-16 ***
## eyecolourbrown -3.682 2.609 -1.41 0.16 
## eyecolourgreen 2.717 4.125 0.66 0.51 
## eyecolourgrey -0.254 9.515 -0.03 0.98 
## eyecolourhazel -2.404 3.935 -0.61 0.54 
## eyecolourother -4.834 5.775 -0.84 0.41 
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 9.26 on 73 degrees of freedom
## (1 observation deleted due to missingness)
## Multiple R-squared: 0.0563,	Adjusted R-squared: -0.00837 
## F-statistic: 0.871 on 5 and 73 DF, p-value: 0.505
```
]
.pull-right[
#### anova

```r
summary(aov(height ~ eyecolour, data = usmr))
```

```
##             Df Sum Sq Mean Sq F value Pr(>F)
## eyecolour    5    373    74.7    0.87   0.51
## Residuals   73   6261    85.8               
## 1 observation deleted due to missingness
```
]

---
# Why Teach LM/Regression?

- LM has less restrictive assumptions

+ especially true for unbalanced designs/missing data
  
- LM is far better at dealing with covariates

+ can arbitrarily mix continuous and discrete predictors
  
- LM is the gateway to other powerful tools

+ mixed models and factor analysis (→ MSMR)

+ structural equation models

---
class: inverse, center, middle, animated, flipInY
# End

---
background-image: url(lecture_11_files/img/playmo_goodbye.jpg)
background-size: contain

# Goodbye!