F-tests & Model Comparison

class: center, middle, inverse, title-slide

.title[
# <b>F-tests & Model Comparison </b>
]
.subtitle[
## Data Analysis for Psychology in R 2<br><br>
]
.author[
### dapR2 Team
]
.institute[
### Department of Psychology<br>The University of Edinburgh
]

---

# Course Overview

.pull-left[

<table style="border: 1px solid black;>
  <tr style="padding: 0 1em 0 1em;">
    <td rowspan="5" style="border: 1px solid black;padding: 0 1em 0 1em;opacity:1;text-align:center;vertical-align: middle">
        <b>Introduction to Linear Models</b></td>
    <td style="border: 1px solid black;padding: 0 1em 0 1em;opacity:1">
        Intro to Linear Regression</td>
  </tr>
  <tr><td style="border: 1px solid black;padding: 0 1em 0 1em;opacity:1">
        Interpreting Linear Models</td></tr>
  <tr><td style="border: 1px solid black;padding: 0 1em 0 1em;opacity:1">
        Testing Individual Predictors</td></tr>
  <tr><td style="border: 1px solid black;padding: 0 1em 0 1em;opacity:1">
        <b>Model Testing & Comparison</b></td></tr>
  <tr><td style="border: 1px solid black;padding: 0 1em 0 1em;opacity:0.4">
        Linear Model Analysis</td></tr>

<tr style="padding: 0 1em 0 1em;">
    <td rowspan="5" style="border: 1px solid black;padding: 0 1em 0 1em;opacity:0.4;text-align:center;vertical-align: middle">
        <b>Analysing Experimental Studies</b></td>
    <td style="border: 1px solid black;padding: 0 1em 0 1em;opacity:0.4">
        Categorical Predictors & Dummy Coding</td>
  </tr>
  <tr><td style="border: 1px solid black;padding: 0 1em 0 1em;opacity:0.4">
        	Effects Coding & Coding Specific Contrasts</td></tr>
  <tr><td style="border: 1px solid black;padding: 0 1em 0 1em;opacity:0.4">
        Assumptions & Diagnostics</td></tr>
  <tr><td style="border: 1px solid black;padding: 0 1em 0 1em;opacity:0.4">
        Bootstrapping</td></tr>
  <tr><td style="border: 1px solid black;padding: 0 1em 0 1em;opacity:0.4">
        	Categorical Predictor Analysis</td></tr>
</table>

]

.pull-right[

<table style="border: 1px solid black;>
  <tr style="padding: 0 1em 0 1em;">
    <td rowspan="5" style="border: 1px solid black;padding: 0 1em 0 1em;opacity:0.4;text-align:center;vertical-align: middle">
        <b>Interactions</b></td>
    <td style="border: 1px solid black;padding: 0 1em 0 1em;opacity:0.4">
        Interactions I</td>
  </tr>
  <tr><td style="border: 1px solid black;padding: 0 1em 0 1em;opacity:0.4">
        Interactions II</td></tr>
  <tr><td style="border: 1px solid black;padding: 0 1em 0 1em;opacity:0.4">
        Interactions III</td></tr>
  <tr><td style="border: 1px solid black;padding: 0 1em 0 1em;opacity:0.4">
        Analysing Experiments</td></tr>
  <tr><td style="border: 1px solid black;padding: 0 1em 0 1em;opacity:0.4">
        Interaction Analysis</td></tr>

<tr style="padding: 0 1em 0 1em;">
    <td rowspan="5" style="border: 1px solid black;padding: 0 1em 0 1em;opacity:0.4;text-align:center;vertical-align: middle">
        <b>Advanced Topics</b></td>
    <td style="border: 1px solid black;padding: 0 1em 0 1em;opacity:0.4">
        Power Analysis</td>
  </tr>
  <tr><td style="border: 1px solid black;padding: 0 1em 0 1em;opacity:0.4">
        Binary Logistic Regression I</td></tr>
  <tr><td style="border: 1px solid black;padding: 0 1em 0 1em;opacity:0.4">
        Binary Logistic Regression II</td></tr>
  <tr><td style="border: 1px solid black;padding: 0 1em 0 1em;opacity:0.4">
        Logistic Regresison Analysis</td></tr>
  <tr><td style="border: 1px solid black;padding: 0 1em 0 1em;opacity:0.4">
        	Exam Prep and Course Q&A</td></tr>
</table>

]

---

# This Week's Learning Objectives

1. Understand the use of `$F$` and incremental `$F$` tests

2. Be able to run and interpret `$F$`-tests in R

3. Understand how to use model comparisons to test different types of questions

4. Understand the difference between nested and non-nested models, and the appropriate statistics to use for comparison in each case

---
class: inverse, center, middle

# Part 1: Recap and `$F$`-tests

---
# Where we left off...
+ Last week we looked at:
  + The significance of individual predictors
  + Overall model evaluation through `$R^2$` and adjusted `$R^2$` to see how much variance in the outcome has been explained

+ Today we will:
  + Look at significance tests of the overall model
  + Discuss how we can use the same tools to do incremental tests (how much does my model improve when I add variables)

---
#  Statistical significance of the overall model

+ Does our combination of `$x$`'s significantly improve prediction of `$y$`, compared to not having any predictors?

+ Some indications that the model might be significant:
  + Slopes for individual predictors associated with significant `$p$`-values
  + High `$R^2$`
  
+ But these do not directly show model significance

+ To test the significance of the model as a whole, we conduct an `$F$`-test

---
#  `$F$`-test & `$F$`-ratio
+ An `$F$`-test involves testing the statistical significance of a test statistic called the `$F$`-ratio (also called `$F$`-statistic)

+ The `$F$`-ratio tests the null hypothesis that all the regression slopes in a model are zero

+ In other words, our predictors tell us nothing about our outcome
  
  + They explain no variance

+ The more variance our predictors explain, the bigger our `$F$`-ratio

+ As with `$t$`-values and the `$t$`-distribution, we compare the `$F$`-statistic to the `$F$`-distribution to obtain a `$p$`-value

---
# Our results (significant `$F$`)

``` r
performance <- lm(score ~ hours + motivation, data = test_study2); summary(performance)
```

```
## 
## Call:
## lm(formula = score ~ hours + motivation, data = test_study2)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -12.9548  -2.8042  -0.2847   2.9344  13.8240 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  6.86679    0.65473  10.488   <2e-16 ***
## hours        1.37570    0.07989  17.220   <2e-16 ***
## motivation   0.91634    0.38376   2.388   0.0182 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 4.386 on 147 degrees of freedom
## Multiple R-squared:  0.6696,	Adjusted R-squared:  0.6651 
## F-statistic: 148.9 on 2 and 147 DF,  p-value: < 2.2e-16
```

---
#  F-ratio: Some details
+ `$F$`-ratio is a ratio of the explained to unexplained variance:

`$$F = \frac{\frac{SS_{model}}{df_{model}}}{\frac{SS_{residual}}{df_{residual}}} = \frac{MS_{Model}}{MS_{Residual}}$$`

+ Where MS = mean squares

+ **What are mean squares?**
  + Mean squares are sums of squares calculations divided by the associated degrees of freedom
  + We saw how to calculate model and residual sums of squares last week

+ But what are model and residual degrees of freedom?

---
# Degrees of freedom

+ The degrees of freedom are defined as the number of independent values associated with the different calculations

+ Conceptually, how many values in the calculation can vary, if we keep the outcome of the calculation fixed

+ `$df$` are typically linked to:
  + the amount of data you have (sample size, `$n$`)
  + and the number of things you need to calculate/estimate based on that data (in our case the number of `$beta$`s)

---
# Degrees of freedom
  
+ **Model degrees of freedom = `$k$` **

+ `$SS_{model}$` are dependent on estimated `$\beta$` s, which is `$k + 1$` (the number of predictors plus the intercept)
  + From `$k + 1$`, we subtract `$1$` as not all the estimates can vary while holding the outcome constant
  + This gives us `$k$` for Model `$df$`
  
--

+ **Residual degrees of freedom = `$n-k-1$` **
  + `$SS_{residual}$` calculation is based on our individual data points and our model (in which we estimate `$k + 1$` `$\beta$` terms, i.e. the slopes and an intercept)
  + For each coefficient estimated, we lose a degree of freedom, as we're fitting the model to the data and reducing the flexibility in how much the residuals (errors) can vary

+ **Total degrees of freedom = `$n-1$` **
  + `$SS_{total}$` calculation is based on the observed `$y_i$` and `$\bar{y}$` . 
  + In order to estimate `$\bar{y}$` , all apart from one value of `$y$` is free to vary, hence `$n-1$`

---
# Our example (note the `$df$` at the bottom)

``` r
summary(performance)
```

---
# `$F$`-ratio
+ Bigger `$F$`-ratios indicate better fitting models

+ It means the variance explained by the model is big compared to the residual variance

+ `$H_0$` for the model says that the best guess of any individual's `$y$` value is `$\bar{y}$` (plus error)
	+ Or, that the `$x$` variables collectively carry no information about `$y$`
	+ All slopes = 0

`$$F = \frac{MS_{Model}}{MS_{Residual}}$$`

+ `$F$`-ratio will be close to 1 when `$H_0$` is true
  + If there is equivalent model to residual variation ( `$MS_{model} = MS_{residual}$` ), then `$F$`=1
	+ If there is more model than residual variation, then `$F$` > 1

---
# Testing the significance of `$F$`

+ The `$F$`-ratio is our test statistic for the significance of our model

+ As with all statistical inferences, we would select an `$\alpha$` level.
  
  + Identify the proper null `$F$`-distribution and calculate the critical value of `$F$` associated with chosen level of `$\alpha$`
  
  + Compare our `$F$`-statistic to the critical value.
  
  + If our value is more extreme than the critical value, it is considered significant

---
# Sampling distribution for the null

.pull-left[
+ Similar to the `$t$`-distribution, the `$F$`-distribution changes shape based on `$df$`

+ With an `$F$`-statistic, we have to consider both the `$df_{model}$` and `$df_{residual}$`

+ In parentheses, `$df_{model}$` is shown before `$df_{residual}$`

]

.pull-right[
<img src="dapr2_04_ftests_files/figure-html/unnamed-chunk-7-1.png" width="504" />
]

---
# A decision about the null

+ We have an `$F$`-statistic (from our model output summary):
  
  + `$F = 148.9$`

+ We consider `$df_{model}$` and `$df_{residual}$` to get our null distribution:
  
  + `$df_{model}=k=2$`
  
  + `$df_{residual}=n-k-1=150-2-1=147$`

+ We need to set our `$\alpha$` level
  
  + `$\alpha = .05$`

+ Now we can compute our critical value for `$F$`

---
# Visualise the test

.pull-left[
<img src="dapr2_04_ftests_files/figure-html/unnamed-chunk-8-1.png" width="504" />

]

.pull-right[

+ `$F$`-distribution with 2 `$df_{model}$` and 147 `$df_{residual}$` (our null distribution)

]

---
count: false

# Visualise the test

.pull-left[
<img src="dapr2_04_ftests_files/figure-html/unnamed-chunk-9-1.png" width="504" />

]

.pull-right[
+ `$F$`-distribution with 2 `$df_{model}$` and 147 `$df_{residual}$` (our null distribution)

+ Our critical value (using the `qf` function)

``` r
(Crit = round(qf(0.95, 2, 147), 3))
```

```
## [1] 3.058
```

]

---
count: false

# Visualise the test

.pull-left[
<img src="dapr2_04_ftests_files/figure-html/unnamed-chunk-11-1.png" width="504" />

]

.pull-right[
+ `$F$`-distribution with 2 `$df_{model}$` and 147 `$df_{residual}$` (our null distribution)

+ Our critical value (using the `qf` function)

``` r
(Crit = round(qf(0.95, 2, 147), 3))
```

```
## [1] 3.058
```

+ We can calculate the probability of an F-statistic at least as extreme as ours, given `$H_0$` is true (our `$p$`-value):

``` r
(pVal = 1-pf(148.9, 2, 147))
```

```
## [1] 0
```

]

---
count: false

# Visualise the test

.pull-left[
<img src="dapr2_04_ftests_files/figure-html/unnamed-chunk-14-1.png" width="504" />

]

.pull-right[
+ `$F$`-distribution with 2 `$df_{model}$` and 147 `$df_{residual}$` (our null distribution)

+ Our critical value (using the `qf` function)

``` r
(Crit = round(qf(0.95, 2, 147), 3))
```

```
## [1] 3.058
```

+ We can calculate the probability of an F-statistic at least as extreme as ours, given `$H_0$` is true (our `$p$`-value):

``` r
(pVal = 1-pf(148.9, 2, 147))
```

```
## [1] 0
```
+ Our model significantly predicted the variance in test score, `$F(2,147)= 148.90, p < .001$`

]

---
class: center, middle

# Questions?

---
class: inverse, center, middle

# Part 2: Model Comparison & Incremental `$F$`-tests

---
# Model comparisons

+ So far, our questions have been _is our overall model better than nothing?_ ( `$F$`-test ) or _which variables, specifically, are good predictors of the outcome variable?_ ( `$t$`-tests of `$\beta$` estimates )

+ But what if instead we wanted to ask:

> **When I make a change to my model, does it improve or not?**

+ This question is the core of model comparison

+ We can adapt this to our models in a more specific way:

+ E.g. is a model with `$x_1$` and `$x_2$` and `$x_3$` as predictors better than the model with just `$x_1$`?

+ So far: 
  
  + We have tested individual predictors
  + and we have tested overall models
  + **but we have not tested the improvement when we add predictors**
  
+ We have not looked at the combined performance of a _subset_ of predictors

---
# `$F$`-test as an incremental test

+ One important way we can think about the `$F$`-test and the `$F$`-ratio is as an incremental test against an "empty" or null model

+ A null or empty model is a linear model with only the intercept
  + In this model, our predicted value of the outcome for every case in our data set, is the mean of the outcome ( `$\bar{y}$`)
  + That is, with no predictors, we have no information that may help us predict the outcome
  + So we will be "least wrong" by guessing the mean of the outcome

+ An empty model is the same as saying all `$\beta$` = 0

+ And remember, this was the null hypothesis of the `$F$`-test

+ So in this way, the `$F$`-test can be seen as **comparing two models**

+ We can extend this idea, and use the `$F$`-test to compare two models that contain fewer or more predictors
  + This is the **incremental `$F$`-test**

---
# Incremental `$F$`-test
.pull-left[
+ The incremental `$F$`-test evaluates the statistical significance of the improvement in variance explained in an outcome with the addition of further predictor(s)

+ It is based on the difference in `$F$`-values between two models.
  + We call the model with the additional predictor(s) the **full model**
  + We call the model without additional predictors the **restricted model**

]

.pull-right[
`$$F_{(df_R-df_F),df_F} = \frac{(SSR_R-SSR_F)/(df_R-df_F)}{SSR_F / df_F}$$`

$$
`\begin{align}
& \text{Where:} \\
& SSR_R = \text{residual sums of squares for the restricted model} \\
& SSR_F = \text{residual sums of squares for the full model} \\
& df_R = \text{residual degrees of freedom from the restricted model} \\
& df_F = \text{residual degrees of freedom from the full model} \\
\end{align}`
$$
]

---
# Example of model comparison

+ Consider this example based on data from the Midlife in United States (MIDUS2) study:

+ Outcome: self-rated health

+ Covariates: Age, sex

+ Predictors: Big Five traits and Purpose in Life

+ Research Question: Does personality predict self-rated health over and above age and sex?

---
# The data

``` r
midus <- read_csv("data/MIDUS2.csv")
midus2 <- midus %>%
  select(1:4, 31:42) %>%
  mutate(
    PIL = rowMeans(.[grep("PIL", names(.))],na.rm=T)
  ) %>%
  select(1:4, 12:17) %>%
  drop_na(.)
slice(midus2, 1:3)
```

```
## # A tibble: 3 × 10
##      ID   age sex    health     O     C     E     A     N   PIL
##   <dbl> <dbl> <chr>   <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 10002    69 MALE        8  2.14   2.8   2.6   3.4  2     5.86
## 2 10019    51 MALE        8  3.14   3     3.4   3.6  1.5   5.71
## 3 10023    78 FEMALE      4  3.57   3.4   3.6   4    1.75  5.14
```

---
# The models
+ Does personality significantly predict self-rated health over and above the effects of age and sex?

+ First step here is to run two models.
  + M1: We predict from age and sex
  + M2: we add in the FFM (personality) traits

``` r
m1 <- lm(health ~ age + sex, data = midus2)
```

``` r
m2 <- lm(health ~ age + sex + O + C + E + A + N, data = midus2)
```

---
# Model 1 output (age + sex)

```
## 
## Call:
## lm(formula = health ~ age + sex, data = midus2)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -7.3057 -1.0496  0.5796  0.8533  2.9769 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  7.755974   0.183649  42.232  < 2e-16 ***
## age         -0.008829   0.003117  -2.833  0.00467 ** 
## sexMALE      0.035288   0.078619   0.449  0.65359    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.642 on 1758 degrees of freedom
## Multiple R-squared:  0.004627,	Adjusted R-squared:  0.003494 
## F-statistic: 4.086 on 2 and 1758 DF,  p-value: 0.01697
```

---
# Model 2 output (age + sex + personality)

```
## 
## Call:
## lm(formula = health ~ age + sex + O + C + E + A + N, data = midus2)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -6.7723 -0.7921  0.2532  1.0097  3.9550 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  6.66172    0.45100  14.771  < 2e-16 ***
## age         -0.01310    0.00298  -4.396 1.17e-05 ***
## sexMALE     -0.09571    0.07955  -1.203    0.229    
## O            0.09308    0.08306   1.121    0.263    
## C            0.57147    0.08507   6.717 2.49e-11 ***
## E            0.56771    0.08061   7.043 2.70e-12 ***
## A           -0.40380    0.09025  -4.474 8.15e-06 ***
## N           -0.56493    0.06189  -9.128  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.521 on 1753 degrees of freedom
## Multiple R-squared:  0.1484,	Adjusted R-squared:  0.145 
## F-statistic: 43.65 on 7 and 1753 DF,  p-value: < 2.2e-16
```

---
# Incremental `$F$`-test in R
+ Second step

+ Compare the two models based on an incremental `$F$`-test

+ In order to apply the `$F$`-test for model comparison in R, we use the `anova()` function.

+ `anova()` takes as its arguments models that we wish to compare

+ Here we see an example with 2 models, but we could use more

``` r
anova(m1, m2)
```

---
# Incremental `$F$`-test in R

``` r
anova(m1, m2)
```

```
## Analysis of Variance Table
## 
## Model 1: health ~ age + sex
## Model 2: health ~ age + sex + O + C + E + A + N
##   Res.Df    RSS Df Sum of Sq      F    Pr(>F)    
## 1   1758 4740.2                                  
## 2   1753 4055.4  5    684.85 59.208 < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
```

---
class: center, middle

# Questions?

---
class: inverse, center, middle

# Part 3: Non-nested models and alternatives to `$F$`-tests

---
# Nested vs non-nested models
+ The `$F$`-ratio depends on the comparison models being nested

+ Nested means that the predictors in one model are a subset of the predictors in the other

+ We also require the models to be computed on the same data
  + Be careful when data contains NA's
  + The `lm` function excludes the whole row of data if any of `$y$` or `$x$`'s specified in the model have missing values on that row
  + If the additional variables you add to the full model have NA's, the data sets used by the models could end up different!
  
--

> **You can only do an `$F$`-test if the models are nested: the variables are nested and the data are identical**

---
# Nested vs non-nested models

Assuming no NA's in `data`:

.pull-left[
**Nested**

``` r
m0 <- lm(outcome ~ x1 + x2 , data = data)

m1 <- lm(outcome ~ x1 + x2 + x3, data = data)
```

+ These models are nested.

+ `x1` and `x2` appear in both models
]

.pull-right[
**Non-nested**

``` r
m0 <- lm(outcome ~ x1 + x2 + x4, data = data)

m1 <- lm(outcome ~ x1 + x2 + x3, data = data)
```

+ These models are non-nested

+ There are unique variables in both models
  + `x4` in `m0`
  + `x3` in `m1`

]

---
# Model comparison for non-nested models
+ So what happens when we have non-nested models?

+ There are two commonly used alternatives
  + AIC
  + BIC

+ Unlike the incremental `$F$`-test AIC and BIC do not require two models to be nested

+ Smaller (more negative) values indicate better fitting models
  + So we compare values and choose the model with the smaller AIC or BIC value
  
---
# AIC & BIC

.pull-left[
`$$AIC = n\,\text{ln}\left( \frac{SS_{residual}}{n} \right) + 2k$$`

$$
`\begin{align}
& \text{Where:} \\
& SS_{residual} = \text{sum of squares residuals} \\
& n = \text{sample size} \\
& k = \text{number of explanatory variables} \\
& \text{ln} = \text{natural log function} 
\end{align}`
$$
]

.pull-right[

`$$BIC = n\,\text{ln}\left( \frac{SS_{residual}}{n} \right) + k\,\text{ln}(n)$$`

]

---
# Parsimony corrections

+ Both AIC and BIC contain something called a parsimony correction

+ In essence, they penalise models for being complex
  
  + This is to help us avoid overfitting (adding predictors arbitrarily to improve fit)
  
`$$AIC = n\,\text{ln}\left( \frac{SS_{residual}}{n} \right) + 2k$$`

`$$BIC = n\,\text{ln}\left( \frac{SS_{residual}}{n} \right) + k\,\text{ln}(n)$$`

+ BIC has a harsher parsimony penalty for typical sample sizes when applying linear models than AIC
  + When `$\text{ln}(n) > 2$` , BIC will have a more severe parsimony penalty (i.e. essentially all the time!)

---
# In R

+ Let's use AIC and BIC on our `m1` and `m2` models from previously:

.pull-left[

``` r
AIC(m1, m2)
```

```
##    df      AIC
## m1  4 6749.246
## m2  9 6484.457
```
]

.pull-right[

``` r
BIC(m1, m2)
```

```
##    df      BIC
## m1  4 6771.141
## m2  9 6533.719
```
]

---
# Let's consider a different example
+ Our previous models were nested
  + `m1` had just covariates
  + `m2` added personality
  
+ Using the same data, let's consider a non-nested example

+ Suppose we want to compare a model that:
  + predicts self-rated health from just 5 personality variables (`nn1` : non-nested model 1)
  + to a model that predicts from age, sex and a variable called Purpose in Life (PIL) (`nn2`).

---
# Applied to non-nested models

``` r
nn1 <- lm(health ~ O + C + E + A + N, data=midus2)
nn2 <- lm(health ~ age + sex + PIL, data = midus2)
```

``` r
AIC(nn1, nn2)
```

```
##     df      AIC
## nn1  7 6501.524
## nn2  5 6564.953
```

``` r
BIC(nn1, nn2)
```

```
##     df      BIC
## nn1  7 6539.840
## nn2  5 6592.321
```

---
# Considerations for use of AIC and BIC

+ AIC and BIC can be used for both nested and non-nested models

+ The AIC and BIC for a single model are not meaningful
  + They only make sense for model comparisons
  + We evaluate these comparisons by looking at the difference, `$\Delta$`, between two values

+ There are no specific thresholds for `$\Delta AIC$` to suggest how big a difference in two models is needed to conclude that one is substantively better than the other

+ The following `$\Delta BIC$` cutoffs have been suggested (Raftery, 1995):
  
| Value             | Interpretation                                    |
|-------------------|---------------------------------------------------|
| `$\Delta < 2$`      | No evidence of difference between models          |
| `$2 < \Delta < 6$`  | Positive evidence of difference between models    |
| `$6 < \Delta < 10$` | Strong evidence of difference between models      |
| `$\Delta > 10$`     | Very strong evidence of difference between models |

---
class: center, middle

# Questions?

---
# Pause to summarise what we know so far

+ So far we have seen how to:
  + run a linear model with a single predictor
  + extend this and add predictors
  + interpret these coefficients either in original units or standardised units
  + test the significance of `$\beta$` coefficients
  + test the significance of the overall model
  + estimate the amount of variance explained by our model
  + evaluate improvements to model fit when variables are added
  + select a better-fitting model between two nested or non-nested models

+ You can now run and interpret linear models with continuous predictors

+ Next week we will put this into action constructing and implementing an analysis plan for a linear model on a real example

---

## This week

.pull-left[

### Tasks

**Attend your lab and work together on the exercises**

<br>

**Complete the weekly quiz**

]

.pull-right[

### Support

**Help each other on the Piazza forum**

<br>

**Attend office hours (see Learn page for details)**

]

---
class: inverse, center, middle

# Thanks for listening