Introduction to the Linear Model

class: center, middle, inverse, title-slide

.title[
# <b> Introduction to the Linear Model </b>
]
.subtitle[
## DPUK Spring Academy<br><br>
]
.author[
### Josiah King, Umberto Noe, (and credits to Tom Booth)
]
.institute[
### Department of Psychology<br>The University of Edinburgh
]
.date[
### April 2025
]

---

# Overview

- Day 2: What is a linear model?
- Day 3: But I have more variables, what now?
- Day 4: Interactions
- Day 5: Is my model any good?

---
class: center, middle
# Day 4
**Interactions (uh-oh)**

---
class: inverse, center, middle

<h2>Part 1: What is an interaction and why are we talking about it? </h2>
<h2 style="text-align: left;opacity:0.3;">Part 2: Continuous*binary interactions </h2>
<h2 style="text-align: left;opacity:0.3;">Part 3: Continuous*Continuous interactions </h2>
<h2 style="text-align: left;opacity:0.3;">Part 4: Categorical*categorical interactions </h2>

---
#  Lecture notation

+ For today, we will work with the following equation and notation:

`$$y_i = \beta_0 + \beta_1 x_{i} + \beta_2 z_{i} + \beta_3 xz_{i} + \epsilon_i$$`

+ `$y$` is a continuous outcome

+ `$x$` is our first predictor

+ `$z$` is our second predictor
	
+ `$xz$` is their product or interaction predictors

---
#  General definition

+ When the effects of one predictor on the outcome differ across levels of another predictor.

+ **Important**: When we have an interaction, we can no longer talk about the overall effect of a variable, *"holding the others constant"*.
  + The effect changes across values of the interacting variable
  + The effects at specific variables of the interacting variable are called marginal effects (sometimes also "simple effects")

+ Note interactions are symmetrical. 
  + What does this mean?
    + We can talk about interaction of X with Z, or Z with X.

---
#  Interactions with different types of variables

+ Categorical*continuous interaction:

+ The slope of the regression line between a continuous predictor and the outcome is different across levels of a categorical predictor.

+ Continuous*continuous interaction:

+ The slope of the regression line between a continuous predictor and the outcome changes as the values of a second continuous predictor change.
	+ May have heard this referred to as moderation.

+ Categorical*categorical interaction:

+ There is a difference in the differences between groups across levels of a second factor.

---
# Why are we interested in interactions?

+ Often we have theories/ideas/questions, which relate to an interaction.

+ For example: 
  + different relationships of mood state to cognitive score dependent on disease status
  + different rates of cognitive decline by disease status. 
  + effect of spending time with partner on relationship satisfaction depends on relationship quality

+ Questions like these would be tested via inclusion of an interaction term in our model.

---
# When should I include an interaction?

.pull-left[

+ If your research question pre-supposes one

+ If theory suggests it is necessary in order to reflect the underlying data generating process

+ If data visualisations suggest it may be necessary

]

.pull-right[

+ An interaction term is another predictor. Additional predictors always explains *some* additional variability.

+ In the real world, everything interacts with everything else

+ But interactions add complexity and make effects more difficult to interpret because they require the additional context.

+ Be wary of testing for and including interactions based on significance alone, as you'll risk over-fitting to your specific sample.

]

---
# How do I include an interaction in R?

``` r
lm( y ~ x + z + x:z, data
```

shorthand:

``` r
`lm( y ~ x*z, data)`
```

---
class: inverse, center, middle

<h2 style="text-align: left;opacity:0.3;">Part 1: What is an interaction and why are we talking about it? </h2>
<h2>Part 2: Continuous*binary interactions </h2>
<h2 style="text-align: left;opacity:0.3;">Part 3: Continuous*Continuous interactions </h2>
<h2 style="text-align: left;opacity:0.3;">Part 4: Categorical*categorical interactions </h2>

---
#  Interpretation: Categorical*Continuous

`$$y_i = \beta_0 + \beta_1 x_{i} + \beta_2 z_{i} + \beta_3 xz_{i} + \epsilon_i$$`

+ Where `$z$` is a binary predictor

+ `$\beta_0$` = Value of `$y$` when `$x$` and `$z$` are 0

+ `$\beta_1$` = Effect of `$x$` (slope) when `$z$` = 0 (reference group)

+ `$\beta_2$` = Difference intercept between `$z$` = 0 and `$z$` = 1, when `$x$` = 0.

+ `$\beta_3$` = Difference in slope across levels of `$z$`

---
#  Example: Categorical*Continuous

.pull-left[
+ Suppose I am conducting a study on how years of service within an organisation predicts salary in two different departments, accounts and store managers.

+ y = salary (unit = thousands of pounds)

+ x = years of service

+ z = Department (0=Store managers, 1=Accounts)
]

.pull-right[

``` r
salary1 %>%
  slice(1:10)
```

```
## # A tibble: 10 × 3
##    service salary dept        
##      <dbl>  <dbl> <fct>       
##  1     6.2   60.5 Accounts    
##  2     2.7   22.9 StoreManager
##  3     4.6   48.9 Accounts    
##  4     5.4   49.9 Accounts    
##  5     3.5   28.2 StoreManager
##  6     5.6   54.1 Accounts    
##  7     5.7   37.8 StoreManager
##  8     2.6   37.9 Accounts    
##  9     5.9   36.5 StoreManager
## 10     4.9   28.4 StoreManager
```

]

---
#  Visualize the data

.pull-left[

``` r
salary1 %>%
  ggplot(., aes(x = service, y = salary, 
                colour = dept)) +
  geom_point() +
  xlim(0,8) +
  labs(x = "Years of Service", 
       y = "Salary (1000gbp)")
```

]

.pull-right[

![](data:image/png;base64,#DPUK_D_2025_files/figure-html/unnamed-chunk-6-1.png)

]

---
#  Example: Categorical*Continuous

``` r
int <- lm(salary ~ service + dept + service*dept, data = salary1)
summary(int)
```

```
## 
## Call:
## lm(formula = salary ~ service + dept + service * dept, data = salary1)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -10.196  -2.812  -0.316   2.927  10.052 
## 
## Coefficients:
##                      Estimate Std. Error t value Pr(>|t|)    
## (Intercept)           16.8937     4.4638   3.785 0.000444 ***
## service                2.7364     0.9166   2.986 0.004524 ** 
## deptAccounts           4.4887     6.3111   0.711 0.480523    
## service:deptAccounts   3.1174     1.2698   2.455 0.017928 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 4.61 on 46 degrees of freedom
## Multiple R-squared:  0.867,	Adjusted R-squared:  0.8583 
## F-statistic: 99.93 on 3 and 46 DF,  p-value: < 2.2e-16
```

---
#  Interpretation: Categorical*Continuous

.pull-left[
+ **Intercept** ( `$\beta_0$` ): Predicted salary for a store manager (`dept`=0) with 0 years of service is £16,894.

+ **Service** ( `$\beta_1$` ): For each additional year of service for a store manager (`dept` = 0), salary increases by £2,736.

+ **Dept** ( `$\beta_2$` ): Difference in salary between store managers (`dept` = 0) and accounts (`dept` = 1) with 0 years of service is £4,489.
	
+ **Service:dept** ( `$\beta_3$` ): The difference in slope. For each year of service, those in accounts (`dept` = 1) increase by an additional £3,117.

]

.pull-right[
![](data:image/png;base64,#DPUK_D_2025_files/figure-html/unnamed-chunk-8-1.png)
]

---
#  Centering predictors

**Why centre?** 
+ Meaningful interpretation.

+ Interpretation of models with interactions involves evaluation when other variables = 0.
  
  + This makes it quite important that 0 is meaningful in some way.
  	+ Note this is simple with categorical variables.
  	+ We code our reference group as 0 in all dummy variables.
  
  + For continuous variables, we need a meaningful 0 point.

---
#  Example of age 
+ Suppose I have age as a variable in my study with a range of 30 to 85.

+ Age = 0 is not that meaningful.
	+ Essentially means all my parameters are evaluated at point of birth.

+ So what might be meaningful?
	+ Average age? (mean centering)
	+ A fixed point? (e.g. 66 if studying retirement)

---
#  Initial model

.pull-left[

``` r
int <- lm(salary ~ service + dept + service:dept, data = salary1)
summary(int)
```
```
Coefficients:
                     Estimate Std. Error t value
(Intercept)           16.8937     4.4638   3.785
service                2.7364     0.9166   2.986
deptAccounts           4.4887     6.3111   0.711
service:deptAccounts   3.1174     1.2698   2.455
```

]

.pull-right[
![](data:image/png;base64,#DPUK_D_2025_files/figure-html/unnamed-chunk-10-1.png)
]

---
#  Mean-centered

.pull-left[

``` r
salary1 <- salary1 |>
    mutate(
        service_m = service - mean(service)
    )
int_a <- lm(salary ~ service_m + dept + service_m:dept, data = salary1)
summary(int_a)
```
```
Coefficients:
                       Estimate Std. Error t value
(Intercept)             30.1982     0.9081  33.256
service_m                2.7364     0.9166   2.986
deptAccounts            19.6455     1.3106  14.989
service_m:deptAccounts   3.1174     1.2698   2.455
```

]

.pull-right[
![](data:image/png;base64,#DPUK_D_2025_files/figure-html/unnamed-chunk-12-1.png)
]

---
class: inverse, center, middle

<h2 style="text-align: left;opacity:0.3;">Part 1: What is an interaction and why are we talking about it? </h2>
<h2 style="text-align: left;opacity:0.3;">Part 2: Continuous*binary interactions </h2>
<h2>Part 3: Continuous*Continuous interactions </h2>
<h2 style="text-align: left;opacity:0.3;">Part 4: Categorical*categorical interactions </h2>

---
#  Interpretation: Continuous*Continuous

`$$y_i = \beta_0 + \beta_1 x_{i} + \beta_2 z_{i} + \beta_3 xz_{i} + \epsilon_i$$`

+ Lecture notation:
  
  + `$\beta_0$` = Value of `$y$` when `$x$` and `$z$` are 0
  
  + `$\beta_1$` = Effect of `$x$` (slope) when `$z$` = 0
  
  + `$\beta_2$` = Effect of `$z$` (slope) when `$x$` = 0
  
  +  `$\beta_3$` = Change in slope of `$x$` on `$y$` across values of `$z$` (and vice versa).
	    + Or how the effect of `$x$` depends on `$z$` (and vice versa)

---
#  Example: Continuous*Continuous

+ Conducting a study on how years of service and employee performance ratings predicts salary in a sample of managers.

`$$y_i = \beta_0 + \beta_1 x_{i} + \beta_2 z_{i} + \beta_3 xz_{i} + \epsilon_i$$`

+ `$y$` = Salary (unit = thousands of pounds ).

+ `$x$` = Years of service.

+ `$z$` = Average performance ratings.

---
#  Example: Continuous*Continuous

```
## 
## Call:
## lm(formula = salary ~ serv * perf, data = salary2)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -43.008  -9.710  -1.068   8.674  48.494 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   87.920     16.376   5.369 5.51e-07 ***
## serv         -10.944      4.538  -2.412  0.01779 *  
## perf           3.154      4.311   0.732  0.46614    
## serv:perf      3.255      1.193   2.728  0.00758 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 17.55 on 96 degrees of freedom
## Multiple R-squared:  0.5404,	Adjusted R-squared:  0.5261 
## F-statistic: 37.63 on 3 and 96 DF,  p-value: 3.631e-16
```

---
#  Example: Continuous*Continuous

.pull-left[

+ **Intercept**: a manager with 0 years of service and 0 performance rating earns £87,920

+ **Service**: for a manager with 0 performance rating, for each year of service, salary decreases by £10,940
  + slope when performance = 0
  
+ **Performance**: for a manager with 0 years service, for each point of performance rating, salary increases by £3,150.
  + slope when service = 0
  
+ **Interaction**: for every year of service, the relationship between performance and salary increases by £3250.

]

.pull-right[

```
##             Estimate Std. Error t value Pr(>|t|)
## (Intercept)    87.92      16.38    5.37     0.00
## serv          -10.94       4.54   -2.41     0.02
## perf            3.15       4.31    0.73     0.47
## serv:perf       3.25       1.19    2.73     0.01
```

]

???
+ What do you notice here?
+ 0 performance and 0 service are odd values
+ lets mean centre both, so 0 = average, and look at this again.

---
# Mean centering

``` r
salary2 <- salary2 %>%
  mutate(
*   perfM = c(scale(perf, scale = F)),
*   servM = c(scale(serv, scale = F))
  )

int3 <- lm(salary ~ servM*perfM, data = salary2)
```

---
# Mean centering

```
## 
## Call:
## lm(formula = salary ~ servM * perfM, data = salary2)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -43.008  -9.710  -1.068   8.674  48.494 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  104.848      1.757  59.686  < 2e-16 ***
## servM          1.425      1.364   1.044  0.29890    
## perfM         14.445      1.399  10.328  < 2e-16 ***
## servM:perfM    3.255      1.193   2.728  0.00758 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 17.55 on 96 degrees of freedom
## Multiple R-squared:  0.5404,	Adjusted R-squared:  0.5261 
## F-statistic: 37.63 on 3 and 96 DF,  p-value: 3.631e-16
```

---
#  Example: Continuous*Continuous

.pull-left[

+ **Intercept**: a manager with average years of service and average performance rating earns £104,850

+ **Service**: a manager with average performance rating, for every year of service, salary increases by £1,420
  + slope when performance = 0 (mean centered)
  
+ **Performance**: a manager with average years service, for each point of performance rating, salary increases by £14,450.
  + slope when service = 0 (mean centered)
  
+ **Interaction**: for every year of service, the relationship between performance and salary increases by £3,250.

]

.pull-right[

```
##             Estimate Std. Error t value Pr(>|t|)
## (Intercept)   104.85       1.76   59.69     0.00
## servM           1.42       1.36    1.04     0.30
## perfM          14.45       1.40   10.33     0.00
## servM:perfM     3.25       1.19    2.73     0.01
```

]

---
#  Plotting interactions

+ In our last block we saw we could produce a line for each group of a binary (extends to categorical) variable.

+ These are called simple slopes:
	+ **Regression of the outcome Y on a predictor X at specific values of an interacting variable Z.**

+ For a continuous variable, we could choose any values of Z. 
  + Typically we plot at the mean and +/- 1SD

---
#  `sjPlot`: Simple Slopes

.pull-left[

``` r
library(sjPlot)
plot_model(int3, type = "int")
```
]

.pull-right[
![](data:image/png;base64,#DPUK_D_2025_files/figure-html/unnamed-chunk-20-1.png)

]

---
#  `sjPlot`: Simple Slopes

.pull-left[

``` r
library(sjPlot)
plot_model(int3, type = "pred",
           terms = c("servM","perfM [-2,0,2]"))
```
]

.pull-right[
![](data:image/png;base64,#DPUK_D_2025_files/figure-html/unnamed-chunk-22-1.png)

]

---
#  `sjPlot`: Simple Slopes

.pull-left[

``` r
library(sjPlot)
plot_model(int3, type = "pred",
           terms = c("servM","perfM [-2,-1,0,1,2]"))
```
]

.pull-right[
![](data:image/png;base64,#DPUK_D_2025_files/figure-html/unnamed-chunk-24-1.png)

]

---
class: inverse, center, middle

<h2 style="text-align: left;opacity:0.3;">Part 1: What is an interaction and why are we talking about it? </h2>
<h2 style="text-align: left;opacity:0.3;">Part 2: Continuous*binary interactions </h2>
<h2 style="text-align: left;opacity:0.3;">Part 3: Continuous*Continuous interactions </h2>
<h2>Part 4: Categorical*categorical interactions </h2>

---
#  General definition

+ When the effects of one predictor on the outcome differ across levels of another predictor.

+ Categorical*categorical interaction:
	+ There is a difference in the differences between groups across levels of a second factor.

+ This idea of a difference in differences can be quite tricky to think about.
  + So we will start with some visualization, and then look at two examples.
  
---
# Difference in differences (1)

.pull-left[

|         | London| Birmingham|
|:--------|------:|----------:|
|Accounts |     50|         40|
|Manager  |     30|         20|

+ In each plot we look at, think about subtracting the average store managers salary (blue triangle) from the average accounts salary (red circle)

+ In both cases, it is £20,000.

+ Note, the lines are parallel
  + Remember what we have said about parallel lines...no interaction

]

.pull-right[

![](data:image/png;base64,#DPUK_D_2025_files/figure-html/unnamed-chunk-27-1.png)

]

---
# Difference in differences (2)

.pull-left[

|         | London| Birmingham|
|:--------|------:|----------:|
|Accounts |     50|         40|
|Manager  |     40|         20|

+ This time we can see the difference differs.
  + £20,000 in Birmingham
  + £10,000 in London.
  
+ Note the lines are no longer parallel.
  + Suggests interaction.
  + But not crossing (so ordinal interaction)

]

.pull-right[

![](data:image/png;base64,#DPUK_D_2025_files/figure-html/unnamed-chunk-30-1.png)

]

---
# Difference in differences (3)

.pull-left[

|         | London| Birmingham|
|:--------|------:|----------:|
|Accounts |     40|         40|
|Manager  |     60|         20|

+ This time we can see the difference differs.
  + £20,000 in Birmingham
  + -£20,000 in London
  
+ Note the lines are no longer parallel.
  + Suggests interaction.
  + Now crossing (so disordinal interaction)

]

.pull-right[

![](data:image/png;base64,#DPUK_D_2025_files/figure-html/unnamed-chunk-33-1.png)

]

---
#  Interpretation: Categorical*categorical interaction (dummy codes)

`$$y_i = \beta_0 + \beta_1 x_{i} + \beta_2 z_{i} + \beta_3 xz_{i} + \epsilon_i$$`

+ `$\beta_0$` = Value of `$y$` when `$x$` and `$z$` are 0 
  + Expected salary for Accounts in London.
  
+ `$\beta_1$` = Difference between levels of `$x$` when `$z$` = 0 
  + The difference in salary between Accounts in London and Birmingham

+ `$\beta_2$` = Difference between levels of `$z$` when `$x$` = 0.
  + The difference in salary between Accounts and Store managers in London.

+  `$\beta_3$` = Difference between levels of `$x$` across levels of `$z$`
  + The difference between salary in Accounts and Store managers between London and Birmingham