Model Building

class: center, middle, inverse, title-slide

.title[
# Model Building
]
.subtitle[
## Data Analysis for Psychology in R 3
]
.author[
### Josiah King
]
.institute[
### Department of Psychology The University of Edinburgh
]

---

# The poke in the eye

It's a trade-off...

.pull-left[

**Accurately representing the world**

everything that can vary is modelled as varying

]

.pull-right[

**Being able to fit the model**

in our sample, some things will not vary _enough_ to fit `x|g`  
  
- not enough groups in `g`
    - fit `+g` instead of `(1|g)`
- predictors on different scales
    - `mm|g` vs `km|g`
      - can be fixed with scaling  
- not enough variance in `y~x` between groups
- model estimation hitting boundaries  
    - variance `$\neq 0$`
    - correlations `$\neq 1$` and `$\neq -1$`  
  
]

---
# Maximal Structures

"maximal model" = the most complex random effect structure that you can fit to the data

- everything that _can_ be modelled as a random effect, is done so

- requires sufficient variance at all levels (for both intercepts and slopes where relevant). Which is often not the case.

.pull-left[
Simple(ish) design:  
20 participants, each with approx 10 observations

```r
d3 <- read_csv("https://uoepsy.github.io/data/dapr3_mindfuldecline.csv")
maxmodel <- lmer(ACE ~ visit * condition + 
 (1 + visit | ppt), 
 data = d3)
```

```r
isSingular(maxmodel)
```

```
## [1] FALSE
```

]
.pull-right[
Complicated design:  
14 Study sites, each with c18 ppts, each with approx 10 observations

```r
d3full <- read_csv("https://uoepsy.github.io/data/dapr3_mindfuldeclineFULL.csv")
maxmodelfull <- lmer(ACE ~ visit * condition + 
 (1 + visit * condition | sitename) +
 (1 + visit | sitename:ppt), 
 data = d3full)
```
```
# boundary (singular) fit: see help('isSingular')
```

```r
isSingular(maxmodelfull)
```

```
## [1] TRUE
```

]

---
# Model Estimation

**Don't report results from a model that doesn't converge. You will probably not be able to trust the estimates.**

- Check model specification! Do your random effects make sense?

.pull-left[
- Try a different optimiser, adjust the max iterations, or the stopping tolerances

{{content}}
]
.pull-right[
<img src="jk_img_sandbox/tolerance.png" width="557" />
]

- Try **all** optimisers at once! `summary(allFit(model))`
  - look for the 'messages'
  - for those optimisers that successfully converge, compare estimates.

**This probably won't help. In most cases, it is just that our model is too complex to fit to our available data.**

---
# Deciding on the optimal random effect structure

.pull-left[

**Keep it maximal**

1. Start with maximal model  
2. Remove random effects with least variance until model converges (see Barr et al., 2013)

]
.pull-right[

**Selection based**

Use a criterion for model selection (e.g. LRT, AIC, BIC) to choose a random effect structure that is supported by the data (see Matsuchek et al., 2017)

]
.footnote[

- Barr, D. J., Levy, R., Scheepers, C., & Tily, H. J. (2013). Random effects structure for confirmatory hypothesis testing: Keep it maximal. Journal of memory and language, 68(3), 255-278.  
- Matuschek, H., Kliegl, R., Vasishth, S., Baayen, H., & Bates, D. (2017). Balancing Type I error and power in linear mixed models. Journal of memory and language, 94, 305-315.

]

---
# Deciding on the optimal random effect structure

.pull-left[

**Keep it maximal**

1. Start with maximal model  
2. Remove random effects with least variance until model converges (see Barr et al., 2013)

]
.pull-right[

Selection based 
 
Use a criterion for model selection (e.g. LRT, AIC, BIC) to choose a random effect structure that is supported by the data (see Matsuchek et al., 2017)

]

__<center>No right answer</center>__

.footnote[

]

---
# how to simplify

.pull-left[
One thing at a time!  
]
.pull-right[

```r
# to extract random effects
VarCorr(maxmodel)
```

```
##  Groups   Name        Std.Dev. Corr 
##  ppt      (Intercept) 0.383         
##           visit       0.277    -0.49
##  Residual             0.495
```
]

Look for:

Variances/standard deviations of 0 (or very small, e.g. `3.56e-05`)

Correlations of -1 or 1

---
# how to simplify (2)

.pull-left[
One thing at a time!  
]
.pull-right[

```r
# to extract random effects
VarCorr(maxmodel)
```

```
##  Groups   Name        Std.Dev. Corr 
##  ppt      (Intercept) 0.383         
##           visit       0.277    -0.49
##  Residual             0.495
```
]

Consider removing most complex random effects first.

`(1 + x1 * x2 | g)` 
&emsp;&emsp;&darr; 
`(1 + x1 + x2 | g)` 
&emsp;&emsp;&darr; 
`(1 + x1 | g)` OR `(1 + x2 | g)`

---
# how to simplify (3)

.pull-left[
One thing at a time!  
]
.pull-right[

```r
# to extract random effects
VarCorr(maxmodel)
```

```
##  Groups   Name        Std.Dev. Corr 
##  ppt      (Intercept) 0.383         
##           visit       0.277    -0.49
##  Residual             0.495
```
]

Categorical predictors with >2 levels are "more complex" (because they require more parameters).

---
# how to simplify (4)

.pull-left[
One thing at a time!  
]
.pull-right[

```r
# to extract random effects
VarCorr(maxmodel)
```

```
##  Groups   Name        Std.Dev. Corr 
##  ppt      (Intercept) 0.383         
##           visit       0.277    -0.49
##  Residual             0.495
```
]

If multiple levels of nesting, you'll have fewer groups as you go up the levels (fewer groups to be able to fit things to).

*sometimes* (but not always) it makes sense to remove higher level random effects first.

`(1 + x1 + x2 | g1/g2)` 
&emsp;&emsp; is the same as 
`(1 + x1 + x2 | g1) + (1 + x1 + x2 | g1:g2)` 
&emsp;&emsp;&darr; 
`(1 + x1 | g1) + (1 + x1 + x2 | g1:g2)`

---
# how to simplify (5)

.pull-left[
One thing at a time!  
]
.pull-right[

```r
# to extract random effects
VarCorr(maxmodel)
```

```
##  Groups   Name        Std.Dev. Corr 
##  ppt      (Intercept) 0.383         
##           visit       0.277    -0.49
##  Residual             0.495
```
]

You will be faced with _subjective_ choices 
 
which simplification can you most easily accept?

---
# random effect correlations

.pull-left[
i've kind of been ignoring these so far..

```r
m1 <- lmer(y ~ 1 + x1 + 
 (1 + x1 | clusterid), data = df)
VarCorr(m1)
```

```
##  Groups    Name        Std.Dev. Corr
##  clusterid (Intercept) 0.759        
##            x1          0.117    1.00
##  Residual              0.428
```
]
.pull-right[

![](dapr3_2324_04b_modelbuilding_files/figure-html/unnamed-chunk-13-1.svg)
]

---
# random effect correlations (2)

.pull-left[

```r
maxmodel <- lmer(ACE ~ visit * condition + 
 (1 + visit | ppt), 
 data = d3)
VarCorr(maxmodel)
```

```
## Groups Name Std.Dev. Corr 
## ppt (Intercept) 0.383 
## visit 0.277 -0.49
## Residual 0.495
```
<img src="dapr3_2324_04b_modelbuilding_files/figure-html/unnamed-chunk-15-1.svg" height="300px" />
]
.pull-right[
plot the predictions:

```r
plot(ranef(maxmodel)$ppt)
```

<img src="dapr3_2324_04b_modelbuilding_files/figure-html/unnamed-chunk-16-1.svg" height="400px" />
]

---
# random effect correlations (3)

.pull-left[

```r
maxmodelZC <- lmer(ACE ~ visit * condition + 
 (1 + visit || ppt), 
 data = d3)
VarCorr(maxmodelZC)
```

```
##  Groups   Name        Std.Dev.
##  ppt      (Intercept) 0.343   
##  ppt.1    visit       0.263   
##  Residual             0.498
```
removing correlations between random effects simplifies your model. 
]
.pull-right[

plot the predictions:

```r
plot(ranef(maxmodelZC)$ppt)
```

```r
cor(ranef(maxmodelZC)$ppt)
```

```
##             (Intercept)   visit
## (Intercept)      1.0000 -0.2384
## visit           -0.2384  1.0000
```
]

---
# random effect correlations (4)

.pull-left[
_Different data!_ (here i have simulated the data so that people *do* have a very strong correlation in their intercepts and slopes)

```r
maxmodel_COR <- lmer(ACE ~ visit * condition + 
 (1 + visit | ppt), 
 data = d3COR)
VarCorr(maxmodel_COR)
```

```
##  Groups   Name        Std.Dev. Corr 
##  ppt      (Intercept) 0.206         
##           visit       0.364    -1.00
##  Residual             0.105
```

```r
maxmodel_CORsuppress <- lmer(ACE ~ visit * condition + 
 (1 + visit || ppt), 
 data = d3COR)
VarCorr(maxmodel_CORsuppress)
```

```
##  Groups   Name        Std.Dev.
##  ppt      (Intercept) 0.199   
##  ppt.1    visit       0.361   
##  Residual             0.107
```

]
.pull-right[

```r
ggplot(d3COR, aes(x=visit, y=ACE))+
  geom_line(aes(group=ppt, col=condition), alpha=.7)+
  theme(legend.position = "bottom")
```

![](dapr3_2324_04b_modelbuilding_files/figure-html/unnamed-chunk-22-1.svg)

]

---
count: false
# random effect correlations (4)

.pull-left[

```r
maxmodel_COR <- lmer(ACE ~ visit * condition + 
 (1 + visit | ppt), 
 data = d3COR)
VarCorr(maxmodel_COR)
```

```
##  Groups   Name        Std.Dev. Corr 
##  ppt      (Intercept) 0.206         
##           visit       0.364    -1.00
##  Residual             0.105
```

```r
maxmodel_CORsuppress <- lmer(ACE ~ visit * condition + 
 (1 + visit || ppt), 
 data = d3COR)
VarCorr(maxmodel_CORsuppress)
```

```
##  Groups   Name        Std.Dev.
##  ppt      (Intercept) 0.199   
##  ppt.1    visit       0.361   
##  Residual             0.107
```

]
.pull-right[

```r
plot(ranef(maxmodel_COR)$ppt)
plot(ranef(maxmodel_CORsuppress)$ppt)
```
![](dapr3_2324_04b_modelbuilding_files/figure-html/unnamed-chunk-25-1.svg)

```r
cor(ranef(maxmodel_CORsuppress)$ppt)
```

```
##             (Intercept)   visit
## (Intercept)      1.0000 -0.9678
## visit           -0.9678  1.0000
```
]

---
# Summary

- random effect structures can get complicated quite quickly

- there may multiple ways in things can be clustered.  
  - clusters can themselves be clustered (**nesting**). 
  - we can have different ways of clustering the same things (**crossed**)

- the maximal random effect structure is the most complex structure we can fit to the data. 
    - it often leads to problems with model convergence
    
- building `lmer`s is a balancing act between accounting for different sources of variation and attempting to fitting a model that can be supported by our data.

---
class: inverse, center, middle, animated, rotateInDownLeft

# End