multilevel modelling working with group structured data
regression refresher
the multilevel model
more complex groupings
centering, assumptions, and diagnostics
recap
factor analysis working with multi-item measures
measurement and dimensionality
exploring underlying constructs (EFA)
testing theoretical models (CFA)
reliability and validity
recap & exam prep
This week
introduction to the multilevel/mixed effects model
how to think about it (model structure)
how to fit it (in R)
Terminology
(size weighted by hits on google scholar)
Building up the idea - in pictures
back to our example
Are older people more satisfied with life? 112 people from 12 different dwellings (cities/towns) in Scotland. Information on their ages and some measure of life satisfaction.
their value of \(\color{red}{y}\) =
some number (\(\color{blue}{b_0}\)) +
some amount (\(\color{blue}{b_1}\)) times their value of \(x\) +
their residual \(\varepsilon_j\)
their value of \(\color{red}{y}\) =
some number for group \(i\) (\(\color{blue}{b_{0i}}\)) +
some amount (\(\color{blue}{b_1}\)) times their value of \(x\) +
their residual \(\varepsilon_{ij}\)
their value of \(\color{red}{y}\) =
some number for group \(i\) (\(\color{blue}{b_{0i}}\)) +
some amount (\(\color{blue}{b_1}\)) times their value of \(x\) +
their residual \(\varepsilon_{ij}\)
group \(i\)’s intercept (\(\color{blue}{b_{0i}}\)) =
the intercept for the average of the population of groups (\(\gamma_{00}\)) +
the deviation of group \(i\) (\(\color{orange}{\zeta_{0i}}\)) from \(\gamma_{00}\)
their value of \(\color{red}{y}\) =
some number for group \(i\) (\(\color{blue}{b_{0i}}\)) +
some amount (\(\color{blue}{b_1}\)) times their value of \(x\) +
their residual \(\varepsilon_{ij}\)
group \(i\)’s intercept (\(\color{blue}{b_{0i}}\)) =
the intercept for the average of the population of groups (\(\gamma_{00}\)) +
the deviation of group \(i\) (\(\color{orange}{\zeta_{0i}}\)) from \(\gamma_{00}\)
We are now assuming \(\color{orange}{\zeta_0}\) and \(\varepsilon\) to be normally distributed with a mean of 0, and we denote their variances as \(\color{orange}{\sigma_0^2}\) and \(\sigma_\varepsilon^2\) respectively.
multi-level (mixed-effects) regression
Sometimes, you will see the levels collapsed into one equation, as it might make for more intuitive reading:
The \(\color{orange}{\zeta}\) components also get termed the “random effects” part of the model, Hence names like “random effects model”, etc.
“random” because considered a random sample from larger population such that \(\color{orange}{\zeta_{0i}} \sim N(0, \color{orange}{\sigma^2_0})\).
No: Edinburgh just “randomly” happens to be higher than average
Yes: Edinburgh just one of a “random sample” of dwellings. It is high in lifesat because of various reasons (it’s a nice city, there’s less rain than the west coast, etc.).
fixed and random (2)
Should variable g be fixed or random?
Should we…
estimate a bunch of specific differences between dwellings?
estimate the variability between dwellings
Repetition: If the experiment were repeated:
Desired inference: The conclusions refer to:
Fixed \(y\,\sim\,~\,...\, +\, g\)
Same groups would be used
The groups used
Random1 \(y\,\sim\,...\,+\,(\,... |\,g)\)
Different groups would be used
A population from which the groups used are just a (random) sample
their value of \(\color{red}{y}\) =
some number for group \(i\) (\(\color{blue}{b_{0i}}\)) +
some amount (\(\color{blue}{b_1}\)) times their value of \(x\) +
their residual \(\varepsilon_{ij}\)
group \(i\)’s intercept (\(\color{blue}{b_{0i}}\)) =
the intercept for the average of the population of groups (\(\gamma_{00}\)) +
the deviation of group \(i\) (\(\color{orange}{\zeta_{0i}}\)) from \(\gamma_{00}\)
their value of \(\color{red}{y}\) =
some number for group \(i\) (\(\color{blue}{b_{0i}}\)) +
some amount for group \(i\) (\(\color{blue}{b_{1i}}\)) times their value of \(x\) +
their residual \(\varepsilon_{ij}\)
group \(i\)’s intercept (\(\color{blue}{b_{0i}}\)) =
the intercept for the average of the population of groups (\(\gamma_{00}\)) +
the deviation of group \(i\) (\(\color{orange}{\zeta_{0i}}\)) from \(\gamma_{00}\)
group \(i\)’s slope (\(\color{blue}{b_{1i}}\)) =
the slope for the average of the population of groups (\(\gamma_{10}\)) +
the deviation of group \(i\) (\(\color{orange}{\zeta_{1i}}\)) from \(\gamma_{10}\)
their value of \(\color{red}{y}\) =
some number for group \(i\) (\(\color{blue}{b_{0i}}\)) +
some amount for group \(i\) (\(\color{blue}{b_{1i}}\)) times their value of \(x\) +
their residual \(\varepsilon_{ij}\)
group \(i\)’s intercept (\(\color{blue}{b_{0i}}\)) =
the intercept for the average of the population of groups (\(\gamma_{00}\)) +
the deviation of group \(i\) (\(\color{orange}{\zeta_{0i}}\)) from \(\gamma_{00}\)
group \(i\)’s slope (\(\color{blue}{b_{1i}}\)) =
the slope for the average of the population of groups (\(\gamma_{10}\)) +
the deviation of group \(i\) (\(\color{orange}{\zeta_{1i}}\)) from \(\gamma_{10}\)
group deviations for intercepts and slopes are normally distributed with mean of 0 and standard deviations of \(\color{orange}{\sigma^2_0}\) and \(\color{orange}{\sigma^2_1}\) respectively, and with a correlation of \(\color{orange}{\rho_{01}}\).
random intercepts, random slopes
the multilevel model: partial pooling
In a no-pooling approach, information is not combined in anyway (data from cluster \(i\) contributes to the slope for cluster \(i\), but nothing else).
Information from Glasgow, Edinburgh, Perth, Dundee etc doesn’t influence what we think about Stirling
In the multilevel model, we model group deviations as normally distributed with a variance of \(\color{orange}{\sigma^2_b}\).
clusters’ contributions to the model depend on:
\(\color{orange}{\sigma^2_b}\) relative to \(\sigma^2_\varepsilon\)
the amount of data in each cluster
the multilevel model: partial pooling
In a no-pooling approach, information is not combined in anyway (data from cluster \(i\) contributes to the slope for cluster \(i\), but nothing else.
Information from Glasgow, Edinburgh, Perth, Dundee etc doesn’t influence what we think about Stirling
In the multilevel model, we model group deviations as normally distributed with a variance of \(\color{orange}{\sigma^2_b}\).
clusters’ contributions to the model depend on:
how distinct the clustering is
how big a cluster is
cluster specific predictions are shrunk towards the average depending on:
Multi-level models can be used to answer multi-level questions!
Do phenomena at Level X influence effects at Level Y?
example: \(n\) children’s grades are recorded every year throughout school
Q: Does being mono/bi-lingual influence childrens’ grades over the duration of their schooling?
Multi-level models can be used to answer multi-level questions!
Do phenomena at Level X influence effects at Level Y?
example: \(n\) children’s grades are recorded every year throughout school
Q: Does being mono/bi-lingual influence childrens’ grades over the duration of their schooling?
Multi-level models can be used to answer multi-level questions!
Do random variances covary?
example: \(n\) participants’ cognitive ability is measured across time.
Q: Do people who have higher cognitive scores at start of study show less decline over the study period than those who have lower scores?
lme4 package (many others are available, but lme4 is most popular).
The lmer() function. syntax is similar to lm(), in that we specify:
[outcome variable] ~ [explanatory variables], data = [name of dataframe]
in lmer(), we have to also specify the random effect structure in parentheses:
[outcome variable] ~ [explanatory variables] + ([vary this] | [by this grouping variable]),
data = [name of dataframe],
REML = [TRUE/FALSE]
A new example
In a study examining how cognition changes over time, a sample of 20 participants took the Addenbrooke’s Cognitive Examination (ACE) every 2 years from age 60 to age 78.
# A tibble: 6 × 7
sitename ppt condition visit age ACE imp
<chr> <chr> <chr> <dbl> <dbl> <dbl> <chr>
1 Sncbk PPT_1 control 1 60 84.5 unimp
2 Sncbk PPT_1 control 2 62 85.6 imp
3 Sncbk PPT_1 control 3 64 84.5 imp
4 Sncbk PPT_1 control 4 66 83.1 imp
5 Sncbk PPT_1 control 5 68 82.3 imp
6 Sncbk PPT_1 control 6 70 83.3 imp
pptplots <-ggplot(d3, aes(x = visit, y = ACE, col = ppt)) +geom_point() +facet_wrap(~ppt) +guides(col ="none") +labs(x ="visit", y ="cognition")
pptplots
Fitting an lm
lm_mod <-lm(ACE ~1+ visit, data = d3)summary(lm_mod)
Call:
lm(formula = ACE ~ 1 + visit, data = d3)
Residuals:
Min 1Q Median 3Q Max
-4.868 -1.054 -0.183 1.146 6.632
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 85.6259 0.3022 283.3 < 2e-16 ***
visit -0.3857 0.0488 -7.9 2.9e-13 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 1.88 on 175 degrees of freedom
Multiple R-squared: 0.263, Adjusted R-squared: 0.259
F-statistic: 62.5 on 1 and 175 DF, p-value: 2.9e-13
pptplots +geom_line(aes(y=fitted(lm_mod)), col ="blue", lwd=1)
random intercept
vary the intercept by participant.
library(lme4)ri_mod <-lmer(ACE ~1+ visit + (1| ppt), data = d3)summary(ri_mod)
Linear mixed model fit by REML ['lmerMod']
Formula: ACE ~ 1 + visit + (1 | ppt)
Data: d3
REML criterion at convergence: 595
Scaled residuals:
Min 1Q Median 3Q Max
-2.8830 -0.5445 -0.0232 0.6491 3.0494
Random effects:
Groups Name Variance Std.Dev.
ppt (Intercept) 2.25 1.5
Residual 1.21 1.1
Number of obs: 177, groups: ppt, 20
Fixed effects:
Estimate Std. Error t value
(Intercept) 85.6497 0.3800 225.4
visit -0.3794 0.0287 -13.2
Correlation of Fixed Effects:
(Intr)
visit -0.411
pptplots +geom_line(aes(y=fitted(ri_mod)), col ="red", lwd=1)
Linear mixed model fit by REML ['lmerMod']
Formula: ACE ~ 1 + visit * condition + (1 + visit | ppt)
Data: d3
REML criterion at convergence: 362
Scaled residuals:
Min 1Q Median 3Q Max
-2.296 -0.686 -0.041 0.686 2.420
Random effects:
Groups Name Variance Std.Dev. Corr
ppt (Intercept) 0.1467 0.383
visit 0.0769 0.277 -0.49
Residual 0.2445 0.495
Number of obs: 177, groups: ppt, 20
Fixed effects:
Estimate Std. Error t value
(Intercept) 85.7353 0.1688 507.79
visit -0.5319 0.0897 -5.93
conditionmindfulness -0.2979 0.2362 -1.26
visit:conditionmindfulness 0.3459 0.1269 2.73
Correlation of Fixed Effects:
(Intr) visit cndtnm
visit -0.471
cndtnmndfln -0.715 0.337
vst:cndtnmn 0.333 -0.707 -0.473
lmer output (1)
summary(model)
Linear mixed model fit by REML ['lmerMod']
Formula: y ~ x + (1 + x | group)
Data: my_data
REML criterion at convergence: 335
Scaled residuals:
Min 1Q Median 3Q Max
-2.1279 -0.7009 0.0414 0.6645 2.1010
Random effects:
Groups Name Variance Std.Dev. Corr
group (Intercept) 1.326 1.152
x 0.152 0.390 -0.88
Residual 0.262 0.512
Number of obs: 170, groups: group, 20
Fixed effects:
Estimate Std. Error t value
(Intercept) 1.7890 0.2858 6.26
x -0.6250 0.0996 -6.27
lmer output (2)
summary(model)
Linear mixed model fit by REML ['lmerMod']
Formula: y ~ x + (1 + x | group)
Data: my_data
REML criterion at convergence: 335
Scaled residuals:
Min 1Q Median 3Q Max
-2.1279 -0.7009 0.0414 0.6645 2.1010
Random effects:
Groups Name Variance Std.Dev. Corr
group (Intercept) 1.326 1.152
x 0.152 0.390 -0.88
Residual 0.262 0.512
Number of obs: 170, groups: group, 20
Fixed effects:
Estimate Std. Error t value
(Intercept) 1.7890 0.2858 6.26
x -0.6250 0.0996 -6.27
lmer output (3)
summary(model)
Linear mixed model fit by REML ['lmerMod']
Formula: y ~ x + (1 + x | group)
Data: my_data
REML criterion at convergence: 335
Scaled residuals:
Min 1Q Median 3Q Max
-2.1279 -0.7009 0.0414 0.6645 2.1010
Random effects:
Groups Name Variance Std.Dev. Corr
group (Intercept) 1.326 1.152
x 0.152 0.390 -0.88
Residual 0.262 0.512
Number of obs: 170, groups: group, 20
Fixed effects:
Estimate Std. Error t value
(Intercept) 1.7890 0.2858 6.26
x -0.6250 0.0996 -6.27
lmer output (4)
summary(model)
Linear mixed model fit by REML ['lmerMod']
Formula: y ~ x + (1 + x | group)
Data: my_data
REML criterion at convergence: 335
Scaled residuals:
Min 1Q Median 3Q Max
-2.1279 -0.7009 0.0414 0.6645 2.1010
Random effects:
Groups Name Variance Std.Dev. Corr
group (Intercept) 1.326 1.152
x 0.152 0.390 -0.88
Residual 0.262 0.512
Number of obs: 170, groups: group, 20
Fixed effects:
Estimate Std. Error t value
(Intercept) 1.7890 0.2858 6.26
x -0.6250 0.0996 -6.27
Model Parameters
summary(model)
Linear mixed model fit by REML ['lmerMod']
Formula: y ~ x + (1 + x | group)
Data: my_data
REML criterion at convergence: 335
Scaled residuals:
Min 1Q Median 3Q Max
-2.1279 -0.7009 0.0414 0.6645 2.1010
Random effects:
Groups Name Variance Std.Dev. Corr
group (Intercept) 1.326 1.152
x 0.152 0.390 -0.88
Residual 0.262 0.512
Number of obs: 170, groups: group, 20
Fixed effects:
Estimate Std. Error t value
(Intercept) 1.7890 0.2858 6.26
x -0.6250 0.0996 -6.27
Fixed effects:
fixef(model)
(Intercept) x
1.789 -0.625
Variance components:
VarCorr(model)
Groups Name Std.Dev. Corr
group (Intercept) 1.152
x 0.390 -0.88
Residual 0.512
Model Predictions: ranef, coef
summary(model)
Linear mixed model fit by REML ['lmerMod']
Formula: y ~ x + (1 + x | group)
Data: my_data
REML criterion at convergence: 335
Scaled residuals:
Min 1Q Median 3Q Max
-2.1279 -0.7009 0.0414 0.6645 2.1010
Random effects:
Groups Name Variance Std.Dev. Corr
group (Intercept) 1.326 1.152
x 0.152 0.390 -0.88
Residual 0.262 0.512
Number of obs: 170, groups: group, 20
Fixed effects:
Estimate Std. Error t value
(Intercept) 1.7890 0.2858 6.26
x -0.6250 0.0996 -6.27
Degrees of freedom for multilevel models is not clear
Lots and lots of different methods
Inference
Degrees of freedom for multilevel models is not clear
Lots and lots of different methods
Inference
Degrees of freedom for multilevel models is not clear
Lots and lots of different methods
Inference - lmerTest
Degrees of freedom for multilevel models is not clear
Lots and lots of different methods
using the {lmerTest} package will overwrite the lmer() function, and give you hypothesis tests for the fixed effects:
library(lmerTest)model <-lmer(y ~ x + (1+ x | group), my_data)summary(model)
Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: y ~ x + (1 + x | group)
Data: my_data
REML criterion at convergence: 335
Scaled residuals:
Min 1Q Median 3Q Max
-2.1279 -0.7009 0.0414 0.6645 2.1010
Random effects:
Groups Name Variance Std.Dev. Corr
group (Intercept) 1.326 1.152
x 0.152 0.390 -0.88
Residual 0.262 0.512
Number of obs: 170, groups: group, 20
Fixed effects:
Estimate Std. Error df t value Pr(>|t|)
(Intercept) 1.7890 0.2858 16.4478 6.26 0.000010 ***
x -0.6250 0.0996 15.4780 -6.27 0.000013 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Correlation of Fixed Effects:
(Intr)
x -0.894
Inference - LRTs
Degrees of freedom for multilevel models is not clear
Lots and lots of different methods
The anova() function will perform a “likelihood ratio test” to allow you to compare two models:
restricted_model <-lmer(y ~1+ (1+ x | group), my_data)model <-lmer(y ~1+ x + (1+ x | group), my_data)anova(restricted_model, model)
vast majority of research questions are about fixed effects, not random effects
model comparisons should differ in only fixed effects, and have the same random effects
testing random effects is difficult
generally not necessary as we motivate their inclusion based on our understanding of the study design (and simplifications)
Summary
We can extend our linear model equation to model certain parameters as random cluster-level adjustments around a fixed center.
\(\color{red}{y_i} = \color{blue}{b_0 \cdot{} 1 \; + \; b_1 \cdot{} x_{i} } + \varepsilon_i\)
if we allow the intercept to vary by cluster, becomes: \(\color{red}{y_{ij}} = \color{blue}{b_{0i} \cdot 1 + b_{1} \cdot x_{ij}} + \varepsilon_{ij}\) \(\color{blue}{b_{0i}} = \gamma_{00} + \color{orange}{\zeta_{0i}}\)
We can express this as one equation if we prefer: \(\color{red}{y_{ij}} = \underbrace{(\gamma_{00} + \color{orange}{\zeta_{0i}})}_{\color{blue}{b_{0i}}} \cdot 1 + \color{blue}{b_{1} \cdot x_{ij}} + \varepsilon_{ij}\)
We can allow slopes to vary by-cluster too.
This allows us to model cluster-level variation around the intercept (“random intercepts”) and around slopes (“random slopes”).
We can fit this using the lme4 package in R
This week
Tasks
Complete readings
Attend your lab and work together on the exercises