class: center, middle, inverse, title-slide .title[ #
Week 2: Longitudinal Data Analysis using Multilevel Modeling
] .subtitle[ ## Multivariate Statistics and Methodology using R (MSMR)
] .author[ ### Dan Mirman ] .institute[ ### Department of Psychology
The University of Edinburgh ] --- # Longitudinal data are a natural application domain for MLM * Longitudinal measurements are *nested* within subjects (by definition) * Longitudinal measurements are related by a continuous variable, spacing can be uneven across participants, and data can be missing + These are problems rmANOVA * Trajectories of longitudinal change can be nonlinear (we'll get to that next week) --- # Another example `web_cbt.rda`: Effect of a 6-week web-based CBT intervention for depression, with and without weekly 10-minute phone call (simulated data based on Farrer et al., 2011, https://doi.org/10.1371/journal.pone.0028099). ```r library(tidyverse) library(lme4) library(lmerTest) library(broom.mixed) library(effects) load("./data/web_cbt.rda") #str(web_cbt) ``` * `Participant`: Participant ID * `Wave`: Week of data collection * `Web`: Whether participant was told to access an online CBT programme * `Phone`: Whether participant received a weekly 10-minute phone call to address issues with web programme access or discuss environmental and lifestyle factors (no therapy or counselling was provided over the phone) * `CES_D`: Center for Epidemiologic Studies Depression Scale, scale goes 0-60 with higher scores indicating more severe depression --- # Inspect data ```r ggplot(web_cbt, aes(Wave, CES_D, color=Web, shape=Phone)) + stat_summary(fun.data=mean_se, geom="pointrange", position=position_dodge(width=0.2)) + stat_summary(fun=mean, geom="line") + scale_shape_manual(values=c(16, 21)) ``` ![](msmr_lec02-2_LinearLDA_files/figure-html/unnamed-chunk-2-1.png)<!-- --> --- # Research questions 1. Did the randomisation work? Participants were randomly assigned to groups, so there should be no group differences at baseline, but it's always a good idea to check that the randomisation was successful. 2. Did the web-based CBT programme reduce depression? 3. Did the phone calls reduce depression? 4. Did the phone calls interact with the web-based CBT programme? That is, did getting a phone call change the efficacy of the web-based programme? (Or, equally, did the CBT programme change the effect of phone calls?) --- # What is the baseline? **Did the randomisation work? Were there any group differences at baseline?** We can answer this question using the **intercept** coefficients -- But those will be estimated at `Wave = 0`, which is 1 week *before* the baseline, which was at `Wave = 1`. So we need to adjust the time variable so that baseline corresponds to time 0: ```r web_cbt$Time <- web_cbt$Wave - 1 ``` Now `Time` is a variable just like `Wave`, but going from 0 to 5 instead of 1 to 6. -- In this case, we made a very minor change, but the more general point is that it's important to think about intercepts. Lots of common predictor variables (calendar year, age, height, etc.) typically do not start at 0 and will produce meaningless intercept estimates (e.g., in the year Jesus was born, or for someone who is 0cm tall) unless properly adjusted or centered. --- # Fit the models Sometimes it is helpful to build the model step-by-step so you can compare whether adding one (or more) parameters improved the model fit. -- Start with a simple, baseline model of how CES_D changed over time across all conditions. ```r m <- lmer(CES_D ~ Time + (1 + Time | Participant), data=web_cbt, REML=FALSE) ``` -- Adding the condition terms `Phone * Web`, but not their interaction with `Time`, will allow us to test overall differences between conditions ```r m.int <- lmer(CES_D ~ Time + (Phone * Web) + (1 + Time | Participant), data = web_cbt, REML = F) ``` -- This is the full model, including Time-by-Condition interactions that will allow us to test whether the conditions differed in terms of the slope of the recovery ```r m.full <- lmer(CES_D ~ Time*Phone*Web + (1 + Time | Participant), data=web_cbt, REML=FALSE) ``` --- # Compare models To compare the models, we use the `anova()` function, but recall that this is *not* an ANOVA, we are doing a likelihood ratio test and using a `\(\chi^2\)` test statistic. ```r anova(m, m.int, m.full) ``` ``` ## Data: web_cbt ## Models: ## m: CES_D ~ Time + (1 + Time | Participant) ## m.int: CES_D ~ Time + (Phone * Web) + (1 + Time | Participant) ## m.full: CES_D ~ Time * Phone * Web + (1 + Time | Participant) ## npar AIC BIC logLik deviance Chisq Df Pr(>Chisq) ## m 6 4516 4545 -2252 4504 ## m.int 9 4518 4560 -2250 4500 4.48 3 0.21 ## m.full 12 4420 4477 -2198 4396 104.10 3 <2e-16 *** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ``` No differences overall ( `\(\chi^2(3) = 4.48, p = 0.21\)` ), differences in slopes ( `\(\chi^2(3) = 104.10, p < 0.0001\)` ). So the groups differed in how their depression symptoms changed over the 6 weeks of the study. --- # Examine Fixed Effects ```r gt(tidy(m.full, effects="fixed")) ```
effect
term
estimate
std.error
statistic
df
p.value
fixed
(Intercept)
36.86259
1.5772
23.3724
140
3.549e-50
fixed
Time
-2.67837
0.1261
-21.2403
140
1.247e-45
fixed
PhoneNo
0.52245
2.2305
0.2342
140
8.151e-01
fixed
WebNo
-0.58776
2.2305
-0.2635
140
7.925e-01
fixed
Time:PhoneNo
0.23673
0.1783
1.3275
140
1.865e-01
fixed
Time:WebNo
1.58939
0.1783
8.9126
140
2.405e-15
fixed
PhoneNo:WebNo
1.76327
3.1544
0.5590
140
5.771e-01
fixed
Time:PhoneNo:WebNo
-0.07102
0.2522
-0.2816
140
7.787e-01
Parameters were estimated for the "No" groups relative to the "Yes" groups, which is the reverse of how we framed our research questions. We can fix this by reversing the factor levels and re-fitting the model. --- # Fix factor coding and refit model ```r web_cbt <- web_cbt %>% mutate(Web = fct_rev(Web), Phone = fct_rev(Phone)) m.full <- lmer(CES_D ~ Time*Phone*Web + (1 + Time | Participant), data=web_cbt, REML=F) ``` .pull-left[
term
estimate
std.error
statistic
df
p.value
(Intercept)
38.56054
1.5772
24.4488
140
2.228e-52
Time
-0.92327
0.1261
-7.3218
140
1.752e-11
PhoneYes
-2.28571
2.2305
-1.0248
140
3.072e-01
WebYes
-1.17551
2.2305
-0.5270
140
5.990e-01
Time:PhoneYes
-0.16571
0.1783
-0.9293
140
3.544e-01
Time:WebYes
-1.51837
0.1783
-8.5144
140
2.347e-14
PhoneYes:WebYes
1.76327
3.1544
0.5590
140
5.771e-01
Time:PhoneYes:WebYes
-0.07102
0.2522
-0.2816
140
7.787e-01
] -- .pull-right[ Now the control group (no phone, no web) is the reference level and it did show some improvement over time (`Time`: `\(Est = -0.92, SE = 0.13, t(140)=-7.32, p < 0.0001\)`). The web intervention group improved even faster (`Time:WebYes`: `\(Est = -1.52, SE = 0.18, t(140)=-8.51, p < 0.0001\)`). `PhoneYes`, `WebYes`, `PhoneYes:WebYes` are all not significant `\((p > 0.3)\)` indicating no differences at baseline; i.e., the randomisation worked. Effect of phone reminders is not significant on its own (`Time:PhoneYes`: `\(p > 0.3\)`) and does not affect the effect of web intervention (`Time:PhoneYes:WebYes`: `\(p > 0.7\)`) ] --- # Visualize effects (1): model-predicted values ```r ggplot(augment(m.full), aes(Time, CES_D, color=Web, shape=Phone)) + stat_summary(fun.data=mean_se, geom="pointrange", position=position_dodge(width=0.1)) + stat_summary(aes(y=.fitted, linetype=Phone), fun=mean, geom="line") + scale_shape_manual(values=c(16, 21)) + theme_bw() + scale_color_brewer(palette = "Set1") ``` ![](msmr_lec02-2_LinearLDA_files/figure-html/unnamed-chunk-11-1.png)<!-- --> --- # Visualize effects (2): model-based lines The `effects::effect()` function produces predicted outcome values from on a model for the specified predictor(s) ```r ef <- as.data.frame(effect("Time:Phone:Web", m.full)) #summary(ef) head(ef, 10) ``` ``` ## Time Phone Web fit se lower upper ## 1 0 No No 38.56 1.577 35.46 41.66 ## 2 1 No No 37.64 1.558 34.58 40.70 ## 3 2 No No 36.71 1.549 33.67 39.75 ## 4 4 No No 34.87 1.562 31.80 37.93 ## 5 5 No No 33.94 1.583 30.84 37.05 ## 6 0 Yes No 36.27 1.577 33.18 39.37 ## 7 1 Yes No 35.19 1.558 32.13 38.24 ## 8 2 Yes No 34.10 1.549 31.06 37.14 ## 9 4 Yes No 31.92 1.562 28.85 34.98 ## 10 5 Yes No 30.83 1.583 27.72 33.94 ``` --- # Visualize effects (2): model-based lines ```r ggplot(ef, aes(Time, fit, color=Web, linetype=Phone)) + geom_line(linewidth=2) + theme_bw() + scale_color_brewer(palette = "Set1") ``` ![](msmr_lec02-2_LinearLDA_files/figure-html/unnamed-chunk-13-1.png)<!-- --> --- # Random effects <img src="./figs/RandomEffectsLinDemo.png" width="75%" /> -- **Keep it maximal**: A full or maximal random effect structure is when all of the factors that could hypothetically vary across individual observational units are allowed to do so. * Incomplete random effects can inflate false alarms * Full random effects can produce convergence problems, may need to be simplified --- # Participants as fixed vs. random effects **In general**, participants should be treated as random effects. This captures the typical assumption of random sampling from some population to which we wish to generalise. **Pooling** <img src="./figs/partial-pooling-vs-others-1.png" width="30%" /> http://tjmahr.github.io/plotting-partial-pooling-in-mixed-effects-models/ --- # Participants as fixed vs. random effects **In general**, participants should be treated as random effects. This captures the typical assumption of random sampling from some population to which we wish to generalise. **Shrinkage** <img src="./figs/shrinkage-plot-1.png" width="30%" /> http://tjmahr.github.io/plotting-partial-pooling-in-mixed-effects-models/ Another explanation of shrinkage: https://m-clark.github.io/posts/2019-05-14-shrinkage-in-mixed-models/ --- # Participants as fixed vs. random effects **In general**, participants should be treated as random effects. This captures the typical assumption of random sampling from some population to which we wish to generalise. **Exceptions are possible**: e.g., neurological/neuropsychological case studies where the goal is to characterise the pattern of performance for each participant, not generalise to a population. --- # Key points .pull-left[ **Logistic MLM** * Binary outcomes have particular distributional properties, use binomial (logistic) models to capture their generative process * Very simple extension of `lmer` code + `glmer()` instead of `lmer()` + add `family=binomial` + Outcome can be binary 1s and 0s or aggregated counts, e.g., `cbind(NumCorrect, NumError)` * Logistic MLMs are slower to fit and are prone to convergence problems ] .pull-right[ **Longitudinal Data Analysis (linear)** * LDA is a natural application for MLM * Use random effects to capture within-participant (longitudinal) nesting + Keep it maximal, simplify as needed (more on this next week) + Partial pooling and shrinkage * Don't forget about contrast coding and centering ] --- # Live R ### Longitudinal Data Analysis is not just for longitudinal data LDA methods are useful whenever there is a continuous predictor, even if that predictor is not 'Time' Example: the time to find a target object among a bunch of distractors (visual search) is a linear function of the number of distractors ![](msmr_lec02-2_LinearLDA_files/figure-html/unnamed-chunk-14-1.png)<!-- -->