LMM analysis workflow
You do not need to memorise these steps. This page is here so you can refer to it while working through LMM analyses for your DAPR reports, minidissertations, and dissertations.
You won’t need to use every step below for every analysis, and they don’t need to be in this specific order either. This is just the order Elizabeth usually uses 😊
Think of these steps like a buffet to pick and choose from, depending on what your analysis needs.
Phase 0: Before collecting data (if applicable)
- Run a power analysis to see how many people you’ll need to gather data from, given the effect size you expect to see or the smallest effect size that’s still theoretically interesting.
Phase 1: Before model fitting
1a: Set up your code and data
- Load the required R packages. You’ll probably need at least:
tidyverse(for managing and wrangling data)lme4(for fitting LMMs)lmerTest(for displaying p-values for the fixed effect coefficients)stats(forxtabs(), useful for identifying possible random slopes)HLMdiag(for computing influence diagnostics)effects(for plotting model-fitted values)
- Read in your data.
- Tidy data (e.g., any missingness, any implausible values? are data types set correctly?).
1b: Set up the fixed effects
- Based on the RQ, identify your outcome variable and the predictors (aka the fixed effects).
- Based on the RQ, decide whether your model requires an interaction between predictors.
- Based on the outcome variable, decide whether to use regular regression or logistic regression.
- Set up categorical predictors. (e.g., factor levels? which a priori coding scheme?)
- Set up continuous predictors. (e.g., any transformations?)
- Explore patterns in the fixed effects by plotting outcome and predictor variables together. (It’s most reader-friendly to plot the transformed version of the data, the version that you’ll also be modelling.)
1c: Set up the random effects
- Identify grouping variables which contribute random variability to your data. Your model will have random intercepts for each of these grouping variables.
- Describe the randomly-varying grouping variables (so that the reader knows how well your analysis might generalise).
- Identify the random slopes that your model can contain.
- Write out the formula for your maximal model.
Phase 2: Model fitting and troubleshooting
- Try fitting the maximal model to the data.
- If you get a convergence warning: Try a different optimiser and fit model again.
- If you get a convergence or singular fit warning: Simplify random effect structure and fit model again.
- If you get a suggestion to rescale variables: Convert continuous predictor(s) to z-scores and fit model again.
- Even if you get no warnings, manually check variance components for sneaky singularities.
Phase 3: After model fitting
3a: Check assumptions and diagnostics
- Check model assumptions.
- see Check assumptions flash card
- If model assumptions about normality/equal variance of errors are violated, bootstrap model estimates.
- see DAPR2 bootstrapping flash card for how to bootstrap simple linear models.
- for bootstrapping LMMs (not covered in DAPR3), you could try
lme4::bootMer()(documentation here).
- Run diagnostics for multicollinearity.
- Run diagnostics for influential observations and levels of grouping variables.
- If you find extreme influential observations/groups: run sensitivity analysis.
3b: Plot and interpret model estimates
- Plot model-fitted values.
- Interpret the fixed effect estimates.
- Plot random effects.
- Interpret SD of random effects relative to fixed effects.
- Interpret correlations between intercept and slope adjustments.
3c: Write up methods and results
- Write up your analysis and results.