Week 1 Exercises: Intro to MLM

These are the main packages we’re going to use in this block. It might make sense to install them now if you do not have them already

  • tidyverse : for organising data
  • lme4 : for fitting generalised linear mixed effects models
  • broom.mixed : tidying methods for mixed models
  • effects : for tabulating and graphing effects in linear models
  • lmerTest: for quick p-values from mixed models
  • parameters: various inferential methods for mixed models

A Toy Example

For our first foray into the multilevel model, we’re going to start with little toy example, and we’re just going to ask you to plot the predictions from a) a simple linear model, b) a model with a random intercept, and c) a model with random intercepts and slopes.

This is to build the understanding of the structure of multilevel models. When it comes to actually building models for research purposes, it is not necessary to slowly build up the complexity in this way.

Data: New Toys!

Recall the example from last semesters’ USMR course, where the lectures explored linear regression with a toy dataset of how practice influences the reading age of toy characters (see USMR Week 7 Lecture). We’re going to now broaden our scope to the investigation of how practice affects reading age for all toys (not just Martin’s Playmobil characters).

You can find a dataset at https://uoepsy.github.io/data/toy2.csv containing information on 129 different toy characters that come from a selection of different families/types of toy. You can see the variables in the table below1.

variable description
toy_type Type of Toy
year Year Released
toy Character
hrs_week Hours of practice per week
R_AGE Reading Age
Question 1

Read in the data and plot the relationship between hours-per-week practice (hrs_week) and reading age (R_AGE).
Facet the plot by the type of toy.

“facet” is the key word here! See 1A #visualisations

Question 2

Below is the code to fit a simple linear model and produce some diagnostic plots.
After running the code, do you think that we have violated any assumptions?

mod1 <- lm(R_AGE ~ hrs_week, data = toy2)
plot(mod1)

Question 3

There are lots of ways in R to get predicted values for a linear model (e.g. we saw augment() a lot in USMR).

The simplest way is to use functions like predict() or fitted() (they do the same thing), which gives us a vector of predicted values, which we can append to our dataframe (provided we don’t have missing data):

data$predictedvalues <- predict(model)

Add the predictions from the linear model in the previous question to the facetted plot that you created in the earlier question. How well does the model fit each type of toy?

Question 4

Load the lme4 package, and fit a model with random intercepts for each toy type.

Using either predict() again, or this time you can use augment() from the broom.mixed package, plot the predicted values and the observations.

You can see a model with a random intercept fitted in 1B# fitting-multilevel-models-in-r.

Question 5

Now fit a model with random intercepts and slopes for each toy type.

As before, plot the predicted values of this model alongside the observations

Question 6

Finally, add a geom_smooth to the plot from the previous question (making sure that this is has y=R_AGE).
This will add a separate linear model lm() line for each of the facets in the plot.

What differences (look closely!) do you notice between the predictions from the model with random intercepts and slopes, and the simple geom_smooths?

You could try changing col, lty, and lwd to make things easier to see.

What we’re doing here is showing to ourselves ‘partial pooling’ in action (1B #partial-pooling).

Question 7

Here is the model with the random intercepts (but not random slopes) that we fitted in an earlier question.

Below is the code that produces a plot of the fitted values:

$$ \[\begin{align} \text{For Toy }j\text{ of Type }i & \\ \text{Level 1 (Toy):}& \\ \text{R\_AGE}_{ij} &= b_{0i} + b_1 \cdot \text{hrs\_week}_{ij} + \epsilon_{ij} \\ \text{Level 2 (Type):}& \\ b_{0i} &= \gamma_{00} + \zeta_{0i} \\ \text{Where:}& \\ \zeta_{0i} &\sim N(0,\sigma_{0}) \\ \varepsilon &\sim N(0,\sigma_{e}) \\ \end{align}\]

$$

mod2 <- lmer(R_AGE ~ 1 + hrs_week + (1 | toy_type), 
             data = toy2)
summary(mod2)
Linear mixed model fit by REML ['lmerMod']
Formula: R_AGE ~ 1 + hrs_week + (1 | toy_type)
   Data: toy2

REML criterion at convergence: 544.7

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-4.8809 -0.4687  0.0541  0.5579  3.3220 

Random effects:
 Groups   Name        Variance Std.Dev.
 toy_type (Intercept) 7.278    2.698   
 Residual             2.550    1.597   
Number of obs: 129, groups:  toy_type, 20

Fixed effects:
            Estimate Std. Error t value
(Intercept)   4.2594     0.9157   4.652
hrs_week      0.7118     0.1651   4.312

Correlation of Fixed Effects:
         (Intr)
hrs_week -0.736
augment(mod2) |>
  ggplot(aes(x=hrs_week, col=toy_type))+
  geom_point(aes(y=R_AGE),alpha=.3) + # observations
  geom_line(aes(y=.fitted)) + # predictions 
  geom_abline(intercept = fixef(mod2)[1], 
              slope = fixef(mod2)[2], lwd=1) +  # fixed effect line
  guides(col="none") + # remove legend
  xlim(0,7) # extent to x=0

Match the parameters from the model equation, as well as coefficients from model output, to the corresponding points on the plot of fitted values.

Model Equation

  • A1: \(\sigma_{0}\)
  • A2: \(\sigma_{\varepsilon}\)
  • A3: \(\gamma_{00}\)
  • A4: \(b_{1}\)

Model Output

  • B1: 0.7118
  • B2: 2.698
  • B3: 1.597
  • B4: 4.2594

Plot of fitted values

  • C1: the standard deviation of the distances from all the individual toy types lines to the black line
  • C2: where the black line cuts the y axis
  • C3: the slope of the black line
  • C4: the standard deviation of the distances from all the individual observations to the line for the toy type to which it belongs.


Audio Interference in Executive Functioning

Data: Audio interference in executive functioning

This data is from a simulated study that aims to investigate the following research question:

How do different types of audio interfere with executive functioning, and does this interference differ depending upon whether or not noise-cancelling headphones are used?

30 healthy volunteers each completed the Symbol Digit Modalities Test (SDMT) - a commonly used test to assess processing speed and motor speed - a total of 15 times. During the tests, participants listened to either no audio (5 tests), white noise (5 tests) or classical music (5 tests). Half the participants listened via active-noise-cancelling headphones, and the other half listened via speakers in the room. Unfortunately, lots of the tests were not administered correctly, and so not every participant has the full 15 trials worth of data.

The data is available at https://uoepsy.github.io/data/efsdmt.csv.

variable description
PID Participant ID
audio Audio heard during the test ('no_audio', 'white_noise','music')
headphones Whether the participant listened via speakers in the room or via noise cancelling headphones
SDMT Symbol Digit Modalities Test (SDMT) score
Question 8

How many participants are there in the data?
How many have complete data (15 trials)?
What is the average number of trials that participants completed? What is the minimum?
Does every participant have some data for each type of audio?

Functions like table() and count() will likely be useful here.

Question 9

Consider the following questions about the study:

  • What is our outcome of interest?
  • What variables are we seeking to investigate in terms of their impact on the outcome?
  • What are the units of observations?
  • Are the observations clustered/grouped? In what way?
  • What varies within these clusters?
  • What varies between these clusters?

Question 10

Calculate the ICC, using the ICCbare() function from the ICC package.

How much of the variation in SDMT scores is attributable to participant level differences?

See 1A #icc, or look up the help documentation for ?ICCbare().

Question 11

The multilevel model that has only an intercept (and the grouping structure) specified is sometimes referred to as the “null model” (or “intercept only model”).
Because there are no predictors in the fixed effects there is just a single value (the intercept). All of the variance in the outcome gets modelled in the random effects part, and is partitioned into either ‘variance between groups’ or ‘residual variance’. This means we can just use those estimates to calculate the ICC.

For our executive functioning study, fit the null model use the output to calculate the ICC.
Compare it to the answer from the previous question (it should be pretty close!)

The formula for the ICC is:
\[ ICC = \frac{\sigma^2_{b}}{\sigma^2_{b} + \sigma^2_e} = \frac{\text{between-group variance}}{\text{between-group variance}+\text{within-group variance}} \]

Question 12

Make factors and set the reference levels of the audio and headphones variables to “no audio” and “speakers” respectively.

Can’t remember about setting factors and reference levels? Check back to USMR materials!

Question 13

Fit a multilevel model to address the aims of the study (copied below)

How do different types of audio interfere with executive functioning, and does this interference differ depending upon whether or not noise-cancelling headphones are used?

Specifying the model may feel like a lot, but try splitting it into three parts:

\[ \text{lmer(}\overbrace{\text{outcome }\sim\text{ fixed effects}}^1\, + \, (1 + \underbrace{\text{slopes}}_3\, |\, \overbrace{\text{grouping structure}}^2 ) \]

  1. Just like the lm()s we have used in USMR, think about what we want to test. This should provide the outcome and the structure of our fixed effects.
  2. Think about how the observations are clustered/grouped. This should tell us how to specify the grouping structure in the random effects.
  3. Think about which slopes (i.e. which terms in our fixed effects) could feasibly vary between the clusters. This provides you with what to put in as random slopes.

Question 14

We now have a model, but we don’t have any p-values or confidence intervals or anything - i.e. we have no inferential criteria on which to draw conclusions. There are a whole load of different methods available for drawing inferences from multilevel models, which means it can be a bit of a never-ending rabbit hole. For now, we’ll just use the ‘quick and easy’ approach provided by the lmerTest package seen in the lectures.

Using the lmerTest package, re-fit your model, and you should now get some p-values!

If you use library(lmerTest) to load the package, then every single model you fit will show p-values calculated with the Satterthwaite method.
Personally, I would rather this is not the case, so I often opt to fit specific models with these p-values without ever loading the package:
modp <- lmerTest::lmer(y ~ 1 + x + ....

If we want to go down the model comparison route, we just need to isolate the relevant part(s) of the model that we are interested in.

Remember, as we saw in USMR, model comparison is sometimes a useful way of testing a set of coefficients. For instance, in this example the interaction involves estimating two terms: audiomusic:headphonesanc_headphones and audiowhite_noise:headphonesanc_headphones.

To test the interaction as a whole, we can create a model without the interaction, and then compare it. The SATmodcomp() function from the pbkrtest package provides a way of conducting an F test with the same Satterthwaite method of approximating the degrees of freedom:

sdmt_mod <- lmer(SDMT ~ audio * headphones + 
              (1 + audio | PID), data = efdat)
sdmt_res <- lmer(SDMT ~ audio + headphones + 
                   (1 + audio | PID), data = efdat)
library(pbkrtest)
SATmodcomp(largeModel = sdmt_mod, smallModel = sdmt_res)
large : SDMT ~ audio * headphones + (1 + audio | PID)
small (restriction matrix) : 
                             
 0 0 0 0 0.8936904 -0.4486841
 0 0 0 0 0.4486841  0.8936904
     statistic    ndf    ddf   p.value    
[1,]    11.051  2.000 26.909 0.0003136 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Question 15

We’ve already seen in the example with the toys (above) that we can visualise the fitted values (model predictions) using things like augment() from the broom.mixed package. But these were plotting all the cluster-specific values (i.e. our random effects), and what we are really interested in are the estimates of (and uncertainty around) our fixed effects.

Using tools like the effects package can provide us with the values of the outcome across levels of a specific fixed predictor (holding other predictors at their mean).

Question 16

Now we have some p-values and a plot, try to create a short write-up of the analysis and results.

Footnotes

  1. Image sources:
    http://tophatsasquatch.com/2012-tmnt-classics-action-figures/
    https://www.dezeen.com/2016/02/01/barbie-dolls-fashionista-collection-mattel-new-body-types/
    https://www.wish.com/product/5da9bc544ab36314cfa7f70c
    https://www.worldwideshoppingmall.co.uk/toys/jumbo-farm-animals.asp
    https://www.overstock.com/Sports-Toys/NJ-Croce-Scooby-Doo-5pc.-Bendable-Figure-Set-with-Scooby-Doo-Shaggy-Daphne-Velma-and-Fred/28534567/product.html
    https://tvtropes.org/pmwiki/pmwiki.php/Toys/Furby
    https://www.fun.com/toy-story-4-figure-4-pack.html
    https://www.johnlewis.com/lego-minifigures-71027-series-20-pack/p5079461↩︎