Recap of multilevel models

No specific exercises this week

This week, there aren’t any exercises, but there is a small recap reading of multilevel models, followed by some ‘flashcard’ type boxes to help you test your understanding of some of the key concepts.

Please use the lab sessions to go over exercises from previous weeks, as well as asking any questions about the content below.

Flashcards: lm to lmer

In a simple linear regression, there is only considered to be one source of random variability: any variability left unexplained by a set of predictors (which are modelled as fixed estimates) is captured in the model residuals.

Multi-level (or ‘mixed-effects’) approaches involve modelling more than one source of random variability - as well as variance resulting from taking a random sample of observations, we can identify random variability across different groups of observations. For example, if we are studying a patient population in a hospital, we would expect there to be variability across the our sample of patients, but also across the doctors who treat them.

We can account for this variability by allowing the outcome to be lower/higher for each group (a random intercept) and by allowing the estimated effect of a predictor vary across groups (random slopes).

Before you expand each of the boxes below, think about how comfortable you feel with each concept.
This content is very cumulative, which means often going back to try to isolate the place which we need to focus efforts in learning.

MLM in a nutshell

Model Structure & Notation

MLM allows us to model effects in the linear model as varying between groups. Our coefficients we remember from simple linear models (the β’s) are modelled as a distribution that has an overall mean around which our groups vary. We can see this in Figure 1, where both the intercept and the slope of the line are modelled as varying by-groups. Figure 1 shows the overall line in blue, with a given group’s line in green.

Figure 1: Multilevel Model. Each group (e.g. the group in the green line) deviates from the overall fixed effects (the blue line), and the individual observations (green points) deviate from their groups line

The formula notation for these models involves separating out our effects β into two parts: the overall effect γ + the group deviations ζi:

for observation j in group iLevel 1:yij=β0i1+β1ixij+εijLevel 2:β0i=γ00+ζ0iβ1i=γ10+ζ1iWhere:γ00 is the population intercept, and ζ0i is the deviation of group i from γ00γ10 is the population slope, and ζ1i is the deviation of group i from γ10

The group-specific deviations ζ0i from the overall intercept are assumed to be normally distributed with mean 0 and variance σ02. Similarly, the deviations ζ1i of the slope for group i from the overall slope are assumed to come from a normal distribution with mean 0 and variance σ12. The correlation between random intercepts and slopes is ρ=Cor(ζ0i,ζ1i)=σ01σ0σ1:

[ζ0iζ1i]N([00],[σ02ρσ0σ1ρσ0σ1σ12])

The random errors, independently from the random effects, are assumed to be normally distributed with a mean of zero

ϵijN(0,σϵ2)

MLM in R

We can fit these models using the R package lme4, and the function lmer(). Think of it like building your linear model lm(y ~ 1 + x), and then allowing effects (i.e. things on the right hand side of the ~ symbol) to vary by the grouping of your data. We specify these by adding (vary these effects | by these groups) to the model:

library(lme4)
m1 <- lmer(y ~ x + (1 + x | group), data = df)
summary(m1)
Linear mixed model fit by REML ['lmerMod']
Formula: y ~ x + (1 + x | group)
   Data: df

REML criterion at convergence: 637.9

Scaled residuals: 
     Min       1Q   Median       3Q      Max 
-2.49449 -0.57223 -0.01353  0.62544  2.39122 

Random effects:
 Groups   Name        Variance Std.Dev. Corr
 group    (Intercept) 2.2616   1.5038       
          x           0.7958   0.8921   0.55
 Residual             4.3672   2.0898       
Number of obs: 132, groups:  group, 20

Fixed effects:
            Estimate Std. Error t value
(Intercept)   1.7261     0.9673   1.785
x             1.1506     0.2968   3.877

Correlation of Fixed Effects:
  (Intr)
x -0.552

The summary of the lmer output returns estimated values for

Fixed effects:

  • γ^00=1.726
  • γ^10=1.151

Variability of random effects:

  • σ^0=1.504
  • σ^1=0.892

Correlation of random effects:

  • ρ^=0.546

Residuals:

  • σ^ϵ=2.09

Flashcards: Getting p-values/CIs for Multilevel Models

Inference for multilevel models is tricky, and there are lots of different approaches.
You might find it helpful to think of the majority of these techniques as either:

A. Fitting a full model and a reduced model. These differ only with respect to the relevant effect. Comparing the overall fit of these models by some metric allows us to isolate the improvement due to our effect of interest. B. Fitting the full model and - using some method of approximating the degrees of freedom for the tests of whether a coefficient is zero. - constructing some confidence intervals (e.g. via bootstrapping) in order to gain a range of plausible values for the estimate (typically we then ask whether zero is within the interval)

Neither of these is correct or incorrect, but they each have different advantages. In brief:

  • Satterthwaite df = Very easy to implement, can fit with REML
  • Kenward-Rogers df = Good when working with small samples, can fit with REML
  • Likelihood Ratio Tests = better with bigger samples (of groups, and of observations within groups), requires REML = FALSE
  • Parametric Bootstrap = takes time to run, but in general a reliable approach. , requires REML = FALSE if doing comparisons, but not for confidence intervals
  • Non-Parametric Bootstrap = takes time, needs careful thought about which levels to resample, but means we can relax distributional assumptions (e.g. about normality of residuals).

For the examples below, we’re investigating how “hours per week studied” influences a child’s reading age. We have data from 132 children from 20 schools, and we have fitted the model:

full_model <- lmer(reading_age ~ 1 + hrs_week + (1 + hrs_week | school), 
                   data = childread)