4. Random Effect Structures | Logistic MLM

Nested and Crossed structures

The same principle we have seen for one level of clustering can be extended to clustering at different levels (for instance, observations are clustered within subjects, which are in turn clustered within groups).

Consider the example where we have observations for each student in every class within a number of schools:

Question: Is “Class 1” in “School 1” the same as “Class 1” in “School 2”?

No.
The classes in one school are distinct from the classes in another even though they are named the same.

The classes-within-schools example is a good case of nested random effects - one factor level (one group in a grouping varible) appears only within a particular level of another grouping variable.

In R, we can specify this using:

(1 | school) + (1 | class:school)

or, more succinctly:

(1 | school/class)

Consider another example, where we administer the same set of tasks for every participant.

Question: Are tasks nested within participants?

No.
Tasks are seen by multiple participants (and participants see multiple tasks).

We could visualise this as the below:

In the sense that these are not nested, they are crossed random effects.

In R, we can specify this using:

(1 | subject) + (1 | task)

Nested vs Crossed

Nested: Each group belongs uniquely to a higher-level group.

Crossed: Not-nested.

Note that in the schools and classes example, had we changed data such that the classes had unique IDs (e.g., see below), then the structures (1 | school) + (1 | class) and (1 | school/class) would give the same results.

Random Effects in lme4

Below are a selection of different formulas for specifying different random effect structures, taken from the lme4 vignette. This might look like a lot, but over time and repeated use of multilevel models you will get used to reading these in a similar way to getting used to reading the formula structure of y ~ x1 + x2 in all our linear models.

Formula	Alternative	Meaning
$\text{(1 \| g)}$	$\text{1 + (1 \| g)}$	Random intercept with fixed mean
$\text{(1 \| g1/g2)}$	$\text{(1 \| g1) + (1 \| g1:g2)}$	Intercept varying among $g1$, and $g2$ within $g1$
$\text{(1 \| g1) + (1 \| g2)}$	$\text{1 + (1 \| g1) + (1 \| g2)}$	Intercept varying among $g1$ and $g2$
$\text{x + (x \| g)}$	$\text{1 + x + (1 + x \| g)}$	Correlated random intercept and slope
$\text{x + (x \|\| g)}$	$\text{1 + x + (1 \| g) + (0 + x \| g)}$	Uncorrelated random intercept and slope

Table 1: Examples of the right-hand-sides of mixed effects model formulas. $g$, $g1$, $g2$ are grouping factors, $x$ is a predictor variable.

Singular fits

You may have noticed that some of our models over the last few weeks have been giving a warning: boundary (singular) fit: see ?isSingular.
Up to now, we’ve been largely ignoring these warnings. However, this week we’re going to look at how to deal with this issue.

boundary (singular) fit: see ?isSingular

The warning is telling us that our model has resulted in a ‘singular fit’. Singular fits often indicate that the model is ‘overfitted’ - that is, the random effects structure which we have specified is too complex to be supported by the data.

Perhaps the most intuitive advice would be remove the most complex part of the random effects structure (i.e. random slopes). This leads to a simpler model that is not over-fitted. In other words, start simplifying from the top (where the most complexity is) to the bottom (where the lowest complexity is). Additionally, when variance estimates are very low for a specific random effect term, this indicates that the model is not estimating this parameter to differ much between the levels of your grouping variable. It might, in some experimental designs, be perfectly acceptable to remove this or simply include it as a fixed effect.

A key point here is that when fitting a mixed model, we should think about how the data are generated. Asking yourself questions such as “do we have good reason to assume subjects might vary over time, or to assume that they will have different starting points (i.e., different intercepts)?” can help you in specifying your random effect structure

You can read in depth about what this means by reading the help documentation for ?isSingular. For our purposes, a relevant section is copied below:

… intercept-only models, or 2-dimensional random effects such as intercept + slope models, singularity is relatively easy to detect because it leads to random-effect variance estimates of (nearly) zero, or estimates of correlations that are (almost) exactly -1 or 1.

Convergence warnings

Issues of non-convergence can be caused by many things. If you’re model doesn’t converge, it does not necessarily mean the fit is incorrect, however it is is cause for concern, and should be addressed, else you may end up reporting inferences which do not hold.

There are lots of different things which you could do which might help your model to converge. A select few are detailed below:

double-check the model specification and the data
adjust stopping (convergence) tolerances for the nonlinear optimizer, using the optCtrl argument to [g]lmerControl. (see ?convergence for convergence controls).
- What is “tolerance”? Remember that our optimizer is the the method by which the computer finds the best fitting model, by iteratively assessing and trying to maximise the likelihood (or minimise the loss).

Figure 1: An optimizer will stop after a certain number of iterations, or when it meets a tolerance threshold

center and scale continuous predictor variables (e.g. with scale)
Change the optimization method (for example, here we change it to bobyqa): lmer(..., control = lmerControl(optimizer="bobyqa"))
glmer(..., control = glmerControl(optimizer="bobyqa"))
Increase the number of optimization steps: lmer(..., control = lmerControl(optimizer="bobyqa", optCtrl=list(maxfun=50000))
glmer(..., control = glmerControl(optimizer="bobyqa", optCtrl=list(maxfun=50000))
Use allFit() to try the fit with all available optimizers. This will of course be slow, but is considered ‘the gold standard’; “if all optimizers converge to values that are practically equivalent, then we would consider the convergence warnings to be false positives.”
Consider simplifying your model, for example by removing random effects with the smallest variance (but be careful to not simplify more than necessary, and ensure that your write up details these changes)

Model Building

Random effect structures can get pretty complicated quite quickly. Very often it is not the random effects part that is of specific interest to us, but we wish to estimate random effects in order to more accurately partition up the variance in our outcome variable and provide better estimates of fixed effects.

It is a fine balance between fitting the most sophisticated model structure that we possibly can, and fitting a model that converges without too much simplification. Typically for many research designs, the following steps will keep you mostly on track to finding the maximal model:

lmer(outcome ~ fixed effects + (random effects | grouping structure))

Specify the outcome ~ fixed effects bit first.
- The outcome variable should be clear: it is the variable we are wishing to explain/predict.
- The fixed effects are the things we want to use to explain/predict variation in the outcome variable. These will often be the things that are of specific inferential interest, and other covariates. Just like the simple linear model.
If there is a grouping structure to your data, and those groups (preferably n>7 or 8) are perceived as a random sample of a wider population (the specific groups aren’t interesting to you), then consider including random intercepts (and possibly random slopes of predictors) for those groups (1 | grouping).
If there are multiple different grouping structures, is one nested within another? (If so, we can specify this as (1 | high_level_grouping / low_level_grouping). If the grouping structures are not nested, we specify them as crossed: (1 | grouping1) + (1 | grouping2).
If any of the things in the fixed effects vary within the groups, it might be possible to also include them as random effects.
- as a general rule, don’t specify random effects that are not also specified as fixed effects (an exception could be specifically for model comparison, to isolate the contribution of the fixed effect).
- For things that do not vary within the groups, it rarely makes sense to include them as random effects. For instance if we had a model with lmer(score ~ genetic_status + (1 + genetic_status | patient)) then we would be trying to model a process where “the effect of genetic_status on scores is different for each patient”. But if you consider an individual patient, their genetic status never changes. For patient $i$, what is “the effect of genetic status on score”? It’s undefined. This is because genetic status only varies between patients.
- Sometimes, things will vary within one grouping, but not within another. E.g. with patients nested within hospitals (1 | hospital/patient), genetic_status varies between patients, but within hospitals. Therefore we could theoretically fit a random effect of genetic_status | hospital, but not one for genetic_status | patient.
  We can include these while preserving the nested structure of the data by separating out the groupings: (1 + genetic_status | hospital) + (1 | patient:hospital).
A common approach to fitting multilevel models is to fit the most complex model that the study design allows. This means including random effects for everything that makes theoretical sense to be included. Because this “maximal model” will often not converge, we then simplify our random effect structure to obtain a convergent model.
Candidate random effect terms for removal are:

overly complex terms (e.g. (1 + x1 * x2 | group) could be simplified to (1 + x1 + x2 | group)).
those with little variance
those with near perfect correlation (we can alternatively remove just the correlation)

Obtain final model, and proceed to checking assumptions, checking influence, plotting, and eventually writing up.

Exercises: Crossed Ranefs

Data: Test-enhanced learning

An experiment was run to conceptually replicate “test-enhanced learning” (Roediger & Karpicke, 2006): two groups of 25 participants were presented with material to learn. One group studied the material twice (StudyStudy), the other group studied the material once then did a test (StudyTest). Recall was tested immediately (one minute) after the learning session and one week later. The recall tests were composed of 175 items identified by a keyword (Test_word).

The researchers are interested in how test-enhanced learning influences time-to-recall. The critical (replication) prediction is that the StudyTest group will retain the material better (lower reaction times) on the 1-week follow-up test compared to the StudyStudy group.

variable	description
Subject_ID	Unique Participant Identifier
Group	Group denoting whether the participant studied the material twice (StudyStudy), or studied it once then did a test (StudyTest)
Delay	Time of recall test ('min' = Immediate, 'week' = One week later)
Test_word	Word being recalled (175 different test words)
Rtime	Time to respond (milliseconds)

The following code loads the data into your R environment by creating a variable called tel:

load(url("https://uoepsy.github.io/data/RTtestlearning.RData"))

Question 1

Load and plot the data.

For this week, we’ll use Reaction Time as our proxy for the test performance, so you’ll probably want that variable on the y-axis.

Does it look like the effect was replicated?

Hints

Our outcome variable is reaction time, so we’ll want to plot that on the y axis.

As described in the data summary, this is a replication, and the prediction is that the StudyTest group will retain the material better (have lower reaction times) on the 1-week follow-up test compared to the StudyStudy group.

So we want to show the reaction times for each group, at each test (min or week delay). Do we want to show all the datapoints, or maybe just the average for each group at each test (in which case, does stat_summary() help?).

Question 2

The critical (replication) prediction is that the StudyStudy group should perform somewhat better on the immediate recall test, but the StudyTest group will retain the material better and thus perform better on the 1-week follow-up test.

Test the critical hypothesis using a multi-level model.
Try to fit the maximally complex random effect structure that is supported by the experimental design.

NOTE: Your model probably won’t converge. We’ll deal with that in the next question

Hints

We can expect variability across subjects (some people are better at learning than others) and across items (some of the recall items are harder than others). How should this be represented in the random effects?
If a model takes ages to fit, you might want to cancel it by pressing the escape key. It is normal for complex models to take time, but for the purposes of this task, give up after a couple of minutes, and try simplifying your model.

Question 3

Often, models with maximal random effect structures will not converge, or will obtain a singular fit. So we need to simplify the model until we achieve convergence (Barr et al., 2013).

Incrementally simplify your model from the previous question until you obtain a model that converges and is not a singular fit.

Hints

you can look at the variance estimates and correlations easily by using the VarCorr() function. What jumps out?

There’s a bit of subjectivity here about which simplifications we can live with, so you might end up with a slightly different model from the solutions, and that’s okay.

There are very high correlations with the by-item random effects of the interaction Delay:Group. We might expect that because it’s an interaction term, but it is quite a complex bit of the model, so let’s remove it:

VarCorr(m)

 Groups     Name                     Std.Dev. Corr                
 Test_word  (Intercept)               17.8641                     
            Delayweek                 13.2793 -0.251              
            GroupStudyTest            18.0269 -0.797 -0.385       
            Delayweek:GroupStudyTest  13.1273  0.972 -0.016 -0.917
 Subject_ID (Intercept)               40.5201                     
            Delayweek                  7.4482 -0.038              
 Residual                            240.3113

m1 <- lmer(Rtime ~ Delay*Group +
             (1 + Delay | Subject_ID) +
             (1 + Delay + Group | Test_word),
           data=telrt, control=lmerControl(optimizer = "bobyqa"))
VarCorr(m1)

 Groups     Name           Std.Dev. Corr         
 Test_word  (Intercept)      0.0000              
            Delayweek       18.6861    NaN       
            GroupStudyTest   8.3577    NaN -0.778
 Subject_ID (Intercept)     40.5125              
            Delayweek        7.3661 -0.036       
 Residual                  240.5318

isSingular(m1)

[1] TRUE

We still have a singular fit here, and we still have quite high¹ correlations between by-testword random effects. Thinking about the study, if we are going to remove one of the by-testword random effects (Delay or Group), which one do we consider to be more theoretically justified? Is the effect of Delay likely to vary by test-words? More so than the effect of group is likely to vary by test-words? Quite possibly - there’s no obvious reason for certain words to be more memorable for people in one group vs another. But there is reason for words to vary in the effect that delay of one week has - how familiar a word is will likely influence the amount to which a week’s delay has on recall.

Let’s remove the by-testword random effect of group.

m2 <- lmer(Rtime ~ Delay*Group +
             (1 + Delay | Subject_ID) +
             (1 + Delay | Test_word),
           data=telrt, control=lmerControl(optimizer = "bobyqa"))
isSingular(m2)

[1] FALSE

VarCorr(m2)

 Groups     Name        Std.Dev. Corr  
 Test_word  (Intercept)  11.6973       
            Delayweek    13.5692 -0.236
 Subject_ID (Intercept)  40.5155       
            Delayweek     7.4004 -0.037
 Residual               240.4432

Hooray, the model converged!

summary(m2)

Linear mixed model fit by REML ['lmerMod']
Formula: Rtime ~ Delay * Group + (1 + Delay | Subject_ID) + (1 + Delay |  
    Test_word)
   Data: telrt
Control: lmerControl(optimizer = "bobyqa")

REML criterion at convergence: 241671.3

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-3.8200 -0.6685 -0.0096  0.6760  4.0783 

Random effects:
 Groups     Name        Variance Std.Dev. Corr 
 Test_word  (Intercept)   136.83  11.70        
            Delayweek     184.12  13.57   -0.24
 Subject_ID (Intercept)  1641.51  40.52        
            Delayweek      54.77   7.40   -0.04
 Residual               57812.93 240.44        
Number of obs: 17498, groups:  Test_word, 175; Subject_ID, 50

Fixed effects:
                         Estimate Std. Error t value
(Intercept)               744.047      8.925  83.363
Delayweek                  26.766      5.448   4.913
GroupStudyTest            -18.032     12.560  -1.436
Delayweek:GroupStudyTest  -17.647      7.566  -2.332

Correlation of Fixed Effects:
            (Intr) Delywk GrpStT
Delayweek   -0.285              
GropStdyTst -0.704  0.200       
Dlywk:GrpST  0.202 -0.694 -0.288

Let’s quickly visualise the interaction. Remember, lower reaction times are better here. It looks like we have replicated the hypothesised effect:

library(sjPlot)
plot_model(m2, type="int")

Exercises: Nested Random Effects

Data: Naming

74 children from 10 schools were administered the full Boston Naming Test (BNT-60) on a yearly basis for 5 years to examine development of word retrieval. Five of the schools taught lessons in a bilingual setting with English as one of the languages, and the remaining five schools taught in monolingual English.

The data is available at https://uoepsy.github.io/data/bntmono.csv.

variable	description
child_id	unique child identifier
school_id	unique school identifier
BNT60	score on the Boston Naming Test-60. Scores range from 0 to 60
schoolyear	Year of school
mlhome	Mono/Bi-lingual School. 0 = Bilingual, 1 = Monolingual

Question 4

Let’s start by thinking about our clustering - we’d like to know how much of the variance in BNT60 scores is due to the clustering of data within children, who are themselves within schools. One easy way of assessing this is to fit an intercept only model, which has the appropriate random effect structure.

Using the model below, calculate the proportion of variance attributable to the clustering of data within children within schools.

bnt_null <- lmer(BNT60 ~ 1 +  (1 | school_id/child_id), data = bnt)

Hints

the random intercept variances are the building blocks here. There are no predictors in this model, so all the variance in the outcome gets attributed to either school-level nesting, child-level nesting, or else is lumped into the residual.

Question 5

Fit a model examining the interaction between the effects of school year and mono/bilingual teaching on word retrieval (via BNT test), with random intercepts only for children and schools.

Hints

make sure your variables are of the right type first - e.g. numeric, factor etc

Examine the fit and consider your model assumptions, and assess what might be done to improve the model in order to make better statistical inferences.

This is a quick way to make a set of variables factors:

bnt <- bnt %>% mutate(across(c(mlhome, school_id, child_id), factor))

And now let’s fit our model:

bntm0 <- lmer(BNT60 ~ schoolyear * mlhome + (1 | school_id/child_id), data = bnt)

Residuals don’t look zero mean:

plot(bntm0, type=c("p","smooth"))

It looks a little like, compared to our model (black lines below) the children’s scores (coloured lines) are more closely clustered together when they start school, and then they are more spread out by the end of the study. The fact that we’re fitting the same slope for each child is restricting us here, so we should try fitting random effects of schoolyear.

library(broom.mixed)
augment(bntm0) %>%
  ggplot(aes(x=schoolyear, col=child_id)) + 
  geom_point(aes(y = BNT60))+
  geom_path(aes(y = BNT60))+
  geom_path(aes(y = .fitted), col="black", alpha=.3)+
  guides(col="none")+
  facet_wrap(~school_id)

bntm1 <- lmer(BNT60 ~ schoolyear * mlhome + (1 + schoolyear | school_id/child_id), data = bnt)
plot(bntm1, type=c("p","smooth"))

Much better!

Let’s do some quick diagnostic checks for influence:

library(HLMdiag)
inf1 <- hlm_influence(bntm1, level=1)
dotplot_diag(inf1$cooksd, cutoff = "internal")

If you check in the help for dotplot_diag(), it tells you that

we can add an index for the labels, and
the coordinates (x,y) are flipped. We’re telling R to change the limits of the y axis, but actually it is the x axis. This is just because we want to see the label for that point out to the right.

infchild <- hlm_influence(bntm1, level="child_id:school_id")
dotplot_diag(infchild$cooksd, cutoff = "internal", index = infchild$`child_id:school_id`) + 
  scale_y_continuous(limits=c(0,.05))

And then we can examine the effects to the fixed effects and our standard errors when we remove this child:

del94 <- case_delete(bntm1, level="child_id:school_id", delete = "ID94:SC9")
cbind(del94$fixef.original, del94$fixef.delete)

                        [,1]       [,2]
(Intercept)         6.265626  6.2627962
schoolyear          6.371168  6.3639335
mlhome1             0.138711 -0.3052998
schoolyear:mlhome1 -2.603763 -2.3701181

Optional: Case deletion influence on standard errors

We can examine the influence that deleting a case has on the standard errors. The standard errors are the square-root of the diagonal of the model-implied variance-covariance matrix:

cbind( 
  sqrt(diag(del94$vcov.original)),
  sqrt(diag(del94$vcov.delete))
)

                        [,1]      [,2]
(Intercept)        1.0413285 0.9557645
schoolyear         0.7719321 0.6717837
mlhome1            1.4683927 1.3525492
schoolyear:mlhome1 1.0875897 0.9498207

infschool <- hlm_influence(bntm1, level="school_id")
dotplot_diag(infschool$cooksd, cutoff = "internal", index = infschool$school_id)

Question 6

Using a method of your choosing, conduct inferences (i.e. obtain p-values or confidence intervals) from your final model and write up the results.

We’ll use case-based bootstrapping for a demonstration, but other methods would be appropriate here. We have a large sample of children (74), each with 5 observations. However, we only have 10 schools. A standard likelihood ratio test using anova(model1, model2) might not be preferable here.

This took quite a while to run:

library(lmeresampler)
bntm1BS <- bootstrap(bntm1, .f=fixef, type = "case", B = 2000, resample = c(FALSE,TRUE,FALSE))
confint(bntm1BS, type = "basic")

# A tibble: 4 × 6
  term               estimate lower upper type  level
  <chr>                 <dbl> <dbl> <dbl> <chr> <dbl>
1 (Intercept)           6.27   5.41  7.10 basic  0.95
2 schoolyear            6.37   5.72  7.02 basic  0.95
3 mlhome1               0.139 -1.12  1.40 basic  0.95
4 schoolyear:mlhome1   -2.60  -3.62 -1.70 basic  0.95

Multilevel level linear regression was used to investigate childrens’ development of word retrieval over 5 years of school, and whether development was dependent upon the school teaching classes monolingually or bilingually. Initial evaluation of the intercept-only model indicated that the clustering of multiple observations from children within schools accounted for 39.7% of the variance in scores on the Boston Naming Task (BNT60, range 0 to 60). BNT60 scores were modelled with fixed effects of school year (1-5) and monolingual teaching (bilingual vs monolingual, treatment coded with bilingual as the reference level). Random intercepts and slopes of school year were included for schools and for children nested within schools. The model was fitting with maximum likelihood estimation using the default optimiser from the lme4 package (Bates et al., 2015).
95% Confidence for fixed effect estimates were constructed by case-based bootstrapping with 2000 bootstraps in which children, (but neither observations within children nor the schools within which children were nested) were resampled. Results indicated that children’s scores on the BNT60 increased over the 5 years in which they were studied, with children from bilingual schools increasing in scores by 6.37 ([5.72 – 7.02]) every school year. There was a significant interaction between mono/bilingual schools and changes over the school year, with children from monolingual schools increasing -2.6 ([-3.62 – -1.7]) less than those from bilingual schools for every additional year of school. Full model results can be found in Table 1.

Table 1

::: {.cell layout-align=“center”}

	BNT 60
Predictors	Estimates	95% CI bootstrap
Intercept	6.27	5.43 – 7.12
School Year	6.37	4.85 – 7.89
MonolingualSchool [1]	0.14	-2.75 – 3.03
School Year:MonolingualSchool [1]	-2.60	-4.74 – -0.46
Random Effects
σ²	8.64
τ₀₀ _{child_id:school_id}	1.77
τ₀₀ _{school_id}	3.83
τ₁₁ _{child_id:school_id.schoolyear}	6.83
τ₁₁ _{school_id.schoolyear}	1.89
ρ₀₁ _{child_id:school_id}	-0.42
ρ₀₁ _{school_id}	-0.39
ICC	0.91
N _{child_id}	74
N _{school_id}	10
Observations	370
Marginal R² / Conditional R²	0.420 / 0.947

:::

library(effects)
as.data.frame(effect("schoolyear:mlhome",bntm1)) %>%
  ggplot(., aes(x=schoolyear,y=fit,col=mlhome))+
  geom_pointrange(aes(ymin=lower,ymax=upper))+
  scale_color_manual(NULL,labels=c("Bilingual","Monolingual"),values=c("tomato1","navyblue"))+
  labs(x="- School Year -", y="BNT-60")

Optional Exercises: Logistic MLM

Don’t forget to look back at other materials!

Back in DAPR2, we introduced logistic regression in semester 2, week 8. The lab contained some simulated data based on a hypothetical study about inattentional blindness. That content will provide a lot of the groundwork for this week, so we recommend revisiting it if you feel like it might be useful.

From lmer() to glmer()

Remember how we simply used glm() and could specify the family = "binomial" in order to fit a logistic regression? Well it’s much the same thing for multi-level models!

Gaussian model: lmer(y ~ x1 + x2 + (1 | g), data = data)
Binomial model: glmer(y ~ x1 + x2 + (1 | g), data = data, family = binomial(link='logit'))
- or just glmer(y ~ x1 + x2 + (1 | g), data = data, family = "binomial")
- or glmer(y ~ x1 + x2 + (1 | g), data = data, family = binomial)

Data: Memory Recall & Finger Tapping

Research Question: After accounting for effects of sentence length, does the rhythmic tapping of fingers aid memory recall?

Researchers recruited 40 participants. Each participant was tasked with studying and then recalling 10 randomly generated sentences between 1 and 14 words long. For 5 of these sentences, participants were asked to tap their fingers along with speaking the sentence in both the study period and in the recall period. For the remaining 5 sentences, participants were asked to sit still.
The data are available at https://uoepsy.github.io/data/memorytap.csv, and contains information on the length (in words) of each sentence, the condition (static vs tapping) under which it was studied and recalled, and whether the participant was correct in recalling it.

variable	description
ppt	Participant Identifier (n=40)
slength	Number of words in sentence
condition	Condition under which sentence is studied and recalled ('static' = sitting still, 'tap' = tapping fingers along to sentence)
correct	Whether or not the sentence was correctly recalled

Question Optional

Research Question: After accounting for effects of sentence length, does the rhythmic tapping of fingers aid memory recall?

Fit an appropriate model to answer the research question.

Hints

our outcome is conceptually ‘memory recall’, and it’s been measured by “Whether or not a sentence was correctly recalled”. This is a binary variable.
we have multiple observations for each ?????
This will define our ( | ??? ) bit

From Log odds to odds ratios

Take some time to remind yourself from DAPR2 of the interpretation of logistic regression coefficients.

In family = binomial(link='logit'), we are modelling the log-odds. We can obtain estimates on this scale using:

fixef(model)
summary(model)$coefficients
tidy(model) from broom.mixed
(there are probably more ways, but I can’t think of them right now!)

We can use exp(), to get these back into odds and odds ratios.

Question Optional

Interpret each of the fixed effect estimates from your model.

Question Optional

Checking the assumptions in non-gaussian models in general (i.e. those where we set the family to some other error distribution) can be a bit tricky, and this is especially true for multilevel models.

For the logistic MLM, the standard assumptions of normality etc for our Level 1 residuals residuals(model) do not hold. However, it is still useful to quickly plot the residuals and check that $|residuals|\leq 2$ (or $|residuals|\leq 3$ if you’re more relaxed). We don’t need to worry too much about the pattern though.

While we’re more relaxed about Level 1 residuals, we do still want our random effects ranef(model) to look fairly normally distributed.

Plot the level 1 residuals and check whether any are greater than 3 in magnitude
Plot the random effects (the level 2 residuals) and assess the normality.

for beyond DAPR3

The HLMdiag package doesn’t support diagnosing influential points/clusters for glmer, but there is a package called influence.me which might help: https://journal.r-project.org/archive/2012/RJ-2012-011/RJ-2012-011.pdf
There are packages which aim to create more interpretable residual plots for these models via simulation, such as the DHARMa package: https://cran.r-project.org/web/packages/DHARMa/vignettes/DHARMa.html

Practice Datasets Weeks 4 and 5

Below are various datasets on which you can try out your new-found modelling skills. Read the descriptions carefully, keeping in mind the explanation of how the data is collected and the research question that motivates the study design.

Practice 1: Music and Driving

These data are simulated to represent data from a fake experiment, in which participants were asked to drive around a route in a 30mph zone. Each participant completed the route 3 times (i.e. “repeated measures”), but each time they were listening to different audio (either speech, classical music or rap music). Their average speed across the route was recorded. This is a fairly simple design, that we might use to ask “how is the type of audio being listened to associated with driving speeds?”

The data are available at https://uoepsy.github.io/data/drivingmusicwithin.csv.

variable	description
pid	Participant Identifier
speed	Avg Speed Driven on Route (mph)
music	Music listened to while driving (classical music / rap music / spoken word)

Practice 2: CBT and Stress

These data are simulated to represent data from 50 participants, each measured at 3 different time-points (pre, during, and post) on a measure of stress. Participants were randomly allocated such that half received some cognitive behavioural therapy (CBT) treatment, and half did not. This study is interested in assessing whether the two groups (control vs treatment) differ in how stress changes across the 3 time points.

The data are available at https://uoepsy.github.io/data/stressint.csv.

variable	description
ppt	Participant Identifier
stress	Stress (range 0 to 100)
time	Time (pre/post/during)
group	Whether participant is in the CBT group or control group

Practice 3: Erm.. I don’t believe you

These data are simulated to represent data from 30 participants who took part in an experiment designed to investigate whether fluency of speech influences how believable an utterance is perceived to be.

Each participant listened to the same 20 statements, with 10 being presented in fluent speech, and 10 being presented with a disfluency (an “erm, …”). Fluency of the statements was counterbalanced such that 15 participants heard statements 1 to 10 as fluent and 11 to 20 as disfluent, and the remaining 15 participants heard statements 1 to 10 as disfluent, and 11 to 20 as fluent. The order of the statements presented to each participant was random. Participants rated each statement on how believable it is on a scale of 0 to 100.

The data are available at https://uoepsy.github.io/data/erm_belief.csv.

variable	description
ppt	Participant Identifier
trial_n	Trial number
sentence	Statement identifier
condition	Condition (fluent v disfluent)
belief	belief rating (0-100)
statement	Statement

Practice 4: Cognitive Aging

These data are simulated to represent a large scale international study of cognitive aging, for which data from 17 research centers has been combined. The study team are interested in whether different cognitive domains have different trajectories as people age. Do all cognitive domains decline at the same rate? Do some decline more steeply, and some less? The literature suggests that scores on cognitive ability are predicted by educational attainment, so they would like to control for this.

Each of the 17 research centers recruited a minimum of 14 participants (Median = 21, Range 14-29) at age 45, and recorded their level of education (in years). Participants were then tested on 5 cognitive domains: processing speed, spatial visualisation, memory, reasoning, and vocabulary. Participants were contacted for follow-up on a further 9 occasions (resulting in 10 datapoints for each participant), and at every follow-up they were tested on the same 5 cognitive domains. Follow-ups were on average 3 years apart (Mean = 3, SD = 0.8).

The data are available at https://uoepsy.github.io/data/cogdecline.csv.

variable	description
cID	Center ID
pptID	Participant Identifier
educ	Educational attainment (years of education)
age	Age at visit (years)
processing_speed	Score on Processing Speed domain task
spatial_visualisation	Score on Spatial Visualisation domain task
memory	Score on Memory domain task
reasoning	Score on Reasoning domain task
vocabulary	Score on Vocabulary domain task

Footnotes

It’s always going to be debateable about what is ‘too high’ because in certain situations you might expect correlations close to 1. It’s best to think through whether it is a feasible value given the study itself↩︎

Formula	Alternative	Meaning
\(\text{(1 \| g)}\)	\(\text{1 + (1 \| g)}\)	Random intercept with fixed mean
\(\text{(1 \| g1/g2)}\)	\(\text{(1 \| g1) + (1 \| g1:g2)}\)	Intercept varying among \(g1\), and \(g2\) within \(g1\)
\(\text{(1 \| g1) + (1 \| g2)}\)	\(\text{1 + (1 \| g1) + (1 \| g2)}\)	Intercept varying among \(g1\) and \(g2\)
\(\text{x + (x \| g)}\)	\(\text{1 + x + (1 + x \| g)}\)	Correlated random intercept and slope
\(\text{x + (x \|\| g)}\)	\(\text{1 + x + (1 \| g) + (0 + x \| g)}\)	Uncorrelated random intercept and slope