DAPR3 Lab Exercises
  • Multi-level Models
    • W1: Regression Refresher
    • W2 Exercises: Introducing MLM
    • W3 Exercises: Nested and Crossed Structures
    • W4 Exercises: Centering
    • W5 Exercises: Bringing it all together

On this page

  • Psychoeducation Treatment Effects
  • Jokes
  • Extra: Test Enhanced Learning

W3 Exercises: Nested and Crossed Structures

Psychoeducation Treatment Effects

Data: gadeduc.csv

This is synthetic data from a randomised controlled trial, in which 30 therapists randomly assigned patients (each therapist saw between 2 and 28 patients) to a control or treatment group, and monitored their scores over time on a measure of generalised anxiety disorder (GAD7 - a 7 item questionnaire with 5 point likert scales).

The control group of patients received standard sessions offered by the therapists. For the treatment group, 10 mins of each sessions was replaced with a specific psychoeducational component, and patients were given relevant tasks to complete between each session. All patients had monthly therapy sessions. Generalised Anxiety Disorder was assessed at baseline and then every visit over 4 months of sessions (5 assessments in total).

The data are available at https://uoepsy.github.io/data/lmm_gadeduc.csv

You can find a data dictionary below:

Table 1: Data Dictionary: lmm_gadeduc.csv
variable description
patient A patient code in which the labels take the form <Therapist initials>_<group>_<patient number>.
visit_0 Score on the GAD7 at baseline
visit_1 GAD7 at 1 month assessment
visit_2 GAD7 at 2 month assessment
visit_3 GAD7 at 3 month assessment
visit_4 GAD7 at 4 month assessment
Question 1

Uh-oh… these data aren’t in the same shape as the other datasets we’ve been giving you..

Can you get it into a format that is ready for modelling?

Hints
  • It’s wide, and we want it long.
  • Once it’s long. “visit_0”, “visit_1”,.. needs to become the numbers 0, 1, …
  • One variable (patient) contains lots of information that we want to separate out. There’s a handy function in the tidyverse called separate(), check out the help docs!

Solution 1. Here’s the data. We have one row per patient, but we have multiple observations for each patient across the columns..

geduc = read_csv("https://uoepsy.github.io/data/lmm_gadeduc.csv")
head(geduc)
# A tibble: 6 × 6
  patient      visit_0 visit_1 visit_2 visit_3 visit_4
  <chr>          <dbl>   <dbl>   <dbl>   <dbl>   <dbl>
1 VC_Control_1      24      24      26      29      28
2 VC_Control_2      24      26      28      29      30
3 VC_Control_3      25      29      27      29      30
4 VC_Control_4      24      25      25      26      26
5 VC_Control_5      28      28      27      29      28
6 VC_Control_6      26      28      25      27      28

We can make it long by taking the all the columns from visit_0 to visit_4 (that is, from the second column 2 to the last column last_col()) and shoving their values into one variable called GAD, and keeping the name of the column they come from as another variable called visit:

geduc |> 
  pivot_longer(2:last_col(), names_to="visit",values_to="GAD")
# A tibble: 2,410 × 3
   patient      visit     GAD
   <chr>        <chr>   <dbl>
 1 VC_Control_1 visit_0    24
 2 VC_Control_1 visit_1    24
 3 VC_Control_1 visit_2    26
 4 VC_Control_1 visit_3    29
 5 VC_Control_1 visit_4    28
 6 VC_Control_2 visit_0    24
 7 VC_Control_2 visit_1    26
 8 VC_Control_2 visit_2    28
 9 VC_Control_2 visit_3    29
10 VC_Control_2 visit_4    30
# ℹ 2,400 more rows

This is step 1 of our data wrangling. In the next step, we’ll pipe the result of pivot_longer() into mutate().

In general, building up and running your data wrangling pipeline step by step, the way we’re illustrating here, is a good way to make sure each step of your code really is doing what you think it’s doing.

Solution 2. Now we know how to get our data long, we need to sort out our time variable (visit) and make it into numbers.
We can replace all occurrences of the string "visit_" in our data with nothingness "", and then convert what remains—the visit number—to numeric.

geduc |> 
  pivot_longer(2:last_col(), names_to="visit",values_to="GAD") |>
  mutate(
    visit = as.numeric(gsub("visit_","",visit))
  ) 
# A tibble: 2,410 × 3
   patient      visit   GAD
   <chr>        <dbl> <dbl>
 1 VC_Control_1     0    24
 2 VC_Control_1     1    24
 3 VC_Control_1     2    26
 4 VC_Control_1     3    29
 5 VC_Control_1     4    28
 6 VC_Control_2     0    24
 7 VC_Control_2     1    26
 8 VC_Control_2     2    28
 9 VC_Control_2     3    29
10 VC_Control_2     4    30
# ℹ 2,400 more rows

Solution 3. Finally, we need to sort out the patient variable. It contains 3 bits of information that we will want to have separated out. It has the therapist (their initials), then the group (treatment or control), and then the patient number. These are all separated by an underscore “_“.

The separate() function takes a column and separates it into several things (as many things as we give it), splitting them by some user defined separator such as an underscore:

geduc_long <- geduc |> 
  pivot_longer(2:last_col(), names_to="visit",values_to="GAD") |>
  mutate(
    visit = as.numeric(gsub("visit_","",visit))
  ) |>
  separate(patient, into=c("therapist","group","patient"), sep="_")

And we’re ready to go!

geduc_long
# A tibble: 2,410 × 5
   therapist group   patient visit   GAD
   <chr>     <chr>   <chr>   <dbl> <dbl>
 1 VC        Control 1           0    24
 2 VC        Control 1           1    24
 3 VC        Control 1           2    26
 4 VC        Control 1           3    29
 5 VC        Control 1           4    28
 6 VC        Control 2           0    24
 7 VC        Control 2           1    26
 8 VC        Control 2           2    28
 9 VC        Control 2           3    29
10 VC        Control 2           4    30
# ℹ 2,400 more rows

Question 2

Visualise the data. Does it look like the treatment had an effect over time? Does it look like the treatment worked when used by every therapist?

Hints
  • remember, stat_summary() is very useful for aggregating data inside a plot.

Solution 4. Here’s the overall picture. The average score on the GAD7 at each visit gets more and more different between the two groups. The treatment looks effective..

ggplot(geduc_long, aes(x = visit, y = GAD, col = group)) +
  stat_summary(geom="pointrange")

Let’s split this up by therapist, so we can see the averages across each therapist’s set of patients.
There’s clear variability between therapists in how well the treatment worked. For instance, the therapists EU and OD don’t seem to have much difference between their groups of patients.

ggplot(geduc_long, aes(x = visit, y = GAD, col = group)) +
  stat_summary(geom="pointrange") +
  facet_wrap(~therapist)

Question 3

Fit a model to test if the psychoeducational treatment is associated with greater improvement in anxiety over time.

Step 1: Choose the appropriate fixed effects.

Step 2: Think about the grouping structure in the data.

Step 3: Choose the appropriate random effects.

Note that the patient variable does not uniquely specify the individual patients. That is, patient “1” from therapist “AO” is a different person from patient “1” from therapist “BJ”.

Solution 5. We want to know if how anxiety (GAD) changes over time (visit) is different between treatment and control (group).

Hopefully this should hopefully come as no surprise1 - it’s an interaction!

lmer(GAD ~ visit * group + ...
       ...
     data = geduc_long)

Solution 6. We have multiple observations for each of the 482 patients, and those patients are nested within 30 therapists.

Note that in our data, the patient variable does not uniquely specify the individual patients. i.e. patient “1” from therapist “AO” is a different person from patient “1” from therapist “BJ”. To correctly group the observations into different patients (and not ‘patient numbers’), we need to have therapist:patient.

So we capture therapist-level differences in ( ... | therapist) and the patients-within-therapist-level differences in ( ... | therapist:patient):

lmer(GAD ~ visit * group + ...
       ( ... | therapist) + 
       ( ... | therapist:patient),
     data = geduc_long)

Solution 7. Note that each patient can change differently in their anxiety levels over time - i.e. the slope of visit could vary by participant.

Likewise, some therapists could have patients who change differently from patients from another therapist, so visit|therapist can be included.

Each patient is in one of the two groups - they’re either treatment or control. So we can’t say that “differences in anxiety due to treatment varies between patients”, because for any one patient the “difference in anxiety due to treatment” is not defined in our study design.

However, therapists see multiple different patients, some of which are in the treatment group, and some of which are in the control group. So the treatment effect could be different for different therapists!

mod1 <- lmer(GAD ~ visit*group + 
               (1+visit*group|therapist)+
               (1+visit|therapist:patient),
             geduc_long)

Solution 8. The question asked whether the psychoeducational treatment (in the variable group) is associated with greater improvement in anxiety (GAD) over time (visit).

Let’s take a look at the model summary to find out.

summary(mod1)
Linear mixed model fit by REML ['lmerMod']
Formula: GAD ~ visit * group + (1 + visit * group | therapist) + (1 +  
    visit | therapist:patient)
   Data: geduc_long

REML criterion at convergence: 8696

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-2.7051 -0.5341  0.0147  0.5412  2.9954 

Random effects:
 Groups            Name                 Variance Std.Dev. Corr             
 therapist:patient (Intercept)          1.448    1.203                     
                   visit                1.014    1.007    0.07             
 therapist         (Intercept)          1.602    1.266                     
                   visit                0.211    0.459    -0.09            
                   groupTreatment       0.082    0.286    -0.13  0.02      
                   visit:groupTreatment 0.154    0.392    -0.14 -0.20 -0.49
 Residual                               0.706    0.840                     
Number of obs: 2410, groups:  therapist:patient, 482; therapist, 30

Fixed effects:
                     Estimate Std. Error t value
(Intercept)           25.1910     0.2528   99.63
visit                 -0.5780     0.1120   -5.16
groupTreatment         0.0313     0.1377    0.23
visit:groupTreatment  -0.8375     0.1233   -6.79

Correlation of Fixed Effects:
            (Intr) visit  grpTrt
visit       -0.073              
groupTrtmnt -0.278  0.029       
vst:grpTrtm -0.067 -0.442 -0.160

The question focuses on the fixed effects, so let’s look at those coefficients and their t-values.

  • visit: We see a negative association between visit and GAD: the more therapy visits you have - when you are in the control condition - the less your anxiety. The t-value here is more extreme than 2 (which is, as a rule of thumb, approximately the t-value that represents the boundary of the 95% CI). So we likely have a significant negative association between time and anxiety. If this association is significant, that means we can reject the null hypothesis that there’s no change in anxiety over time.

  • groupTreatment: We see a very small positive association between group and GAD, but the error is much bigger than the estimate itself, and consequently, the t-value is also pretty close to 0. I doubt we can reject the null hypothesis that, on average, there’s no difference between the treatment and control groups at baseline (when visit = 0).

  • visit:groupTreatment: We see a negative coefficient for the interaction between visit and groupTreatment, accompanied by a fairly extreme t-value. We likely have a significant negative interaction. This means that we can reject the null hypothesis that there’s no difference between groups as time goes on. And the negative coefficient means that, as visits increase, the GAD of the treatment group decreases more than the GAD of the control group does. (Note: Interactions are REALLY hard to interpret just based on model coefficients. The best way to interpret them is to look at plots of the data.)

To see whether our guesses about significance based on the t-values were on the right track, we can re-fit the model using lmerTest.

mod1_test <- lmerTest::lmer(
  GAD ~ visit*group + 
    (1+visit*group|therapist) +
    (1+visit|therapist:patient),
  geduc_long
)
summary(mod1_test)
Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: GAD ~ visit * group + (1 + visit * group | therapist) + (1 +  
    visit | therapist:patient)
   Data: geduc_long

REML criterion at convergence: 8696

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-2.7051 -0.5341  0.0147  0.5412  2.9954 

Random effects:
 Groups            Name                 Variance Std.Dev. Corr             
 therapist:patient (Intercept)          1.448    1.203                     
                   visit                1.014    1.007    0.07             
 therapist         (Intercept)          1.602    1.266                     
                   visit                0.211    0.459    -0.09            
                   groupTreatment       0.082    0.286    -0.13  0.02      
                   visit:groupTreatment 0.154    0.392    -0.14 -0.20 -0.49
 Residual                               0.706    0.840                     
Number of obs: 2410, groups:  therapist:patient, 482; therapist, 30

Fixed effects:
                     Estimate Std. Error      df t value Pr(>|t|)    
(Intercept)           25.1910     0.2528 28.7236   99.63  < 2e-16 ***
visit                 -0.5780     0.1120 25.7026   -5.16  2.3e-05 ***
groupTreatment         0.0313     0.1377 21.4424    0.23     0.82    
visit:groupTreatment  -0.8375     0.1233 18.8876   -6.79  1.8e-06 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) visit  grpTrt
visit       -0.073              
groupTrtmnt -0.278  0.029       
vst:grpTrtm -0.067 -0.442 -0.160

Yes, the effects deemed “significant” based on Satterthwaite’s method are the ones we expected.

So yes, it looks like the treatment group does improve more over time, compared to the control group, taking into account all the variability introduced by individual patients and therapists.

Question 4

For each of the models below, what is wrong with the random effect structure?

modelA <- lmer(GAD ~ visit*group + 
               (1+visit*group|therapist)+
               (1+visit|patient),
             geduc_long)
modelB <- lmer(GAD ~ visit*group + 
               (1+visit*group|therapist/patient),
             geduc_long)

Solution 9.

modelA <- lmer(GAD ~ visit*group + 
               (1+visit*group|therapist)+
               (1+visit|patient),
             geduc_long)

The patient variable doesn’t capture the different patients within therapists, so this actually fits crossed random effects and treats all data where patient==1 as from the same group (even if this includes several different patients’ worth of data from different therapists!)

modelB <- lmer(GAD ~ visit*group + 
               (1+visit*group|therapist/patient),
             geduc_long)

Using the / here means we have the same random slopes fitted for therapists and for patients-within-therapists.

Concretely, (1+visit*group|therapist/patient) is shorthand for (1 + visit*group | therapist) (which is fine) + (1 + visit*group | patient:therapist) (which is not fine, because we don’t have data for every participant for both groups.

In other words, the effect of group can’t vary by patient, so this doesn’t work, hence why we need to split them up into (...|therapist)+(...|therapist:patient).

Question 5

Let’s suppose that I don’t want the psychoeducation treatment, I just want the standard therapy sessions that the ‘Control’ group received. Which therapist should I go to?

Hints

You don’t need to fit a new model here, you can use the one you fitted above.

ranef() and dotplot.ranef.mer() will help! You can read about ranef in Chapter 2 #making-model-predictions.

Solution 10. The three best therapists to go to are SZ, AO, or IT.

Why?

  • We said we don’t care about getting the treatment, so we can ignore the parameters groupTreatment and visit:groupTreatment.
  • (Intercept) doesn’t tell us about the effect that each therapist has over time, just how good they’re estimated to be at visit 0.
  • visit is the parameter that tells us about how anxiety changes over time. And because lower GAD is better, we want scores to decrease over time.

So we can look at which therapists have the most negative slope estimated for visit:

ranef(mod1)$therapist |>
  arrange(visit)
   (Intercept)   visit groupTreatment visit:groupTreatment
SZ      0.9726 -0.6888        0.24166             -0.22320
AO      1.1433 -0.6187        0.12344             -0.36512
IT     -1.1505 -0.5899       -0.07236              0.31511
YS     -0.4870 -0.5669       -0.06644             -0.03350
GW      1.2185 -0.4012        0.08380             -0.22423
YF      1.6864 -0.3062       -0.21447              0.27606
BT      0.3180 -0.2006        0.02727             -0.27354
CX     -2.0326 -0.1830        0.07968              0.04233
YE     -0.0753 -0.1195       -0.11639              0.37761
BJ      0.4484 -0.1189        0.13246             -0.17041
OD      0.3721 -0.1095       -0.00822              0.01029
LI      1.2295 -0.1060        0.05089             -0.30193
WB     -1.3593 -0.0762        0.12760             -0.22965
XA     -0.5062 -0.0623       -0.03204              0.28860
OI     -0.2767 -0.0181       -0.21767              0.33782
TV     -1.9303  0.0405       -0.04086              0.11940
DF     -0.4995  0.0754        0.12759              0.03544
DJ     -1.4707  0.0854        0.06194              0.02057
PM      0.2754  0.1224       -0.02785              0.04959
MV      0.9641  0.1595       -0.01262              0.15925
OE     -0.2371  0.1873        0.13168              0.01462
RW      0.3232  0.2004        0.10068             -0.31089
LO      1.8762  0.2146       -0.05419              0.00335
MY     -0.5791  0.2166       -0.02693              0.08901
KD      0.4017  0.2669        0.01437             -0.36765
EU     -1.3717  0.3149       -0.03285              0.26114
KI      2.7613  0.3277       -0.29719              0.17487
CS     -1.2525  0.4870       -0.00583             -0.00779
XQ     -1.3365  0.5703       -0.27314              0.33068
VC      0.5744  0.8969        0.19601             -0.39784

Double check using the plot:

dotplot.ranef.mer(ranef(mod1))$therapist

Question 6

Recreate this plot.

The faint lines represent the model estimated lines for each patient. The points and ranges represent our fixed effect estimates and their uncertainty.

Make sure you’re plotting model estimates, not the raw data.

Hints
  • you can get the patient-specific lines using augment() from the broom.mixed package, and the fixed effects estimates using effect() from the effects package.
  • remember that the “patient” column doesn’t group observations into unique patients.
  • remember you can pull multiple datasets into ggplot:
ggplot(data = dataset1, aes(x=x,y=y)) + 
  geom_point() + # points from dataset1
  geom_line(data = dataset2) # lines from dataset2
  • see more in Chapter 2 #visualising-models

Solution 11. The effects package will give us the fixed effect estimates:

library(effects)
library(broom.mixed)
effplot <- effect("visit*group",mod1) |>
  as.data.frame()

We want to get the fitted values for each patient. We can get fitted values using augment(). But the patient variable doesn’t capture the unique patients, it just captures their numbers (which aren’t unique to each therapist).
So we can create a new column called upatient which pastes together the therapists initials and the patient numbers

augment(mod1) |> 
  mutate(
    upatient = paste0(therapist,patient),
    .after = patient # place the column next to the patient col
  )
# A tibble: 2,410 × 17
     GAD visit group   therapist patient upatient .fitted .resid  .hat .cooksd
   <dbl> <dbl> <fct>   <fct>     <fct>   <chr>      <dbl>  <dbl> <dbl>   <dbl>
 1    24     0 Control VC        1       VC1         24.2 -0.198 0.454 0.0210 
 2    24     1 Control VC        1       VC1         25.3 -1.28  0.239 0.239  
 3    26     2 Control VC        1       VC1         26.4 -0.360 0.186 0.0128 
 4    29     3 Control VC        1       VC1         27.4  1.56  0.294 0.508  
 5    28     4 Control VC        1       VC1         28.5 -0.522 0.563 0.284  
 6    24     0 Control VC        2       VC2         24.8 -0.843 0.454 0.383  
 7    26     1 Control VC        2       VC2         26.2 -0.171 0.239 0.00426
 8    28     2 Control VC        2       VC2         27.5  0.502 0.186 0.0250 
 9    29     3 Control VC        2       VC2         28.8  0.174 0.294 0.00633
10    30     4 Control VC        2       VC2         30.2 -0.153 0.563 0.0246 
# ℹ 2,400 more rows
# ℹ 7 more variables: .fixed <dbl>, .mu <dbl>, .offset <dbl>, .sqrtXwt <dbl>,
#   .sqrtrwt <dbl>, .weights <dbl>, .wtres <dbl>

Solution 12.

library(effects)
library(broom.mixed)
effplot <- effect("visit*group",mod1) |>
  as.data.frame()

augment(mod1) |> 
  mutate(
    upatient = paste0(therapist,patient),
    .after = patient # place the column next to the patient col
  ) |>
  ggplot(aes(x=visit,y=.fitted,col=group))+
  stat_summary(geom="line", aes(group=upatient,col=group), alpha=.1)+
  geom_pointrange(data=effplot, aes(y=fit,ymin=lower,ymax=upper,col=group))+
  labs(x="- Month -",y="GAD7")


Jokes

Data: lmm_laughs.csv

These data are simulated to imitate an experiment that investigates the effect of visual non-verbal communication (i.e. gestures, facial expressions) on joke appreciation. 90 participants took part in the experiment, in which they each rated how funny they found a set of 30 jokes. For each participant, the order of these 30 jokes was randomised for each run of the experiment. For each participant, the set of jokes was randomly split into two halves, with the first half being presented in audio-only, and the second half being presented in audio and video. This meant that each participant saw 15 jokes with video and 15 without, and each joke would be presented with video roughly half of the time.

The researchers want to investigate whether the delivery (audio/audiovideo) of jokes is associated with differences in humour-ratings.

Data are available at https://uoepsy.github.io/data/lmm_laughs.csv

Table 2: Data Dictionary: lmm_laughs.csv
variable description
ppt Participant Identification Number
joke_label The text of the joke
joke_id Joke Identification Number
delivery Experimental manipulation: whether joke was presented in audio-only ('audio') or in audiovideo ('video')
rating Humour rating chosen on a slider from 0 to 100
Question 7

Don’t look at the actual data yet!

Even before getting hold of any data, we should be able to write out the structure of our ideal “maximal” model based only on the information above.

Can you do so?

Hints

Don’t know where to start? Try following the steps in Chapter 8 #maximal-model.

Solution 13. We want to estimate the effect of delivery on humour rating of jokes:
rating ~ delivery

We have 30 observations for each participant. Participants are just another sampling unit here.
rating ~ delivery + (1 | ppt)

We have 90 observations for each joke. We’re not interested in specific jokes here, so we can think of these as a random set of experimental items that we might choose differently next time we conduct an experiment to assess delivery~rating: rating ~ delivery + (1 | ppt) + (1 | joke_id)

Participants each see 15 jokes without video, and 15 with. The delivery variable is “within” participant. Some participants might respond a lot to having video whereas some might not rate jokes any differently. The effect of delivery on rating might be vary by participant:
rating ~ delivery + (1 + delivery | ppt) + (1 | joke_id)

Each joke is presented both with and without the video. Some jokes might really benefit from gestures and facial expressions, whereas some might not. The effect of delivery on rating might be vary by joke:
rating ~ delivery + (1 + delivery | ppt) + (1 + delivery | joke_id)

Question 8

Read in and clean the data (if necessary).

Create some plots showing:

  1. the average rating for audio vs audio+video for each joke
  2. the average rating for audio vs audio+video for each participant
Hints
  • you could use facet_wrap, or even stat_summary!
  • you might want to use joke_id, rather than joke_label (the labels are very long!)

Solution 14. Here is one using facet_wrap:

ggplot(laughs, aes(x = delivery, y = rating)) +
  geom_boxplot()+
  facet_wrap(~joke_id)

And one using stat_summary() for participants:

ggplot(laughs, aes(x = delivery, y = rating)) +
  stat_summary(geom="pointrange", aes(group = ppt),
               position = position_dodge(width=.2))+
  stat_summary(geom="line", aes(group = ppt),
               position = position_dodge(width=.2))

Question 9

Fit an appropriate model to address the research aims of the study.

Hints

This should be the one you came up with a couple of questions ago!

Solution 15.

mod <- lmer(rating ~ delivery + 
              (1 + delivery | joke_id) +
              (1 + delivery| ppt), data = laughs)

Question 10

Which joke is funniest when presented just in audio? For which joke does the video make the most difference to ratings?

Hints

These can all be answered by examining the random effects with ranef().
See Chapter 2 #making-model-predictions.

If you’re using joke_id, can you find out the actual joke that these correspond to?

Solution 16. Which joke is funniest in audio?

Audio is the reference level of delivery, so we only need to look at the intercept adjustments.

ranef(mod)$joke_id |>
  arrange(desc(`(Intercept)`))
   (Intercept) deliveryvideo
19       4.227        1.5179
9        3.384        0.3888
27       3.317        2.4125
17       2.776        1.3163
20       2.291        2.8854
10       1.761       -1.2474
29       1.232       -1.5029
2        1.066       -1.0460
25       0.963        3.2455
28       0.807        3.3737
23       0.755        0.0733
22       0.672       -0.9136
26       0.616       -1.2609
8        0.383        1.3332
24       0.248        1.5378
21      -0.157        1.6145
5       -0.190        0.1775
13      -0.606        2.9318
18      -0.760       -0.6693
6       -0.810        1.0232
7       -1.295       -0.2572
12      -1.327       -0.4446
1       -1.433       -1.8001
30      -1.872       -3.9547
3       -1.955        0.4889
4       -1.999       -1.0400
16      -2.468       -4.1913
15      -2.497       -1.2938
14      -3.386       -1.4002
11      -3.743       -3.2984
dotplot.ranef.mer(ranef(mod))$joke_id

Joke 19 is the funniest apparently!

Lots of ways to find what the joke actually is. Here is one way:

laughs |> count(joke_id, joke_label) |>
  filter(joke_id==19) |>
  pull(joke_label)
[1] "How many psychiatrists does it take to change a lightbulb? Just one, but the lightbulb really has to want to change."

(not sure I agree)

To find out which joke benefits most from video, we should look at the by-joke adjustments to the slope over delivery.

We can see the biggest slope adjustment in the plot above, which shows us that Joke 28 has the most benefit of video. We can also quickly check this with something like:

ranef(mod)$joke_id |>
  filter(deliveryvideo == max(deliveryvideo))
   (Intercept) deliveryvideo
28       0.807          3.37

Which joke is it?

laughs |> count(joke_id, joke_label) |>
  filter(joke_id==28) |>
  pull(joke_label)
[1] "An Alsatian went to a telegram office, took out a blank form and wrote:\n\"Woof. Woof. Woof. Woof. Woof. Woof. Woof. Woof. Woof.\"\nThe clerk examined the paper and politely told the dog: \"There are only nine\nwords here. You could send another \x91Woof' for the same price.\"\n\"But,\" the dog replied, \"that would make no sense at all.\""

The joke itself is a bit weird… but I can imagine that the video really helped!

Question 11

Do jokes that are rated funnier when presented in audio-only tend to also benefit more from the addition of video?

Hints

Think careful about this question. The random effects show us that jokes vary in their intercepts (ratings in audio-only) and in their effects of delivery (the random slopes). We want to know if these are related… one might even say … co-related …

Solution 17.

VarCorr(mod)
 Groups   Name          Std.Dev. Corr
 ppt      (Intercept)   3.51         
          deliveryvideo 1.78     0.00
 joke_id  (Intercept)   2.15         
          deliveryvideo 2.24     0.39
 Residual               5.83         

It’s the correlation of joke_id’s (Intercept) and deliveryvideo adjustments here that addresses the question. The correlation is 0.39, so this tells us that jokes rated higher when presented as audio-only tend to have a bigger effect of the video.

We can see this in a plot if we like. Here every dot is a joke, and the x-axis shows whether it is above or below the average rating for audio-only (the intercept). The average rating for audio-only is represented as 0 on the x axis, so if the rating is above-average, the value is positive; if the rating is below-average, the value is negative.

The y-axis shows whether it is above or below the average effect of video. The average effect of video is represented as 0 on the y axis, so if the effect is above-average, the value is positive; if the effect is below-average, the value is negative.

plot(ranef(mod)$joke_id)

Question 12

Create a plot of the estimated effect of video on humour ratings. Try to plot not only the fixed effects, but the raw data too.

Hints

See e.g. Chapter 2 #visualising-models

Solution 18. Many ways! There’s no one correct answer.

library(effects)

# get the model's predictions
plotdatf <- effect("delivery",mod) |>
  as.data.frame()

Josiah’s version:

ggplot(data = laughs, aes(x = delivery)) +
  geom_jitter(aes(y = rating), width = .1, height = 0, alpha = .1) +
  geom_pointrange(data = plotdatf,
                  aes(y = fit, ymin = lower, ymax = upper),
                  position=position_nudge(x=.2))

Elizabeth’s version:

laughs |>
  ggplot(aes(x = delivery, y = rating, colour = delivery, fill = delivery)) +
  geom_violin(alpha = 0.2) +
  geom_jitter(alpha = 0.2) +
  geom_pointrange(data = plotdatf, aes(y = fit, ymin = lower, ymax = upper), colour = 'black') +
  theme(legend.position = 'none') 


Extra: Test Enhanced Learning

Data: Test-enhanced learning

An experiment was run to conceptually replicate “test-enhanced learning” (Roediger & Karpicke, 2006): two groups of 25 participants were presented with material to learn. One group studied the material twice (StudyStudy), the other group studied the material once then did a test (StudyTest). Recall was tested immediately (one minute) after the learning session and one week later. The recall tests were composed of 175 items identified by a keyword (Test_word).

The critical (replication) prediction is that the StudyStudy group perform better on the immediate test, but the StudyTest group will retain the material better and thus perform better on the 1-week follow-up test.

Test performance is measured as the speed taken to correctly recall a given word.

The following code loads the data into your R environment by creating a variable called tel:

load(url("https://uoepsy.github.io/data/testenhancedlearning.RData"))
Table 3: Data Dictionary: testenhancedlearning.Rdata
variable description
Subject_ID Unique Participant Identifier
Group Group denoting whether the participant studied the material twice (StudyStudy), or studied it once then did a test (StudyTest)
Delay Time of recall test ('min' = Immediate, 'week' = One week later)
Test_word Word being recalled (175 different test words)
Correct Whether or not the word was correctly recalled
Rtime Time to recall word (milliseconds)
Question 13

Here is the beginning of our modelling.

Code
# load in the data
load(url("https://uoepsy.github.io/data/testenhancedlearning.RData"))

# performance is measured by time taken to *correctly*
# recall a word.
# so we're going to have to discard all the incorrects:
tel <- tel |> filter(Correct == 1)

# preliminary plot
# makes it look like studytest are better at immediate (contrary to prediction)
# both groups get slower from immediate > week, 
# but studystudy slows more than studytest
ggplot(tel, aes(x = Delay, y = Rtime, col = Group)) + 
  stat_summary(geom="pointrange")

mmod <- lmer(Rtime ~ Delay*Group +
             (1 + Delay | Subject_ID) +
             (1 + Delay * Group | Test_word),
             data=tel)

Solution 19.

This is what I did. You might do something else!

First I removed the interaction from the random effects

mod1 <- lmer(Rtime ~ Delay*Group +
             (1 + Delay | Subject_ID) +
             (1 + Delay + Group | Test_word),
             data=tel)
boundary (singular) fit: see help('isSingular')

This model is a singular fit, suggesting it needs further simplification. The variance covariance matrix of the random effects isn’t giving us many pointers..

# examine vcov
VarCorr(mod1)
 Groups     Name           Std.Dev. Corr       
 Test_word  (Intercept)     21.5               
            Delayweek       14.5     0.17      
            GroupStudyTest  23.4    -0.77 -0.76
 Subject_ID (Intercept)     27.1               
            Delayweek       10.2    0.01       
 Residual                  240.1               

There are various things we could try here. See Chapter 8 #simplifying-random-effect-structures for some of the more in-depth options.

However, sometimes it is simplest just to trial & error the removal of different possible terms. Here we are removing Delay|Test_word and removing Delay|Subject_ID:

mod2a <- lmer(Rtime ~ Delay*Group +
             (1 + Delay | Subject_ID) +
             (1 + Group | Test_word),
             data=tel)
mod2b <- lmer(Rtime ~ Delay*Group +
             (1 | Subject_ID) +
             (1 + Delay + Group | Test_word),
             data=tel)
boundary (singular) fit: see help('isSingular')

The second model is a singular fit, but the first one is not. Just for safety, let’s check:

isSingular(mod2a)
[1] FALSE

All looks good there.

Sometimes it can be useful to check how estimates of fixed effects and their standard errors differ across possible candidate models with different random effect structures. More often than not, this simply provides us with reassurance that the removal of random effects hasn’t actually had too much of an impact on anything we’re going to conduct inferences on. If they differ a lot, then we have a lot more to discuss!

Here are the fixed effects from each model:

term mod1 mod2a mod2b
(Intercept) 740.55 (7.17) 740.57 (7.21) 740.69 (7.23)
Delayweek 27.65 (6.97) 27.64 (6.87) 27.23 (6.64)
GroupStudyTest -31.82 (10.26) -31.75 (10.23) -31.73 (10.35)
Delayweek:GroupStudyTest -17.2 (9.69) -17.26 (9.7) -17.18 (9.19)

In all these models, the fixed effects estimates are all pretty similar, suggesting that they’ve all found similar estimates of these parameters which have been largely invariant to our refinement of the random effects. This makes me feel better - there’s less worry that our final conclusions are going to be influenced by specifics of incl/exclusion of one of these random effect terms.

I would definitely settle on mod2a because that is the one that converges, but we can add a footnote if we wanted, to mention that mod2b finds the same pattern of results.

Footnotes

  1. if it does, head back to where we learned about interactions in the single level regressions lm(). It’s just the same here.↩︎