Data Analysis for Psychology in R 3
Psychology, PPLS
University of Edinburgh
multilevel modelling working with group structured data |
regression refresher |
introducing multilevel models | |
more complex groupings | |
centering, assumptions, and diagnostics | |
recap | |
factor analysis working with multi-item measures |
what is a psychometric test? |
using composite scores to simplify data (PCA) | |
uncovering underlying constructs (EFA) | |
more EFA | |
recap |
How do reaction times change with increasing sleep deprivation? Participants measured over 10 days of sleep deprivation.
In 2010 A US state’s commissioner for education was faced with growing community concern about rising levels of adolescent antisocial behaviours.
After a series of focus groups, the commissioner approved the trialing of an intervention in which yearly Parent Management Training (PMT) group sessions were offered to the parents of a cohort of students entering 10 different high schools. Every year, the parents were asked to fill out an informant-based version of the Aggressive Behaviour Scale (ABS), measuring verbal and physical abuse, socially inappropriate behavior, and resisting care. Where possible, the same parents were followed up throughout the child’s progression through high school. Alongside this, parents from a cohort of students entering 10 further high schools in the state were recruited to also complete the same informant-based ABS, but were not offered the PMT group sessions.
The commissioner has two main questions: Does the presentation of aggressive behaviours increase as children enter the secondary school system? If so, is there any evidence for the effectiveness of Parent Management Training (PMT) group sessions in curbing the rise of aggressive behaviors during a child’s transition into the secondary school system?
The data is available at https://uoepsy.github.io/data/abs_intervention.csv
variable | description |
---|---|
schoolid | Name of school |
ppt | Participant number |
age | Age of participant (in years) at observation |
interv | Informant-based Aggressive Behaviour Scale (ABS) score (range 0 to 100) |
ABS | Whether or not the school was part of the intervention group in which Parent Management Training (PMT) group sessions were offered to the parents |
# A tibble: 6 × 5
schoolid ppt age interv ABS
<chr> <dbl> <dbl> <fct> <dbl>
1 Blue River High School 1 12 1 47.5
2 Blue River High School 1 13 1 47.4
3 Blue River High School 1 14 1 53.0
4 Blue River High School 1 15 1 70.7
5 Blue River High School 1 16 1 66.3
6 Blue River High School 1 17 1 NA
Does the presentation of aggressive behaviours increase as children enter the secondary school system? If so, is there any evidence for the effectiveness of Parent Management Training (PMT) group sessions in curbing the rise of aggressive behaviors during a child’s transition into the secondary school system?
(g)lmer(outcome ~ fixed effects + (random effects | grouping structure), family = …)
Does the presentation of aggressive behaviours increase as children enter the secondary school system? If so, is there any evidence for the effectiveness of Parent Management Training (PMT) group sessions in curbing the rise of aggressive behaviors during a child’s transition into the secondary school system?
(g)lmer(outcome ~ fixed effects + (random effects | grouping structure), family = …)
What are we interested in explaining/predicting?
Does the presentation of aggressive behaviours increase as children enter the secondary school system? If so, is there any evidence for the effectiveness of Parent Management Training (PMT) group sessions in curbing the rise of aggressive behaviors during a child’s transition into the secondary school system?
(g)lmer(outcome ~ fixed effects + (random effects | grouping structure), family = …)
What are we interested in explaining/predicting?
What variables are we interested in explaining this by?
schoolid ppt age interv ABS
Blue River High School 1 12 1 47.5
Blue River High School 1 13 1 47.4
Blue River High School 1 14 1 53
Blue River High School 1 15 1 70.7
... ... ... ... ...
Does the presentation of aggressive behaviours increase as children enter the secondary school system? If so, is there any evidence for the effectiveness of Parent Management Training (PMT) group sessions in curbing the rise of aggressive behaviors during a child’s transition into the secondary school system?
(g)lmer(outcome ~ fixed effects + (random effects | grouping structure), family = …)
Are our questions about the effects of our predictors specifically in reference to group means of predictors?
“the effect of being higher on \(x\) for a group”
“the effect of a group being on average higher on \(x\)”
Does the presentation of aggressive behaviours increase as children enter the secondary school system? If so, is there any evidence for the effectiveness of Parent Management Training (PMT) group sessions in curbing the rise of aggressive behaviors during a child’s transition into the secondary school system?
(g)lmer(outcome ~ fixed effects + (random effects | grouping structure), family = …)
schoolid ppt age interv ABS
Blue River High School 1 12 1 47.5
Blue River High School 1 13 1 47.4
Blue River High School 1 14 1 53
Blue River High School 1 15 1 70.7
... ... ... ... ...
Does the presentation of aggressive behaviours increase as children enter the secondary school system? If so, is there any evidence for the effectiveness of Parent Management Training (PMT) group sessions in curbing the rise of aggressive behaviors during a child’s transition into the secondary school system?
(g)lmer(outcome ~ fixed effects + (random effects | grouping structure), family = …)
schoolid ppt age interv ABS
Blue River High School 1 12 1 47.5
Blue River High School 1 13 1 47.4
Blue River High School 1 14 1 53
Blue River High School 1 15 1 70.7
... ... ... ... ...
Does the presentation of aggressive behaviours increase as children enter the secondary school system? If so, is there any evidence for the effectiveness of Parent Management Training (PMT) group sessions in curbing the rise of aggressive behaviors during a child’s transition into the secondary school system?
(g)lmer(outcome ~ fixed effects + (random effects | grouping structure), family = …)
schoolid ppt age interv ABS
Blue River High School 1 12 1 47.5
Blue River High School 1 13 1 47.4
Blue River High School 1 14 1 53
Blue River High School 1 15 1 70.7
... ... ... ... ...
Does the presentation of aggressive behaviours increase as children enter the secondary school system? If so, is there any evidence for the effectiveness of Parent Management Training (PMT) group sessions in curbing the rise of aggressive behaviors during a child’s transition into the secondary school system?
(g)lmer(outcome ~ fixed effects + (random effects | grouping structure), family = …)
Of the different ways we can group our data, which groupings do we think of as a random sample from a general population of groups?
Does the presentation of aggressive behaviours increase as children enter the secondary school system? If so, is there any evidence for the effectiveness of Parent Management Training (PMT) group sessions in curbing the rise of aggressive behaviors during a child’s transition into the secondary school system?
(g)lmer(outcome ~ fixed effects + (random effects | grouping structure), family = …)
Of the different ways we can group our data, which groupings do we think of as a random sample from a general population of groups?
Does the presentation of aggressive behaviours increase as children enter the secondary school system? If so, is there any evidence for the effectiveness of Parent Management Training (PMT) group sessions in curbing the rise of aggressive behaviors during a child’s transition into the secondary school system?
(g)lmer(outcome ~ fixed effects + (random effects | grouping structure), family = …)
Which of our fixed effects can vary across our groupings?
“does a single group have multiple distinct values for \(x\)?”
“for the data from only one of our groups, can we estimate \(y \sim x\)?”
All the data:
Data from School == “Central High”:
Does the presentation of aggressive behaviours increase as children enter the secondary school system? If so, is there any evidence for the effectiveness of Parent Management Training (PMT) group sessions in curbing the rise of aggressive behaviors during a child’s transition into the secondary school system?
(g)lmer(outcome ~ fixed effects + (random effects | grouping structure), family = …)
Which of our fixed effects can vary across our groupings?
“does a single group have multiple distinct values for \(x\)?”
“for the data from only one of our groups, can we estimate \(y \sim x\)?”
All the data:
Data from School == “Central High”, Participant == “1”:
Does the presentation of aggressive behaviours increase as children enter the secondary school system? If so, is there any evidence for the effectiveness of Parent Management Training (PMT) group sessions in curbing the rise of aggressive behaviors during a child’s transition into the secondary school system?
(g)lmer(outcome ~ fixed effects + (random effects | grouping structure), family = …)
lmer(ABS ~ age * interv + (1 + age | schoolid ) +
(1 + age | schoolid:ppt )
Does the presentation of aggressive behaviours increase as children enter the secondary school system? If so, is there any evidence for the effectiveness of Parent Management Training (PMT) group sessions in curbing the rise of aggressive behaviors during a child’s transition into the secondary school system?
(g)lmer(outcome ~ fixed effects + (random effects | grouping structure), family = …)
lmer(ABS ~ ageC * interv + (1 + ageC | schoolid ) +
(1 + ageC | schoolid:ppt )
schoolid | ppt | age | interv | ABS | ageC |
---|---|---|---|---|---|
Blue River High School | 1 | 12 | 1 | 47.5 | 0 |
Blue River High School | 1 | 13 | 1 | 47.4 | 1 |
Blue River High School | 1 | 14 | 1 | 53 | 2 |
Blue River High School | 1 | 15 | 1 | 70.7 | 3 |
… | … | … | … | … | … |
Do tests if you want, but beware. Multilevel data tends to have bigger \(n\), and these tests are overly sensitive.
[1] "Shapiro-Wilk normality test resid(absmod) W=1, p=0.98"
[2] "Shapiro-Wilk normality test ranef(absmod)$schoolid$`(Intercept)` W=0.97, p=0.69"
[3] "Shapiro-Wilk normality test ranef(absmod)$schoolid$ageC W=0.95, p=0.34"
[4] "Shapiro-Wilk normality test ranef(absmod)$`schoolid:ppt`$`(Intercept)` W=1, p=0.95"
[5] "Shapiro-Wilk normality test ranef(absmod)$`schoolid:ppt`$ageC W=0.99, p=0.44"
Does the presentation of aggressive behaviours increase as children enter the secondary school system? If so, is there any evidence for the effectiveness of Parent Management Training (PMT) group sessions in curbing the rise of aggressive behaviors during a child’s transition into the secondary school system?
Options
Model comparison
Parameters
Methods
lmerTest::lmer()
and pbkrtest::SATmodcomp(model2,model1)
anova(model1, model2, ...)
Does the presentation of aggressive behaviours increase as children enter the secondary school system? If so, is there any evidence for the effectiveness of Parent Management Training (PMT) group sessions in curbing the rise of aggressive behaviors during a child’s transition into the secondary school system?
Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula:
ABS ~ ageC * interv + (1 + ageC | schoolid) + (1 + ageC | schoolid:ppt)
Data: absint
REML criterion at convergence: 7132
Scaled residuals:
Min 1Q Median 3Q Max
-3.0748 -0.5767 0.0061 0.5712 3.0350
Random effects:
Groups Name Variance Std.Dev. Corr
schoolid:ppt (Intercept) 17.379 4.169
ageC 4.913 2.217 0.69
schoolid (Intercept) 2.505 1.583
ageC 0.448 0.669 0.26
Residual 49.465 7.033
Number of obs: 992, groups: schoolid:ppt, 196; schoolid, 20
Fixed effects:
Estimate Std. Error df t value Pr(>|t|)
(Intercept) 46.461 0.862 17.914 53.88 <2e-16 ***
ageC 1.673 0.385 20.361 4.34 0.0003 ***
interv1 1.096 1.207 16.818 0.91 0.3766
ageC:interv1 -1.352 0.543 19.560 -2.49 0.0218 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Correlation of Fixed Effects:
(Intr) ageC intrv1
ageC 0.003
interv1 -0.714 -0.002
ageC:intrv1 -0.002 -0.709 -0.003
Don’t just ignore. They add context to results
e.g. they can give a descriptive answer to “should we expect all children get more aggressive in secondary school?”
often it’s about the amount of variability relative to the corresponding fixed effect.
Don’t just ignore. They add context to results
Data cleaning outlier/data removal, transformations prior to analysis.
Unplanned transformations and data removal which are carried out in order to meet assumptions.
Specify all fixed effects (explanatory variables & covariates).
Link them to explicitly stated research questions/hypotheses.
Explicitly state the hierarchical structure of the data and of the model.
Specify random effects according to the sampling units (schools/children etc) with which they interact.
Planned structure of random effects to be fitted
Procedure to be used to decide on final random effect structure.
State clearly the relevant test/comparison/parameter estimate of interest.
Link to explicitly stated research questions/hypotheses.
Model fitting (strategy? results? ??)
Software packages and versions used to fit the model(s), along with the estimation method (ML/REML) and optimiser used.
If proposed model failed to converge:
Results
For final model:
all parameter estimates for fixed effects.
random effects
Tables help a lot!
But they are not a substitute for interpretation
ABS | ||||
---|---|---|---|---|
Predictors | Estimates | Statistic | p | df |
(Intercept) | 46.46 | 53.88 | <0.001 | 17.91 |
ageC | 1.67 | 4.34 | <0.001 | 20.36 |
interv [1] | 1.10 | 0.91 | 0.377 | 16.82 |
ageC × interv [1] | -1.35 | -2.49 | 0.022 | 19.56 |
Random Effects | ||||
σ2 | 49.47 | |||
τ00 schoolid:ppt | 17.38 | |||
τ00 schoolid | 2.50 | |||
τ11 schoolid:ppt.ageC | 4.91 | |||
τ11 schoolid.ageC | 0.45 | |||
ρ01 schoolid:ppt | 0.69 | |||
ρ01 schoolid | 0.26 | |||
ICC | 0.64 | |||
N schoolid | 20 | |||
N ppt | 16 | |||
Observations | 992 | |||
Marginal R2 / Conditional R2 | 0.034 / 0.650 |
(Intercept) ageC interv1 ageC:interv1
46.46 1.67 1.10 -1.35
Groups Name Std.Dev. Corr
schoolid:ppt (Intercept) 4.169
ageC 2.217 0.69
schoolid (Intercept) 1.583
ageC 0.669 0.26
Residual 7.033
library(effects)
library(broom.mixed)
efplot <- effect("ageC*interv", absmod) |>
as.data.frame()
augment(absmod) |>
mutate(uppt = paste0(schoolid, ppt)) |>
ggplot(aes(x=ageC, col = interv)) +
geom_line(aes(y = .fitted, group = uppt), alpha=.1) +
geom_line(data = efplot, aes(y = fit, col= interv)) +
geom_ribbon(data = efplot, aes(y = fit,
ymin = lower, ymax = upper,
fill = interv), alpha = .3) +
scale_x_continuous("Age",breaks=0:7,labels=12:19)
But plotting manually gives you more control:
Complete readings
Attend your lab and work together on the exercises
Complete the weekly quiz
Piazza forum!
Office hours (see Learn page for details)