df <- tibble(
id = 1:100,
pos = rnorm(100),
ex = rnorm(100,.5*pos),
mood = rnorm(100,.3*pos),
ex2 = rnorm(100,ex+mood),
mood2 = rnorm(100,ex+mood),
ex3 = rnorm(100,ex2+mood2),
mood3 = rnorm(100,ex2+mood2),
ex4 = rnorm(100,ex3+mood3),
mood4 = rnorm(100,ex3+mood3)
)
mod <- "
"9: Further SEM
This reading:
- more uses of SEM!
Introduction
There is a ~50 year history of structural equation modelling in behavioural and social sciences, with its component parts having originated over 100 years ago (factor analysis from Charles Spearman in 1904, and path analysis from Sewall Wright in 1918).
One key benefit of SEM is that it can force us as researchers to face the theoretical assumptions that underpin our analysis. This may be, for instance, having to explicitly define how our observed measurements relate to the underlying concepts we believe they represent. Similarly, representing our models as diagrams forces transparency of our theorised relationships, the variables we have chosen to include, and, perhaps more importantly, those we have omitted.
Despite this, there has been a tendency to view certain structures as a bit like a set of ‘plug-and-play’ models that can be applied to data without us having to think too deeply of the underlying theory. If there is one bit of advice for any work you do in statistics, it would be to always think carefully and deliberately and to always ask “how might i be misleading myself here?”
Below are various examples of some of the more common structures of models that you will see in the literature, that they may serve as examples of “what can be done” in the SEM framework.
- measurement invariance
- other types of indicators (CCC)
- mnlfa
- multi-trait multi-method models
- multi-group analysis
- multi-level SEM
- cross-lagged panel models
- RI-CLPM
- latent class mixture models
- growth curves
- random effects as latent variables?
df <- junk::sim_basicrs() |> filter(x<=5) |> mutate(x=x-1)
library(lme4)
m1 <- lmer(y ~ 1 + x + (1 + x | g), df, REML=FALSE)
dfw <-
df |> select(x,g,y) |>
pivot_wider(names_from=x,values_from=y,names_prefix = "t")
mod <- "
int =~ 1*t0 + 1*t1 + 1*t2 + 1*t3 + 1*t4
slope =~ 0*t0 + 1*t1 + 2*t2 + 3*t3 + 4*t4
"
library(lavaan)
m2 <- growth(mod, data = dfw)
summary(m2)lavaan 0.6-20 ended normally after 36 iterations
Estimator ML
Optimization method NLMINB
Number of model parameters 10
Number of observations 20
Model Test User Model:
Test statistic 23.400
Degrees of freedom 10
P-value (Chi-square) 0.009
Parameter Estimates:
Standard errors Standard
Information Expected
Information saturated (h1) model Structured
Latent Variables:
Estimate Std.Err z-value P(>|z|)
int =~
t0 1.000
t1 1.000
t2 1.000
t3 1.000
t4 1.000
slope =~
t0 0.000
t1 1.000
t2 2.000
t3 3.000
t4 4.000
Covariances:
Estimate Std.Err z-value P(>|z|)
int ~~
slope 1.607 0.549 2.927 0.003
Intercepts:
Estimate Std.Err z-value P(>|z|)
int 0.848 0.391 2.169 0.030
slope 0.950 0.262 3.619 0.000
Variances:
Estimate Std.Err z-value P(>|z|)
.t0 1.402 0.548 2.560 0.010
.t1 0.877 0.321 2.735 0.006
.t2 0.193 0.194 0.994 0.320
.t3 1.494 0.592 2.525 0.012
.t4 2.824 1.106 2.553 0.011
int 2.353 0.982 2.397 0.017
slope 1.208 0.437 2.762 0.006
fixef(m1)(Intercept) x
0.8895569 0.9972677
VarCorr(m1) Groups Name Std.Dev. Corr
g (Intercept) 1.3490
x 1.0784 0.919
Residual 1.1026