9: Further SEM

This reading:

more uses of SEM!

Introduction

There is a ~50 year history of structural equation modelling in behavioural and social sciences, with its component parts having originated over 100 years ago (factor analysis from Charles Spearman in 1904, and path analysis from Sewall Wright in 1918).

One key benefit of SEM is that it can force us as researchers to face the theoretical assumptions that underpin our analysis. This may be, for instance, having to explicitly define how our observed measurements relate to the underlying concepts we believe they represent. Similarly, representing our models as diagrams forces transparency of our theorised relationships, the variables we have chosen to include, and, perhaps more importantly, those we have omitted.

Despite this, there has been a tendency to view certain structures as a bit like a set of ‘plug-and-play’ models that can be applied to data without us having to think too deeply of the underlying theory. If there is one bit of advice for any work you do in statistics, it would be to always think carefully and deliberately and to always ask “how might i be misleading myself here?”

Below are various examples of some of the more common structures of models that you will see in the literature, that they may serve as examples of “what can be done” in the SEM framework.

df <- tibble(
  id = 1:100,
  pos = rnorm(100),
  ex = rnorm(100,.5*pos),
  mood = rnorm(100,.3*pos),
  ex2 = rnorm(100,ex+mood),
  mood2 = rnorm(100,ex+mood),
  ex3 = rnorm(100,ex2+mood2),
  mood3 = rnorm(100,ex2+mood2),
  ex4 = rnorm(100,ex3+mood3),
  mood4 = rnorm(100,ex3+mood3)
)

mod <- "


"

- measurement invariance
- other types of indicators (CCC)
- mnlfa
- multi-trait multi-method models
- multi-group analysis
- multi-level SEM
- cross-lagged panel models
  - RI-CLPM
- latent class mixture models
- growth curves
    - random effects as latent variables?

df <- junk::sim_basicrs() |> filter(x<=5) |> mutate(x=x-1)
library(lme4)
m1 <- lmer(y ~ 1 + x + (1 + x | g), df, REML=FALSE)

dfw <- 
  df |> select(x,g,y) |>
  pivot_wider(names_from=x,values_from=y,names_prefix = "t")

mod <- "
int =~ 1*t0 + 1*t1 + 1*t2 + 1*t3 + 1*t4
slope =~ 0*t0 + 1*t1 + 2*t2 + 3*t3 + 4*t4
"
library(lavaan)
m2 <- growth(mod, data = dfw)

summary(m2)

lavaan 0.6-20 ended normally after 36 iterations

  Estimator                                         ML
  Optimization method                           NLMINB
  Number of model parameters                        10

  Number of observations                            20

Model Test User Model:
                                                      
  Test statistic                                23.400
  Degrees of freedom                                10
  P-value (Chi-square)                           0.009

Parameter Estimates:

  Standard errors                             Standard
  Information                                 Expected
  Information saturated (h1) model          Structured

Latent Variables:
                   Estimate  Std.Err  z-value  P(>|z|)
  int =~                                              
    t0                1.000                           
    t1                1.000                           
    t2                1.000                           
    t3                1.000                           
    t4                1.000                           
  slope =~                                            
    t0                0.000                           
    t1                1.000                           
    t2                2.000                           
    t3                3.000                           
    t4                4.000                           

Covariances:
                   Estimate  Std.Err  z-value  P(>|z|)
  int ~~                                              
    slope             1.607    0.549    2.927    0.003

Intercepts:
                   Estimate  Std.Err  z-value  P(>|z|)
    int               0.848    0.391    2.169    0.030
    slope             0.950    0.262    3.619    0.000

Variances:
                   Estimate  Std.Err  z-value  P(>|z|)
   .t0                1.402    0.548    2.560    0.010
   .t1                0.877    0.321    2.735    0.006
   .t2                0.193    0.194    0.994    0.320
   .t3                1.494    0.592    2.525    0.012
   .t4                2.824    1.106    2.553    0.011
    int               2.353    0.982    2.397    0.017
    slope             1.208    0.437    2.762    0.006

fixef(m1)

(Intercept)           x 
  0.8895569   0.9972677

VarCorr(m1)

 Groups   Name        Std.Dev. Corr 
 g        (Intercept) 1.3490        
          x           1.0784   0.919
 Residual             1.1026