Week 10 Exercises: Structural Equation Modelling (SEM)

You have probably heard the term “Structural Equation Modelling (SEM)” for a few weeks now, but we haven’t been very clear on what exactly it is. Is it CFA? Is it Path Analysis? In fact it is both - it is the overarching framework of which CFA and Path Analysis are just particular cases. The beauty comes in when we put the CFA and Path Analysis approaches together.

Path analysis, as we saw last week, offers a way of specifying and evaluating a structural model, in which variables relate to one another in various ways, via different (sometimes indirect) paths. Common models like our old friend multiple regression can be expressed in a Path Analysis framework.

Factor Analysis, on the other hand, brings something absolutely crucial to the table - it allows us to mitigate some of the problems which are associated with measurement error by specifying the existence of some latent variable which is measured via some observed variables. No question can perfectly measure someone’s level of “anxiety”, but if we take a set of 10 carefully chosen questions, we can consider the shared covariance between those 10 questions to represent the construct that is common between all of them (they all ask, in different ways, about “anxiety”), also modeling the unique error with which each individual question fails to perfectly represent the entire construct.

Combine them and we can reap the rewards of having both a structural model and a measurement model. The measurement model is our specification between the items we directly observed, and the latent variables of which we consider these items to be manifestations. The structural model is our specified model of the relationships between the latent variables.

Figure 1: SEM diagram. Measurement model in orange, Structural model in purple

You can’t test the structural model if the measurement model is bad

If you test the relationships between a set of latent factors, and they are not reliably measured by the observed items, then this error propagates up to influence the fit of the structural model.
To test the measurement model, it is typical to saturate the structural model (i.e., allow all the latent variables to correlate with one another). This way any misfit is due to the measurement model only.

Alternatively, we can fit individual CFA models for each construct and assess their fit (making any reasonable adjustments if necessary) prior to then fitting the full SEM.

Exercising Exercises

Dataset: tpb2

The “Theory of Planned Behaviour” is a theory about why people engage in physical activity (i.e. why people exercise).

The theory is represented in the diagram in Figure 2 (only the latent variables and not the measured items are shown). Attitudes refer to the extent to which a person has a favourable view of exercising; subjective norms refer to whether they believe others whose opinions they care about believe exercise to be a good thing; and perceived behavioural control refers to the extent to which they believe exercising is under their control. Intentions refer to whether a person intends to exercise and behaviour is a measure of the extent to which they exercised. Each construct is measured using four items.

Figure 2: Theory of planned behaviour (latent variables only)

The data are available either:

Table 1:

Data Dictionary for TPB data

variable question
SN1 When I think about people whose opinions matter to me, I believe they value and support regular exercise
SN2 I feel pressure from those I care about to exercise regularly
SN3 Most people who are important to me approve of my exercising
SN4 Most people like me exercise regularly
PBC1 My exercise routine is up to me and only me
PBC2 I am confident that if I want to then I can exercise regularly
PBC3 I believe I have the ability to overcome any obstacles that may prevent me from exercising regularly.
PBC4 I feel capable of sticking to a consistent exercise schedule, even when faced with challenges or distractions
attitude1 I see exercising as an enjoyable and rewarding activity.
attitude2 I believe that exercising contributes positively to my overall well-being and health.
attitude3 I view exercising as an important part of maintaining a healthy lifestyle.
attitude4 I feel energized and invigorated after engaging in physical exercise.
int1 I am determined to take concrete steps towards establishing a consistent exercise habit
int2 I intend to exercise for at least 20 minutes, three times per week for the next three months.
int3 I have made a firm decision to prioritize exercise and allocate time for it in my schedule
int4 I intend to be in shape within the next three months.
int5 I am committed to incorporating regular exercise into my weekly routine.
beh1 I currently engage in physical activity for at least 20 minutes, three times per week, as recommended.
beh2 I already allocate time for exercise in my weekly schedule and adhere to it regularly.
beh3 I track my exercise sessions and ensure I meet my weekly goals
beh4 I do not currently exercise enough
Question 1

Load in the various packages you will probably need (tidyverse, lavaan), and read in the data using the appropriate function.

We’ve given you .csv files for a long time now, but it’s good to be prepared to encounter all sorts of weird filetypes. Can you successfully read in from both types of data?

Either one or the other of:

library(tidyverse)
library(lavaan)
load(url("https://uoepsy.github.io/data/tpb2.Rdata"))

TPB_data <- read.table("https://uoepsy.github.io/data/tpb2.txt", header = TRUE, sep = "\t")

Question 2

Test separate one-factor models for each construct.
Are the measurement models satisfactory? (check their fit measures).

Here we specify our one factor CFA model for attitudes:

att_mod <- "
  att =~ attitude1 + attitude2 + attitude3 + attitude4
  "

And we estimate the model using cfa()

att_mod.est <- cfa(att_mod, data=TPB_data, std.lv = TRUE)

Let’s first inspect the fit measures:

fitmeasures(att_mod.est)[c("rmsea","srmr","tli","cfi")]
      rmsea        srmr         tli         cfi 
0.007237593 0.010669452 0.999299418 0.999766473 

Our fit is good: RMSEA<.05, SRMR<.05, TLI>0.95 and CFI>.95.
We should also check that all loadings are significant and \(>|.30|\).
To save space I am going to not show the entire summary output here, but just pull out the parameter estimates:

parameterestimates(att_mod.est)
        lhs op       rhs   est    se      z pvalue ci.lower ci.upper
1       att =~ attitude1 0.682 0.051 13.355      0    0.582    0.782
2       att =~ attitude2 0.617 0.045 13.656      0    0.528    0.705
3       att =~ attitude3 0.681 0.049 13.928      0    0.585    0.777
4       att =~ attitude4 0.644 0.048 13.415      0    0.550    0.738
5 attitude1 ~~ attitude1 1.097 0.069 15.883      0    0.961    1.232
6 attitude2 ~~ attitude2 0.837 0.054 15.498      0    0.731    0.943
7 attitude3 ~~ attitude3 0.959 0.063 15.121      0    0.835    1.084
8 attitude4 ~~ attitude4 0.966 0.061 15.809      0    0.847    1.086
9       att ~~       att 1.000 0.000     NA     NA    1.000    1.000

They all look good!

Following the same logic as for the Attitudes, let’s fit the CFA for Subjective norms. Again, all fit measures are very good, and loadings are all significant at greater than 0.3.

sn_mod <- "
  SubjN =~ SN1 + SN2 + SN3 + SN4
  "

sn_mod.est <- cfa(sn_mod, data=TPB_data, std.lv = TRUE)

fitmeasures(sn_mod.est)[c("rmsea","srmr","tli","cfi")]
     rmsea       srmr        tli        cfi 
0.03058032 0.01444391 0.98605752 0.99535251 
parameterestimates(sn_mod.est)
    lhs op   rhs   est    se      z pvalue ci.lower ci.upper
1 SubjN =~   SN1 0.644 0.048 13.524      0    0.550    0.737
2 SubjN =~   SN2 0.585 0.049 11.923      0    0.489    0.681
3 SubjN =~   SN3 0.584 0.044 13.266      0    0.498    0.670
4 SubjN =~   SN4 0.615 0.049 12.578      0    0.519    0.711
5   SN1 ~~   SN1 0.850 0.058 14.560      0    0.735    0.964
6   SN2 ~~   SN2 1.041 0.062 16.736      0    0.919    1.163
7   SN3 ~~   SN3 0.749 0.050 14.985      0    0.651    0.847
8   SN4 ~~   SN4 0.985 0.062 15.974      0    0.864    1.106
9 SubjN ~~ SubjN 1.000 0.000     NA     NA    1.000    1.000

All good with Perceived Behavioural Control!
Almost too good (TLI>1, and RMSEA is coming out at exactly 0!), but this is most probably because of this being fake data.
When data is simulated based on a specific model, then fitting that same model structure to the data will obviously fit extremely well! s

pbc_mod <- "
  PBC =~ PBC1 + PBC2 + PBC3 + PBC4
  "

pbc_mod.est <- cfa(pbc_mod, data=TPB_data, std.lv = TRUE)

fitmeasures(pbc_mod.est)[c("rmsea","srmr","tli","cfi")]
      rmsea        srmr         tli         cfi 
0.000000000 0.003369084 1.010079110 1.000000000 
parameterestimates(pbc_mod.est)
   lhs op  rhs   est    se      z pvalue ci.lower ci.upper
1  PBC =~ PBC1 0.696 0.043 16.258      0    0.612    0.780
2  PBC =~ PBC2 0.627 0.038 16.346      0    0.551    0.702
3  PBC =~ PBC3 0.592 0.041 14.619      0    0.513    0.672
4  PBC =~ PBC4 0.676 0.045 15.058      0    0.588    0.764
5 PBC1 ~~ PBC1 0.765 0.052 14.794      0    0.663    0.866
6 PBC2 ~~ PBC2 0.609 0.041 14.677      0    0.527    0.690
7 PBC3 ~~ PBC3 0.768 0.046 16.601      0    0.678    0.859
8 PBC4 ~~ PBC4 0.919 0.057 16.184      0    0.808    1.030
9  PBC ~~  PBC 1.000 0.000     NA     NA    1.000    1.000

Uh-oh, it’s looking less good with Intentions.
The loadings all look okay, but the fit indices aren’t great

int_mod <- "
  intent =~ int1 + int2 + int3 + int4 + int5
  "

int_mod.est <- cfa(int_mod, data=TPB_data, std.lv = TRUE)

fitmeasures(int_mod.est)[c("rmsea","srmr","tli","cfi")]
     rmsea       srmr        tli        cfi 
0.14128950 0.05266866 0.84561107 0.92280553 
parameterestimates(int_mod.est)
      lhs op    rhs   est    se      z pvalue ci.lower ci.upper
1  intent =~   int1 0.698 0.043 16.363      0    0.614    0.781
2  intent =~   int2 0.801 0.035 23.173      0    0.733    0.869
3  intent =~   int3 0.684 0.039 17.407      0    0.607    0.761
4  intent =~   int4 0.868 0.038 22.746      0    0.793    0.943
5  intent =~   int5 0.581 0.037 15.518      0    0.508    0.655
6    int1 ~~   int1 1.046 0.056 18.555      0    0.936    1.157
7    int2 ~~   int2 0.487 0.036 13.665      0    0.417    0.557
8    int3 ~~   int3 0.858 0.047 18.108      0    0.765    0.951
9    int4 ~~   int4 0.614 0.043 14.145      0    0.529    0.699
10   int5 ~~   int5 0.827 0.044 18.874      0    0.741    0.913
11 intent ~~ intent 1.000 0.000     NA     NA    1.000    1.000

Let’s examine the modification indices:

modindices(int_mod.est, sort = TRUE)
    lhs op  rhs     mi    epc sepc.lv sepc.all sepc.nox
17 int2 ~~ int4 97.630  0.414   0.414    0.757    0.757
13 int1 ~~ int3 50.107  0.270   0.270    0.285    0.285
16 int2 ~~ int3 21.423 -0.159  -0.159   -0.246   -0.246
20 int3 ~~ int5 18.787  0.145   0.145    0.172    0.172
19 int3 ~~ int4 17.657 -0.158  -0.158   -0.217   -0.217
12 int1 ~~ int2 16.578 -0.148  -0.148   -0.207   -0.207
14 int1 ~~ int4 10.596 -0.129  -0.129   -0.161   -0.161
21 int4 ~~ int5 10.532 -0.111  -0.111   -0.156   -0.156
15 int1 ~~ int5  4.521  0.077   0.077    0.083    0.083
18 int2 ~~ int5  3.438 -0.058  -0.058   -0.091   -0.091

It looks like correlating the residuals for items int2 and int4 would improve our model. The expected correlation is 0.757, which is fairly large (remember correlations are between -1 and 1).

Note that the items have a possible theoretical link too, beyond just “intention to exercise”. It looks like both int2 and int4 are specifically about intentions in the next three months. It might make sense that responses to these two items are related more than just representing general ‘intention’.

When we include this covariance, our model fit looks much better!

int_mod <- "
  intent =~ int1 + int2 + int3 + int4 + int5
  int2 ~~ int4
  "

int_mod.est <- cfa(int_mod, data=TPB_data, std.lv = TRUE)

fitmeasures(int_mod.est)[c("rmsea","srmr","tli","cfi")]
     rmsea       srmr        tli        cfi 
0.02753383 0.01475947 0.99413687 0.99765475 
parameterestimates(int_mod.est)
      lhs op    rhs   est    se      z pvalue ci.lower ci.upper
1  intent =~   int1 0.795 0.044 18.018      0    0.708    0.881
2  intent =~   int2 0.633 0.039 16.341      0    0.557    0.709
3  intent =~   int3 0.800 0.041 19.607      0    0.720    0.880
4  intent =~   int4 0.682 0.043 15.916      0    0.598    0.766
5  intent =~   int5 0.629 0.039 16.205      0    0.553    0.705
6    int2 ~~   int4 0.343 0.039  8.748      0    0.266    0.419
7    int1 ~~   int1 0.902 0.057 15.734      0    0.789    1.014
8    int2 ~~   int2 0.727 0.044 16.589      0    0.641    0.813
9    int3 ~~   int3 0.686 0.049 13.927      0    0.589    0.782
10   int4 ~~   int4 0.902 0.054 16.827      0    0.797    1.007
11   int5 ~~   int5 0.769 0.045 17.201      0    0.681    0.856
12 intent ~~ intent 1.000 0.000     NA     NA    1.000    1.000

Finally, the behaviour model looks absolutely fine.
Note that bey4 has a negative loading, which is perfectly okay. In fact, if you look at the items, you’ll notice that this is the only item that is reversed (higher scores on the item reflect less exercising)

beh_mod <- "
  behav =~ beh1 + beh2 + beh3 + beh4
  "

beh_mod.est <- cfa(beh_mod, data=TPB_data, std.lv = TRUE)

fitmeasures(beh_mod.est)[c("rmsea","srmr","tli","cfi")]
     rmsea       srmr        tli        cfi 
0.02896260 0.01285583 0.99191922 0.99730641 
parameterestimates(beh_mod.est)
    lhs op   rhs    est    se       z pvalue ci.lower ci.upper
1 behav =~  beh1  0.659 0.045  14.593      0    0.571    0.748
2 behav =~  beh2  0.735 0.045  16.275      0    0.647    0.824
3 behav =~  beh3  0.787 0.045  17.331      0    0.698    0.875
4 behav =~  beh4 -0.724 0.046 -15.626      0   -0.815   -0.633
5  beh1 ~~  beh1  0.987 0.058  16.962      0    0.873    1.101
6  beh2 ~~  beh2  0.889 0.058  15.312      0    0.775    1.003
7  beh3 ~~  beh3  0.819 0.059  13.908      0    0.703    0.934
8  beh4 ~~  beh4  0.980 0.061  16.029      0    0.860    1.099
9 behav ~~ behav  1.000 0.000      NA     NA    1.000    1.000

Question 3

Using lavaan syntax, specify the full structural equation model that corresponds to the model in Figure 2. For each construct use the measurement models from the previous question.

This involves specifying the measurement models for all the latent variables, and then also specifying the relationships between those latent variables. All in the same model!

TPB_model<-'
  # measurement models  
  att =~ attitude1 + attitude2 + attitude3 + attitude4
  SN =~ SN1 + SN2 + SN3 + SN4
  PBC =~ PBC1 + PBC2 + PBC3 + PBC4
  intent =~ int1 + int2 + int3 + int4 + int5
  beh =~ beh1 + beh2 + beh3 + beh4
  
  # covariances between items
  int2 ~~ int4

  # regressions  
  beh ~ intent + PBC
  intent ~ att + SN + PBC

  # covariances between attitudes, SN, and PBC
  att ~~ SN    
  att ~~ PBC
  SN ~~ PBC
'

Question 4

Estimate and evaluate the model

  • Does the model fit well?
  • Are the hypothesised paths significant?

We can estimate the model using the sem() function.
As with cfa(), by default the sem() function will scale the latent variables by fixing the loading of the first item for each latent variable to 1.

TPB_model.est<-sem(TPB_model, data=TPB_data, std.lv=TRUE)

fitmeasures(TPB_model.est)[c("rmsea","srmr","tli","cfi")]
    rmsea      srmr       tli       cfi 
0.0108428 0.0268991 0.9935594 0.9944795 

We can see that the model fits well according to RMSEA, SRMR, TLI and CFI.
From the output below, all of the hypothesised paths in the theory of planned behaviour are statistically significant.

summary(TPB_model.est, standardized=TRUE)
lavaan 0.6-18 ended normally after 21 iterations

  Estimator                                         ML
  Optimization method                           NLMINB
  Number of model parameters                        51

  Number of observations                           890

Model Test User Model:
                                                      
  Test statistic                               198.834
  Degrees of freedom                               180
  P-value (Chi-square)                           0.160

Parameter Estimates:

  Standard errors                             Standard
  Information                                 Expected
  Information saturated (h1) model          Structured

Latent Variables:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
  att =~                                                                
    attitude1         0.687    0.050   13.854    0.000    0.687    0.550
    attitude2         0.614    0.044   14.018    0.000    0.614    0.557
    attitude3         0.662    0.047   13.977    0.000    0.662    0.555
    attitude4         0.660    0.047   14.140    0.000    0.660    0.561
  SN =~                                                                 
    SN1               0.644    0.045   14.213    0.000    0.644    0.572
    SN2               0.595    0.047   12.573    0.000    0.595    0.505
    SN3               0.573    0.042   13.637    0.000    0.573    0.548
    SN4               0.620    0.047   13.202    0.000    0.620    0.531
  PBC =~                                                                
    PBC1              0.687    0.041   16.555    0.000    0.687    0.615
    PBC2              0.617    0.037   16.606    0.000    0.617    0.616
    PBC3              0.608    0.039   15.412    0.000    0.608    0.575
    PBC4              0.681    0.044   15.581    0.000    0.681    0.581
  intent =~                                                             
    int1              0.648    0.039   16.793    0.000    0.777    0.628
    int2              0.554    0.034   16.519    0.000    0.664    0.626
    int3              0.643    0.036   17.876    0.000    0.771    0.670
    int4              0.586    0.037   15.818    0.000    0.703    0.601
    int5              0.534    0.034   15.879    0.000    0.641    0.594
  beh =~                                                                
    beh1              0.556    0.038   14.461    0.000    0.678    0.568
    beh2              0.588    0.039   15.203    0.000    0.718    0.600
    beh3              0.635    0.039   16.208    0.000    0.775    0.646
    beh4             -0.604    0.040  -15.220    0.000   -0.737   -0.601

Regressions:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
  beh ~                                                                 
    intent            0.465    0.057    8.166    0.000    0.457    0.457
    PBC               0.251    0.063    3.967    0.000    0.206    0.206
  intent ~                                                              
    att               0.242    0.061    3.986    0.000    0.202    0.202
    SN                0.335    0.064    5.233    0.000    0.279    0.279
    PBC               0.338    0.059    5.726    0.000    0.282    0.282

Covariances:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
 .int2 ~~                                                               
   .int4              0.307    0.037    8.403    0.000    0.307    0.397
  att ~~                                                                
    SN                0.316    0.050    6.300    0.000    0.316    0.316
    PBC               0.245    0.049    5.056    0.000    0.245    0.245
  SN ~~                                                                 
    PBC               0.275    0.049    5.628    0.000    0.275    0.275

Variances:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
   .attitude1         1.089    0.067   16.231    0.000    1.089    0.697
   .attitude2         0.840    0.052   16.057    0.000    0.840    0.690
   .attitude3         0.985    0.061   16.101    0.000    0.985    0.692
   .attitude4         0.946    0.059   15.925    0.000    0.946    0.685
   .SN1               0.850    0.055   15.385    0.000    0.850    0.672
   .SN2               1.030    0.060   17.099    0.000    1.030    0.745
   .SN3               0.762    0.047   16.061    0.000    0.762    0.699
   .SN4               0.979    0.059   16.516    0.000    0.979    0.718
   .PBC1              0.777    0.050   15.682    0.000    0.777    0.622
   .PBC2              0.621    0.040   15.630    0.000    0.621    0.620
   .PBC3              0.750    0.045   16.715    0.000    0.750    0.670
   .PBC4              0.912    0.055   16.576    0.000    0.912    0.663
   .int1              0.929    0.055   16.918    0.000    0.929    0.606
   .int2              0.687    0.041   16.675    0.000    0.687    0.609
   .int3              0.730    0.046   15.843    0.000    0.730    0.551
   .int4              0.873    0.051   17.115    0.000    0.873    0.639
   .int5              0.754    0.043   17.618    0.000    0.754    0.648
   .beh1              0.962    0.056   17.226    0.000    0.962    0.677
   .beh2              0.915    0.055   16.517    0.000    0.915    0.640
   .beh3              0.837    0.055   15.238    0.000    0.837    0.582
   .beh4              0.961    0.058   16.498    0.000    0.961    0.639
    att               1.000                               1.000    1.000
    SN                1.000                               1.000    1.000
    PBC               1.000                               1.000    1.000
   .intent            1.000                               0.695    0.695
   .beh               1.000                               0.672    0.672

Question 5

Examine the modification indices and expected parameter changes - are there any additional parameters you would consider including?

modindices(TPB_model.est, sort = TRUE) |> head()
     lhs op  rhs     mi    epc sepc.lv sepc.all sepc.nox
316 int1 ~~ int3 12.167  0.143   0.143    0.174    0.174
103  PBC =~ int5 11.028  0.151   0.151    0.140    0.140
303 PBC3 ~~ beh2  8.351  0.095   0.095    0.114    0.114
336 int4 ~~ beh1  7.886  0.087   0.087    0.095    0.095
291 PBC2 ~~ int5  7.847  0.076   0.076    0.111    0.111
263  SN4 ~~ PBC4  6.978 -0.098  -0.098   -0.104   -0.104

In this case, none of the expected parameter changes are large enough that we would consider including any additional parameters

Question 6

Test the indirect effect of attitudes, subjective norms, and perceived behavioural control on behaviour via intentions.

Remember, when you fit the model with sem(), use se='bootstrap' to get boostrapped standard errors (it may take a few minutes). When you inspect the model using summary(), get the 95% confidence intervals for parameters with ci = TRUE.

First, let’s name the paths in the structural equation model:

To test these indirect effects we create new a parameter for each indirect effect:

TPB_model2 <- '
  # measurement models  
  att =~ attitude1 + attitude2 + attitude3 + attitude4
  SN =~ SN1 + SN2 + SN3 + SN4
  PBC =~ PBC1 + PBC2 + PBC3 + PBC4
  intent =~ int1 + int2 + int3 + int4 + int5
  beh =~ beh1 + beh2 + beh3 + beh4
  
  # covariances between items
  int2 ~~ int4

  # regressions  
  beh ~ b*intent + PBC
  intent ~ a1*att + a2*SN + a3*PBC

  # covariances between attitudes, SN, and PBC
  att ~~ SN    
  att ~~ PBC
  SN ~~ PBC

  # indirect effects:  
  ind1 := a1*b  #indirect effect of attitudes via intentions
  ind2 := a2*b  #indirect effect of SN via intentions
  ind3 := a3*b  #indirect effect of PBC via intentions
'

When we estimate the model, we request bootstrapped standard errors:

TPB_model2.est<-sem(TPB_model2, std.lv=TRUE, se='bootstrap', data=TPB_data)

When we inspect the model, we request the 95% confidence intervals for parameters:

summary(TPB_model2.est, ci=TRUE)
lavaan 0.6.17 ended normally after 21 iterations

  Estimator                                         ML
  Optimization method                           NLMINB
  Number of model parameters                        51

  Number of observations                           890

Model Test User Model:
                                                      
  Test statistic                               198.834
  Degrees of freedom                               180
  P-value (Chi-square)                           0.160

Parameter Estimates:

  Standard errors                            Bootstrap
  Number of requested bootstrap draws             1000
  Number of successful bootstrap draws            1000

Latent Variables:
                   Estimate  Std.Err  z-value  P(>|z|) ci.lower ci.upper
  att =~                                                                
    attitude1         0.687    0.049   14.147    0.000    0.585    0.780
    attitude2         0.614    0.044   13.899    0.000    0.528    0.700
    attitude3         0.662    0.046   14.327    0.000    0.567    0.745
    attitude4         0.660    0.045   14.639    0.000    0.566    0.745
  SN =~                                                                 
    SN1               0.644    0.046   13.946    0.000    0.552    0.734
    SN2               0.595    0.046   12.962    0.000    0.501    0.687
    SN3               0.573    0.044   12.955    0.000    0.487    0.657
    SN4               0.620    0.048   13.001    0.000    0.524    0.717
  PBC =~                                                                
    PBC1              0.687    0.039   17.604    0.000    0.614    0.765
    PBC2              0.617    0.036   17.152    0.000    0.540    0.683
    PBC3              0.608    0.040   15.052    0.000    0.524    0.687
    PBC4              0.681    0.043   15.872    0.000    0.594    0.761
  intent =~                                                             
    int1              0.648    0.040   16.116    0.000    0.571    0.728
    int2              0.554    0.033   16.651    0.000    0.485    0.615
    int3              0.643    0.038   16.874    0.000    0.566    0.715
    int4              0.586    0.037   15.918    0.000    0.515    0.655
    int5              0.534    0.031   17.381    0.000    0.472    0.594
  beh =~                                                                
    beh1              0.556    0.039   14.358    0.000    0.477    0.628
    beh2              0.588    0.039   15.130    0.000    0.512    0.662
    beh3              0.635    0.039   16.230    0.000    0.558    0.709
    beh4             -0.604    0.040  -15.181    0.000   -0.684   -0.528

Regressions:
                   Estimate  Std.Err  z-value  P(>|z|) ci.lower ci.upper
  beh ~                                                                 
    intent     (b)    0.465    0.056    8.242    0.000    0.354    0.576
    PBC               0.251    0.066    3.794    0.000    0.126    0.383
  intent ~                                                              
    att       (a1)    0.242    0.066    3.672    0.000    0.116    0.369
    SN        (a2)    0.335    0.064    5.196    0.000    0.213    0.476
    PBC       (a3)    0.338    0.062    5.483    0.000    0.228    0.468

Covariances:
                   Estimate  Std.Err  z-value  P(>|z|) ci.lower ci.upper
 .int2 ~~                                                               
   .int4              0.307    0.038    8.153    0.000    0.233    0.383
  att ~~                                                                
    SN                0.316    0.050    6.313    0.000    0.216    0.412
    PBC               0.245    0.050    4.942    0.000    0.148    0.342
  SN ~~                                                                 
    PBC               0.275    0.049    5.563    0.000    0.175    0.370

Variances:
                   Estimate  Std.Err  z-value  P(>|z|) ci.lower ci.upper
   .attitude1         1.089    0.062   17.481    0.000    0.966    1.204
   .attitude2         0.840    0.052   16.198    0.000    0.741    0.940
   .attitude3         0.985    0.060   16.344    0.000    0.862    1.098
   .attitude4         0.946    0.062   15.353    0.000    0.817    1.061
   .SN1               0.850    0.055   15.489    0.000    0.742    0.954
   .SN2               1.030    0.059   17.516    0.000    0.914    1.148
   .SN3               0.762    0.051   15.082    0.000    0.671    0.858
   .SN4               0.979    0.060   16.353    0.000    0.860    1.100
   .PBC1              0.777    0.049   15.821    0.000    0.679    0.873
   .PBC2              0.621    0.037   16.567    0.000    0.547    0.693
   .PBC3              0.750    0.042   17.901    0.000    0.671    0.831
   .PBC4              0.912    0.052   17.498    0.000    0.811    1.015
   .int1              0.929    0.053   17.564    0.000    0.822    1.030
   .int2              0.687    0.042   16.534    0.000    0.600    0.771
   .int3              0.730    0.050   14.687    0.000    0.628    0.830
   .int4              0.873    0.050   17.561    0.000    0.772    0.970
   .int5              0.754    0.042   18.084    0.000    0.670    0.836
   .beh1              0.962    0.053   18.042    0.000    0.857    1.071
   .beh2              0.915    0.057   16.080    0.000    0.802    1.035
   .beh3              0.837    0.057   14.574    0.000    0.717    0.947
   .beh4              0.961    0.061   15.767    0.000    0.837    1.085
    att               1.000                               1.000    1.000
    SN                1.000                               1.000    1.000
    PBC               1.000                               1.000    1.000
   .intent            1.000                               1.000    1.000
   .beh               1.000                               1.000    1.000

Defined Parameters:
                   Estimate  Std.Err  z-value  P(>|z|) ci.lower ci.upper
    ind1              0.113    0.032    3.478    0.001    0.055    0.185
    ind2              0.156    0.035    4.470    0.000    0.091    0.229
    ind3              0.157    0.034    4.685    0.000    0.099    0.231

We can see that all of the indirect effects are statistically significant at p<.05 as none of the 95% confidence intervals for the coefficients include zero.

Question 7

Write up your analysis as if you were presenting the work in academic paper, with brief separate ‘Method’ and ‘Results’ sections

Method

We tested a theory of planned behaviour model of physical activity by fitting a structural equation model in which attitudes, subjective norms, perceived behavioural control, intentions and behaviour were latent variables defined by four items. We first tested the measurement models for each construct by fitting a one-factor CFA model. Latent variable scaling was by fixing the loading of the first item for each construct to 1.

Within the SEM, behaviour was regressed on intentions and perceived behavioural control and intentions were regressed on attitudes, subjective norms, and perceived behavioiural control. In addition, attitudes, subjective norms, and perceived behavioural control were allowed to covary. The indirect effects of attitudes, subjective norms and perceived behavioural control on behaviour were calculated as the product of the effect of the relevant predictor on the mediator (intentions) and the effect of the mediator on the outcome. The statistical significance of the indirect effects were evaluated using bootstrapped 95% confidence intervals with 1000 resamples.

In all cases models were fit using maximum likelihood estimation and model fit was judged to be good if CFI and TLI were \(>.95\) and RMSEA and SRMR were \(<.05\). Modification indices and expected parameter changes were inspected to identify any areas of local mis-fit but model modifications were only made if they could be justified on substantive grounds.

Results

All measurement models fit well (CFI and TLI \(>.95\) and RMSEA and SRMR \(<.05\)) with the exception of the measurement model for intentions. Modification indices suggested the inclusion of residual covariance between two items on the intentions scale (int2 and int4) that both made specific reference to short term intentions. The addition of this parameter resulted in a good fit. The full structural equation model fit well (CFI = 0.99, TLI = 0.99, RMSEA = 0.01, SRMR = 0.03). Unstandardised parameter estimates are provided in Table 2. All of the hypothesised paths were statistically significant at \(p<.05\). The significant indirect effects suggested that intentions mediate the effects of attitudes, subjective norms, and perceived behavioural control on behaviour whilst perceived behavioural control also has a direct effect on behaviour. Results thus provide support for a theory of planned behaviour model of physical activity.

Table 2:

Unstandardised parameter estimates for structural equation model for a theory of planned behaviour model of physical activity. Note: PBC = Perceived Behavioural Control, CI = Confidence Interval

Parameter Estimate SE z p 95% CI
Loadings
Attitudes attitude1 0.69 0.05 14.15 <0.001 [0.59, 0.78]
attitude2 0.61 0.04 13.90 <0.001 [0.53, 0.7]
attitude3 0.66 0.05 14.33 <0.001 [0.57, 0.75]
attitude4 0.66 0.05 14.64 <0.001 [0.57, 0.75]
Subjective Norms SN1 0.64 0.05 13.95 <0.001 [0.55, 0.73]
SN2 0.59 0.05 12.96 <0.001 [0.5, 0.69]
SN3 0.57 0.04 12.95 <0.001 [0.49, 0.66]
SN4 0.62 0.05 13.00 <0.001 [0.52, 0.72]
PBC PBC1 0.69 0.04 17.60 <0.001 [0.61, 0.77]
PBC2 0.62 0.04 17.15 <0.001 [0.54, 0.68]
PBC3 0.61 0.04 15.05 <0.001 [0.52, 0.69]
PBC4 0.68 0.04 15.87 <0.001 [0.59, 0.76]
Intentions int1 0.65 0.04 16.12 <0.001 [0.57, 0.73]
int2 0.55 0.03 16.65 <0.001 [0.49, 0.62]
int3 0.64 0.04 16.87 <0.001 [0.57, 0.72]
int4 0.59 0.04 15.92 <0.001 [0.51, 0.66]
int5 0.53 0.03 17.38 <0.001 [0.47, 0.59]
Behaviours beh1 0.56 0.04 14.36 <0.001 [0.48, 0.63]
beh2 0.59 0.04 15.13 <0.001 [0.51, 0.66]
beh3 0.64 0.04 16.23 <0.001 [0.56, 0.71]
beh4 -0.60 0.04 -15.18 <0.001 [-0.68, -0.53]
Covariances
int2 with int4 0.31 0.04 8.15 <0.001 [0.23, 0.38]
Attitudes with Subjective Norms 0.32 0.05 6.31 <0.001 [0.22, 0.41]
Attitudes with PBC 0.25 0.05 4.94 <0.001 [0.15, 0.34]
Subjective Norms with PBC 0.27 0.05 5.56 <0.001 [0.18, 0.37]
Regressions
Behaviours on Intentions 0.47 0.06 8.24 <0.001 [0.35, 0.58]
Behaviours on PBC 0.25 0.07 3.79 <0.001 [0.13, 0.38]
Intentions on Attitudes 0.24 0.07 3.67 <0.001 [0.12, 0.37]
Intentions on Subjective Norms 0.33 0.06 5.20 <0.001 [0.21, 0.48]
Intentions on PBC 0.34 0.06 5.48 <0.001 [0.23, 0.47]
Indirect effects
Attitudes via Intentions 0.11 0.03 3.48 <0.001 [0.05, 0.19]
Subjective Norms via Intentions 0.16 0.03 4.47 <0.001 [0.09, 0.23]
PBC via Intentions 0.16 0.03 4.68 <0.001 [0.1, 0.23]