Week 10 Exercises: Structural Equation Modelling (SEM)

Exercising Exercises

Dataset: tpb2

The “Theory of Planned Behaviour” is a theory about why people engage in certain behaviours. It has been applied in many contexts, and here we are testing the theory as a model of why people exercise.

The theory is represented in the diagram in Figure 1 (only the latent variables and not the measured items are shown). Attitudes refer to the extent to which a person has a favourable view of exercising; subjective norms refer to whether they believe others whose opinions they care about believe exercise to be a good thing; and perceived behavioural control refers to the extent to which they believe exercising is under their control. Intentions refer to whether a person intends to exercise and behaviour is a measure of the extent to which they exercised. Each construct is measured using four items.

Figure 1: Theory of planned behaviour (latent variables only)

The data are available either:

Table 1: Data Dictionary for TPB data
variable question
SN1 When I think about people whose opinions matter to me, I believe they value and support regular exercise
SN2 I feel pressure from those I care about to exercise regularly
SN3 Most people who are important to me approve of my exercising
SN4 Most people like me exercise regularly
PBC1 My exercise routine is up to me and only me
PBC2 I am confident that if I want to then I can exercise regularly
PBC3 I believe I have the ability to overcome any obstacles that may prevent me from exercising regularly.
PBC4 I feel capable of sticking to a consistent exercise schedule, even when faced with challenges or distractions
attitude1 I see exercising as an enjoyable and rewarding activity.
attitude2 I believe that exercising contributes positively to my overall well-being and health.
attitude3 I view exercising as an important part of maintaining a healthy lifestyle.
attitude4 I feel energized and invigorated after engaging in physical exercise.
int1 I am determined to take concrete steps towards establishing a consistent exercise habit
int2 I intend to exercise for at least 20 minutes, three times per week for the next three months.
int3 I have made a firm decision to prioritize exercise and allocate time for it in my schedule
int4 I intend to be in shape within the next three months.
int5 I am committed to incorporating regular exercise into my weekly routine.
beh1 I currently engage in physical activity for at least 20 minutes, three times per week, as recommended.
beh2 I already allocate time for exercise in my weekly schedule and adhere to it regularly.
beh3 I track my exercise sessions and ensure I meet my weekly goals
beh4 I do not currently exercise enough
Question 1

Load in the various packages you will probably need (tidyverse, lavaan), and read in the data using the appropriate function.

We’ve given you .csv files for a long time now, but it’s good to be prepared to encounter all sorts of weird filetypes. Can you successfully read in from both types of data?

Either one or the other of:

library(tidyverse)
library(lavaan)
load(url("https://uoepsy.github.io/data/tpb2.Rdata"))

TPB_data <- read.table("https://uoepsy.github.io/data/tpb2.txt", header = TRUE, sep = "\t")

Question 2

Before we test the theory of planned behaviour, we want to think about the measurement models for each of the constructs we are trying to capture.

Test separate one-factor models for each construct.
Are the measurement models satisfactory? (check their fit measures).

This isn’t anything new - this is just back to cfa()! So all the same as in the CFA reading, only we need to do it 5 times over..

Here we specify our one factor CFA model for attitudes:

att_mod <- "
  att =~ attitude1 + attitude2 + attitude3 + attitude4
  "

And we estimate the model using cfa()

att_mod.est <- cfa(att_mod, data=TPB_data, std.lv = TRUE)

Let’s first inspect the fit measures:

fitmeasures(att_mod.est)[c("rmsea","srmr","tli","cfi")]
      rmsea        srmr         tli         cfi 
0.007237593 0.010669452 0.999299418 0.999766473 

Our fit is good: RMSEA<.05, SRMR<.05, TLI>0.95 and CFI>.95.
We should also check that all loadings are significant and \(>|.30|\).
To save space I am going to not show the entire summary output here, but just pull out the parameter estimates:

parameterestimates(att_mod.est)
        lhs op       rhs   est    se      z pvalue ci.lower ci.upper
1       att =~ attitude1 0.682 0.051 13.355      0    0.582    0.782
2       att =~ attitude2 0.617 0.045 13.656      0    0.528    0.705
3       att =~ attitude3 0.681 0.049 13.928      0    0.585    0.777
4       att =~ attitude4 0.644 0.048 13.415      0    0.550    0.738
5 attitude1 ~~ attitude1 1.097 0.069 15.883      0    0.961    1.232
6 attitude2 ~~ attitude2 0.837 0.054 15.498      0    0.731    0.943
7 attitude3 ~~ attitude3 0.959 0.063 15.121      0    0.835    1.084
8 attitude4 ~~ attitude4 0.966 0.061 15.809      0    0.847    1.086
9       att ~~       att 1.000 0.000     NA     NA    1.000    1.000

They all look good!

Following the same logic as for the Attitudes, let’s fit the CFA for Subjective norms. Again, all fit measures are very good, and loadings are all significant at greater than 0.3.

sn_mod <- "
  SubjN =~ SN1 + SN2 + SN3 + SN4
  "

sn_mod.est <- cfa(sn_mod, data=TPB_data, std.lv = TRUE)

fitmeasures(sn_mod.est)[c("rmsea","srmr","tli","cfi")]
     rmsea       srmr        tli        cfi 
0.03058032 0.01444391 0.98605752 0.99535251 
parameterestimates(sn_mod.est)
    lhs op   rhs   est    se      z pvalue ci.lower ci.upper
1 SubjN =~   SN1 0.644 0.048 13.524      0    0.550    0.737
2 SubjN =~   SN2 0.585 0.049 11.923      0    0.489    0.681
3 SubjN =~   SN3 0.584 0.044 13.266      0    0.498    0.670
4 SubjN =~   SN4 0.615 0.049 12.578      0    0.519    0.711
5   SN1 ~~   SN1 0.850 0.058 14.560      0    0.735    0.964
6   SN2 ~~   SN2 1.041 0.062 16.736      0    0.919    1.163
7   SN3 ~~   SN3 0.749 0.050 14.985      0    0.651    0.847
8   SN4 ~~   SN4 0.985 0.062 15.974      0    0.864    1.106
9 SubjN ~~ SubjN 1.000 0.000     NA     NA    1.000    1.000

All good with Perceived Behavioural Control!
Almost too good (TLI>1, and RMSEA is coming out at exactly 0!), but this is most probably because of this being fake data.
When data is simulated based on a specific model, then fitting that same model structure to the data will obviously fit extremely well! s

pbc_mod <- "
  PBC =~ PBC1 + PBC2 + PBC3 + PBC4
  "

pbc_mod.est <- cfa(pbc_mod, data=TPB_data, std.lv = TRUE)

fitmeasures(pbc_mod.est)[c("rmsea","srmr","tli","cfi")]
      rmsea        srmr         tli         cfi 
0.000000000 0.003369084 1.010079110 1.000000000 
parameterestimates(pbc_mod.est)
   lhs op  rhs   est    se      z pvalue ci.lower ci.upper
1  PBC =~ PBC1 0.696 0.043 16.258      0    0.612    0.780
2  PBC =~ PBC2 0.627 0.038 16.346      0    0.551    0.702
3  PBC =~ PBC3 0.592 0.041 14.619      0    0.513    0.672
4  PBC =~ PBC4 0.676 0.045 15.058      0    0.588    0.764
5 PBC1 ~~ PBC1 0.765 0.052 14.794      0    0.663    0.866
6 PBC2 ~~ PBC2 0.609 0.041 14.677      0    0.527    0.690
7 PBC3 ~~ PBC3 0.768 0.046 16.601      0    0.678    0.859
8 PBC4 ~~ PBC4 0.919 0.057 16.184      0    0.808    1.030
9  PBC ~~  PBC 1.000 0.000     NA     NA    1.000    1.000

Uh-oh, it’s looking less good with Intentions.
The loadings all look okay, but the fit indices aren’t great

int_mod <- "
  intent =~ int1 + int2 + int3 + int4 + int5
  "

int_mod.est <- cfa(int_mod, data=TPB_data, std.lv = TRUE)

fitmeasures(int_mod.est)[c("rmsea","srmr","tli","cfi")]
     rmsea       srmr        tli        cfi 
0.14128950 0.05266866 0.84561107 0.92280553 
parameterestimates(int_mod.est)
      lhs op    rhs   est    se      z pvalue ci.lower ci.upper
1  intent =~   int1 0.698 0.043 16.363      0    0.614    0.781
2  intent =~   int2 0.801 0.035 23.173      0    0.733    0.869
3  intent =~   int3 0.684 0.039 17.407      0    0.607    0.761
4  intent =~   int4 0.868 0.038 22.746      0    0.793    0.943
5  intent =~   int5 0.581 0.037 15.518      0    0.508    0.655
6    int1 ~~   int1 1.046 0.056 18.555      0    0.936    1.157
7    int2 ~~   int2 0.487 0.036 13.665      0    0.417    0.557
8    int3 ~~   int3 0.858 0.047 18.108      0    0.765    0.951
9    int4 ~~   int4 0.614 0.043 14.145      0    0.529    0.699
10   int5 ~~   int5 0.827 0.044 18.874      0    0.741    0.913
11 intent ~~ intent 1.000 0.000     NA     NA    1.000    1.000

Let’s examine the modification indices:

modindices(int_mod.est, sort = TRUE)
    lhs op  rhs     mi    epc sepc.lv sepc.all sepc.nox
17 int2 ~~ int4 97.630  0.414   0.414    0.757    0.757
13 int1 ~~ int3 50.107  0.270   0.270    0.285    0.285
16 int2 ~~ int3 21.423 -0.159  -0.159   -0.246   -0.246
20 int3 ~~ int5 18.787  0.145   0.145    0.172    0.172
19 int3 ~~ int4 17.657 -0.158  -0.158   -0.217   -0.217
12 int1 ~~ int2 16.578 -0.148  -0.148   -0.207   -0.207
14 int1 ~~ int4 10.596 -0.129  -0.129   -0.161   -0.161
21 int4 ~~ int5 10.532 -0.111  -0.111   -0.156   -0.156
15 int1 ~~ int5  4.521  0.077   0.077    0.083    0.083
18 int2 ~~ int5  3.438 -0.058  -0.058   -0.091   -0.091

It looks like correlating the residuals for items int2 and int4 would improve our model. The expected correlation is 0.757, which is fairly large (remember correlations are between -1 and 1).

Note that the items have a possible theoretical link too, beyond just “intention to exercise”. It looks like both int2 and int4 are specifically about intentions in the next three months. It might make sense that responses to these two items are related more than just representing general ‘intention’.

When we include this covariance, our model fit looks much better!

int_mod <- "
  intent =~ int1 + int2 + int3 + int4 + int5
  int2 ~~ int4
  "

int_mod.est <- cfa(int_mod, data=TPB_data, std.lv = TRUE)

fitmeasures(int_mod.est)[c("rmsea","srmr","tli","cfi")]
     rmsea       srmr        tli        cfi 
0.02753383 0.01475947 0.99413687 0.99765475 
parameterestimates(int_mod.est)
      lhs op    rhs   est    se      z pvalue ci.lower ci.upper
1  intent =~   int1 0.795 0.044 18.018      0    0.708    0.881
2  intent =~   int2 0.633 0.039 16.341      0    0.557    0.709
3  intent =~   int3 0.800 0.041 19.607      0    0.720    0.880
4  intent =~   int4 0.682 0.043 15.916      0    0.598    0.766
5  intent =~   int5 0.629 0.039 16.205      0    0.553    0.705
6    int2 ~~   int4 0.343 0.039  8.748      0    0.266    0.419
7    int1 ~~   int1 0.902 0.057 15.734      0    0.789    1.014
8    int2 ~~   int2 0.727 0.044 16.589      0    0.641    0.813
9    int3 ~~   int3 0.686 0.049 13.927      0    0.589    0.782
10   int4 ~~   int4 0.902 0.054 16.827      0    0.797    1.007
11   int5 ~~   int5 0.769 0.045 17.201      0    0.681    0.856
12 intent ~~ intent 1.000 0.000     NA     NA    1.000    1.000

Finally, the behaviour model looks absolutely fine.
Note that bey4 has a negative loading, which is perfectly okay. In fact, if you look at the items, you’ll notice that this is the only item that is reversed (higher scores on the item reflect less exercising)

beh_mod <- "
  behav =~ beh1 + beh2 + beh3 + beh4
  "

beh_mod.est <- cfa(beh_mod, data=TPB_data, std.lv = TRUE)

fitmeasures(beh_mod.est)[c("rmsea","srmr","tli","cfi")]
     rmsea       srmr        tli        cfi 
0.02896260 0.01285583 0.99191922 0.99730641 
parameterestimates(beh_mod.est)
    lhs op   rhs    est    se       z pvalue ci.lower ci.upper
1 behav =~  beh1  0.659 0.045  14.593      0    0.571    0.748
2 behav =~  beh2  0.735 0.045  16.275      0    0.647    0.824
3 behav =~  beh3  0.787 0.045  17.331      0    0.698    0.875
4 behav =~  beh4 -0.724 0.046 -15.626      0   -0.815   -0.633
5  beh1 ~~  beh1  0.987 0.058  16.962      0    0.873    1.101
6  beh2 ~~  beh2  0.889 0.058  15.312      0    0.775    1.003
7  beh3 ~~  beh3  0.819 0.059  13.908      0    0.703    0.934
8  beh4 ~~  beh4  0.980 0.061  16.029      0    0.860    1.099
9 behav ~~ behav  1.000 0.000      NA     NA    1.000    1.000

Question 3

Using lavaan syntax, specify the full structural equation model that corresponds to the model in Figure 1. For each construct use the measurement models from the previous question.

Estimate and evaluate the model

  • Does the model fit well?
  • Are the hypothesised paths significant?

This involves specifying the measurement models for all the latent variables, and then also specifying the relationships between those latent variables. All in the same model!

TPB_model<-'
  # measurement models  
  att =~ attitude1 + attitude2 + attitude3 + attitude4
  SN =~ SN1 + SN2 + SN3 + SN4
  PBC =~ PBC1 + PBC2 + PBC3 + PBC4
  intent =~ int1 + int2 + int3 + int4 + int5
  beh =~ beh1 + beh2 + beh3 + beh4
  
  # covariances between items
  int2 ~~ int4

  # regressions  
  beh ~ intent + PBC
  intent ~ att + SN + PBC

  # covariances between attitudes, SN, and PBC
  att ~~ SN    
  att ~~ PBC
  SN ~~ PBC
'

We can estimate the model using the sem() function.
As with cfa(), by default the sem() function will scale the latent variables by fixing the loading of the first item for each latent variable to 1.

TPB_model.est<-sem(TPB_model, data=TPB_data, std.lv=TRUE)

fitmeasures(TPB_model.est)[c("rmsea","srmr","tli","cfi")]
    rmsea      srmr       tli       cfi 
0.0108428 0.0268991 0.9935594 0.9944795 

We can see that the model fits well according to RMSEA, SRMR, TLI and CFI.
From the output below, all of the hypothesised paths in the theory of planned behaviour are statistically significant.

summary(TPB_model.est, standardized=TRUE)
lavaan 0.6-20 ended normally after 21 iterations

  Estimator                                         ML
  Optimization method                           NLMINB
  Number of model parameters                        51

  Number of observations                           890

Model Test User Model:
                                                      
  Test statistic                               198.834
  Degrees of freedom                               180
  P-value (Chi-square)                           0.160

Parameter Estimates:

  Standard errors                             Standard
  Information                                 Expected
  Information saturated (h1) model          Structured

Latent Variables:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
  att =~                                                                
    attitude1         0.687    0.050   13.854    0.000    0.687    0.550
    attitude2         0.614    0.044   14.018    0.000    0.614    0.557
    attitude3         0.662    0.047   13.977    0.000    0.662    0.555
    attitude4         0.660    0.047   14.140    0.000    0.660    0.561
  SN =~                                                                 
    SN1               0.644    0.045   14.213    0.000    0.644    0.572
    SN2               0.595    0.047   12.573    0.000    0.595    0.505
    SN3               0.573    0.042   13.637    0.000    0.573    0.548
    SN4               0.620    0.047   13.202    0.000    0.620    0.531
  PBC =~                                                                
    PBC1              0.687    0.041   16.555    0.000    0.687    0.615
    PBC2              0.617    0.037   16.606    0.000    0.617    0.616
    PBC3              0.608    0.039   15.412    0.000    0.608    0.575
    PBC4              0.681    0.044   15.581    0.000    0.681    0.581
  intent =~                                                             
    int1              0.648    0.039   16.793    0.000    0.777    0.628
    int2              0.554    0.034   16.519    0.000    0.664    0.626
    int3              0.643    0.036   17.876    0.000    0.771    0.670
    int4              0.586    0.037   15.818    0.000    0.703    0.601
    int5              0.534    0.034   15.879    0.000    0.641    0.594
  beh =~                                                                
    beh1              0.556    0.038   14.461    0.000    0.678    0.568
    beh2              0.588    0.039   15.203    0.000    0.718    0.600
    beh3              0.635    0.039   16.208    0.000    0.775    0.646
    beh4             -0.604    0.040  -15.220    0.000   -0.737   -0.601

Regressions:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
  beh ~                                                                 
    intent            0.465    0.057    8.166    0.000    0.457    0.457
    PBC               0.251    0.063    3.967    0.000    0.206    0.206
  intent ~                                                              
    att               0.242    0.061    3.986    0.000    0.202    0.202
    SN                0.335    0.064    5.233    0.000    0.279    0.279
    PBC               0.338    0.059    5.726    0.000    0.282    0.282

Covariances:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
 .int2 ~~                                                               
   .int4              0.307    0.037    8.403    0.000    0.307    0.397
  att ~~                                                                
    SN                0.316    0.050    6.300    0.000    0.316    0.316
    PBC               0.245    0.049    5.056    0.000    0.245    0.245
  SN ~~                                                                 
    PBC               0.275    0.049    5.628    0.000    0.275    0.275

Variances:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
   .attitude1         1.089    0.067   16.231    0.000    1.089    0.697
   .attitude2         0.840    0.052   16.057    0.000    0.840    0.690
   .attitude3         0.985    0.061   16.101    0.000    0.985    0.692
   .attitude4         0.946    0.059   15.925    0.000    0.946    0.685
   .SN1               0.850    0.055   15.385    0.000    0.850    0.672
   .SN2               1.030    0.060   17.099    0.000    1.030    0.745
   .SN3               0.762    0.047   16.061    0.000    0.762    0.699
   .SN4               0.979    0.059   16.516    0.000    0.979    0.718
   .PBC1              0.777    0.050   15.682    0.000    0.777    0.622
   .PBC2              0.621    0.040   15.630    0.000    0.621    0.620
   .PBC3              0.750    0.045   16.715    0.000    0.750    0.670
   .PBC4              0.912    0.055   16.576    0.000    0.912    0.663
   .int1              0.929    0.055   16.918    0.000    0.929    0.606
   .int2              0.687    0.041   16.675    0.000    0.687    0.609
   .int3              0.730    0.046   15.843    0.000    0.730    0.551
   .int4              0.873    0.051   17.115    0.000    0.873    0.639
   .int5              0.754    0.043   17.618    0.000    0.754    0.648
   .beh1              0.962    0.056   17.226    0.000    0.962    0.677
   .beh2              0.915    0.055   16.517    0.000    0.915    0.640
   .beh3              0.837    0.055   15.238    0.000    0.837    0.582
   .beh4              0.961    0.058   16.498    0.000    0.961    0.639
    att               1.000                               1.000    1.000
    SN                1.000                               1.000    1.000
    PBC               1.000                               1.000    1.000
   .intent            1.000                               0.695    0.695
   .beh               1.000                               0.672    0.672

Question 4

Examine the modification indices and expected parameter changes - are there any additional parameters you would consider including?

Making adjustments our theoretical model in order to better represent this sample, we are risking a) over-fitting to the specifics of this sample, and b) testing a theory that we didn’t really have a priori (i.e. we didn’t have this theoretical model before seeing this data).

However, it can still be worth looking at modindices in order to assess any places of local misfit in the model. These can provide useful discussion points and make us pause for thought, even if we are happy with our current model fit.

In this case, none of the expected parameter changes are very large.

modindices(TPB_model.est, sort = TRUE) |> head()
     lhs op  rhs     mi    epc sepc.lv sepc.all sepc.nox
316 int1 ~~ int3 12.167  0.143   0.143    0.174    0.174
103  PBC =~ int5 11.028  0.151   0.151    0.140    0.140
303 PBC3 ~~ beh2  8.351  0.095   0.095    0.114    0.114
336 int4 ~~ beh1  7.886  0.087   0.087    0.095    0.095
291 PBC2 ~~ int5  7.847  0.076   0.076    0.111    0.111
263  SN4 ~~ PBC4  6.978 -0.098  -0.098   -0.104   -0.104

Question 5

Test the indirect effect of attitudes, subjective norms, and perceived behavioural control on behaviour via intentions.

Remember, when you fit the model with sem(), use se='bootstrap' to get boostrapped standard errors (it may take a few minutes). When you inspect the model using summary(), get the 95% confidence intervals for parameters with ci = TRUE.

First, let’s name the paths in the structural equation model:

To test these indirect effects we create new a parameter for each indirect effect:

TPB_model2 <- '
  # measurement models  
  att =~ attitude1 + attitude2 + attitude3 + attitude4
  SN =~ SN1 + SN2 + SN3 + SN4
  PBC =~ PBC1 + PBC2 + PBC3 + PBC4
  intent =~ int1 + int2 + int3 + int4 + int5
  beh =~ beh1 + beh2 + beh3 + beh4
  
  # covariances between items
  int2 ~~ int4

  # regressions  
  beh ~ b*intent + PBC
  intent ~ a1*att + a2*SN + a3*PBC

  # covariances between attitudes, SN, and PBC
  att ~~ SN    
  att ~~ PBC
  SN ~~ PBC

  # indirect effects:  
  ind1 := a1*b  #indirect effect of attitudes via intentions
  ind2 := a2*b  #indirect effect of SN via intentions
  ind3 := a3*b  #indirect effect of PBC via intentions
'

When we estimate the model, we request bootstrapped standard errors:

TPB_model2.est<-sem(TPB_model2, std.lv=TRUE, se='bootstrap', data=TPB_data)

When we inspect the model, we request the 95% confidence intervals for parameters:

summary(TPB_model2.est, ci=TRUE)
lavaan 0.6.17 ended normally after 21 iterations

  Estimator                                         ML
  Optimization method                           NLMINB
  Number of model parameters                        51

  Number of observations                           890

Model Test User Model:
                                                      
  Test statistic                               198.834
  Degrees of freedom                               180
  P-value (Chi-square)                           0.160

Parameter Estimates:

  Standard errors                            Bootstrap
  Number of requested bootstrap draws             1000
  Number of successful bootstrap draws            1000

Latent Variables:
                   Estimate  Std.Err  z-value  P(>|z|) ci.lower ci.upper
  att =~                                                                
    attitude1         0.687    0.049   14.147    0.000    0.585    0.780
    attitude2         0.614    0.044   13.899    0.000    0.528    0.700
    attitude3         0.662    0.046   14.327    0.000    0.567    0.745
    attitude4         0.660    0.045   14.639    0.000    0.566    0.745
  SN =~                                                                 
    SN1               0.644    0.046   13.946    0.000    0.552    0.734
    SN2               0.595    0.046   12.962    0.000    0.501    0.687
    SN3               0.573    0.044   12.955    0.000    0.487    0.657
    SN4               0.620    0.048   13.001    0.000    0.524    0.717
  PBC =~                                                                
    PBC1              0.687    0.039   17.604    0.000    0.614    0.765
    PBC2              0.617    0.036   17.152    0.000    0.540    0.683
    PBC3              0.608    0.040   15.052    0.000    0.524    0.687
    PBC4              0.681    0.043   15.872    0.000    0.594    0.761
  intent =~                                                             
    int1              0.648    0.040   16.116    0.000    0.571    0.728
    int2              0.554    0.033   16.651    0.000    0.485    0.615
    int3              0.643    0.038   16.874    0.000    0.566    0.715
    int4              0.586    0.037   15.918    0.000    0.515    0.655
    int5              0.534    0.031   17.381    0.000    0.472    0.594
  beh =~                                                                
    beh1              0.556    0.039   14.358    0.000    0.477    0.628
    beh2              0.588    0.039   15.130    0.000    0.512    0.662
    beh3              0.635    0.039   16.230    0.000    0.558    0.709
    beh4             -0.604    0.040  -15.181    0.000   -0.684   -0.528

Regressions:
                   Estimate  Std.Err  z-value  P(>|z|) ci.lower ci.upper
  beh ~                                                                 
    intent     (b)    0.465    0.056    8.242    0.000    0.354    0.576
    PBC               0.251    0.066    3.794    0.000    0.126    0.383
  intent ~                                                              
    att       (a1)    0.242    0.066    3.672    0.000    0.116    0.369
    SN        (a2)    0.335    0.064    5.196    0.000    0.213    0.476
    PBC       (a3)    0.338    0.062    5.483    0.000    0.228    0.468

Covariances:
                   Estimate  Std.Err  z-value  P(>|z|) ci.lower ci.upper
 .int2 ~~                                                               
   .int4              0.307    0.038    8.153    0.000    0.233    0.383
  att ~~                                                                
    SN                0.316    0.050    6.313    0.000    0.216    0.412
    PBC               0.245    0.050    4.942    0.000    0.148    0.342
  SN ~~                                                                 
    PBC               0.275    0.049    5.563    0.000    0.175    0.370

Variances:
                   Estimate  Std.Err  z-value  P(>|z|) ci.lower ci.upper
   .attitude1         1.089    0.062   17.481    0.000    0.966    1.204
   .attitude2         0.840    0.052   16.198    0.000    0.741    0.940
   .attitude3         0.985    0.060   16.344    0.000    0.862    1.098
   .attitude4         0.946    0.062   15.353    0.000    0.817    1.061
   .SN1               0.850    0.055   15.489    0.000    0.742    0.954
   .SN2               1.030    0.059   17.516    0.000    0.914    1.148
   .SN3               0.762    0.051   15.082    0.000    0.671    0.858
   .SN4               0.979    0.060   16.353    0.000    0.860    1.100
   .PBC1              0.777    0.049   15.821    0.000    0.679    0.873
   .PBC2              0.621    0.037   16.567    0.000    0.547    0.693
   .PBC3              0.750    0.042   17.901    0.000    0.671    0.831
   .PBC4              0.912    0.052   17.498    0.000    0.811    1.015
   .int1              0.929    0.053   17.564    0.000    0.822    1.030
   .int2              0.687    0.042   16.534    0.000    0.600    0.771
   .int3              0.730    0.050   14.687    0.000    0.628    0.830
   .int4              0.873    0.050   17.561    0.000    0.772    0.970
   .int5              0.754    0.042   18.084    0.000    0.670    0.836
   .beh1              0.962    0.053   18.042    0.000    0.857    1.071
   .beh2              0.915    0.057   16.080    0.000    0.802    1.035
   .beh3              0.837    0.057   14.574    0.000    0.717    0.947
   .beh4              0.961    0.061   15.767    0.000    0.837    1.085
    att               1.000                               1.000    1.000
    SN                1.000                               1.000    1.000
    PBC               1.000                               1.000    1.000
   .intent            1.000                               1.000    1.000
   .beh               1.000                               1.000    1.000

Defined Parameters:
                   Estimate  Std.Err  z-value  P(>|z|) ci.lower ci.upper
    ind1              0.113    0.032    3.478    0.001    0.055    0.185
    ind2              0.156    0.035    4.470    0.000    0.091    0.229
    ind3              0.157    0.034    4.685    0.000    0.099    0.231

We can see that all of the indirect effects are statistically significant at p<.05 as none of the 95% confidence intervals for the coefficients include zero.

Question 6

Write up your analysis as if you were presenting the work in academic paper, with brief separate ‘Method’ and ‘Results’ sections

Method

We tested a theory of planned behaviour model of physical activity by fitting a structural equation model in which attitudes, subjective norms, perceived behavioural control, intentions and behaviour were latent variables defined by four items. We first tested the measurement models for each construct by fitting a one-factor CFA model. Latent variable scaling was by fixing the loading of the first item for each construct to 1.

Within the SEM, behaviour was regressed on intentions and perceived behavioural control and intentions were regressed on attitudes, subjective norms, and perceived behavioiural control. In addition, attitudes, subjective norms, and perceived behavioural control were allowed to covary. The indirect effects of attitudes, subjective norms and perceived behavioural control on behaviour were calculated as the product of the effect of the relevant predictor on the mediator (intentions) and the effect of the mediator on the outcome. The statistical significance of the indirect effects were evaluated using bootstrapped 95% confidence intervals with 1000 resamples.

In all cases models were fit using maximum likelihood estimation and model fit was judged to be good if CFI and TLI were \(>.95\) and RMSEA and SRMR were \(<.05\). Modification indices and expected parameter changes were inspected to identify any areas of local mis-fit but model modifications were only made if they could be justified on substantive grounds.

Results

All measurement models fit well (CFI and TLI \(>.95\) and RMSEA and SRMR \(<.05\)) with the exception of the measurement model for intentions. Modification indices suggested the inclusion of residual covariance between two items on the intentions scale (int2 and int4) that both made specific reference to short term intentions. The addition of this parameter resulted in a good fit. The full structural equation model (with the residual covariance between int2 and int4 included) fit well (CFI = 0.99, TLI = 0.99, RMSEA = 0.01, SRMR = 0.03). Unstandardised parameter estimates are provided in Table 2. All of the hypothesised paths were statistically significant at \(p<.05\). Significant indirect effects suggested that intentions mediate the effects of attitudes, subjective norms, and perceived behavioural control on behaviour whilst perceived behavioural control also has a direct effect on behaviour. Results thus provide support for a theory of planned behaviour model of physical activity.

Table 2: Unstandardised parameter estimates for structural equation model for a theory of planned behaviour model of physical activity. Note: PBC = Perceived Behavioural Control, CI = Confidence Interval
Parameter Estimate SE z p 95% CI
Loadings
Attitudes attitude1 0.69 0.05 14.15 <0.001 [0.59, 0.78]
attitude2 0.61 0.04 13.90 <0.001 [0.53, 0.7]
attitude3 0.66 0.05 14.33 <0.001 [0.57, 0.75]
attitude4 0.66 0.05 14.64 <0.001 [0.57, 0.75]
Subjective Norms SN1 0.64 0.05 13.95 <0.001 [0.55, 0.73]
SN2 0.59 0.05 12.96 <0.001 [0.5, 0.69]
SN3 0.57 0.04 12.95 <0.001 [0.49, 0.66]
SN4 0.62 0.05 13.00 <0.001 [0.52, 0.72]
PBC PBC1 0.69 0.04 17.60 <0.001 [0.61, 0.77]
PBC2 0.62 0.04 17.15 <0.001 [0.54, 0.68]
PBC3 0.61 0.04 15.05 <0.001 [0.52, 0.69]
PBC4 0.68 0.04 15.87 <0.001 [0.59, 0.76]
Intentions int1 0.65 0.04 16.12 <0.001 [0.57, 0.73]
int2 0.55 0.03 16.65 <0.001 [0.49, 0.62]
int3 0.64 0.04 16.87 <0.001 [0.57, 0.72]
int4 0.59 0.04 15.92 <0.001 [0.51, 0.66]
int5 0.53 0.03 17.38 <0.001 [0.47, 0.59]
Behaviours beh1 0.56 0.04 14.36 <0.001 [0.48, 0.63]
beh2 0.59 0.04 15.13 <0.001 [0.51, 0.66]
beh3 0.64 0.04 16.23 <0.001 [0.56, 0.71]
beh4 -0.60 0.04 -15.18 <0.001 [-0.68, -0.53]
Covariances
int2 with int4 0.31 0.04 8.15 <0.001 [0.23, 0.38]
Attitudes with Subjective Norms 0.32 0.05 6.31 <0.001 [0.22, 0.41]
Attitudes with PBC 0.25 0.05 4.94 <0.001 [0.15, 0.34]
Subjective Norms with PBC 0.27 0.05 5.56 <0.001 [0.18, 0.37]
Regressions
Behaviours on Intentions 0.47 0.06 8.24 <0.001 [0.35, 0.58]
Behaviours on PBC 0.25 0.07 3.79 <0.001 [0.13, 0.38]
Intentions on Attitudes 0.24 0.07 3.67 <0.001 [0.12, 0.37]
Intentions on Subjective Norms 0.33 0.06 5.20 <0.001 [0.21, 0.48]
Intentions on PBC 0.34 0.06 5.48 <0.001 [0.23, 0.47]
Indirect effects
Attitudes via Intentions 0.11 0.03 3.48 <0.001 [0.05, 0.19]
Subjective Norms via Intentions 0.16 0.03 4.47 <0.001 [0.09, 0.23]
PBC via Intentions 0.16 0.03 4.68 <0.001 [0.1, 0.23]

Models of pro-environmental behaviour

Warning: ambiguity incoming!!

In some fields, theories are built on top of immutable laws and well defined measures of physical quantities. In much of the behavioural and social sciences, theories can feel a bit more like a “free-for-all”, working with broad, overlapping concepts that are hard to define, let alone measure. It’s not bad, just very difficult!

This next set of exercises are loosely inspired by Kaiser et al., 2006 :Contrasting the Theory of Planned Behavior With the Value-Belief-Norm Model in Explaining Conservation Behavior, and provide an example of how confusing it is to work in this sort of area.

Dataset: consvmodels.csv

The “theory of planned behaviour” (TPB) is a broad psycho-social theory of ‘why people do things’, that you can find applied in all sorts of contexts, from health psychology to business/organisation psychology, to environmental psychology. Broadly speaking, the theory suggests that we do things because they are beneficial, socially acceptable, and do-able.

A contrasting theory, specifically for why people take pro-environmental actions, suggests that we do things because our values inform an ‘environmental worldview’ (a set of beliefs about the state of the world), and this in turn results in taking more pro-environmental actions because it encourages us to consider the consequences of our actions and thus our responsibility and our “Personal Norms” (i.e., our personal moral obligation toward the environment). This theory — the “Value-Belief-Norm (VBN) theory” — contrasts with the TPB idea in that it views behavior as a moral response rather than a rational choice. Essentially, the TPB suggests a decision is made by asking ‘is this action good for me and my social standing?’, where the VBN equivalent question would be ‘is this action the right thing to do based on my duty to the planet?’

We’re going to compare these two theories in terms of how well they predict pro-environmental actions.

We have data from 500 people, all of whom filled out a questionnaire that contained 48 items, measuring each of the constructs involved in both TBP and VBN.

TPB constructs

  • Attitudes - 5 items: att1, att2, att3, att4, att5
  • Pro-environmental Social Norms - 5 items: sn1, sn2, sn3, sn4, sn5
  • Perceived Behavioural Control - 5 items: pbc1, pbc2, pbc3, pbc4, pbc5
  • Pro-environmental Intentions - 5 items: int1, int2, int3, int4, int5

VBN constructs

  • Environmental Worldview (‘New Ecological Paradigm’ questions) - 5 items: nep1, nep2, nep3, nep4, nep5
  • Awareness of Consequences - 5 items: awar1, awar2, awar3, awar4, awar5
  • Environmental Responsibility - 5 items: resp1, resp2, resp3, resp4, resp5
  • Personal Norms - 5 items: pn1, pn2, pn3, pn4, pn5

Outcome (for both TPB and VBN)

  • Conservationist Behaviours - 8 items: cb1, cb2, cb3, cb4, cb5, cb6, cb7, cb8

The data can be found at https://uoepsy.github.io/data/consvmodels.csv

Table 3: Data Dictionary: consvmodels.csv
variable wording
att1 Protecting the environment is beneficial and advantageous for society.
att2 Taking action to help the environment feels satisfying and rewarding to me.
att3 I believe acting in an environmentally friendly way is a sensible and effective thing to do.
att4 Environmental conservation is a wise and productive use of my time.
att5 Overall, I have a highly positive and favorable view of being 'green'.
sn1 I feel social pressure to be more environmentally conscious in my daily life.
sn2 People expect each other to protect the environment.
sn3 People whose opinions I value would approve of people making 'green' choices.
sn4 Many people I look up to take active steps to help the environment.
sn5 Most people who are important to me think I should act environmentally friendly.
pbc1 I am confident that I can perform pro-environmental behaviors if I want to.
pbc2 I have the resources and opportunities I need to protect the environment.
pbc3 For me, living an environmentally friendly lifestyle is easy.
pbc4 Whether or not I act environmentally friendly is entirely up to me.
pbc5 I have complete control over how much I contribute to environmental protection.
int1 I intend to take action to protect the environment in the next month.
int2 I plan to reduce my environmental footprint significantly.
int3 I will make a conscious effort to engage in pro-environmental behaviors.
int4 I am determined to choose 'green' alternatives whenever possible.
int5 I expect to increase my level of environmental conservation in the near future.
nep1 The balance of nature is very delicate and easily upset by human activities.
nep2 Humans are severely abusing the environment.
nep3 Plants and animals have as much right as humans to exist.
nep4 The earth is like a spaceship with very limited room and resources.
nep5 Humans must live in harmony with nature in order to survive.
awar1 If we don't act now, the damage to our ecosystem will be irreversible.
awar2 Climate change will have dangerous consequences for my health and safety.
awar3 I believe that environmental problems have a direct impact on my community.
awar4 Environmental protection will help ensure a better life for future generations.
awar5 Environmental pollution is a major threat to all living things on Earth.
resp1 I feel personally responsible for the environmental problems caused by my lifestyle.
resp2 My individual actions can make a meaningful difference in the environment.
resp3 Every person is responsible for the protection of the natural world.
resp4 I believe I have a duty to help solve the environmental issues we face today.
resp5 I feel a sense of ownership over the environmental impact of my household.
pn1 Protecting the environment is a duty I owe to society and/or the planet.
pn2 I would feel guilty and at fault if I did not take action to help the environment.
pn3 I believe acting in an environmentally friendly way is a morally right and necessary thing to do.
pn4 My conscience would bother me if I ignored environmental issues.
pn5 Overall, I feel that being 'green' is a core requirement of my personal values.
cb1 Consumer Choice: I chose to buy products with less packaging or products made from recycled materials.
cb2 Waste Management: I made a conscious effort to sort and recycle my household waste (paper, plastic, glass).
cb3 Resource Conservation: I reduced my water consumption by taking shorter showers or turning off the tap while brushing teeth.
cb4 Sustainable Shopping: I brought my own reusable bags or containers when shopping to avoid using plastic bags.
cb5 Energy Efficiency: I turned off lights and electronic devices in rooms that were not being used to save electricity.
cb6 Transportation: I opted for public transport, cycling, or walking instead of driving a private car for short trips.
cb7 Chemical Reduction: I used eco-friendly cleaning products or avoided using harsh chemicals in my home/garden.
cb8 Temperature Control: I kept the heating/cooling in my home at a lower/higher setting than usual to save energy.
Question 7

Read in the data. It’s all nice and cleaned and ready to go.

Get some quick plots of item distributions to check things look normal, and get a nice table of descriptive stats for all the variables - stuff like skew and kurtosis.

the functions (both from the psych package) like multi.hist() and describe() are designed for exactly this purpose - quick explorations of lots and lots of variables.

Question 8

In order to compare how well these two theories predict the pro-environmental behaviour, we’re going to want to specify and fit two models, one for the TPB and the other for the VBN theory, but with the same outcome.

Note that in our diagram, that last bit of the two models is the same, going from Intentions->Behaviours.

Before you get started with modelling, check your measurement models for the different constructs, and make any modifications that you deem to be justifiable in order to achieve good fit.

It’s a pain having to write these all out, so if you want to save time you can copy-paste these:

Attitudes =~ att1 + att2 + att3 + att4 + att5

SNorms =~ sn1 + sn2 + sn3 + sn4 + sn5

PBControl =~ pbc1 + pbc2 + pbc3 + pbc4 + pbc5

Intentions =~ int1 + int2 + int3 + int4 + int5

EWV =~ nep1 + nep2 + nep3 + nep4 + nep5

Aware =~ awar1 + awar2 + awar3 + awar4 + awar5

Resp =~ resp1 + resp2 + resp3 + resp4 + resp5

PNorms =~ pn1 + pn2 + pn3 + pn4 + pn5

Conserv_Beh =~ cb1 + cb2 + cb3 + cb4 + cb5 + cb6 + cb7 + cb8

Question 9

Okay, let’s now move to specifying and fitting models that reflect our two theories - TPB and VBN. Do they fit well? are all of the hypothesised paths are significant?

Question 10

Our question is about how well these two theories predict conservationist behaviours.

We’re using the same outcome - conservationist behaviours - so what we would like to know is how much variance in the outcome is explained in each of our models.

We can do that!

inspect(model, what = "rsquare")

Which theory explains more variability in how people engage in pro-environmental behaviours?

Question 11

Let’s take stock of where we are now. We’ve got two competing theories about why people act in environmentally friendly ways. Both theories provide overall good fit to the data. They explain a similar amount of variance in our final outcome measure of conservationist behaviours, but the TPB provides a better prediction of peoples intentions.

To do some more thorough work, we might want to think a bit more about how exactly these two theories differ. If we take a step back a bit, both of these theories are just saying “Something–>Intentions–>Actions”, and they differ in terms of what they say explains why people have different intentions. TPB says our intentions are driven by 3 things (Attitudes, Social Pressure, and amount of control we think we have over our actions), and VBN says they are driven by a chain of things that results in a Personal sense of moral obligation (“Personal Norms”).

So one way we could start to think about assessing these theories, is to ask if the addition of the “Personal Norms” part of VBN provides explanatory power beyond the other parts of the TPB, i.e.:

Fit the model presented in the diagram above. What do you conclude (if anything?)