Packages for today

lavaan
semPlot or tidySEM

Introducing Path Analysis

over last couple of weeks we have applied exploratory and then confirmatory factor analysis to develop and then test factor analysis models of ‘conduct problems.’ Factor models posit the existence of some underlying latent variable which is thought of as resulting in the scores on our measured items. Especially in Week 8 we began to depict the variables and parameters involved in these models visually, in what get called ‘path’ or ‘SEM’ diagrams. Specifically, by using rectangles (observed variables), ovals (latent variables), single headed arrows (regression paths) and double headed arrows (covariances), we could draw various model structures.

This week, we are temporarily putting aside the latent variables (no ovals in the drawings today!) and focusing on some of the fundamentals that motivate this modeling approach.

Mountains cannot be surmounted except by winding paths

Over the course of USMR and the first block of this course, you have hopefully become pretty comfortable with the regression world, and can see how it is extended to lots of different types of outcome and data stuctures.

If we are for the time being ignoring the latent variables, then what exactly do we gain by this approach of drawing out our variables and drawing various lines between them? Surely our regression toolkit can do all the things we need?

Let’s imagine we are interested in peoples’ intention to get vaccinated, and we observe the following variables:

Intention to vaccinate (scored on a range of 0-100)
Health Locus of Control (HLC) score (average score on a set of items relating to perceived control over ones own health)
Religiosity (average score on a set of items relating to an individual’s religiosity).

We are assuming here that we do not have the individual items, but only the scale scores (if we had the individual items we might be inclined to model religiosity and HLC as latent variables!).
If we draw out our variables, and think about this in the form of a standard regression model with “Intention to vaccinate” as our outcome variable, then all the lines are filled in for us (see Figure 1)

Figure 1: Multiple regression as a path model

But what if our theory suggests that some other model might be of more relevance? For instance, what if we believe that participants’ religiosity has an effect on their Health Locus of Control score, which in turn affects the intention to vaccinate (see Figure 2)?
In this case, the HLC variable is thought of as a mediator, because is mediates the effect of religiosity on intention to vaccinate. We are specifying presence of a distinct type of effect: direct and indirect.

Direct vs Indirect

In path diagrams:

Direct effect = one single-headed arrow between the two variables concerned
Indirect effect = An effect transmitted via some other variables

Figure 2: Mediation as a path model

The only option here is to conduct several regression models, because we have multiple endogenous variables. Fortunately, path analysis allows us to do just that - fit simultaneous regression equations!

Terminology refresher

Exogenous variables are a bit like what we have been describing with words like “independent variable” or “predictor.” In a path diagram, they have no paths coming from other variables in the system, but have paths going to other variables.
Endogenous variables are more like the “outcome”/“dependent”/“response” variables we are used to. They have some path coming from another variable in the system (and may also - but not necessarily - have paths going out from them).

Recall our way of drawing path diagrams (excluding any mention of latent variables and factors for now):

Observed variables are represented by squares or rectangles. These are the named variables of interest which exist in our dataset - i.e. the ones which we have measured directly.
Covariances are represented by double-headed arrows. In many diagrams these are curved.
Regressions are shown by single headed arrows (e.g., an arrow from \(x\) to \(y\) for the path \(y~x\)).

Some key assumptions

There are a few assumptions of a complete path diagram:

all our exogenous variables are correlated (unless we specifically assume that their correlation is zero)
All models are ‘recursive’ (no two-way causal relations, no feedback loops)
Residuals are uncorrelated with exogenous variables
Endogenous variables are not connected by correlations (we would use correlations between residuals here, because the residuals are not endogenous)
All ‘causal’ relations are linear and additive
‘causes’ are unitary (if A -> B and A -> C, then it is presumed that this is the same aspect of A resulting in a change in both B and C, and not two distinct aspects of A, which would be better represented by two correlated variables A1 and A2).

Causal??

It is a slippery slope to start using the word ‘cause,’ and personally I am not that comfortable using it here. However, you will likely hear it a lot in resources about path analysis and SEM, so it is best to be warned.

Please keep in mind that we are using a very broad definition of ‘causal,’ simply to reflect the one way nature of the relationship we are modeling. In Figure 3, a change in the variable X1 is associated with a change in Y, but not vice versa.

Figure 3: Paths are still just regressions.

Tracing rules

Thanks to Sewal Wright, we can express the correlation between any two variables in the system as the sum of all compound paths between the two variables.

compound paths are any paths you can trace between A and B for which there are:

no loops
no going forward then backward
maximum of one curved arrow per path

Let’s consider the example below, for which the paths are all labelled with lower case letters \(a, b, c, \text{and } d\).

Figure 4: A multiple regression model as a path diagram

According to Wright’s tracing rules above, write out the equations corresponding to the 3 correlations between our observed variables (remember that \(r_{a,b} = r_{b,a}\), so it doesn’t matter at which variable we start the paths).

\(r_{x1,x2} = c\)
\(r_{x1,y} = a + bc\)
\(r_{x2,y} = b + ac\)

Now let’s suppose we observed the following correlation matrix:

egdat <- read_csv("https://uoepsy.github.io/data/patheg.csv")
round(cor(egdat),2)

##      x1   x2    y
## x1 1.00 0.36 0.75
## x2 0.36 1.00 0.60
## y  0.75 0.60 1.00

We can plug these into our system of equations:

\(r_{x1,x2} = c = 0.36\)
\(r_{x1,y} = a + bc = 0.75\)
\(r_{x2,y} = b + ac = 0.60\)

And with some substituting and rearranging, we can work out the values of \(a\), \(b\) and \(c\).

We’ve hidden the answers so you can test yourself. Grab a piece of paper and solve some equations!

We can even work out what the path labeled \(d\) (the residual variance) is. First we sum up all the equations for the paths from Y to Y. These are:

\(a^2\) (from Y to X1 and back)
\(b^2\) (from Y to X2 and back)
\(acb\) (from Y to X1 to X2 to Y)
\(bca\) (from Y to X2 to X1 to Y)

Summing them all up and solving gives us:
\[ \begin{align} r_{y \cdot x1, x2} & = a^2 + b^2 + acb + bca\\ & = 0.61^2 + 0.38^2 + 2 \times(0.61 \times 0.38 \times 0.36)\\ & = 0.68 \\ \end{align} \] We can think of this as the portion of the correlation of Y with itself that occurs via the predictors. Put another way, this is the amount of variance in Y explained jointly by X1 and X2, which sounds an awful lot like an \(R^2\)!
This means that the path \(d = \sqrt{1-R^2}\).

Hooray! We’ve just worked out regression coefficients when all we had was the correlation matrix of the variables! It’s important to note that we have been using the correlation matrix, so, somewhat unsurprisingly, our estimates are standardised coefficients.

Because we have the data itself, let’s quickly find them with lm()

# quickly scale all the columns in the data
egdat <- egdat %>% mutate_all(~scale(.)[,1])
# extract the coefs
coef(lm(y~x1+x2, egdat))

##  (Intercept)           x1           x2 
## 1.943428e-16 6.118072e-01 3.816321e-01

# extract the r^2
summary(lm(y~x1+x2, egdat))$r.squared

## [1] 0.6884071

Path Mediation

Now that we’ve seen how path analysis works, we can use that same logic to investigate models which have quite different structures, such as those including mediating variables. So if we can’t fit our theoretical model into a regression framework, let’s just fit it into a framework which is lots of regressions smushed together!
Luckily, we can just get the lavaan package to do all of this for us. So let’s look at fitting the model below.

If you're interested, you can find the inspiration for this data from the paper [here](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7596314/). I haven't properly read it though!

Figure 5: If you’re interested, you can find the inspiration for this data from the paper here. I haven’t properly read it though!

First we read in our data:

vax <- read_csv("https://uoepsy.github.io/data/vaxdat.csv")
summary(vax)

##   religiosity          hlc          intention    
##  Min.   :-1.000   Min.   :0.400   Min.   :39.00  
##  1st Qu.: 1.800   1st Qu.:2.000   1st Qu.:59.00  
##  Median : 2.400   Median :3.000   Median :64.00  
##  Mean   : 2.396   Mean   :2.992   Mean   :65.09  
##  3rd Qu.: 3.000   3rd Qu.:3.600   3rd Qu.:74.00  
##  Max.   : 4.600   Max.   :5.800   Max.   :88.00

Then we specify the relevant paths:

med_model <- " 
    intention ~ religiosity
    intention ~ hlc
    hlc ~ religiosity
"

If we fit this model as it is, we won’t actually be testing the indirect effect, we will simply be fitting a couple of regressions.

To do that, we need to explicitly define the indirect effect in the model, by first creating a label for each of its sub-component paths, and then defining the indirect effect itself as the product (why the product? Click here for a lovely pdf explainer from Aja).
To do this, we use a new operator, :=.

med_model <- " 
    intention ~ religiosity
    intention ~ a*hlc
    hlc ~ b*religiosity
    
    indirect:=a*b
"

This operator ‘defines’ new parameters which take on values that are an arbitrary function of the original model parameters. The function, however, must be specified in terms of the parameter labels that are explicitly mentioned in the model syntax.

(the lavaan project)

Note. The labels we use are completely up to us. This would be equivalent:

med_model <- " 
    intention ~ religiosity
    intention ~ peppapig * hlc
    hlc ~ kermit * religiosity
    
    indirect:= kermit * peppapig
"

Estimating the model

It is common to estimate the indirect effect using bootstrapping (a method of resampling the data with replacement, thousands of times, in order to empirically generate a sampling distribution). We can do this easily in lavaan:

mm1.est <- sem(med_model, data=vax, se = "bootstrap") 
summary(mm1.est, ci = TRUE)

## lavaan 0.6-8 ended normally after 26 iterations
## 
##   Estimator                                         ML
##   Optimization method                           NLMINB
##   Number of model parameters                         5
##                                                       
##   Number of observations                           100
##                                                       
## Model Test User Model:
##                                                       
##   Test statistic                                 0.000
##   Degrees of freedom                                 0
## 
## Parameter Estimates:
## 
##   Standard errors                            Bootstrap
##   Number of requested bootstrap draws             1000
##   Number of successful bootstrap draws            1000
## 
## Regressions:
##                    Estimate  Std.Err  z-value  P(>|z|) ci.lower ci.upper
##   intention ~                                                           
##     religiosty        0.270    1.041    0.260    0.795   -1.700    2.470
##     hlc        (a)    5.971    0.944    6.325    0.000    4.037    7.768
##   hlc ~                                                                 
##     religiosty (b)    0.508    0.087    5.853    0.000    0.337    0.686
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|) ci.lower ci.upper
##    .intention        62.090    7.837    7.923    0.000   45.128   75.825
##    .hlc               0.753    0.108    7.009    0.000    0.539    0.951
## 
## Defined Parameters:
##                    Estimate  Std.Err  z-value  P(>|z|) ci.lower ci.upper
##     indirect          3.033    0.678    4.475    0.000    1.844    4.463

Exercises

This week’s lab focuses on the technique of path analysis using the same context as previous weeks: conduct problems in adolescence. In this week’s example, a researcher has collected data on n=557 adolescents and would like to know whether there are associations between conduct problems (both aggressive and non-aggressive) and academic performance and whether the relations are mediated by the quality of relationships with teachers.

Question A1

First, read in the dataset from https://uoepsy.github.io/data/cp_teachacad.csv

Solution

cp_teach<-read_csv("https://uoepsy.github.io/data/cp_teachacad.csv")
summary(cp_teach)

##        ID           Acad             Teach_r            Non_agg        
##  Min.   :  1   Min.   :-3.04218   Min.   :-3.68090   Min.   :-3.28306  
##  1st Qu.:140   1st Qu.:-0.73573   1st Qu.:-0.88734   1st Qu.:-0.68964  
##  Median :279   Median :-0.02592   Median : 0.01081   Median :-0.09320  
##  Mean   :279   Mean   :-0.06130   Mean   :-0.09497   Mean   :-0.06278  
##  3rd Qu.:418   3rd Qu.: 0.60620   3rd Qu.: 0.61758   3rd Qu.: 0.56956  
##  Max.   :557   Max.   : 3.05654   Max.   : 3.52253   Max.   : 3.43912  
##       Agg          
##  Min.   :-3.27096  
##  1st Qu.:-0.72768  
##  Median :-0.01646  
##  Mean   :-0.02455  
##  3rd Qu.: 0.68199  
##  Max.   : 3.30539

Question A2

Use the sem() function in lavaan to specify and estimate a straightforward linear regression model to test whether aggressive and non-aggressive conduct problems significantly predict academic performance.

How do your results compare to those you obtain using the lm() function?

Solution

# we can fit the model in lavaan as follows:
# first we specify the model using lavaan syntax
sr_lavaan<-'Acad~Non_agg+Agg'
# next we can estimate the model using the sem() function
sr_lavaan.est<-sem(sr_lavaan, data=cp_teach)
# we can inspect the results using the summary() function
summary(sr_lavaan.est)

## lavaan 0.6-8 ended normally after 13 iterations
## 
##   Estimator                                         ML
##   Optimization method                           NLMINB
##   Number of model parameters                         3
##                                                       
##   Number of observations                           557
##                                                       
## Model Test User Model:
##                                                       
##   Test statistic                                 0.000
##   Degrees of freedom                                 0
## 
## Parameter Estimates:
## 
##   Standard errors                             Standard
##   Information                                 Expected
##   Information saturated (h1) model          Structured
## 
## Regressions:
##                    Estimate  Std.Err  z-value  P(>|z|)
##   Acad ~                                              
##     Non_agg           0.182    0.057    3.178    0.001
##     Agg               0.318    0.057    5.599    0.000
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##    .Acad              0.943    0.057   16.688    0.000

# the same model can be fit using lm():

sr_lm<-lm(Acad~Non_agg+Agg, data=cp_teach)
summary(sr_lm)

## 
## Call:
## lm(formula = Acad ~ Non_agg + Agg, data = cp_teach)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -2.89617 -0.59575  0.00731  0.62189  3.13248 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -0.04204    0.04135  -1.017  0.30973    
## Non_agg      0.18238    0.05755   3.169  0.00161 ** 
## Agg          0.31809    0.05697   5.583  3.7e-08 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.9737 on 554 degrees of freedom
## Multiple R-squared:  0.1937, Adjusted R-squared:  0.1908 
## F-statistic: 66.54 on 2 and 554 DF,  p-value: < 2.2e-16

We can see that both non-aggressive and aggressive conduct problems significantly predict academic perfofmance.We can also see that we get the same results when we use the sem() function as we do when we use the lm() function. Lavaan will give essentially the same results as lm() for simple and multiple regression problems. However, if we have multiple outcome variables in our model it is advantageous to do this using path mediation model with lavaan. This allows us to include all the regressions in a single model.

Question A3

Now specify a model in which non-aggressive conduct problems have both a direct and indirect effect (via teacher relationships) on academic performance

Solution

model1<-'
    #we regress academic performance on non-aggressive conduct problems (the direct effect)
    Acad~Non_agg
    
    #we regress academic peformance on teacher relationship quality
    Acad~Teach_r
    
    #we regress teacher relationship quality on non-aggressive conduct problems
    Teach_r~Non_agg 
'

Question A4

Now define the indirect effect in order to test the hypothesis that non-aggressive conduct problems have both a direct and an indirect effect (via teacher relationships) on academic performance.

Fit the model and examine the 95% CI.

Solution

#model specification
model1<-'
    Acad~Non_agg
    #we label the two parameters that comprise the indirect effect b and c
    Acad~b*Teach_r    
    Teach_r~c*Non_agg  
    
    # the indirect effect is the product of b and c. We create a new parameter (ind) to estimate the indirect effect
    ind:=b*c   
'

#model estimation
model1.est<-sem(model1, data=cp_teach, se='bootstrap') 

# we request bootstrapped standard errors to assess the signifance of the indirect effect
summary(model1.est, ci=T)

## lavaan 0.6-8 ended normally after 13 iterations
## 
##   Estimator                                         ML
##   Optimization method                           NLMINB
##   Number of model parameters                         5
##                                                       
##   Number of observations                           557
##                                                       
## Model Test User Model:
##                                                       
##   Test statistic                                 0.000
##   Degrees of freedom                                 0
## 
## Parameter Estimates:
## 
##   Standard errors                            Bootstrap
##   Number of requested bootstrap draws             1000
##   Number of successful bootstrap draws            1000
## 
## Regressions:
##                    Estimate  Std.Err  z-value  P(>|z|) ci.lower ci.upper
##   Acad ~                                                                
##     Non_agg           0.158    0.060    2.643    0.008    0.035    0.281
##     Teach_r    (b)    0.328    0.050    6.525    0.000    0.223    0.425
##   Teach_r ~                                                             
##     Non_agg    (c)    0.769    0.034   22.783    0.000    0.701    0.833
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|) ci.lower ci.upper
##    .Acad              0.919    0.058   15.935    0.000    0.804    1.036
##    .Teach_r           0.713    0.043   16.653    0.000    0.629    0.799
## 
## Defined Parameters:
##                    Estimate  Std.Err  z-value  P(>|z|) ci.lower ci.upper
##     ind               0.252    0.039    6.476    0.000    0.171    0.330

We can see that the 95% bootstrapped confidence interval for the indirect effect of non-aggressive conduct problems on academic performance (‘ind’) does not include zero. We can conclude that the indirect effect is significant at \(p <.05\). The direct effect is also statistically significant at \(p < .05\).

Question A5

Specify a new parameter which is the total (direct+indirect) effect of non-aggressive conduct problems on academic performance

Solution

We can create a new parameter that is the sum of the direct and indirect effect to evaluate the total effect of non-aggressive conduct problems on academic performance.

#model specification

model1<-'
    # we now also label the indirect effect of non-aggressive conduct problems on academic performance
    Acad~a*Non_agg    
    Acad~b*Teach_r    
    Teach_r~c*Non_agg  
    
    ind:=b*c   
    #the total effect is the indirect effect plus the direct effect
    total:=b*c+a
'

#model estimation
model1.est<-sem(model1, data=cp_teach,se='bootstrap') 

# we request bootstrapped standard errors to assess the signifance of the indirect effect
summary(model1.est, ci=T)

## lavaan 0.6-8 ended normally after 13 iterations
## 
##   Estimator                                         ML
##   Optimization method                           NLMINB
##   Number of model parameters                         5
##                                                       
##   Number of observations                           557
##                                                       
## Model Test User Model:
##                                                       
##   Test statistic                                 0.000
##   Degrees of freedom                                 0
## 
## Parameter Estimates:
## 
##   Standard errors                            Bootstrap
##   Number of requested bootstrap draws             1000
##   Number of successful bootstrap draws            1000
## 
## Regressions:
##                    Estimate  Std.Err  z-value  P(>|z|) ci.lower ci.upper
##   Acad ~                                                                
##     Non_agg    (a)    0.158    0.057    2.765    0.006    0.048    0.273
##     Teach_r    (b)    0.328    0.050    6.562    0.000    0.235    0.423
##   Teach_r ~                                                             
##     Non_agg    (c)    0.769    0.034   22.881    0.000    0.698    0.833
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|) ci.lower ci.upper
##    .Acad              0.919    0.058   15.778    0.000    0.805    1.033
##    .Teach_r           0.713    0.042   16.905    0.000    0.631    0.799
## 
## Defined Parameters:
##                    Estimate  Std.Err  z-value  P(>|z|) ci.lower ci.upper
##     ind               0.252    0.039    6.406    0.000    0.180    0.334
##     total             0.410    0.043    9.555    0.000    0.330    0.492

Question A6

Now visualise the estimated model and its parameters using the semPaths() function from the semPlot package.

Solution

#to include the parameter estimates we set what='est'
semPaths(model1.est, what='est')

A more complex model

Question B1

Now specify a model in which both aggressive and non-aggressive conduct problems have both direct and indirect effects (via teacher relationships) on academic performance. Include the parameters for the indirect effects.

Solution

model2<-
   'Acad~Agg+Non_agg+b*Teach_r
    Teach_r~c1*Agg+c2*Non_agg
   
    ind1:=b*c1 #indirect effect for aggressive conduct problems
    ind2:=b*c2 #indirect effect for non-aggressive conduct problems
'

We now have two predictors, one mediator and one outcome (and two indirect effects, one for each predictor). We can represent this in two lines: one where we specify academic performance as the outcome variable and one where we specify teacher relationships (the mediator) as the outcome variable.

Question B2

Now estimate the model and test the significance of the indirect effects

Solution

model2.est<-sem(model2,  data=cp_teach,se='bootstrap') 
summary(model2.est, ci=T)

## lavaan 0.6-8 ended normally after 18 iterations
## 
##   Estimator                                         ML
##   Optimization method                           NLMINB
##   Number of model parameters                         7
##                                                       
##   Number of observations                           557
##                                                       
## Model Test User Model:
##                                                       
##   Test statistic                                 0.000
##   Degrees of freedom                                 0
## 
## Parameter Estimates:
## 
##   Standard errors                            Bootstrap
##   Number of requested bootstrap draws             1000
##   Number of successful bootstrap draws            1000
## 
## Regressions:
##                    Estimate  Std.Err  z-value  P(>|z|) ci.lower ci.upper
##   Acad ~                                                                
##     Agg               0.171    0.061    2.792    0.005    0.049    0.299
##     Non_agg           0.091    0.065    1.392    0.164   -0.036    0.221
##     Teach_r    (b)    0.256    0.055    4.650    0.000    0.146    0.367
##   Teach_r ~                                                             
##     Agg       (c1)    0.574    0.045   12.797    0.000    0.488    0.667
##     Non_agg   (c2)    0.358    0.042    8.557    0.000    0.278    0.438
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|) ci.lower ci.upper
##    .Acad              0.908    0.055   16.399    0.000    0.793    1.016
##    .Teach_r           0.540    0.030   18.246    0.000    0.479    0.596
## 
## Defined Parameters:
##                    Estimate  Std.Err  z-value  P(>|z|) ci.lower ci.upper
##     ind1              0.147    0.033    4.392    0.000    0.082    0.214
##     ind2              0.092    0.022    4.085    0.000    0.049    0.138

We can see that the 95% confidence intervals for both indirect effects do not include zero, therefore, we can conclude that they are significant at \(p < .05\).

Question B3

Write a brief paragraph reporting on the results of the model estimates in Question B2. Include a Figure or Table to display the parameter estimates.

Solution

Optional: Mediation the more manual way: back to lm()

Following Baron & Kenny 1986, we can conduct mediation analysis by using three separate regression models.

\(y \sim x\)
\(x \sim m\)
\(y \sim x + m\)

Step 1. Determine the presence of y ~ x:
if x predicts y, then there is possibility to detect mediation

vax <- read_csv("https://uoepsy.github.io/data/vaxdat.csv")

mod1 <- lm(intention ~ religiosity, data = vax)
summary(mod1)$coefficients

##              Estimate Std. Error  t value     Pr(>|t|)
## (Intercept) 57.175330  2.4217180 23.60941 3.167781e-42
## religiosity  3.303285  0.9292332  3.55485 5.840235e-04

Step 2. Determine the presence of m ~ x: if x predicts m, then there is possibility to detect mediation

mod2 <- lm(hlc ~ religiosity, data = vax)
summary(mod2)$coefficients

##              Estimate Std. Error  t value     Pr(>|t|)
## (Intercept) 1.7749083 0.22288830 7.963219 3.039025e-12
## religiosity 0.5079682 0.08552408 5.939475 4.360399e-08

Step 3. Examine the effect of y ~ x + m:
If the x no longer predicts y after partialling out effects due to m, then there is full mediation. If the effect of x on y is smaller, then there is partial mediation.

mod3 <- lm(intention ~ religiosity + hlc, data = vax)
summary(mod3)$coefficients

##               Estimate Std. Error    t value     Pr(>|t|)
## (Intercept) 46.5779660  2.6100101 17.8458946 2.102793e-32
## religiosity  0.2703824  0.9100233  0.2971159 7.670134e-01
## hlc          5.9706544  0.9216921  6.4779272 3.850955e-09

Step 4. Test for the mediation.
There are various ways to do this, but the simplest is probably:

library(mediation)
summary(mediate(mod2, mod3, treat='religiosity', mediator='hlc', boot=TRUE, sims=500))

## 
## Causal Mediation Analysis 
## 
## Nonparametric Bootstrap Confidence Intervals with the Percentile Method
## 
##                Estimate 95% CI Lower 95% CI Upper p-value    
## ACME              3.033        1.844         4.42  <2e-16 ***
## ADE               0.270       -1.584         2.57    0.73    
## Total Effect      3.303        1.648         5.22  <2e-16 ***
## Prop. Mediated    0.918        0.488         1.85  <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Sample Size Used: 100 
## 
## 
## Simulations: 500

ACME: Average Causal Mediation Effects ADE: Average Direct Effects Total Effect: sum of the mediation (indirect) effect and the direct effect.