Recap

You have (hopefully) already made a head start on this weeks exercises if you completed the Factorial ANOVA section of last week’s lab. If you haven’t yet completed these two questions, do so before reading any further.

In this week’s exercises, we will further explore questions such as:

Does level \(i\) of the first factor have an effect on the response?
Does level \(j\) of the second factor have an effect on the response?
Is there a combined effect of level \(i\) of the first factor and level \(j\) of the second factor on the response? In other words, is there interaction of the two factors so that the combined effect is not simply the additive effect of level \(i\) of the first factor plus the effect of level \(j\) of the second factor?

Research question and data

As a reminder, we are working with data from a study yielding a \(3 \times 3\) factorial design to test whether there are differences in types of memory deficits for those experiencing different cognitive impairment(s).

	Task
Diagnosis	grammar	classification	recognition
amnesic	44, 63, 76, 72, 45	72, 66, 55, 82, 75	70, 51, 82, 66, 56
huntingtons	24, 30, 51, 55, 40	53, 59, 33, 37, 43	107, 80, 98, 82, 108
control	76, 98, 71, 70, 85	92, 65, 86, 67, 90	107, 80, 101, 82, 105

Interaction Model

Question 1

Let’s look at the summary() and anova() output in detail from the model you should have previously fitted with the sum to zero constraint. As a reminder, the model with interaction is:

\[\begin{aligned} Score &= \beta_0 \\ &+ \beta_1 D_\text{Control} + \beta_2 D_\text{Amnesic} \\ &+ \beta_3 T_\text{Recognition} + \beta_4 T_\text{Grammar} \\ &+ \beta_5 (D_\text{Control} * T_\text{Recognition}) + \beta_6 (D_\text{Amnesic} * T_\text{Recognition}) \\ &+ \beta_7 (D_\text{Control} * T_\text{Grammar}) + \beta_8 (D_\text{Amnesic} * T_\text{Grammar}) \\ &+ \epsilon \end{aligned}\]

Applying the sum to zero constraint (for Diagnosis of ‘Control’ and Task of ‘Recognition’), we would have:

\[\begin{aligned} \text{Intercept (global mean)} &= \beta_0 \frac{\mu_{1,1} + \mu_{1,2} + \cdots + \mu_{3,3}}{9} \\ \beta_{Huntingtons} &= -(\beta_1 + \beta_2) \\ \beta_{Classification} &= -(\beta_3 + \beta_4) \\ \beta_{Huntingtons:Classification} &= -(\beta_5 + \beta_6 + \beta_7 + \beta_8) \end{aligned}\]

Solution

Let’s look at the anova() and summary() output:

contrasts(cog$Diagnosis) <- "contr.sum"
contrasts(cog$Task) <- "contr.sum"
mdl_int <- lm(Score ~ Diagnosis * Task, data = cog)
anova(mdl_int)

## Analysis of Variance Table
## 
## Response: Score
##                Df Sum Sq Mean Sq F value    Pr(>F)    
## Diagnosis       2   5250 2625.00 16.6373  7.64e-06 ***
## Task            2   5250 2625.00 16.6373  7.64e-06 ***
## Diagnosis:Task  4   5000 1250.00  7.9225 0.0001092 ***
## Residuals      36   5680  157.78                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Scores significantly differed by both Diagnosis (\(F(2, 36)=16.63, p < .001\)) and by Task (\(F(2, 36)=16.63, p < .001\)). The interaction between Diagnosis and Task was significant (\(F(4, 36)=7.92, p < 0.001\)). This provides evidence against the null hypothesis that effect of Task is constant across the different levels of Diagnosis.

summary(mdl_int)

## 
## Call:
## lm(formula = Score ~ Diagnosis * Task, data = cog)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
##    -16    -12      2     11     18 
## 
## Coefficients:
##                  Estimate Std. Error t value Pr(>|t|)    
## (Intercept)        70.000      1.872  37.384  < 2e-16 ***
## Diagnosis1         15.000      2.648   5.664 1.95e-06 ***
## Diagnosis2         -5.000      2.648  -1.888 0.067085 .  
## Task1              15.000      2.648   5.664 1.95e-06 ***
## Task2             -10.000      2.648  -3.776 0.000576 ***
## Diagnosis1:Task1   -5.000      3.745  -1.335 0.190216    
## Diagnosis2:Task1  -15.000      3.745  -4.005 0.000297 ***
## Diagnosis1:Task2    5.000      3.745   1.335 0.190216    
## Diagnosis2:Task2    5.000      3.745   1.335 0.190216    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 12.56 on 36 degrees of freedom
## Multiple R-squared:  0.7318, Adjusted R-squared:  0.6722 
## F-statistic: 12.28 on 8 and 36 DF,  p-value: 2.844e-08

The F-test for model utility is again significant at the 5% level: \(F(8,36) = 12.28, p < .001\). An F-statistic this large or larger occurring by chance only is very small. In the presence of a significant interaction we do not interpret the main effects as their interpretation changes with the level of the other factor.

Question 2

Based on previous plotting, it does seem that there is a clear interaction between diagnosis and task (as well as from our output above). However, we have not statistically compared our interaction model to an additive model (same model but without the interaction). Until we do so, we cannot confidently progress on the assumption that the interaction model is the most suitable for answering the research question.

We also want to consider the best coding constraint to apply in order to best answer the research question - are we interested in whether group X (e.g., Amnesic) differed from group Y (e.g., Huntingtons), or whether group X (e.g., Amnesic) differed from the overall group mean?

Since we are interested in comparing groups, we should reset to dummy coding, and thus should re-run our interaction model. Next, we need to perform a model comparison between the additive model and the interaction model using the anova() function.

After re-running your model with dummy coding, interpret the result of the model comparison.

\[\begin{aligned} Additive Model: Score &= \beta_0 \\ &+ \beta_1 D_\text{Amnseic} + \beta_2 D_\text{Huntingtons} \\ &+ \beta_3 T_\text{Grammar} + \beta_4 T_\text{Classification} \\ &+ \epsilon \end{aligned}\] \[\begin{aligned} Interaction Model: Score &= \beta_0 \\ &+ \beta_1 D_\text{Amnseic} + \beta_2 D_\text{Huntingtons} \\ &+ \beta_3 T_\text{Grammar} + \beta_4 T_\text{Classification} \\ &+ \beta_5 (D_\text{Amnseic} * T_\text{Grammar}) + \beta_6 (D_\text{Huntingtons} * T_\text{Grammar}) \\ &+ \beta_7 (D_\text{Amnseic} * T_\text{Classification}) + \beta_8 (D_\text{Huntingtons} * T_\text{Classification}) \\ &+ \epsilon \end{aligned}\]

Solution

Switch back to dummy coding:

contrasts(cog$Diagnosis) <- "contr.treatment"
contrasts(cog$Task) <- "contr.treatment"

Build additive model and re-run interaction model:

mdl_add <- lm(Score ~ Diagnosis + Task, data = cog)
mdl_int <- lm(Score ~ Diagnosis * Task, data = cog)

The relevant function is anova() with the two models as inputs to conduct a model comparison:

anova(mdl_add, mdl_int)

## Analysis of Variance Table
## 
## Model 1: Score ~ Diagnosis + Task
## Model 2: Score ~ Diagnosis * Task
##   Res.Df   RSS Df Sum of Sq      F    Pr(>F)    
## 1     40 10680                                  
## 2     36  5680  4      5000 7.9225 0.0001092 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

We performed an F-test to compare two nested models: an additive two-factor ANOVA against a two-factor model with interaction. The test results are \(F(4, 36) = 7.92, p < .001\).

At the 5% significance level, the probability of obtaining an F-statistic as large as 7.92 or larger is <.001.

Hence, the comparison of nested models provides evidence against the additive effects model, suggesting that we should use the interaction model as each factor has a different effect on the response depending the level of the other factor.

Question 3

Using plot_model() (note that this function is from sjPlot package - make sure that you load this), generate a plot showing the predicted mean scores for each combination of levels of the diagnosis and task factors.

Solution

Contrast analysis

We will begin by looking at each factor separately.

In terms of the diagnostic groups, recall that we want to compare the amnesiacs to the Huntington individuals. This corresponds to a contrast with coefficients of 0, 1, and −1, for control, amnesic, and Huntingtons, respectively.

Similarly, in terms of the tasks, we want to compare the average of the two implicit memory tasks with the explicit memory task. This corresponds to a contrast with coefficients of 0.5, 0.5, and −1 for the three tasks.

When we are in presence of a significant interaction, the coefficients for a contrast between the means are found by multiplying each row coefficient with all column coefficients as shown below:

This can be done in R using:

diag_coef  <- c('control' = 0, 'amnesic' = 1, 'huntingtons' = -1)
task_coef  <- c('grammar' = 0.5, 'classification' = 0.5, 'recognition' = -1)
contr_coef <- outer(diag_coef, task_coef)   # or: diag_coef %o% task_coef
contr_coef

##             grammar classification recognition
## control         0.0            0.0           0
## amnesic         0.5            0.5          -1
## huntingtons    -0.5           -0.5           1

The above coefficients correspond to testing the null hypothesis

\[ H_0 : \frac{\mu_{2,1} + \mu_{2,2}}{2} - \mu_{2,3} - \left( \frac{\mu_{3,1} + \mu_{3,2}}{2} - \mu_{3,3} \right) = 0 \]

or, equivalently,

\[ H_0 : \frac{\mu_{2,1} + \mu_{2,2}}{2} - \mu_{2,3} = \frac{\mu_{3,1} + \mu_{3,2}}{2} - \mu_{3,3} \]

which says that, in the population, the difference between the mean implicit memory and the explicit memory score is the same for amnesic patients and Huntingtons individuals. Note that the scores for the grammar and classification tasks have been averaged to obtain a single measure of ‘implicit memory’ score.

Now that we have the coefficients, let’s call the emmeans function (this is helpful to look at the ordering of the groups):

library(emmeans)
emm <- emmeans(mdl_int, ~ Diagnosis*Task)
emm

##  Diagnosis   Task           emmean   SE df lower.CL upper.CL
##  control     recognition        95 5.62 36     83.6    106.4
##  amnesic     recognition        65 5.62 36     53.6     76.4
##  huntingtons recognition        95 5.62 36     83.6    106.4
##  control     grammar            80 5.62 36     68.6     91.4
##  amnesic     grammar            60 5.62 36     48.6     71.4
##  huntingtons grammar            40 5.62 36     28.6     51.4
##  control     classification     80 5.62 36     68.6     91.4
##  amnesic     classification     70 5.62 36     58.6     81.4
##  huntingtons classification     45 5.62 36     33.6     56.4
## 
## Confidence level used: 0.95

Next, from contr_coef, insert the coefficients following the order specified by the rows of emm above. That is, the first one should be for control recognition and have a value of 0, the second for amnesic recognition with a value of -1, and so on…

We also give a name to this contrast, such as ‘Research Hyp.’

comp_res <- contrast(emm, method = list('Research Hyp' = c(0, -1, 1, 0, 0.5, -0.5, 0, 0.5, -0.5)))
comp_res

##  contrast     estimate   SE df t.ratio p.value
##  Research Hyp     52.5 9.73 36   5.396  <.0001

confint(comp_res)

##  contrast     estimate   SE df lower.CL upper.CL
##  Research Hyp     52.5 9.73 36     32.8     72.2
## 
## Confidence level used: 0.95

or:

summary(comp_res, infer = TRUE)

##  contrast     estimate   SE df lower.CL upper.CL t.ratio p.value
##  Research Hyp     52.5 9.73 36     32.8     72.2   5.396  <.0001
## 
## Confidence level used: 0.95

Question 4

Interpret the results of the contrast analysis.

Solution

Simple Effects

By considering the simple effects, we can identify at which levels of the interacting condition we see different effects.

Question 5

Since we have a significant interaction, we should also look at the simple main effects. Simple effects are the effect of one factor (e.g., Task) at each level of another factor (e.g., Diagnosis - Control, Huntingtons, and Amnesic).

Examine the simple effects for Task at each level of Diagnosis; and then the simple effects for Diagnosis at each level of Task.

Solution

mdl_int_simple1 <- pairs(emm, simple = "Task")
mdl_int_simple1

## Diagnosis = control:
##  contrast                     estimate   SE df t.ratio p.value
##  recognition - grammar              15 7.94 36   1.888  0.1567
##  recognition - classification       15 7.94 36   1.888  0.1567
##  grammar - classification            0 7.94 36   0.000  1.0000
## 
## Diagnosis = amnesic:
##  contrast                     estimate   SE df t.ratio p.value
##  recognition - grammar               5 7.94 36   0.629  0.8050
##  recognition - classification       -5 7.94 36  -0.629  0.8050
##  grammar - classification          -10 7.94 36  -1.259  0.4273
## 
## Diagnosis = huntingtons:
##  contrast                     estimate   SE df t.ratio p.value
##  recognition - grammar              55 7.94 36   6.923  <.0001
##  recognition - classification       50 7.94 36   6.294  <.0001
##  grammar - classification           -5 7.94 36  -0.629  0.8050
## 
## P value adjustment: tukey method for comparing a family of 3 estimates

mdl_int_simple2 <- pairs(emm, simple = "Diagnosis")
mdl_int_simple2

## Task = recognition:
##  contrast              estimate   SE df t.ratio p.value
##  control - amnesic           30 7.94 36   3.776  0.0016
##  control - huntingtons        0 7.94 36   0.000  1.0000
##  amnesic - huntingtons      -30 7.94 36  -3.776  0.0016
## 
## Task = grammar:
##  contrast              estimate   SE df t.ratio p.value
##  control - amnesic           20 7.94 36   2.518  0.0424
##  control - huntingtons       40 7.94 36   5.035  <.0001
##  amnesic - huntingtons       20 7.94 36   2.518  0.0424
## 
## Task = classification:
##  contrast              estimate   SE df t.ratio p.value
##  control - amnesic           10 7.94 36   1.259  0.4273
##  control - huntingtons       35 7.94 36   4.406  0.0003
##  amnesic - huntingtons       25 7.94 36   3.147  0.0091
## 
## P value adjustment: tukey method for comparing a family of 3 estimates

From mdl_int_simple1 we can see the differences between between tasks for each diagnosis group, and from mdl_int_simple2 the differences between diagnoses for each task group.

Question 6

There are various ways we can create an interaction plot, for instance, try this code:

emmip(mdl_int, Diagnosis ~ Task, CIs = TRUE)

Considering the simple effects that we just saw in Question 5, identify the significant effects and match them to the parts of an interaction plot.

Optional: You can change what is plotted on the x-axis

Solution

Pairwise Comparisons

Question 7

Conduct exploratory pairwise comparisons to compare all levels of Diagnosis with all levels of Task.

Solution

pairs_res <- pairs(emm)
pairs_res

##  contrast                                             estimate   SE df t.ratio
##  control recognition - amnesic recognition                  30 7.94 36   3.776
##  control recognition - huntingtons recognition               0 7.94 36   0.000
##  control recognition - control grammar                      15 7.94 36   1.888
##  control recognition - amnesic grammar                      35 7.94 36   4.406
##  control recognition - huntingtons grammar                  55 7.94 36   6.923
##  control recognition - control classification               15 7.94 36   1.888
##  control recognition - amnesic classification               25 7.94 36   3.147
##  control recognition - huntingtons classification           50 7.94 36   6.294
##  amnesic recognition - huntingtons recognition             -30 7.94 36  -3.776
##  amnesic recognition - control grammar                     -15 7.94 36  -1.888
##  amnesic recognition - amnesic grammar                       5 7.94 36   0.629
##  amnesic recognition - huntingtons grammar                  25 7.94 36   3.147
##  amnesic recognition - control classification              -15 7.94 36  -1.888
##  amnesic recognition - amnesic classification               -5 7.94 36  -0.629
##  amnesic recognition - huntingtons classification           20 7.94 36   2.518
##  huntingtons recognition - control grammar                  15 7.94 36   1.888
##  huntingtons recognition - amnesic grammar                  35 7.94 36   4.406
##  huntingtons recognition - huntingtons grammar              55 7.94 36   6.923
##  huntingtons recognition - control classification           15 7.94 36   1.888
##  huntingtons recognition - amnesic classification           25 7.94 36   3.147
##  huntingtons recognition - huntingtons classification       50 7.94 36   6.294
##  control grammar - amnesic grammar                          20 7.94 36   2.518
##  control grammar - huntingtons grammar                      40 7.94 36   5.035
##  control grammar - control classification                    0 7.94 36   0.000
##  control grammar - amnesic classification                   10 7.94 36   1.259
##  control grammar - huntingtons classification               35 7.94 36   4.406
##  amnesic grammar - huntingtons grammar                      20 7.94 36   2.518
##  amnesic grammar - control classification                  -20 7.94 36  -2.518
##  amnesic grammar - amnesic classification                  -10 7.94 36  -1.259
##  amnesic grammar - huntingtons classification               15 7.94 36   1.888
##  huntingtons grammar - control classification              -40 7.94 36  -5.035
##  huntingtons grammar - amnesic classification              -30 7.94 36  -3.776
##  huntingtons grammar - huntingtons classification           -5 7.94 36  -0.629
##  control classification - amnesic classification            10 7.94 36   1.259
##  control classification - huntingtons classification        35 7.94 36   4.406
##  amnesic classification - huntingtons classification        25 7.94 36   3.147
##  p.value
##   0.0149
##   1.0000
##   0.6257
##   0.0026
##   <.0001
##   0.6257
##   0.0711
##   <.0001
##   0.0149
##   0.6257
##   0.9993
##   0.0711
##   0.6257
##   0.9993
##   0.2575
##   0.6257
##   0.0026
##   <.0001
##   0.6257
##   0.0711
##   <.0001
##   0.2575
##   0.0004
##   1.0000
##   0.9367
##   0.0026
##   0.2575
##   0.2575
##   0.9367
##   0.6257
##   0.0004
##   0.0149
##   0.9993
##   0.9367
##   0.0026
##   0.0711
## 
## P value adjustment: tukey method for comparing a family of 9 estimates

#can also plot if you'd like:
plot(pairs_res)

From the above, we can see comparisons for all different possible pairs of diagnosis-task combinations. As well as recapping assumption checks, next week we will also explore how we can adjust our pairwise comparisons (think about how many comparisons were conducted above - without adjusting our \(\alpha\) (or \(p\)-value), why might any inferences drawn be problematic?)

References

Maxwell, Scott E, Harold D Delaney, and Ken Kelley. 2017. Designing Experiments and Analyzing Data: A Model Comparison Perspective. Routledge.

Two-way ANOVA

Recap

Research question and data

Interaction Model

Contrast analysis

Simple Effects

Pairwise Comparisons

References