Data Analysis for Psychology in R 2
Department of Psychology
University of Edinburgh
2025–2026
| Introduction to linear Models | Intro to linear regression |
| Interpreting linear models | |
| Testing individual predictors | |
| Model testing & comparison | |
| Linear model analysis | |
| Analysing Experimental Studies | Categorical predictors and dummy coding |
| Effect coding and manual post-hoc contrasts | |
| Assumptions and diagnostics | |
| Bootstrapping and confidence intervals | |
| Categorical predictors: Practice analysis |
| Interactions | Mean-centering and numeric/categorical interactions |
| Numeric/numeric interactions | |
| Categorical/categorical interactions | |
| Manual contrast interactions and multiple comparisons | |
| Interactions: Practice analysis | |
| Advanced Topics | Power analysis |
| Binary logistic regression I | |
| Binary logistic regression II | |
| Logistic regression: Practice analysis | |
| Exam prep and course Q&A |
How do we test an interaction between two manually-coded post-hoc contrasts?
What’s the main danger of running multiple statistical tests on the same data?
What are two common ways that we can correct for running multiple statistical tests on the same data?
The subjective wellbeing (SWB) of patients at two different hospitals:
Hosp1:

Hosp2:

who have undergone one of three different treatments for depression:
Cognitive Behavioural Therapy (CBT)

Eye Movement Desensitisation and Reprocessing therapy (EMDR)

Antidepressant medication
(Meds):

What we want to know: Is the difference in SWB between talk therapy (both CBT and EMDR) and pharmaceutical therapy (Meds) different in each hospital?
TrtmtType.Hospital.Hospital contrast interacts with the TrtmtType contrast.We only care about the post-hoc contrasts, so it’s fine to fit the model using R’s default treatment coding.
Check on the reference levels:
The model in R syntax:
On your own time, practice writing out the model’s linear expression in mathematical terms, and check the appendix for the solution :)
Treatment Hospital emmean SE df lower.CL upper.CL
CBT Hosp1 10.10 0.37 174 9.37 10.83
EMDR Hosp1 9.43 0.37 174 8.70 10.16
Meds Hosp1 10.80 0.37 174 10.07 11.53
CBT Hosp2 7.98 0.37 174 7.25 8.71
EMDR Hosp2 13.12 0.37 174 12.39 13.85
Meds Hosp2 7.85 0.37 174 7.12 8.58
Confidence level used: 0.95
TrtmtTypeStep 1: “Chunk” together the two group(s) that the research question is comparing.
Meds.CBT, EMDR.Step 2: Assign a 0 to any group(s) that aren’t in one of the chunks from Step 1.
Step 3: Assign a plus sign to every group in Chunk 1, and a minus sign to every group in Chunk 2.
Step 4: Count the plus signs and minus signs.
Step 5: To figure out the actual values for each cell, start with 1 and –1. Divide 1 by \(n_{plus}\), and divide –1 by \(n_{minus}\).
Step 6: In the coding matrix, replace the plus signs with the positive coding value from Step 5, and replace the minus signs with the negative coding value from Step 5. Done!
Treatment |
TrtmtType |
|---|---|
CBT |
|
EMDR |
|
Meds |
|
wooclap.com
code UBEQCF
HospitalStep 1: “Chunk” together the two group(s) that the research question is comparing.
Hosp2.Hosp1.Step 2: Assign a 0 to any group(s) that aren’t in one of the chunks from Step 1.
Step 3: Assign a plus sign to every group in Chunk 1, and a minus sign to every group in Chunk 2.
Step 4: Count the plus signs and minus signs.
Step 5: To figure out the actual values for each cell, start with 1 and –1. Divide 1 by \(n_{plus}\), and divide –1 by \(n_{minus}\).
Step 6: In the coding matrix, replace the plus signs with the positive coding value from Step 5, and replace the minus signs with the negative coding value from Step 5. Done!
Hospital |
Hosp |
|---|---|
Hosp1 |
|
Hosp2 |
|
wooclap.com
code UBEQCF
We want to know: Is there a difference between hospitals in how different treatment types (talk therapy vs. pharmaceutical therapy) are associated with SWB?
When we test differences between groups, we’re testing differences between group means.
So, we will start by defining the group means for all combinations of factor levels in our data.
Next, we’ll combine these group means in the way that the manual contrasts specify.
Within each hospital, we chunk CBT and EMDR together, because they are both talk therapy.
When we chunk two group means together, mathematically we are taking the mean of those group means.
To represent the mean of the blue chunk (top left):
\[ \frac{ \mu_{\text{CBT,Hosp1}} + \mu_{\text{EMDR,Hosp1}} }{2} \]
To represent the mean of the green chunk (top right):
\[ \frac{ \mu_{\text{CBT,Hosp2}} + \mu_{\text{EMDR,Hosp2}} }{2} \]
RQ: Is the difference in SWB between talk therapy (both CBT and EMDR) and pharmaceutical therapy (Meds) different in each hospital?
Null hypothesis in words: The difference in SWB between talk therapy (both CBT and EMDR) and pharmaceutical therapy (Meds) is the same in each hospital.
In words arranged like the math:
In math arranged like the words:
Version 1: Differences are the same (H0) or not the same (H1):
\[\begin{align} H_0 &: \frac{ \mu_{\text{CBT,Hosp1}} + \mu_{\text{EMDR,Hosp1}} }{2} - \mu_{\text{Meds,Hosp1}} = \frac{ \mu_{\text{CBT,Hosp2}} + \mu_{\text{EMDR,Hosp2}} }{2} - \mu_{\text{Meds,Hosp2}} \\ H_1 &: \frac{ \mu_{\text{CBT,Hosp1}} + \mu_{\text{EMDR,Hosp1}} }{2} - \mu_{\text{Meds,Hosp1}} \neq \frac{ \mu_{\text{CBT,Hosp2}} + \mu_{\text{EMDR,Hosp2}} }{2} - \mu_{\text{Meds,Hosp2}} \end{align}\]
Version 2, also fine: Difference between differences is zero (H0) or not zero (H1):
\[\begin{align} H_0 &: \Big( \frac{ \mu_{\text{CBT,Hosp1}} + \mu_{\text{EMDR,Hosp1}} }{2} - \mu_{\text{Meds,Hosp1}} \Big) - \Big( \frac{ \mu_{\text{CBT,Hosp2}} + \mu_{\text{EMDR,Hosp2}} }{2} - \mu_{\text{Meds,Hosp2}} \Big) = 0 \\ H_1 &: \Big( \frac{ \mu_{\text{CBT,Hosp1}} + \mu_{\text{EMDR,Hosp1}} }{2} - \mu_{\text{Meds,Hosp1}} \Big) - \Big( \frac{ \mu_{\text{CBT,Hosp2}} + \mu_{\text{EMDR,Hosp2}} }{2} - \mu_{\text{Meds,Hosp2}} \Big) \neq 0 \\ \end{align}\]
(LaTeX code in appendix)
Treatment |
TrtmtType |
|---|---|
CBT |
–1/2 |
EMDR |
–1/2 |
Meds |
1 |
Hospital |
Hosp |
|---|---|
Hosp1 |
–1 |
Hosp2 |
1 |
In other words: To get the interaction’s contrast, we multiply together the two contrasts for the interacting predictors.
| Treatment | Hospital | TrtmtType | Hosp | TrtmtType:Hosp |
|---|---|---|---|---|
| CBT | Hosp1 | –0.5 | –1 | |
| EMDR | Hosp1 | –0.5 | –1 | |
| Meds | Hosp1 | 1 | –1 | |
| CBT | Hosp2 | –0.5 | 1 | |
| EMDR | Hosp2 | –0.5 | 1 | |
| Meds | Hosp2 | 1 | 1 |
wooclap.com, code UBEQCF
(1) The fancy way:
Here are the two individual manual contrasts we defined a couple slides ago:
Multiply them together using outer(). Write TrtmtType first because we told emmeans ~ Treatment * Hospital, with Treatment first.
Flatten this table using as.vector():
(2) The manual way:
Check the order of items that emmeans uses:
Treatment Hospital emmean SE df lower.CL upper.CL
CBT Hosp1 10.10 0.37 174 9.37 10.83
EMDR Hosp1 9.43 0.37 174 8.70 10.16
Meds Hosp1 10.80 0.37 174 10.07 11.53
CBT Hosp2 7.98 0.37 174 7.25 8.71
EMDR Hosp2 13.12 0.37 174 12.39 13.85
Meds Hosp2 7.85 0.37 174 7.12 8.58
Confidence level used: 0.95
Figure out the correct coding values manually (as we did a few slides ago).
Make sure to match emmeans’ order and type up the contrast coding by hand:
When we test the contrast using contrast(), we are testing the H0 that the estimate is equal to zero. For this interaction, the H0 says there’s no difference between Hospital in how TrtmtType is associated with SWB.
contrast estimate SE df t.ratio p.value
TrtmtType:Hospital -3.73 0.641 174 -5.819 <0.0001
Get the associated 95% CIs using confint():
The null hypothesis:
Can we reject the null hypothesis?
It’s the result of multiplying each group mean (column mean_SWB) by the corresponding value from the interaction’s contrast (column TrtmtType:Hospital), shown in the column product, and then adding all those numbers together.
| Treatment | Hospital | mean_SWB | TrtmtType:Hospital | product |
|---|---|---|---|---|
| CBT | Hosp1 | 10.10 | 0.5 | 5.05 |
| EMDR | Hosp1 | 9.43 | 0.5 | 4.72 |
| Meds | Hosp1 | 10.80 | -1.0 | -10.80 |
| CBT | Hosp2 | 7.98 | -0.5 | -3.99 |
| EMDR | Hosp2 | 13.12 | -0.5 | -6.56 |
| Meds | Hosp2 | 7.85 | 1.0 | 7.85 |
Can we give this number a more intuitive explanation?
Not really tbh :( Interaction coefficients are adjustments to other effects, not effects themselves. But we don’t have any other coefficients to relate this adjustment to, so we can’t compute simple slopes or anything. And it’s not very useful to interpret this single number on its own.
The reason manual interactions are useful is essentially just the significance test.
Hospital at each level of Treatment, what am I looking for?pairs()We’ve seen the math of how to compute simple effects by hand.
But that only gives us the value of the effect, not a test of whether that value is significantly different from zero.
When we had continuous predictors, we could test simple slopes using probe_interactions().
Now that we have categorical predictors, we can test simple effects by giving the pairs() function the model’s emmeans object and specifing the simple= argument.
For example, to get the simple effects of Hospital at each level of Treatment:
Hospital at each level of TreatmentTreatment = CBT:
contrast estimate SE df t.ratio p.value
Hosp1 - Hosp2 2.12 0.523 174 4.059 <0.0001
Treatment = EMDR:
contrast estimate SE df t.ratio p.value
Hosp1 - Hosp2 -3.69 0.523 174 -7.047 <0.0001
Treatment = Meds:
contrast estimate SE df t.ratio p.value
Hosp1 - Hosp2 2.95 0.523 174 5.632 <0.0001
Note on the positive/negative signs:
pairs() computes Hosp1 – Hosp2 because of how factor levels in the factor Hospital are ordered.Hosp2 – Hosp1 instead, we would have to code Hosp2 as the reference level of Hospital before fitting the model.Treatment for each level of HospitalHospital = Hosp1:
contrast estimate SE df t.ratio p.value
CBT - EMDR 0.673 0.523 174 1.287 0.4044
CBT - Meds -0.697 0.523 174 -1.332 0.3796
EMDR - Meds -1.370 0.523 174 -2.619 0.0259
Hospital = Hosp2:
contrast estimate SE df t.ratio p.value
CBT - EMDR -5.137 0.523 174 -9.819 <0.0001
CBT - Meds 0.127 0.523 174 0.242 0.9682
EMDR - Meds 5.263 0.523 174 10.061 <0.0001
P value adjustment: tukey method for comparing a family of 3 estimates
What’s this “P value adjustment: tukey method” thing? We’ll find out soon, but first …
On a twenty-sided die (a d20), the probability of rolling a 1 (or any other individual number) is 1/20 = 5%.
But if we roll the d20 over and over again, the probability that we’ll roll a 1 at least once goes up and up and up.
\(.05\)
\(= 1 – 0.95\)
\(.0975\)
\(= 1 – (0.95 \times 0.95)\)
\(.1426\)
\(= 1 – (0.95 \times 0.95 \times 0.95) = 1 - 0.95^3\)
In general:
\[ P(\text{rolling a 1 in m rolls}) = 1 - (1 - 0.05)^m \]
What’s the probability that we’ll roll a 1 at least once in twenty rolls?
wooclap.com, code UBEQCF
Why did we spend time rolling those dice?
With \(\alpha = 0.05\), we accept the risk that, we will incorrectly reject the null hypothesis 5% of the time.
And if we accept that positive result and ignore all the negative results (green jelly bean style), then we run into the multiple comparisons problem: we are more than 5% likely to see at least one false positive, the more tests (aka, the more comparisons) we run.
How much more likely?
Depends on the number of tests.
\[ \begin{align} P(\text{incorrectly reject H0}) &= \alpha \\ P(\text{correctly reject H0}) &= 1 - \alpha \\ P(\text{correctly reject H0 in} ~m~ \text{tests}) &= (1 - \alpha)^m \\ P(\text{incorrectly reject H0 in} ~m~ \text{tests}) &= 1 - (1 - \alpha)^m \\ \end{align} \]
For the green jelly beans scenario:
\[ \begin{align} P(\text{incorrectly reject H0 in 20 tests}) &= 1 - (1 - 0.05)^{20} \\ &= 1 - (0.95)^{20} \\ &= 1 - 0.36 \\ &= 0.64 \\ \end{align} \]
We accepted the risk of incorrectly rejecting the null hypothesis 5% of the time. But if we run 20 tests and accept any positive result, we will risk incorrectly rejecting H0 64% of the time!
How do we fix it?
Lower the \(\alpha\) level (aka our accepted level of risk, aka the false positive rate, aka the Type I error rate) to compensate for the increased probability of incorrectly rejecting H0.
You only need to know about Tukey and Bonferroni, the two most common.
We’ll start with Bonferroni, because it’s simplest.
The goal is to lower the \(\alpha\) level of our tests so that it matches our desired level of risk.
The Bonferroni correction takes our original \(\alpha\) level and divides it by the number of tests:
\[ \alpha_{\text{Bonferroni}} = \frac{\alpha}{\text{number of tests}} \]
For the jelly bean scenario, with 20 tests:
For our simple effects of Treatment by Hospital, with three tests:
Hospital = Hosp1:
contrast estimate SE df t.ratio p.value
CBT - EMDR 0.673 0.523 174 1.287 0.5993
CBT - Meds -0.697 0.523 174 -1.332 0.5541
EMDR - Meds -1.370 0.523 174 -2.619 0.0288
Hospital = Hosp2:
contrast estimate SE df t.ratio p.value
CBT - EMDR -5.137 0.523 174 -9.819 <0.0001
CBT - Meds 0.127 0.523 174 0.242 1.0000
EMDR - Meds 5.263 0.523 174 10.061 <0.0001
P value adjustment: bonferroni method for 3 tests
Think back to independent-samples t-tests in DAPR1:
At heart, Tukey’s HSD is very similar!
You do not need to know how to calculate this adjustment by hand!
If you want to know, here’s a YouTube video that walks through the calculation. (The main action is between 3:41–7:42.)
Hospital = Hosp1:
contrast estimate SE df t.ratio p.value
CBT - EMDR 0.673 0.523 174 1.287 0.4044
CBT - Meds -0.697 0.523 174 -1.332 0.3796
EMDR - Meds -1.370 0.523 174 -2.619 0.0259
Hospital = Hosp2:
contrast estimate SE df t.ratio p.value
CBT - EMDR -5.137 0.523 174 -9.819 <0.0001
CBT - Meds 0.127 0.523 174 0.242 0.9682
EMDR - Meds 5.263 0.523 174 10.061 <0.0001
P value adjustment: tukey method for comparing a family of 3 estimates
When to choose Bonferroni:
When to choose Tukey:
Check the flashcards for more details.
How do we test an interaction between two manually-coded post-hoc contrasts?
emmeans() toolkit.What’s the main danger of running multiple statistical tests on the same data?
What are two common ways that we can correct for running multiple statistical tests on the same data?
Tasks:
Attend your lab and work together on the exercises
Support:
Help each other on the Piazza forum
Complete the weekly quiz

Attend office hours (see Learn page for details)
Version 1: Differences are the same (H0) or not the same (H1):
\[\begin{align} H_0 &: \frac{ \mu_{\text{CBT,Hosp1}} + \mu_{\text{EMDR,Hosp1}} }{2} - \mu_{\text{Meds,Hosp1}} = \frac{ \mu_{\text{CBT,Hosp2}} + \mu_{\text{EMDR,Hosp2}} }{2} - \mu_{\text{Meds,Hosp2}} \\ H_1 &: \frac{ \mu_{\text{CBT,Hosp1}} + \mu_{\text{EMDR,Hosp1}} }{2} - \mu_{\text{Meds,Hosp1}} \neq \frac{ \mu_{\text{CBT,Hosp2}} + \mu_{\text{EMDR,Hosp2}} }{2} - \mu_{\text{Meds,Hosp2}} \end{align}\]
$$
\begin{align}
H_0 &: \frac{ \mu_{\text{CBT,Hosp1}} + \mu_{\text{EMDR,Hosp1}} }{2} - \mu_{\text{Meds,Hosp1}} =
\frac{ \mu_{\text{CBT,Hosp2}} + \mu_{\text{EMDR,Hosp2}} }{2} - \mu_{\text{Meds,Hosp2}} \\
H_1 &: \frac{ \mu_{\text{CBT,Hosp1}} + \mu_{\text{EMDR,Hosp1}} }{2} - \mu_{\text{Meds,Hosp1}} \neq
\frac{ \mu_{\text{CBT,Hosp2}} + \mu_{\text{EMDR,Hosp2}} }{2} - \mu_{\text{Meds,Hosp2}}
\end{align}
$$
Version 2, also fine: Difference between differences is zero (H0) or not zero (H1):
\[\begin{align} H_0 &: \Big( \frac{ \mu_{\text{CBT,Hosp1}} + \mu_{\text{EMDR,Hosp1}} }{2} - \mu_{\text{Meds,Hosp1}} \Big) - \Big( \frac{ \mu_{\text{CBT,Hosp2}} + \mu_{\text{EMDR,Hosp2}} }{2} - \mu_{\text{Meds,Hosp2}} \Big) = 0 \\ H_1 &: \Big( \frac{ \mu_{\text{CBT,Hosp1}} + \mu_{\text{EMDR,Hosp1}} }{2} - \mu_{\text{Meds,Hosp1}} \Big) - \Big( \frac{ \mu_{\text{CBT,Hosp2}} + \mu_{\text{EMDR,Hosp2}} }{2} - \mu_{\text{Meds,Hosp2}} \Big) \neq 0 \\ \end{align}\]
$$
\begin{align}
H_0 &: \Big( \frac{ \mu_{\text{CBT,Hosp1}} + \mu_{\text{EMDR,Hosp1}} }{2} - \mu_{\text{Meds,Hosp1}} \Big) -
\Big( \frac{ \mu_{\text{CBT,Hosp2}} + \mu_{\text{EMDR,Hosp2}} }{2} - \mu_{\text{Meds,Hosp2}} \Big) = 0 \\
H_1 &: \Big( \frac{ \mu_{\text{CBT,Hosp1}} + \mu_{\text{EMDR,Hosp1}} }{2} - \mu_{\text{Meds,Hosp1}} \Big) -
\Big( \frac{ \mu_{\text{CBT,Hosp2}} + \mu_{\text{EMDR,Hosp2}} }{2} - \mu_{\text{Meds,Hosp2}} \Big) \neq 0 \\
\end{align}
$$
\[
\begin{align}
\text{SWB} ~=~& \beta_0 +
(\beta_1 \cdot \text{Treatment}_{\text{EMDR}}) +
(\beta_2 \cdot \text{Treatment}_{\text{Meds}}) +
(\beta_3 \cdot \text{Hospital}) ~ + \\
& (\beta_4 \cdot \text{Treatment}_{\text{EMDR}} \cdot \text{Hospital}) +
(\beta_5 \cdot \text{Treatment}_{\text{Meds}} \cdot \text{Hospital}) +
\epsilon
\end{align}
\]
If we compare pairs() with no correction and with the Bonferroni correction, it’s only the p-values that change.
Hospital = Hosp1:
contrast estimate SE df t.ratio p.value
CBT - EMDR 0.673 0.523 174 1.287 0.1998
CBT - Meds -0.697 0.523 174 -1.332 0.1847
EMDR - Meds -1.370 0.523 174 -2.619 0.0096
Hospital = Hosp2:
contrast estimate SE df t.ratio p.value
CBT - EMDR -5.137 0.523 174 -9.819 <0.0001
CBT - Meds 0.127 0.523 174 0.242 0.8090
EMDR - Meds 5.263 0.523 174 10.061 <0.0001
Hospital = Hosp1:
contrast estimate SE df t.ratio p.value
CBT - EMDR 0.673 0.523 174 1.287 0.5993
CBT - Meds -0.697 0.523 174 -1.332 0.5541
EMDR - Meds -1.370 0.523 174 -2.619 0.0288
Hospital = Hosp2:
contrast estimate SE df t.ratio p.value
CBT - EMDR -5.137 0.523 174 -9.819 <0.0001
CBT - Meds 0.127 0.523 174 0.242 1.0000
EMDR - Meds 5.263 0.523 174 10.061 <0.0001
P value adjustment: bonferroni method for 3 tests
So technically, pairs() adjusts p, not \(\alpha\). But whichever one is adjusted, the results are equivalent!
Adjusting \(\alpha\) by dividing by 3 is the same as adjusting p by multiplying by 3 (and if the value ends up above 1, setting it equal to 1).
For example, the unadjusted p-value in Hosp1 for CBT - Meds, 0.1847, multiplied by 3 (the number of tests), is the Bonferroni-adjusted p-value, 0.5541.