Data Analysis for Psychology in R 2
Department of Psychology
University of Edinburgh
2025–2026
| Introduction to Linear Models | Intro to Linear Regression |
| Interpreting Linear Models | |
| Testing Individual Predictors | |
| Model Testing & Comparison | |
| Linear Model Analysis | |
| Analysing Experimental Studies | Categorical Predictors & Dummy Coding |
| Effects Coding & Coding Specific Contrasts | |
| Assumptions & Diagnostics | |
| Bootstrapping | |
| Categorical Predictor Analysis |
| Interactions | Interactions I |
| Interactions II | |
| Interactions III | |
| Analysing Experiments | |
| Interaction Analysis | |
| Advanced Topics | Power Analysis |
| Binary Logistic Regression I | |
| Binary Logistic Regression II | |
| Logistic Regression Analysis | |
| Exam Prep and Course Q&A |
Answer the questions in this table as thoroughly as you can FROM MEMORY.
(It’s extremely OK and normal to not remember everything.)
| Intercept \(\beta_0\) | Slope \(\beta_j\) | |||
|---|---|---|---|---|
| Meaning | H0 | Meaning | H0 | |
| Dummy coding / Treatment coding |
What does the intercept mean? | What null hypothesis is tested for the intercept? | What does the slope coefficient mean? | What hypothesis is tested for the slope coefficient? |
Once you’ve written down everything you can remember, look at your notes and fill in the gaps.
Retrieving information from memory is a good study strategy too. According to Brown et al. (2014), if you test your memory first and only afterward look up the information, you’ll end up remembering things much better than if you look up the information without testing yourself first.
Dummy coding is one common scheme for a priori contrast coding. What’s another common scheme, and how is it different?
When we code predictors using this other coding scheme, how do we interpret the linear model’s coefficients?
In this other coding scheme, what hypotheses are tested for each coefficient?
How can we test hypotheses other than the ones tested by a priori coding schemes?
| Intercept \(\beta_0\) | Slope \(\beta_j\) | |||
|---|---|---|---|---|
| Meaning | H0 | Meaning | H0 | |
| Dummy coding / Treatment coding |
Mean outcome of reference level | Mean outcome of ref. level = 0 | Difference between non-ref. level and ref. level | Difference between non-ref. level and ref. level = 0 |
| Effect(s) coding / sum(-to-zero) coding | ||||
Dummy coding/treatment coding:
alone is coded as 0.others is coded as 1.Effect(s) coding / sum-to-zero coding:
alone is coded as 1.others is coded as –1.Dummy coding (from last week) uses 0 and 1.
Effect coding (this week) uses 1 and –1.
Effect coding still fits a line through both group means.
Predict with your neighbours: What are your guesses for each question? Why do you think your guesses are likely to be correct?
score ~ study\[ \text{score}_i = \beta_0 + (\beta_1 \cdot \text{study}) + \epsilon_i \]
Call:
lm(formula = score ~ study, data = score_data)
Residuals:
Min 1Q Median 3Q Max
-22.79 -4.79 0.21 5.48 18.21
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 25.157 0.505 49.85 < 2e-16 ***
study1 -2.367 0.505 -4.69 0.0000045 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 7.9 on 248 degrees of freedom
Multiple R-squared: 0.0815, Adjusted R-squared: 0.0777
F-statistic: 22 on 1 and 248 DF, p-value: 0.00000453
Does your prediction about the intercept make sense, given the estimate of 25.157?
Does your prediction about the slope make sense, given the estimate of -2.367?
Estimate Std. Error t value Pr(>|t|)
(Intercept) 25.16 0.505 49.85 3.12e-131
study1 -2.37 0.505 -4.69 4.53e-06
Estimate Std. Error t value Pr(>|t|)
(Intercept) 25.16 0.505 49.85 3.12e-131
study1 -2.37 0.505 -4.69 4.53e-06
Observe and explain: Did your guesses match the results? Why are the results the way they are?
Estimate Std. Error t value Pr(>|t|)
(Intercept) 25.16 0.505 49.85 3.12e-131
study1 -2.37 0.505 -4.69 4.53e-06
(Intercept) aka \(\hat{\beta_0}\):
| Intercept \(\beta_0\) | Slope \(\beta_j\) | |||
|---|---|---|---|---|
| Meaning | H0 | Meaning | H0 | |
| Dummy coding / Treatment coding |
Mean outcome of reference level | Mean outcome of ref. level = 0 | Difference between non-ref. level and ref. level | Difference between non-ref. level and ref. level = 0 |
| Effect(s) coding / sum(-to-zero) coding | Grand mean (mean of all group mean outcomes) | Difference between mean of level coded as 1 and grand mean | ||
Estimate Std. Error t value Pr(>|t|)
(Intercept) 25.157 0.505 49.85 < 2e-16 ***
study1 -2.367 0.505 -4.69 0.0000045 ***
(Intercept):
Can we reject this null hypothesis?
study1:
alone and the grand mean is equal to zero.Can we reject this null hypothesis?
| Intercept \(\beta_0\) | Slope \(\beta_j\) | |||
|---|---|---|---|---|
| Meaning | H0 | Meaning | H0 | |
| Dummy coding / Treatment coding |
Mean outcome of reference level | Mean outcome of ref. level = 0 | Difference between non-ref. level and ref. level | Difference between non-ref. level and ref. level = 0 |
| Effect(s) coding / sum(-to-zero) coding | Grand mean (mean of all group mean outcomes) | Grand mean = 0 | Difference between mean of level coded as 1 and grand mean | Difference between mean of level coded as 1 and grand mean = 0 |
In general, the H0 being tested for a given parameter is always “this parameter is equal to 0”.
Here’s what the coefficients mean in effect coding:
Imagine fitting a model score ~ method. This slide has all the information you need to guess the coefficient values.
Work individually or with your neighbour(s).
method1. What is its value?method2. What is its value?Here’s what the coefficients mean in effect coding:
score ~ method
Call:
lm(formula = score ~ method, data = score_data)
Residuals:
Min 1Q Median 3Q Max
-23.414 -5.359 -0.196 5.750 17.804
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 25.062 0.518 48.41 <2e-16 ***
method1 -1.648 0.720 -2.29 0.0229 *
method2 2.514 0.773 3.25 0.0013 **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 8.08 on 247 degrees of freedom
Multiple R-squared: 0.0422, Adjusted R-squared: 0.0345
F-statistic: 5.45 on 2 and 247 DF, p-value: 0.00484
| Intercept \(\beta_0\) | Slope \(\beta_j\) | |||
|---|---|---|---|---|
| Meaning | H0 | Meaning | H0 | |
| Dummy coding / Treatment coding |
Mean outcome of reference level | Mean outcome of ref. level = 0 | Difference between non-ref. level and ref. level | Difference between non-ref. level and ref. level = 0 |
| Effect(s) coding / sum(-to-zero) coding | Grand mean (mean of all group mean outcomes) | Grand mean = 0 | Difference between mean of level coded as 1 and grand mean | Difference between mean of level coded as 1 and grand mean = 0 |
Estimate Std. Error t value Pr(>|t|)
(Intercept) 25.062 0.518 48.41 <2e-16 ***
method1 -1.648 0.720 -2.29 0.0229 *
method2 2.514 0.773 3.25 0.0013 **
A priori contrasts are chosen prior to (before) fitting the model.
Every categorical variable must be coded with some kind of a priori contrast.
Your toolkit now includes:
Post-hoc contrasts are tested post (after) fitting the model.
We can optionally use estimated marginal means to test hypotheses beyond the ones from our a priori contrasts.
We tested our first post-hoc contrasts last week, when we looked at the difference between the two non-reference levels of method.
This session, we’ll learn how to define any contrasts we want.
\(\rightarrow\) We’ll create the contrasts to address this question together.
\(\rightarrow\) You’ll create the contrasts for this question with your neighbours.
Currently partnered vs. previously partnered:
Has been married vs. never married:
If we don’t specify a coding scheme a priori, the model will by default use treatment coding.
The reference level will be the level of the factor that comes first in the alphabet.
To see the dummy variables the model will use, try contrasts().
Call:
lm(formula = swb ~ status, data = wb_data)
Residuals:
Min 1Q Median 3Q Max
-11.337 -2.078 -0.002 2.053 11.473
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 11.438 0.333 34.35 < 2e-16 ***
statusDivorced -2.064 0.577 -3.58 0.00038 ***
statusMarried/CP -0.812 0.389 -2.09 0.03721 *
statusSingle -3.377 0.577 -5.86 8.7e-09 ***
statusWidowed -5.433 0.745 -7.30 1.2e-12 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 3.33 on 495 degrees of freedom
Multiple R-squared: 0.142, Adjusted R-squared: 0.135
F-statistic: 20.4 on 4 and 495 DF, p-value: 1.38e-15
These hypothesis tests don’t address our research questions! We’ll need to define our own contrasts.
Is there a difference in subjective well-being between (1) people who are currently married or in a civil partnership and (2) people who were previously married or in a civil partnership?
group |
current |
|---|---|
Cohab |
|
Divorced |
|
Married/CP |
|
Single |
|
Widowed |
|
Step 1: “Chunk” together the two group(s) that the research question is comparing.
Married/CP.Divorced, Widowed.Is there a difference in subjective well-being between (1) people who are currently married or in a civil partnership and (2) people who were previously married or in a civil partnership?
group |
current |
|---|---|
Cohab |
0 |
Divorced |
|
Married/CP |
|
Single |
0 |
Widowed |
|
Step 1: “Chunk” together the two group(s) that the research question is comparing.
Married/CP.Divorced, Widowed.Step 2: Assign a 0 to any group(s) that aren’t in one of the chunks from Step 1.
Cohab, SingleIs there a difference in subjective well-being between (1) people who are currently married or in a civil partnership and (2) people who were previously married or in a civil partnership?
group |
current |
|---|---|
Cohab |
0 |
Divorced |
– |
Married/CP |
+ |
Single |
0 |
Widowed |
– |
Step 1: “Chunk” together the two group(s) that the research question is comparing.
Married/CP.Divorced, Widowed.Step 2: Assign a 0 to any group(s) that aren’t in one of the chunks from Step 1.
Cohab, SingleStep 3: Assign a plus sign to every group in Chunk 1, and a minus sign to every group in Chunk 2.
Is there a difference in subjective well-being between (1) people who are currently married or in a civil partnership and (2) people who were previously married or in a civil partnership?
group |
current |
|---|---|
Cohab |
0 |
Divorced |
– |
Married/CP |
+ |
Single |
0 |
Widowed |
– |
Step 4: Count the plus signs and minus signs.
Step 5: To figure out the actual values for each cell, start with 1 and –1. Divide 1 by \(n_{plus}\), and divide –1 by \(n_{minus}\).
Is there a difference in subjective well-being between (1) people who are currently married or in a civil partnership and (2) people who were previously married or in a civil partnership?
group |
current |
|---|---|
Cohab |
0 |
Divorced |
–1/2 |
Married/CP |
1 |
Single |
0 |
Widowed |
–1/2 |
Step 4: Count the plus signs and minus signs.
Step 5: To figure out the actual values for each cell, start with 1 and –1. Divide 1 by \(n_{plus}\), and divide –1 by \(n_{minus}\).
Step 6: In the coding matrix, replace the plus signs with the positive coding value from Step 5, and replace the minus signs with the negative coding value from Step 5. Done!
Does it matter which chunk is positive and which chunk is negative? Not really. It’ll change whether the difference that emmeans estimates and tests is positive or negative, but the absolute value of the number should be the same.
Can I compare more than two chunks in a single contrast? No. This is because we’re dealing with the slope of a line between two groups. If you want to compare more than two things, then you need more than one contrast.
Can I use the exact same chunk in more than one contrast? No. Using the exact same chunk in more than one contrast is the categorical predictor equivalent of collinearity/non-independence of predictors.
How many contrasts can I have? If you have \(k\) groups, you can have up to \(k-1\) contrasts. So since we have five groups, we can have up to four contrasts.
Any other questions about manual contrasts?
Compute m3’s estimated marginal means for each level of status.
Next, check the order of levels with levels().
group |
current |
|---|---|
Cohab |
0 |
Divorced |
–1/2 |
Married/CP |
1 |
Single |
0 |
Widowed |
–1/2 |
The estimate we get will be the grand mean of the positive chunk, minus the grand mean of the negative chunk.
Using this comparisons list, test the contrast (i.e., test the H0 that the difference between chunks is equal to 0):
contrast estimate SE df t.ratio p.value
current 2.94 0.455 495 6.460 <.0001
Get the associated 95% CIs:
Where does this number for estimate come from?
It’s the grand mean of the positive chunk, minus the grand mean of the negative chunk.
Can we reject the null hypothesis that there’s no difference in subjective well-being between people who are currently married and people who were previously married?
Is there a difference in subjective well-being between (1) people who are currently OR were previously married or in a civil partnership and (2) people who were never married or in a civil partnership?
group |
evermarried |
|---|---|
Cohab |
|
Divorced |
|
Married/CP |
|
Single |
|
Widowed |
|
Step 1: “Chunk” together the two group(s) that the research question is comparing.
Divorced, Married/CP, Widowed.Cohab, Single.Step 2: Assign a 0 to any group(s) that aren’t in one of the chunks from Step 1.
Step 3: Assign a plus sign to every group in Chunk 1, and a minus sign to every group in Chunk 2.
Step 4: Count the plus signs and minus signs.
Step 5: To figure out the actual values for each cell, start with 1 and –1. Divide 1 by \(n_{plus}\), and divide –1 by \(n_{minus}\).
Step 6: In the coding matrix, replace the plus signs with the positive coding value from Step 5, and replace the minus signs with the negative coding value from Step 5. Done!
emmeans’ estimate for the evermarried comparison will be?emmeans and test whether the difference between chunks is significantly different from zero?Test the contrast (i.e., test the H0 that the estimate is different from 0):
contrast estimate SE df t.ratio p.value
evermarried -1.08 0.402 495 -2.690 0.0074
Get the associated 95% CIs:
Sense check: The estimate is the grand mean of the positive chunk, minus the grand mean of the negative chunk.
Can we reject the null hypothesis that there’s no difference in subjective well-being between people who’ve ever been married and people who’ve never been married?
“Orthogonal” is a word that makes you sound smart when you really just mean “perpendicular” =
“at a 90 degree angle” = “at a right angle”.
Two orthogonal dimensions:
Three orthogonal dimensions:
If you change the value on one dimension, you don’t change the values on orthogonal dimension(s) at all.
Moving on the y dimension doesn’t affect x:
Moving on the z dimension doesn’t affect x or y:
In other words, if you know about y, you know nothing about x. This means that orthogonal dimensions, and orthogonal contrasts, contain completely different information.
Statistical models really like that!
group |
current |
evermarried |
product |
|---|---|---|---|
Cohab |
0 | –1/2 | |
Divorced |
–1/2 | 1/3 | |
Married/CP |
1 | 1/3 | |
Single |
0 | –1/2 | |
Widowed |
–1/2 | 1/3 | |
group |
current |
evermarried |
product |
|---|---|---|---|
Cohab |
0 | –1/2 | 0 |
Divorced |
–1/2 | 1/3 | |
Married/CP |
1 | 1/3 | |
Single |
0 | –1/2 | |
Widowed |
–1/2 | 1/3 | |
group |
current |
evermarried |
product |
|---|---|---|---|
Cohab |
0 | –1/2 | 0 |
Divorced |
–1/2 | 1/3 | –1/6 |
Married/CP |
1 | 1/3 | |
Single |
0 | –1/2 | |
Widowed |
–1/2 | 1/3 | |
group |
current |
evermarried |
product |
|---|---|---|---|
Cohab |
0 | –1/2 | 0 |
Divorced |
–1/2 | 1/3 | –1/6 |
Married/CP |
1 | 1/3 | 1/3 |
Single |
0 | –1/2 | |
Widowed |
–1/2 | 1/3 | |
group |
current |
evermarried |
product |
|---|---|---|---|
Cohab |
0 | –1/2 | 0 |
Divorced |
–1/2 | 1/3 | –1/6 |
Married/CP |
1 | 1/3 | 1/3 |
Single |
0 | –1/2 | 0 |
Widowed |
–1/2 | 1/3 | |
group |
current |
evermarried |
product |
|---|---|---|---|
Cohab |
0 | –1/2 | 0 |
Divorced |
–1/2 | 1/3 | –1/6 |
Married/CP |
1 | 1/3 | 1/3 |
Single |
0 | –1/2 | 0 |
Widowed |
–1/2 | 1/3 | –1/6 |
0 + –1/6 + 1/3 + 0 + –1/6
= 0
Orthogonal!
It affects how we interpret the contrasts.
If contrasts are orthogonal:
If they’re not orthogonal:
If at all possible, create orthogonal contrasts. Your life will be easier.
Dummy coding is one common scheme for a priori contrast coding. What’s another common scheme, and how is it different?
When we code predictors using this other coding scheme, how do we interpret the linear model’s coefficients?
In this other coding scheme, what hypotheses are tested for each coefficient?
How can we test hypotheses other than the ones tested by a priori coding schemes?
Attend your lab and work together on the exercises
Help each other on the Piazza forum
Complete the weekly quiz

Attend office hours (see Learn page for details)
The linear expression telling us model predictions:
\[
\widehat{\text{outcome}}= \hat{\beta_0} + (\hat{\beta_1} \cdot \text{Predictor1}) + (\hat{\beta_2} \cdot \text{Predictor2})
\]
We can combine the estimated betas to compute the means of each level. To do this, we plug in the values for Predictor1 and Predictor2 that correspond to each level of the three-level variable. We get these values from the rows of our our coding matrix.
Level 1 is represented as Predictor1 = 1, Predictor2 = 0
(first row of contrast matrix).
\[ \begin{align} \widehat{\text{outcome}}_{\text{level 1}} &= \hat{\beta_0} + (\hat{\beta_1} \cdot \text{Predictor1}) + (\hat{\beta_2} \cdot \text{Predictor2})\\ &= \hat{\beta_0} + (\hat{\beta_1} \cdot 1) + (\hat{\beta_2} \cdot 0)\\ &= \hat{\beta_0} + \hat{\beta_1}\\ \end{align} \]
Level 2 is represented as Predictor1 = 0, Predictor2 = 1
(second row of contrast matrix).
\[ \begin{align} \widehat{\text{outcome}}_{\text{level 2}} &= \hat{\beta_0} + (\hat{\beta_1} \cdot \text{Predictor1}) + (\hat{\beta_2} \cdot \text{Predictor2})\\ &= \hat{\beta_0} + (\hat{\beta_1} \cdot 0) + (\hat{\beta_2} \cdot 1)\\ &= \hat{\beta_0} + \hat{\beta_2}\\ \end{align} \]
Level 3 is represented as Predictor1 = -1, Predictor2 = -1
(third row of contrast matrix).
\[ \begin{align} \widehat{\text{outcome}}_{\text{level 3}} &= \hat{\beta_0} + (\hat{\beta_1} \cdot \text{Predictor1}) + (\hat{\beta_2} \cdot \text{Predictor2})\\ &= \hat{\beta_0} + (\hat{\beta_1} \cdot -1) + (\hat{\beta_2} \cdot -1)\\ &= \hat{\beta_0} - \hat{\beta_1} - \hat{\beta_2}\\ &= \hat{\beta_0} - (\hat{\beta_1} + \hat{\beta_2})\\ \end{align} \]
The grand mean is the mean of the group means.
The grand mean is sometimes, but not always, the overall mean of the observed data.
# A tibble: 5 x 2
# Groups: status [5]
status n
<fct> <int>
1 Cohab 100
2 Divorced 50
3 Married/CP 275
4 Single 50
5 Widowed 25
The \(\times\) shows the grand mean = the model’s intercept.
Once effect coding uses >2 levels, the line between group means does not give us the correct intercept or slope.
evermarried