Effect coding and manual post-hoc contrasts


Data Analysis for Psychology in R 2

Elizabeth Pankratz (elizabeth.pankratz@ed.ac.uk)


Department of Psychology
University of Edinburgh
2025–2026

Course Overview


Introduction to Linear Models Intro to Linear Regression
Interpreting Linear Models
Testing Individual Predictors
Model Testing & Comparison
Linear Model Analysis
Analysing Experimental Studies Categorical Predictors & Dummy Coding
Effects Coding & Coding Specific Contrasts
Assumptions & Diagnostics
Bootstrapping
Categorical Predictor Analysis
Interactions Interactions I
Interactions II
Interactions III
Analysing Experiments
Interaction Analysis
Advanced Topics Power Analysis
Binary Logistic Regression I
Binary Logistic Regression II
Logistic Regression Analysis
Exam Prep and Course Q&A

Retrieval practice: Coefficients and
null hypotheses (H0s) in dummy coding


Answer the questions in this table as thoroughly as you can FROM MEMORY.

(It’s extremely OK and normal to not remember everything.)


Intercept \(\beta_0\) Slope \(\beta_j\)
Meaning H0 Meaning H0
Dummy coding /
Treatment coding
What does the intercept mean? What null hypothesis is tested for the intercept? What does the slope coefficient mean? What hypothesis is tested for the slope coefficient?

Once you’ve written down everything you can remember, look at your notes and fill in the gaps.


Retrieving information from memory is a good study strategy too. According to Brown et al. (2014), if you test your memory first and only afterward look up the information, you’ll end up remembering things much better than if you look up the information without testing yourself first.

This week’s learning objectives


Dummy coding is one common scheme for a priori contrast coding. What’s another common scheme, and how is it different?

When we code predictors using this other coding scheme, how do we interpret the linear model’s coefficients?

In this other coding scheme, what hypotheses are tested for each coefficient?

How can we test hypotheses other than the ones tested by a priori coding schemes?

Coefficients and null hypotheses (H0s)


Intercept \(\beta_0\) Slope \(\beta_j\)
Meaning H0 Meaning H0
Dummy coding /
Treatment coding
Mean outcome of reference level Mean outcome of ref. level = 0 Difference between non-ref. level and ref. level Difference between non-ref. level and ref. level = 0
Effect(s) coding / sum(-to-zero) coding


Effect coding

Effect coding: Another way of representing categorical predictors as numbers


Dummy coding/treatment coding:

  • Studying alone is coded as 0.
  • Studying with others is coded as 1.

Effect(s) coding / sum-to-zero coding:

  • Studying alone is coded as 1.
  • Studying with others is coded as –1.

Same data as last week: Two study patterns

Same data represented as different numbers

Dummy coding (from last week) uses 0 and 1.

Effect coding (this week) uses 1 and –1.

Effect coding still fits a line through both group means.

  • What will this line’s intercept represent?
  • What will this line’s slope represent?

Predict with your neighbours: What are your guesses for each question? Why do you think your guesses are likely to be correct?

Defining effect coding in R


R uses dummy coding by default.

contrasts(score_data$study)
       others
alone       0
others      1


To make sure that our predictor is effect-coded, we use the function contr.sum().

contrasts(score_data$study) <- contr.sum(2)  # 2 because there are 2 levels
contrasts(score_data$study)
       [,1]
alone     1
others   -1

Model score ~ study

\[ \text{score}_i = \beta_0 + (\beta_1 \cdot \text{study}) + \epsilon_i \]

m1 <- lm(score ~ study, data = score_data)
summary(m1)

Call:
lm(formula = score ~ study, data = score_data)

Residuals:
   Min     1Q Median     3Q    Max 
-22.79  -4.79   0.21   5.48  18.21 

Coefficients:
            Estimate Std. Error t value  Pr(>|t|)    
(Intercept)   25.157      0.505   49.85   < 2e-16 ***
study1        -2.367      0.505   -4.69 0.0000045 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 7.9 on 248 degrees of freedom
Multiple R-squared:  0.0815,    Adjusted R-squared:  0.0777 
F-statistic:   22 on 1 and 248 DF,  p-value: 0.00000453
  • Does your prediction about the intercept make sense, given the estimate of 25.157?

  • Does your prediction about the slope make sense, given the estimate of -2.367?

What does each coefficient represent?

            Estimate Std. Error t value  Pr(>|t|)
(Intercept)    25.16      0.505   49.85 3.12e-131
study1         -2.37      0.505   -4.69  4.53e-06

What does each coefficient represent?

            Estimate Std. Error t value  Pr(>|t|)
(Intercept)    25.16      0.505   49.85 3.12e-131
study1         -2.37      0.505   -4.69  4.53e-06


What does each coefficient represent?

Observe and explain: Did your guesses match the results? Why are the results the way they are?

            Estimate Std. Error t value  Pr(>|t|)
(Intercept)    25.16      0.505   49.85 3.12e-131
study1         -2.37      0.505   -4.69  4.53e-06


(Intercept) aka \(\hat{\beta_0}\):

  • 25.16 is the grand mean: the mean of the group means.
(m1_grand_mean <- mean(
  c(mean_others, mean_alone))
 )
[1] 25.2


study1 aka \(\hat{\beta_1}\):

  • -2.37 is the difference between the level coded as 1, alone, and the grand mean.
mean_alone - m1_grand_mean
[1] -2.37

Coefficients and null hypotheses (H0s)


Intercept \(\beta_0\) Slope \(\beta_j\)
Meaning H0 Meaning H0
Dummy coding /
Treatment coding
Mean outcome of reference level Mean outcome of ref. level = 0 Difference between non-ref. level and ref. level Difference between non-ref. level and ref. level = 0
Effect(s) coding / sum(-to-zero) coding Grand mean (mean of all group mean outcomes) Difference between mean of level coded as 1 and grand mean


What hypotheses does effect coding test?


            Estimate Std. Error t value  Pr(>|t|)    
 (Intercept)   25.157      0.505   49.85   < 2e-16 ***
 study1        -2.367      0.505   -4.69 0.0000045 ***


(Intercept):

  • Null hypothesis: The grand mean is equal to zero.
  • \(p\)-value: the probability of observing a grand mean of 25.157 (or a value more extreme), assuming that the true grand mean is zero.

Can we reject this null hypothesis?


study1:

  • Null hypothesis: The difference between the mean score of alone and the grand mean is equal to zero.
  • \(p\)-value: the probability of observing a difference of -2.367 (or a value more extreme), assuming that the true difference is zero.

Can we reject this null hypothesis?

Coefficients and null hypotheses (H0s)


Intercept \(\beta_0\) Slope \(\beta_j\)
Meaning H0 Meaning H0
Dummy coding /
Treatment coding
Mean outcome of reference level Mean outcome of ref. level = 0 Difference between non-ref. level and ref. level Difference between non-ref. level and ref. level = 0
Effect(s) coding / sum(-to-zero) coding Grand mean (mean of all group mean outcomes) Grand mean = 0 Difference between mean of level coded as 1 and grand mean Difference between mean of level coded as 1 and grand mean = 0


In general, the H0 being tested for a given parameter is always “this parameter is equal to 0”.


Effect coding for >2 levels

Same data as last week: Three study methods

Challenge: Guess the coefficient values

Here’s what the coefficients mean in effect coding:

  • Intercept = Grand mean (mean of all group mean outcomes)
  • Slope = Difference between mean of level coded as 1 and grand mean (i.e., group mean – grand mean)

Mean of read:

mean_read
[1] 23.4

Mean of self-test:

mean_self
[1] 27.6

Mean of summarise:

mean_summ
[1] 24.2
contrasts(score_data$method) <- contr.sum(3)
contrasts(score_data$method)
          [,1] [,2]
read         1    0
self-test    0    1
summarise   -1   -1

Imagine fitting a model score ~ method. This slide has all the information you need to guess the coefficient values.

Work individually or with your neighbour(s).

  • What’s the value of the intercept?
  • There’ll be a predictor called method1. What is its value?
  • There’ll also be a predictor called method2. What is its value?

Challenge: Guess the coefficient values

Here’s what the coefficients mean in effect coding:

  • Intercept = Grand mean (mean of all group mean outcomes)
  • Slope = Difference between mean of level coded as 1 and grand mean (i.e., group mean – grand mean)

Mean of read:

mean_read
[1] 23.4

Mean of self-test:

mean_self
[1] 27.6

Mean of summarise:

mean_summ
[1] 24.2
contrasts(score_data$method) <- contr.sum(3)
contrasts(score_data$method)
          [,1] [,2]
read         1    0
self-test    0    1
summarise   -1   -1
  • Intercept: grand mean, so
(m2_grand_mean <- (mean_read + mean_self + mean_summ)/3)
[1] 25.1
  • method1: in column 1 of contrast matrix, read = 1, so
mean_read - m2_grand_mean
[1] -1.65
  • method2: in column 2 of contrast matrix, self-test = 1, so
mean_self - m2_grand_mean
[1] 2.51

Model score ~ method

m2 <- lm(score ~ method, data = score_data)
summary(m2)

Call:
lm(formula = score ~ method, data = score_data)

Residuals:
    Min      1Q  Median      3Q     Max 
-23.414  -5.359  -0.196   5.750  17.804 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)   25.062      0.518   48.41   <2e-16 ***
method1       -1.648      0.720   -2.29   0.0229 *  
method2        2.514      0.773    3.25   0.0013 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 8.08 on 247 degrees of freedom
Multiple R-squared:  0.0422,    Adjusted R-squared:  0.0345 
F-statistic: 5.45 on 2 and 247 DF,  p-value: 0.00484

What hypotheses are being tested?


Intercept \(\beta_0\) Slope \(\beta_j\)
Meaning H0 Meaning H0
Dummy coding /
Treatment coding
Mean outcome of reference level Mean outcome of ref. level = 0 Difference between non-ref. level and ref. level Difference between non-ref. level and ref. level = 0
Effect(s) coding / sum(-to-zero) coding Grand mean (mean of all group mean outcomes) Grand mean = 0 Difference between mean of level coded as 1 and grand mean Difference between mean of level coded as 1 and grand mean = 0


            Estimate Std. Error t value Pr(>|t|)    
 (Intercept)   25.062      0.518   48.41   <2e-16 ***
 method1       -1.648      0.720   -2.29   0.0229 *  
 method2        2.514      0.773    3.25   0.0013 ** 


Creating your own post-hoc contrasts

A note on terminology: A priori contrasts?
Post-hoc contrasts?

A priori contrasts are chosen prior to (before) fitting the model.

Every categorical variable must be coded with some kind of a priori contrast.

Your toolkit now includes:

  1. dummy coding = treatment coding (uses 0 and 1)
  2. effect coding = effects coding = sum-to-zero coding = sum coding (uses 1 and –1)

Post-hoc contrasts are tested post (after) fitting the model.

We can optionally use estimated marginal means to test hypotheses beyond the ones from our a priori contrasts.

We tested our first post-hoc contrasts last week, when we looked at the difference between the two non-reference levels of method.

This session, we’ll learn how to define any contrasts we want.

The data: Subjective well-being and partnership

Two research questions


  1. Is there a difference in subjective well-being between (1) people who are currently married or in a civil partnership and (2) people who were previously married or in a civil partnership?

\(\rightarrow\) We’ll create the contrasts to address this question together.


  1. Is there a difference in subjective well-being between (1) people who are currently OR were previously married or in a civil partnership and (2) people who were never married or in a civil partnership?

\(\rightarrow\) You’ll create the contrasts for this question with your neighbours.

Visualising the data these contrasts will analyse (1)

Currently partnered vs. previously partnered:

Visualising the data these contrasts will analyse (2)

Has been married vs. never married:

Fit model


m3 <- lm(swb ~ status, wb_data)


If we don’t specify a coding scheme a priori, the model will by default use treatment coding.

The reference level will be the level of the factor that comes first in the alphabet.


To see the dummy variables the model will use, try contrasts().

contrasts(wb_data$status)
           Divorced Married/CP Single Widowed
Cohab             0          0      0       0
Divorced          1          0      0       0
Married/CP        0          1      0       0
Single            0          0      1       0
Widowed           0          0      0       1

Fit model

summary(m3)

Call:
lm(formula = swb ~ status, data = wb_data)

Residuals:
    Min      1Q  Median      3Q     Max 
-11.337  -2.078  -0.002   2.053  11.473 

Coefficients:
                 Estimate Std. Error t value Pr(>|t|)    
(Intercept)        11.438      0.333   34.35  < 2e-16 ***
statusDivorced     -2.064      0.577   -3.58  0.00038 ***
statusMarried/CP   -0.812      0.389   -2.09  0.03721 *  
statusSingle       -3.377      0.577   -5.86  8.7e-09 ***
statusWidowed      -5.433      0.745   -7.30  1.2e-12 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 3.33 on 495 degrees of freedom
Multiple R-squared:  0.142, Adjusted R-squared:  0.135 
F-statistic: 20.4 on 4 and 495 DF,  p-value: 1.38e-15


These hypothesis tests don’t address our research questions! We’ll need to define our own contrasts.

Defining manual contrasts step by step

Defining manual contrasts step by step

Is there a difference in subjective well-being between (1) people who are currently married or in a civil partnership and (2) people who were previously married or in a civil partnership?

group current
Cohab
Divorced
Married/CP
Single
Widowed

Step 1: “Chunk” together the two group(s) that the research question is comparing.

  • Chunk 1: People who are currently married or in a civil partnership: Married/CP.
  • Chunk 2: People who were previously married or in a civil partnership: Divorced, Widowed.

Defining manual contrasts step by step

Is there a difference in subjective well-being between (1) people who are currently married or in a civil partnership and (2) people who were previously married or in a civil partnership?

group current
Cohab 0
Divorced
Married/CP
Single 0
Widowed

Step 1: “Chunk” together the two group(s) that the research question is comparing.

  • Chunk 1: People who are currently married or in a civil partnership: Married/CP.
  • Chunk 2: People who were previously married or in a civil partnership: Divorced, Widowed.

Step 2: Assign a 0 to any group(s) that aren’t in one of the chunks from Step 1.

  • Cohab, Single

Defining manual contrasts step by step

Is there a difference in subjective well-being between (1) people who are currently married or in a civil partnership and (2) people who were previously married or in a civil partnership?

group current
Cohab 0
Divorced
Married/CP +
Single 0
Widowed

Step 1: “Chunk” together the two group(s) that the research question is comparing.

  • Chunk 1: People who are currently married or in a civil partnership: Married/CP.
  • Chunk 2: People who were previously married or in a civil partnership: Divorced, Widowed.

Step 2: Assign a 0 to any group(s) that aren’t in one of the chunks from Step 1.

  • Cohab, Single

Step 3: Assign a plus sign to every group in Chunk 1, and a minus sign to every group in Chunk 2.

Defining manual contrasts step by step

Is there a difference in subjective well-being between (1) people who are currently married or in a civil partnership and (2) people who were previously married or in a civil partnership?

group current
Cohab 0
Divorced
Married/CP +
Single 0
Widowed

Step 4: Count the plus signs and minus signs.

  • Plus: \(n_{plus}\) = 1
  • Minus: \(n_{minus}\) = 2

Step 5: To figure out the actual values for each cell, start with 1 and –1. Divide 1 by \(n_{plus}\), and divide –1 by \(n_{minus}\).

  • Plus: 1 divided by 1 = 1
  • Minus: –1 divided by 2 = –1/2

Defining manual contrasts step by step

Is there a difference in subjective well-being between (1) people who are currently married or in a civil partnership and (2) people who were previously married or in a civil partnership?

group current
Cohab 0
Divorced –1/2
Married/CP 1
Single 0
Widowed –1/2

Step 4: Count the plus signs and minus signs.

  • Plus: \(n_{plus}\) = 1
  • Minus: \(n_{minus}\) = 2

Step 5: To figure out the actual values for each cell, start with 1 and –1. Divide 1 by \(n_{plus}\), and divide –1 by \(n_{minus}\).

  • Plus: 1 divided by 1 = 1
  • Minus: –1 divided by 2 = –1/2

Step 6: In the coding matrix, replace the plus signs with the positive coding value from Step 5, and replace the minus signs with the negative coding value from Step 5. Done!

Manual contrasts FAQ


  • Does it matter which chunk is positive and which chunk is negative? Not really. It’ll change whether the difference that emmeans estimates and tests is positive or negative, but the absolute value of the number should be the same.

  • Can I compare more than two chunks in a single contrast? No. This is because we’re dealing with the slope of a line between two groups. If you want to compare more than two things, then you need more than one contrast.

  • Can I use the exact same chunk in more than one contrast? No. Using the exact same chunk in more than one contrast is the categorical predictor equivalent of collinearity/non-independence of predictors.

  • How many contrasts can I have? If you have \(k\) groups, you can have up to \(k-1\) contrasts. So since we have five groups, we can have up to four contrasts.


Any other questions about manual contrasts?

Testing manual contrasts

Compute m3’s estimated marginal means for each level of status.

m3_emm <- emmeans(m3, ~status)


Next, check the order of levels with levels().

levels(wb_data$status)
[1] "Cohab"      "Divorced"   "Married/CP" "Single"     "Widowed"   


group current
Cohab 0
Divorced –1/2
Married/CP 1
Single 0
Widowed –1/2

Then make sure the order of coding values matches the order of levels.

m3_comparison1 <- list(
  "current" = c(
       0,  # Cohab
    -1/2,  # Divorced
       1,  # Married/CP 
       0,  # Single
    -1/2   # Widowed
  )
)

The estimate we get will be the grand mean of the positive chunk, minus the grand mean of the negative chunk.

Testing manual contrasts

Using this comparisons list, test the contrast (i.e., test the H0 that the difference between chunks is equal to 0):

(m3_contrast1 <- contrast(m3_emm, m3_comparison1))
 contrast estimate    SE  df t.ratio p.value
 current      2.94 0.455 495   6.460  <.0001


Get the associated 95% CIs:

(m3_confint1 <- confint(m3_contrast1))
 contrast estimate    SE  df lower.CL upper.CL
 current      2.94 0.455 495     2.04     3.83

Confidence level used: 0.95 


Where does this number for estimate come from?

It’s the grand mean of the positive chunk, minus the grand mean of the negative chunk.

mean_married - mean(c(mean_divorced, mean_widowed))
[1] 2.94


Can we reject the null hypothesis that there’s no difference in subjective well-being between people who are currently married and people who were previously married?

Your turn

Your turn: Research question 2

Is there a difference in subjective well-being between (1) people who are currently OR were previously married or in a civil partnership and (2) people who were never married or in a civil partnership?

group evermarried
Cohab
Divorced
Married/CP
Single
Widowed

Step 1: “Chunk” together the two group(s) that the research question is comparing.

  • Chunk 1: Divorced, Married/CP, Widowed.
  • Chunk 2: Cohab, Single.

Step 2: Assign a 0 to any group(s) that aren’t in one of the chunks from Step 1.

Step 3: Assign a plus sign to every group in Chunk 1, and a minus sign to every group in Chunk 2.

Step 4: Count the plus signs and minus signs.

Step 5: To figure out the actual values for each cell, start with 1 and –1. Divide 1 by \(n_{plus}\), and divide –1 by \(n_{minus}\).

Step 6: In the coding matrix, replace the plus signs with the positive coding value from Step 5, and replace the minus signs with the negative coding value from Step 5. Done!

If you finish early


  • Can you figure out what emmeans’ estimate for the evermarried comparison will be?
  • Can you code up your contrast in emmeans and test whether the difference between chunks is significantly different from zero?


levels(wb_data$status)
[1] "Cohab"      "Divorced"   "Married/CP" "Single"     "Widowed"   


m3_comparison2 <- list(
  "evermarried" = c(
    ?,  # Cohab
    ?,  # Divorced
    ?,  # Married/CP 
    ?,  # Single
    ?   # Widowed
  )
)

The results we should expect

Test the contrast (i.e., test the H0 that the estimate is different from 0):

(m3_contrast2 <- contrast(m3_emm, m3_comparison2))
 contrast    estimate    SE  df t.ratio p.value
 evermarried    -1.08 0.402 495  -2.690  0.0074

Get the associated 95% CIs:

(m3_confint2 <- confint(m3_contrast2))
 contrast    estimate    SE  df lower.CL upper.CL
 evermarried    -1.08 0.402 495    -1.87   -0.291

Confidence level used: 0.95 

Sense check: The estimate is the grand mean of the positive chunk, minus the grand mean of the negative chunk.

pos_chunk_mean <- mean(c( mean_divorced, mean_married, mean_widowed ))
neg_chunk_mean <- mean(c( mean_cohab, mean_single ))

pos_chunk_mean - neg_chunk_mean
[1] -1.08

Can we reject the null hypothesis that there’s no difference in subjective well-being between people who’ve ever been married and people who’ve never been married?

Are the contrasts orthogonal?

What’s “orthogonal”?


“Orthogonal” is a word that makes you sound smart when you really just mean “perpendicular” =
“at a 90 degree angle” = “at a right angle”.


Two orthogonal dimensions:

Three orthogonal dimensions:

Why is it nice for contrasts to be orthogonal?

If you change the value on one dimension, you don’t change the values on orthogonal dimension(s) at all.

Moving on the y dimension doesn’t affect x:

Moving on the z dimension doesn’t affect x or y:

In other words, if you know about y, you know nothing about x. This means that orthogonal dimensions, and orthogonal contrasts, contain completely different information.

Statistical models really like that!

Figuring out if contrasts are orthogonal


group current evermarried product
Cohab 0 –1/2
Divorced –1/2 1/3
Married/CP 1 1/3
Single 0 –1/2
Widowed –1/2 1/3
group current evermarried product
Cohab 0 –1/2 0
Divorced –1/2 1/3
Married/CP 1 1/3
Single 0 –1/2
Widowed –1/2 1/3
group current evermarried product
Cohab 0 –1/2 0
Divorced –1/2 1/3 –1/6
Married/CP 1 1/3
Single 0 –1/2
Widowed –1/2 1/3
group current evermarried product
Cohab 0 –1/2 0
Divorced –1/2 1/3 –1/6
Married/CP 1 1/3 1/3
Single 0 –1/2
Widowed –1/2 1/3
group current evermarried product
Cohab 0 –1/2 0
Divorced –1/2 1/3 –1/6
Married/CP 1 1/3 1/3
Single 0 –1/2 0
Widowed –1/2 1/3
group current evermarried product
Cohab 0 –1/2 0
Divorced –1/2 1/3 –1/6
Married/CP 1 1/3 1/3
Single 0 –1/2 0
Widowed –1/2 1/3 –1/6
  1. Multiply the weights for each level together.
  1. Add all of those products together.
  • If this sum = 0, then the contrasts are orthogonal.
  • If this sum \(\neq\) 0, then the contrasts are non-orthogonal.


0     +    –1/6    +    1/3    +    0    +     –1/6    

=     0


Orthogonal!

Why does it matter if contrasts are orthogonal?


It affects how we interpret the contrasts.


If contrasts are orthogonal:

  • They are independent from one another.
  • They test mutually independent hypotheses about the data.
  • “Orthogonal” is the categorical version of “uncorrelated”.


If they’re not orthogonal:

  • “Not orthogonal” is the categorical version of “correlated”.
  • Contrasts contain some of the same information. (this idea will come up again next week).
  • It’s not obvious which contrast is contributing what information to the outcome, so estimates are harder to interpret.


If at all possible, create orthogonal contrasts. Your life will be easier.


Building an analysis workflow


Revisiting this week’s learning objectives

Dummy coding is one common scheme for a priori contrast coding. What’s another common scheme, and how is it different?

  • Effects coding, also called sum-to-zero coding (or just “sum coding”).
  • Dummy coding uses 0/1, and one level of the predictor (the one coded as 0) is the reference level.
  • Effects coding uses –1/1, and there is no reference level.

When we code predictors using this other coding scheme, how do we interpret the linear model’s coefficients?

  • Intercept (also written as \(\beta_0\)): The grand mean of the outcome (grand mean = the mean of every group’s mean).
  • Slope (also written as \(\beta_1\), \(\beta_2\), etc., or for short, \(\beta_j\)): The difference between (1) the mean of a group and (2) the grand mean, when all other predictors are at zero.

Revisiting this week’s learning objectives

In this other coding scheme, what hypotheses are tested for each coefficient?

  • The intercept’s hypothesis test: The grand mean of the outcome is different from zero.
  • The slopes’ hypothesis tests: The difference between (1) the mean of each individual group and (2) the grand mean is different from zero.

How can we test hypotheses other than the ones tested by a priori coding schemes?

  • Use a linear model to generate the expected outcome values for every level of our predictors (= the expected marginal means).
  • We can compare expected marginal means of any combination of groups we want by manually creating our own contrasts.
  • There is a step-by-step process we can follow to create any contrasts we want.

This week


Tasks


Attend your lab and work together on the exercises

Support


Help each other on the Piazza forum


Complete the weekly quiz

Attend office hours (see Learn page for details)

Appendix

Prediction equations: Effect coding, three levels

The linear expression telling us model predictions:

\[ \widehat{\text{outcome}}= \hat{\beta_0} + (\hat{\beta_1} \cdot \text{Predictor1}) + (\hat{\beta_2} \cdot \text{Predictor2}) \]

We can combine the estimated betas to compute the means of each level. To do this, we plug in the values for Predictor1 and Predictor2 that correspond to each level of the three-level variable. We get these values from the rows of our our coding matrix.


contr.sum(3)
  [,1] [,2]
1    1    0
2    0    1
3   -1   -1

Level 1 is represented as Predictor1 = 1, Predictor2 = 0
(first row of contrast matrix).

\[ \begin{align} \widehat{\text{outcome}}_{\text{level 1}} &= \hat{\beta_0} + (\hat{\beta_1} \cdot \text{Predictor1}) + (\hat{\beta_2} \cdot \text{Predictor2})\\ &= \hat{\beta_0} + (\hat{\beta_1} \cdot 1) + (\hat{\beta_2} \cdot 0)\\ &= \hat{\beta_0} + \hat{\beta_1}\\ \end{align} \]

Prediction equations: Effect coding, three levels

contr.sum(3)
  [,1] [,2]
1    1    0
2    0    1
3   -1   -1

Level 2 is represented as Predictor1 = 0, Predictor2 = 1
(second row of contrast matrix).

\[ \begin{align} \widehat{\text{outcome}}_{\text{level 2}} &= \hat{\beta_0} + (\hat{\beta_1} \cdot \text{Predictor1}) + (\hat{\beta_2} \cdot \text{Predictor2})\\ &= \hat{\beta_0} + (\hat{\beta_1} \cdot 0) + (\hat{\beta_2} \cdot 1)\\ &= \hat{\beta_0} + \hat{\beta_2}\\ \end{align} \]


Level 3 is represented as Predictor1 = -1, Predictor2 = -1
(third row of contrast matrix).

\[ \begin{align} \widehat{\text{outcome}}_{\text{level 3}} &= \hat{\beta_0} + (\hat{\beta_1} \cdot \text{Predictor1}) + (\hat{\beta_2} \cdot \text{Predictor2})\\ &= \hat{\beta_0} + (\hat{\beta_1} \cdot -1) + (\hat{\beta_2} \cdot -1)\\ &= \hat{\beta_0} - \hat{\beta_1} - \hat{\beta_2}\\ &= \hat{\beta_0} - (\hat{\beta_1} + \hat{\beta_2})\\ \end{align} \]

Grand mean vs. overall mean of the data

The grand mean is the mean of the group means.

The grand mean is sometimes, but not always, the overall mean of the observed data.

  • When the groups are all exactly the same size, then the grand mean = the overall mean of observed data.
  • But when the groups are different sizes (like with the subjective well-being data), then the grand mean \(\neq\) the overall mean.
wb_data |>
  group_by(status) |>
  count()
# A tibble: 5 x 2
# Groups:   status [5]
  status         n
  <fct>      <int>
1 Cohab        100
2 Divorced      50
3 Married/CP   275
4 Single        50
5 Widowed       25


The mean of all swb values (not the grand mean):

wb_data$swb |> mean()
[1] 10.2

The grand mean (not the mean of all swb values):

mean(c(mean_cohab, mean_divorced, mean_married, mean_single, mean_widowed))
[1] 9.1

Visualise effect coding with >2 levels

contrasts(score_data$method)
          [,1] [,2]
read         1    0
self-test    0    1
summarise   -1   -1

The \(\times\) shows the grand mean = the model’s intercept.

Once effect coding uses >2 levels, the line between group means does not give us the correct intercept or slope.

evermarried


group evermarried
Cohab –1/2
Divorced 1/3
Married/CP 1/3
Single –1/2
Widowed 1/3
m3_comparison2 <- list(
  "evermarried" = c(
    -1/2,  # Cohab
     1/3,  # Divorced
     1/3,  # Married/CP 
    -1/2,  # Single
     1/3   # Widowed
  )
)