These exercises cover the lecture materials from weeks 1 and 2 of semester 2. So take this lab slowly, try to understand it and don’t rush through it as it is important material.

You will have those two weeks to work through these exercises.

Introduction

You might remember that we use a t-test to test whether two population means are equal. But what if we want to compare more than two groups?

ANOVA, which stands for ANalysis Of VAriance, is a method to assess whether the observed differences among sample means are statistically significant. For example, could a variation among three sample means this large be plausibly due to chance, or is it good evidence for a difference among the population means?

In this week’s exercises you will be asked to use one-way ANOVA to test for a difference in means among several groups specified by the levels of a single factor.

The null and alternative hypotheses used for comparing \(g\) population means can be written as follows:

\(H_0: \mu_1 = \mu_2 = \cdots = \mu_g\)
\(H_1 : \text{the means are not all equal}\)¹

Although the name “analysis of variance” may seem odd to compare group means, we will actually establish if there is a significant different among the means by comparing the variability between the group means to the variability within each group.

Motivation

Consider the two figures below, each corresponding to a different dataset. Each dataset comprises three samples. The sample observations are shown as black dots. Furthermore, each sample mean is displayed as a red diamond.

Even if the null hypothesis were true, and the means of the populations underlying the groups were equal, we would still expect the sample means to vary a bit.

For each dataset, we wish to decide whether the sample means are different enough for us to reject the null hypothesis of equal population means. On the contrary, if they are sufficiently close, we will just attribute the observed differences to the natural variability from sample to sample.

What do you think?

Do the data in plot (A) provide sufficient evidence that the population means are not all equal?
Do the data in plot (B) provide sufficient evidence that the population means are not all equal?

It is fairly easy to see that the means in dataset (B) differ. It’s hard to imagine that those means could be that far apart just from natural sampling variability alone.

How about dataset (A)? It looks like these observations could have occurred from three populations having the same mean. The variation among the group means (red diamonds) in the left plot does seem consistent with the null hypothesis of equal population means.

Believe it or not, both plots have the same means! The group means in both plots are 11.7, 26, and 17.4, respectively.

So what is making the figures look so different???

In plot (B), the variation within each group is so small that the differences between the means stand out. In plot (A), instead, the variation within each group is much larger compared to the variation of the group means. This, in turn, increases the range of the y-axis, making it difficult to spot any differences between the means.

To verify our intuition, we will compute the residuals. These correspond to each observation minus the mean of the group it belongs to.

We might now compare the variability among the group means with the variability of the residuals (observation minus group mean). The following table reports the relevant variances:²

Variability	Dataset A	Dataset B
Means	35.74	35.74
Residuals	591.2	1.005

For dataset (A), the variability among the group means is smaller than the variability of the residuals.

For dataset (B), the variability among the group means is larger than the variability of the residuals.

This might prompt us to consider the following ratio, \[ \frac{\text{variability among means}}{\text{variability of residuals}} \] also called the F-statistic, and we typically reject the null hypothesis for large values of the ratio (meaning that the variability among the means is larger than the variability of the residuals).

This is the central idea of the F-test used in ANOVA. If the variation of the group means is much larger than the variation of the data within each group, we reject the null hypothesis and conclude that at least two of the population means are different. On the contrary, if the variation of the data within each group is much larger than the variation between the group means, we do not have sufficient evidence to reject the null hypothesis of equal population means.

How do we compute the variation among the residuals (or the group means)? Do you remember the formula of the variance? We will use the quantity at the numerator of the variance, which is called sum of squares. This is why the method is called analysis of variance.

Research question

Do you think the type of background music playing in a restaurant affects the amount of money that diners spend on their meal?

Data: RestaurantSpending.csv

id	type	amount
1	No Music	23.15
2	Pop Music	20.59
3	Pop Music	19.18
4	Pop Music	16.7
5	Classical Music	25.91
6	Pop Music	19.28

Question 1

Load the tidyverse library.

Read the restaurant spending data into R using the function read_csv() and name the data rest_spend.

Check for the correct encoding of all variables — that is, categorical variables should be factors and numeric variables should be numeric.

Hint: Transform a variable to a factor by using the functions factor() or as.factor().

Solution

Load the tidyverse library:

library(tidyverse)

Read the data into R:

rest_spend <- read_csv('https://uoepsy.github.io/data/RestaurantSpending.csv')

Check if the data were read into R correctly by examining the top six rows:

head(rest_spend)

## # A tibble: 6 x 3
##      id type            amount
##   <dbl> <chr>            <dbl>
## 1     1 No Music          23.1
## 2     2 Pop Music         20.6
## 3     3 Pop Music         19.2
## 4     4 Pop Music         16.7
## 5     5 Classical Music   25.9
## 6     6 Pop Music         19.3

Alternatively, you could use the glimpse() function:

glimpse(rest_spend)

## Rows: 360
## Columns: 3
## $ id     <dbl> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, …
## $ type   <chr> "No Music", "Pop Music", "Pop Music", "Pop Music", "Classical M…
## $ amount <dbl> 23.14891, 20.59492, 19.18172, 16.70237, 25.91041, 19.27888, 22.…

Music type is a categorical variable but it is encoded as a character (<chr>) variable rather than a factor (<fctr>). Let’s fix this:

# We could do this the standard way
rest_spend$type <- as.factor(rest_spend$type)

# Or the tidyverse way
rest_spend <- rest_spend %>%
  mutate(type = as.factor(type))

And check again:

head(rest_spend)

## # A tibble: 6 x 3
##      id type            amount
##   <dbl> <fct>            <dbl>
## 1     1 No Music          23.1
## 2     2 Pop Music         20.6
## 3     3 Pop Music         19.2
## 4     4 Pop Music         16.7
## 5     5 Classical Music   25.9
## 6     6 Pop Music         19.3

Question 2

Create a table of summary statistics showing the number of people in each type of background music, along with their average spending and standard deviation.

Solution

The question asks us to summarise the data by creating a frequency distribution of the different music types:

rest_spend %>%
  group_by(type) %>%
  summarise(n = n(),
            M = mean(amount), 
            SD = sd(amount))

## # A tibble: 3 x 4
##   type                n     M    SD
##   <fct>           <int> <dbl> <dbl>
## 1 Classical Music   120  24.2  1.89
## 2 No Music          120  22.1  3.44
## 3 Pop Music         120  21.9  2.97

The same number of customers were assigned to each type of music: 120 had a silent background, 120 had classical music, and 120 pop music.

Question 3

Produce a boxplot displaying the relationship between restaurant spending and music type.

Solution

A good plot to visualise the relationship between a continuous variable and a categorical one is a boxplot.

ggplot(data = rest_spend, aes(x = type, y = amount)) +
  geom_boxplot() +
  labs(x = 'Background music type', y = 'Restaurant spending (in GBP)')

Question 4

Using the boxplot, comment on any observed differences among the sample means of the three background music types.

Please note that this question just asks you to comment on any visible differences among the groups, not to run any formal test or analysis.

Solution

Model specification

Each music type is considered as a different population with its own mean spending. However, the three populations (there is one for each music type) are assumed to have the same variance.

R by default orders the levels of a categorical variable in alphabetical order:

levels(rest_spend$type)

## [1] "Classical Music" "No Music"        "Pop Music"

If we follow the same approach and order the groups by alphabetical ordering, we can represent the data as follows:

Population Name	Population ID	Sample observations	Population mean
Classical Music	1	\(y_{1,1}, y_{1,2}, ..., y_{1,n}\)	\(\mu_{1}\)
No Music	2	\(y_{2,1}, y_{2,2}, ..., y_{2,n}\)	\(\mu_{2}\)
Pop Music	3	\(y_{3,1}, y_{3,2}, ..., y_{3,n}\)	\(\mu_{3}\)

Question 5

For the restaurant spending data, what is the number of groups (\(g\)) and the numbers of observations per group (\(n\))?

Solution

Cell means model

The data are assumed to follow the model: \[ y_{i,j} = \mu_{i} + \epsilon_{i,j} \qquad \begin{cases} i = 1, 2, 3 \\ j = 1, 2, ..., 120 \end{cases} \]

where the \(\epsilon_{i,j}\)s are assumed to be independent and follow a \(N(0, \sigma)\) distribution.

In the model above,

\(i = 1, 2, 3\) indicates the type of music played - i.e. the group
\(j = 1, 2, ..., 120\) indicates the observation within a group
\(y_{i,j}\) is the observed spending of subject \(j\) from group \(i\)
\(\mu_{i}\) is the population mean spending of group \(i\)
\(\epsilon_{i,j}\) is the deviation between the individual observation and the group mean

The model parameters are \(\mu_1, \mu_2, \mu_3\) and \(\sigma\).

Effects model

There is another common rewriting of the above model which is widely used in practice.

To get there, we first need to define: \[ \mu_i = \beta_0 + \underbrace{(\mu_{i} - \beta_0)}_{\beta_i} = \beta_0 + \beta_i \]

That is, each group mean is:

\(\mu_1 = \beta_0 + \beta_1\)
\(\mu_2 = \beta_0 + \beta_2\)
\(\mu_3 = \beta_0 + \beta_3\)

We have silently introduced one additional parameter (\(\beta_0\)), and now we have 4 parameters (\(\beta_0, \beta_1, \beta_2, \beta_3\)) to model only 3 group means (\(\mu_1, \mu_2, \mu_3\)).

You can think of \(\beta_0\) as the baseline value, and the \(\beta_1, \beta_2, \beta_3\) as group effects.

This might seem like we are overcomplicating the model but, instead, it will make interpretation much easier in the context of multiple factors (to be discussed in the next weeks).

The effects model is: \[ y_{i,j} = \beta_0 + \beta_i + \epsilon_{i,j}\qquad \begin{cases} i = 1, 2, ..., g \\ j = 1, 2, ..., n \end{cases} \]

with the \(\epsilon_{i,j}\)s independently following a \(N(0, \sigma)\) distribution as usual.

PROBLEM

As we have four \(\beta\)s to model 3 group means (\(\mu\)s), the model is said to be not identifiable. We cannot distinguish any more whether the group mean is due to a group effect or not.

For example, we can get the same group mean \(\mu_i = 10\), say, either by setting:

\(\beta_0 = 0\) and \(\beta_i = 10\), which interprets as “there is an effect due to group \(i\)”
\(\beta_0 = 10\) and \(\beta_i = 0\), which interprets as “there is no effect of group \(i\)”

To make the model identifiable again, we need to introduce a constraint on the parameters, effectively reducing the degrees of freedom to three, which is also the number of group means.

Before continuing, however, let us define the overall mean as \[ \mu = \frac{\mu_1 + \mu_2 + \mu_3}{3} \]

Side-constraints

Possible side-constraints on the parameters are:

Name	Constraint	Meaning of \(\beta_0\)	R
Sum to zero	\(\beta_1 + \beta_2 + \beta_3 = 0\)	\(\beta_0 = \mu\)	`contr.sum`
Reference group	\(\beta_1 = 0\)	\(\beta_0 = \mu_1\)	`contr.treatment`

Sum to zero constraint

Under the constraint \(\beta_1 + \beta_2 + \beta_3 = 0\), the model coefficients are interpreted as follows:

\(\beta_0 = \mu\) is interpreted as the overall or global mean — that is, the population mean when all the groups are combined;
\(\beta_i = \mu_i - \mu\) is interpreted as the effect of group \(i\) — that is, the difference between the population mean for group \(i\), \(\mu_i\), and the global population mean, \(\mu\).

The interpretation of \(\beta_0\) follows from the constraint: \[ \begin{aligned} \beta_1 + \beta_2 + \beta_3 &= 0 \\ (\mu_1 - \beta_0) + (\mu_2 - \beta_0) + (\mu_3 - \beta_0) &= 0 \\ \mu_1 + \mu_2 + \mu_3 &= 3 \beta_0 \\ \beta_0 &= \frac{\mu_1 + \mu_2 + \mu_3}{3} = \mu \end{aligned} \] which shows that \(\beta_0\) corresponds to the global mean, i.e. the average of the three group means.

The treatment effect represents the deviation between the mean response produced by treatment \(i\) and the global mean: \[ \beta_i = \mu_i - \beta_0 = \mu_i - \mu \] If the treatment effect is positive (negative), the \(i\)th treatment produces an increase (decrease) in the mean response.

Typically, software doesn’t return the last \(\beta_3\) as this can be found using the side constraint \[ \begin{aligned} \beta_1 + \beta_2 + \beta_3 &= 0 \\ \beta_3 = - (\beta_1 + \beta_2 ) \end{aligned} \] Each group mean is found as:

\(\mu_1 = \beta_0 + \beta_1\)
\(\mu_2 = \beta_0 + \beta_2\)
\(\mu_3 = \beta_0 + \beta_3 = \beta_0 - (\beta_1 + \beta_2) = \beta_0 - \beta_1 - \beta_2\)

In practice, this is done by fitting the linear regression model

\[ y = b_0 + b_1 \ \text{EffectLevel1} + b_2 \ \text{EffectLevel2} + \epsilon \]

where

\[ \text{EffectLevel1} = \begin{cases} 1 & \text{if observation is from category 1} \\ 0 & \text{if observation is from category 2} \\ -1 & \text{otherwise} \end{cases} \\ \text{EffectLevel2} = \begin{cases} 1 & \text{if observation is from category 2} \\ 0 & \text{if observation is from category 1} \\ -1 & \text{otherwise} \end{cases} \]

Schematically,

\[ \begin{matrix} \textbf{Level} & \textbf{EffectLevel1} & \textbf{EffectLevel2} \\ \hline \text{Classical Music} & 1 & 0 \\ \text{No Music} & 0 & 1 \\ \text{Pop Music} & -1 & -1 \end{matrix} \]

In R this is simply done by saying:

contrasts(rest_spend$type) <- "contr.sum"
mdl_sum <- lm(amount ~ 1 + type, data = rest_spend) # or: lm(amount ~ type, data = rest_spend)
coef(mdl_sum)

## (Intercept)       type1       type2 
##  22.7381560   1.4359846  -0.5967688

where type is a factor. In such case R will automatically create the two variables EffectLevel1 and EffectLevel2 for you!

The summary(mdl) will return 3 estimated coefficients:

\(b_0 = 22.74\) is estimated overall mean
\(b_1 = 1.44\) is the effect of Classical Music
\(b_2 = -0.6\) is the effect of No Music
NOTE: The effect of Pop Music is not shown by summary as it is found via the side-constraint as \(b_3 = -(b_1 + b_2) = -0.84\)!

Reference group constraint

Under the constraint \(\beta_1 = 0\), meaning that the first factor level is the reference group,

\(\beta_0\) is interpreted as \(\mu_1\), the mean response for the reference group (group 1);
\(\beta_i\) is interpreted as the difference between the mean response for group \(i\) and the reference group.

Roughly speaking, \(\beta_i\) can still be interpreted as the “treatment effect,” but compared to a reference category: typically a control group or placebo.

The interpretation of \(\beta_0\) follows from: \[ \begin{aligned} \beta_1 &= 0 \\ \mu_1 - \beta_0 &= 0 \\ \beta_0 &= \mu_1 \end{aligned} \]

The interpretation of the \(\beta_i\)s follows from: \[ \begin{aligned} \beta_i = \mu_i - \beta_0 = \mu_i - \mu_1 \end{aligned} \] showing that it is the deviation between the mean response in group \(i\) and the mean response in the reference group.

As \(\beta_1 = 0\), each group mean is found as:

\(\mu_1 = \beta_0\)
\(\mu_2 = \beta_0 + \beta_2\)
\(\mu_3 = \beta_0 + \beta_3\)

In practice, this is done by fitting the linear regression model

\[ y = b_0 + b_1 \ \text{IsTypeNoMusic} + b_2 \ \text{IsTypePopMusic} +\epsilon \]

where

\[ \text{IsTypeNoMusic} = \begin{cases} 1 & \text{if observation is from the No Music category} \\ 0 & \text{otherwise} \end{cases} \\ \text{IsTypePopMusic} = \begin{cases} 1 & \text{if observation is from the Pop Music category} \\ 0 & \text{otherwise} \end{cases} \]

Schematically,

\[ \begin{matrix} \textbf{Level} & \textbf{IsTypeNoMusic} & \textbf{IsTypePopMusic} \\ \hline \text{Classical Music} & 0 & 0 \\ \text{No Music} & 1 & 0 \\ \text{Pop Music} & 0 & 1 \end{matrix} \]

In R this is simply done by saying:

contrasts(rest_spend$type) <- "contr.treatment"
mdl_trt <- lm(amount ~ 1 + type, data = rest_spend) # or: lm(amount ~ type, data = rest_spend)
coef(mdl_trt)

##   (Intercept)  typeNo Music typePop Music 
##     24.174141     -2.032753     -2.275200

where type is a factor. In such case R will automatically create the two dummy variables IsTypeNoMusic and IsTypePopMusic for you!

The summary(mdl) will return 3 estimated coefficients:

\(b_0 = 24.17\) is the mean of the base level (level 1)
\(b_1 = -2.03\) is the effect of No Music
\(b_2 = -2.28\) is the effect of Pop Music
NOTE: The effect of Classical Music is not shown as it is equal to zero by the constraint!

IMPORTANT

By default R uses the reference group constraint
If your factor has \(g\) levels, your regression model will have \(g-1\) dummy variables (R creates them for you)

ANOVA F-test

By default, R uses the reference group constraint. In this section and the next one, we will not specify any side-constraints, meaning that \(\beta_1 = 0\).

Question 6

Under the reference group constraint, the interpretation of the model coefficients is as follows: \[ \mu_1 = \beta_0, \qquad \mu_2 = \beta_0 + \beta_2, \qquad \mu_3 = \beta_0 + \beta_3 \]

Show that under the reference group constraint, the null hypothesis of the F-test for model utility is equivalent to the ANOVA null hypothesis. In other words, demonstrate that \[ H_0: \beta_2 = \beta_3 = 0 \] is equivalent to \[ H_0 : \mu_1 = \mu_2 = \mu_3 \]

Solution

Question 7

Fit a linear model to the data, having amount as response variable and the factor type as predictor. Call the fitted model mdl_rg (for reference group constraint, which R uses by default!)

Identify the relevant pieces of information from the commands anova(mdl_rg) and summary(mdl_rg) that can be used to conduct an ANOVA F-test against the null hypothesis that all population means are equal.

Interpret the F-test results in the context of the ANOVA null hypothesis.

Solution

mdl_rg <- lm(amount ~ type, data = rest_spend)
mdl_rg

## 
## Call:
## lm(formula = amount ~ type, data = rest_spend)
## 
## Coefficients:
##   (Intercept)   typeNo Music  typePop Music  
##        24.174         -2.033         -2.275

summary(mdl_rg)

## 
## Call:
## lm(formula = amount ~ type, data = rest_spend)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -8.433 -1.886  0.127  1.755 11.285 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)    24.1741     0.2593  93.211  < 2e-16 ***
## typeNo Music   -2.0328     0.3668  -5.542 5.81e-08 ***
## typePop Music  -2.2752     0.3668  -6.203 1.53e-09 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.841 on 357 degrees of freedom
## Multiple R-squared:  0.1151, Adjusted R-squared:  0.1101 
## F-statistic: 23.21 on 2 and 357 DF,  p-value: 3.335e-10

anova(mdl_rg)

## Analysis of Variance Table
## 
## Response: amount
##            Df Sum Sq Mean Sq F value    Pr(>F)    
## type        2  374.7 187.348  23.211 3.335e-10 ***
## Residuals 357 2881.5   8.071                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

The model summary returns the F-test of model utility which, in this case, corresponds to the ANOVA F-test against the null hypothesis of equal population means.

The relevant line from summary() is:

F-statistic: 23.21 on 2 and 357 DF,  p-value: 3.335e-10

The relevant parts from anova() are:

F value of 23.211
The Df column giving 2 and 357 degrees of freedom
The p-value of the test, reported under Pr(>F) as 3.335e-10 ***.

Recall that 3.335e-10 simply means \(3.335 \times 10^{-10}\). That is, “e” means 10 to the power of…

We can write this up as follows:

We performed an analysis of variance against the null hypothesis of equal population mean spending across three types of background music, \(F(2, 357) = 23.21\), \(p < .001\).

The large observed F statistic leads to a very small p-value, meaning that such a large observed variability among the mean restaurant spending across the different music types, compared to the variability in the residuals, is very unlikely to happen by chance alone if the population means where all the same.

For this reason, at the 5% significance level we reject the null hypothesis as there is very strong evidence that at least two population means differ.

Reference group constraint

Question 8

Examine and investigate the meaning of the coefficients in the output of summary(mdl_rg)

Solution

Let’s recall the model summary:

summary(mdl_rg)

## 
## Call:
## lm(formula = amount ~ type, data = rest_spend)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -8.433 -1.886  0.127  1.755 11.285 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)    24.1741     0.2593  93.211  < 2e-16 ***
## typeNo Music   -2.0328     0.3668  -5.542 5.81e-08 ***
## typePop Music  -2.2752     0.3668  -6.203 1.53e-09 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.841 on 357 degrees of freedom
## Multiple R-squared:  0.1151, Adjusted R-squared:  0.1101 
## F-statistic: 23.21 on 2 and 357 DF,  p-value: 3.335e-10

The interpretation is as follows

Coefficient	Estimate	Corresponds to
(Intercept)	24.1741	\(\hat \beta_0 = \hat \mu_1\)
typeNo Music	-2.0328	\(\hat \beta_2 = \hat \mu_2 - \hat \beta_0 = \hat \mu_2 - \hat \mu_1\)
typePop Music	-2.2752	\(\hat \beta_3 = \hat \mu_3 - \hat \beta_0 = \hat \mu_3 - \hat \mu_1\)

The estimate corresponding to (Intercept) contains \(\hat \beta_0 = \hat \mu_1 = 24.17\). The estimated average spending for those having a classical music background is approximately £24.2.

This is because by default R uses contr.treatment and the first level of the factor is the reference group. R always orders factor levels alphabetically unless you specifically change them. To check the levels of a factor, we use the levels() function. As you can see below, Classical Music appears first and hence is used by R as reference group.

levels(rest_spend$type)

## [1] "Classical Music" "No Music"        "Pop Music"

The next estimate corresponds to typeNo Music and is \(\hat \beta_2 = -2.033\). The difference in mean spending between No Music and Classical Music is estimated to be \(-2.033\). In other words, people with a silent background seem to spend approximately £2 less than those having a classical music background.

The estimate corresponding to typePop Music is \(\hat \beta_3 = -2.275\). This is the estimated difference in mean spending between Pop Music and Classical Music. People with a pop music background seem to spend approximately £2.3 less than those with a classical music background.

Hence, for all levels except the reference group we see differences to the reference group while the estimate of the reference level can be found next to (Intercept).

It is also important to notice how the coefficients names are written. They are a combination of factor name and level name, such as typeNo Music. The only coefficient that is missing is typeClassical Music, the one corresponding to the reference category Classical Music.

Question 9

The function dummy.coef() returns the coefficients for all groups, including the reference category.

Run the dummy.coef() function on the fitted model, and examine the output.

Solution

dummy.coef(mdl_rg)

## Full coefficients are 
##                                                    
## (Intercept):           24.17414                    
## type:           Classical Music  No Music Pop Music
##                        0.000000 -2.032753 -2.275200

The output of dummy.coef() is written in terms of the reference group mean and the additive effect of each group.

We obtain the predicted mean of for “Classical Music” by adding 24.17414 and 0. The result, 24.17414, can be written as \(\hat \mu_\text{Classical Music} = \hat \mu_1 = 24.17\)
We obtain the predicted mean of for “No Music” by adding 24.17414 and -2.032753. The result, 22.14139, can be written as \(\hat \mu_\text{No Music} = \hat \mu_2 = 22.14\)
We obtain the predicted mean of for “Pop Music” by adding 24.17414 and -2.275200. The result, 21.89894, can be written as \(\hat \mu_\text{Pop Music} = \hat \mu_3 = 21.90\)

Question 10

Obtain the estimated (or predicted) group means for the “Classical Music,” “No Music,” and “Pop Music” groups by using the predict(mdl_rg, newdata = ...) function.

Compare the output of the predict() function with that from dummy.coef().

Solution

Define a data frame with a column having the same name as the factor in the fitted model. Then, specify all the groups (= levels) for which you would like the predicted mean.

query_groups <- tibble(type = c("Classical Music", "No Music", "Pop Music"))
query_groups

## # A tibble: 3 x 1
##   type           
##   <chr>          
## 1 Classical Music
## 2 No Music       
## 3 Pop Music

Pass the data frame to the predict function using the newdata = argument. The predict function will match the column named type with the predictor called type in the fitted model mdl_rg.

predict(mdl_rg, newdata = query_groups)

##        1        2        3 
## 24.17414 22.14139 21.89894

Or:

query_groups %>%
  mutate(pred = predict(mdl_rg, newdata = .))

## # A tibble: 3 x 2
##   type             pred
##   <chr>           <dbl>
## 1 Classical Music  24.2
## 2 No Music         22.1
## 3 Pop Music        21.9

Hence,

\(\hat \mu_\text{Classical Music} = \hat \mu_1 = 24.17\)
\(\hat \mu_\text{No Music} = \hat \mu_2 = 22.14\)
\(\hat \mu_\text{Pop Music} = \hat \mu_3 = 21.90\)

While predict() directly returns the group means, the dummy.coef() function returns the estimates for each model coefficient. Hence we have the estimated mean for the reference group, and the additive effects for each group. The additive effect for the reference category is shown as 0 (as the side-constraint is \(\beta_1 = 0\)).

However, by adding the mean of the reference category and each group effect, we obtain the same means as those returned by predict().

Question 11

It actually makes more sense to have “No Music” as reference group.

Open the help page of the function fct_relevel(), and look at the examples.

Within the rest_sped data, reorder the levels of the factor type so that your reference group is “No Music.”

NOTE

Because you reordered the levels, now

\(\mu_1\) = mean of no music group
\(\mu_2\) = mean of classical music group
\(\mu_3\) = mean of pop music group

Solution

Check the help page:

?fct_relevel

Select “No Music” to be the first level:

rest_spend$type <- fct_relevel(rest_spend$type, "No Music")

Check that the factor levels are now in the correct order:

levels(rest_spend$type)

## [1] "No Music"        "Classical Music" "Pop Music"

Question 12

Refit the linear model, and inspect the model summary once more.

Do you notice any change in the estimated coefficients?

Do you notice any change in the F-test for model utility?

Solution

mdl_rg <- lm(amount ~ type, data = rest_spend)
summary(mdl_rg)

## 
## Call:
## lm(formula = amount ~ type, data = rest_spend)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -8.433 -1.886  0.127  1.755 11.285 
## 
## Coefficients:
##                     Estimate Std. Error t value Pr(>|t|)    
## (Intercept)          22.1414     0.2593  85.373  < 2e-16 ***
## typeClassical Music   2.0328     0.3668   5.542 5.81e-08 ***
## typePop Music        -0.2424     0.3668  -0.661    0.509    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.841 on 357 degrees of freedom
## Multiple R-squared:  0.1151, Adjusted R-squared:  0.1101 
## F-statistic: 23.21 on 2 and 357 DF,  p-value: 3.335e-10

The displayed coefficients have changed as now the (Intercept) represents the new reference group, which is “No Music.”

However, by doing the calculations, this model is equivalent to the previous one as the predicted group means are the same.

Predicted mean of “No Music” \(\hat \mu_\text{No Music}\) = 22.1414
Predicted mean of “Classical Music” \(\hat \mu_\text{Classical Music}\) = 22.1414 + 2.0328 = 24.1742
Predicted mean of “Pop Music” = \(\hat \mu_\text{Pop Music}\) = 22.1414 - 0.2424 = 21.899

In fact, the predict function returns the same output:

query_groups <- tibble(type = levels(rest_spend$type))
query_groups

## # A tibble: 3 x 1
##   type           
##   <chr>          
## 1 No Music       
## 2 Classical Music
## 3 Pop Music

query_groups %>% 
  mutate(pred = predict(mdl_rg, newdata = .))

## # A tibble: 3 x 2
##   type             pred
##   <chr>           <dbl>
## 1 No Music         22.1
## 2 Classical Music  24.2
## 3 Pop Music        21.9

Also the dummy.coef() function leads to the same predictions, however, the displayed coefficients vary slightly as the reference group is now “No Music.”

In fact, its coefficient is now 0, which is the reference group constraint (\(\beta_1 = 0\)):

dummy.coef(mdl_rg)

## Full coefficients are 
##                                                      
## (Intercept):      22.14139                           
## type:             No Music Classical Music  Pop Music
##                  0.0000000       2.0327533 -0.2424471

Question 13

Let’s compare some population means…

Interpret the results of the t-test for the significance of the model coefficients.

Solution

summary(mdl_rg)

## 
## Call:
## lm(formula = amount ~ type, data = rest_spend)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -8.433 -1.886  0.127  1.755 11.285 
## 
## Coefficients:
##                     Estimate Std. Error t value Pr(>|t|)    
## (Intercept)          22.1414     0.2593  85.373  < 2e-16 ***
## typeClassical Music   2.0328     0.3668   5.542 5.81e-08 ***
## typePop Music        -0.2424     0.3668  -0.661    0.509    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.841 on 357 degrees of freedom
## Multiple R-squared:  0.1151, Adjusted R-squared:  0.1101 
## F-statistic: 23.21 on 2 and 357 DF,  p-value: 3.335e-10

The first row reports the t-value and p-value to test: \[ H_0: \mu_1 = 0 \quad vs \quad H_1 : \mu_1 \neq 0 \]

We performed a t-test against \(H_0: \mu_1 = 0\), which resulted in \(t(357) = 85.373, p < .001\), two-sided). At the 5% significance level, there is very strong evidence that the population mean restaurant spending for those with a silent background is not zero.

The second row reports the t-value and p-value to test: \[ H_0: \mu_2 - \mu_1 = 0 \quad vs \quad H_1 : \mu_2 - \mu_1 \neq 0 \] or

\[ H_0: \mu_2 = \mu_1 \quad vs \quad H_1 : \mu_2 \neq \mu_1 \]

We performed a t-test against \(H_0: \mu_2 = \mu_1\), which resulted in \(t(357) = 5.542, p < .001\), two-sided. At the 5% significance level, there is very strong evidence that the population mean restaurant spending for those with a classical music background is not equal to those with a silent background.

The third and last row reports the t-value and p-value to test: \[ H_0: \mu_3 - \mu_1 = 0 \quad vs \quad H_1 : \mu_3 - \mu_1 \neq 0 \] or

\[ H_0: \mu_3 = \mu_1 \quad vs \quad H_1 : \mu_3 \neq \mu_1 \]

We performed a t-test against \(H_0: \mu_3 = \mu_1\), which resulted in \(t(357) = -0.661, p = 0.509\), two-sided. At the 5% significance level, there is not sufficient evidence to reject the null hypothesis that the population mean restaurant spending for those with a pop music background is the same as those with a silent background.

Sum-to-zero Constraint

Let’s now change the side-constraint from the R default, the reference group or contr.treatment constraint, to the sum-to-zero constraint.

Recall that under this constraint the interpretation of the coefficients becomes:

\(\beta_0\) represents the global mean
\(\beta_i\) the effect due to group \(i\) — that is, the mean response in group \(i\) minus the global mean

We set the sum-to-zero side constraint by saying

contrasts(rest_spend$type) <- "contr.sum"

Question 14

Set the sum to zero constraint for the factor type of background music.

Fit again the linear model using amount as response variable and type as predictor. Call the fitted model using the sum to zero constraint mdl_stz.

Examine the output of the summary(mdl_stz) function:

Do the displayed coefficients change? What do they represent now?
Do the predicted group means change?
Is the model utility F-test still the same? Why do you think it’s the case?

Solution

Set the sum to zero contrast to the type factor:

contrasts(rest_spend$type) <- "contr.sum"

Re-fit the model and inspect the summary output:

mdl_stz <- lm(amount ~ type, data = rest_spend)
summary(mdl_stz)

## 
## Call:
## lm(formula = amount ~ type, data = rest_spend)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -8.433 -1.886  0.127  1.755 11.285 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  22.7382     0.1497 151.856  < 2e-16 ***
## type1        -0.5968     0.2118  -2.818   0.0051 ** 
## type2         1.4360     0.2118   6.781 4.95e-11 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.841 on 357 degrees of freedom
## Multiple R-squared:  0.1151, Adjusted R-squared:  0.1101 
## F-statistic: 23.21 on 2 and 357 DF,  p-value: 3.335e-10

With the new side-constraint, we get different values for the model coefficients because the meaning of the parameters has changed. If we closely inspect the output, we can also notice that a slightly different naming has been used.

Instead of factor name and level name (such as typeNo Music), we get factor name and level number (e.g. type1). Here, type1 simply means the first level of the factor type, while type2 means the second level of the factor type.

levels(rest_spend$type)

## [1] "No Music"        "Classical Music" "Pop Music"

The first level is “No Music,” while the second level is “Classical Music.”

The estimate for (Intercept) is now the global mean of the data.
The estimate for type1 is now the difference of the first group (“No Music”) to the global mean
The estimate for type2 is now the difference of the second group (“Classical Music”) to the global mean

The estimate for type3, representing the difference of “Pop Music” to the global mean is not shown. Because of the side-constraint, we know it must be \(\beta_3 = - (\beta_1 + \beta_2) = - (-0.5968 + 1.4360) = -0.8392\).

As before, we can get all model coefficients calling dummy.coef():

dummy.coef(mdl_stz)

## Full coefficients are 
##                                                      
## (Intercept):      22.73816                           
## type:             No Music Classical Music  Pop Music
##                 -0.5967688       1.4359846 -0.8392158

The dummy.coef() function shows that the (Intercept), that is \(\hat \beta_0\), represents the global mean.

Underneath we see the group effects that need to be added to the global mean to obtain the predicted group means.

Compute the predicted group means:

query_groups %>%
  mutate(pred = predict(mdl_stz, newdata = .))

## # A tibble: 3 x 2
##   type             pred
##   <chr>           <dbl>
## 1 No Music         22.1
## 2 Classical Music  24.2
## 3 Pop Music        21.9

We notice that we get the same predicted group means independently of the side-constraint.

The predicted means do not depend on the side-constraint that we employ. However, the side-constraint affects the meaning of the parameters in the model.

Question 15

Let’s do more hypothesis tests on the population means…

Interpret the results of the t-tests on the coefficients of the model fitted using the sum to zero side-constraint.

Solution

summary(mdl_stz)

## 
## Call:
## lm(formula = amount ~ type, data = rest_spend)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -8.433 -1.886  0.127  1.755 11.285 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  22.7382     0.1497 151.856  < 2e-16 ***
## type1        -0.5968     0.2118  -2.818   0.0051 ** 
## type2         1.4360     0.2118   6.781 4.95e-11 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.841 on 357 degrees of freedom
## Multiple R-squared:  0.1151, Adjusted R-squared:  0.1101 
## F-statistic: 23.21 on 2 and 357 DF,  p-value: 3.335e-10

The first row reports the t-value and p-value to test: \[ H_0: \beta_0 = 0 \quad vs \quad H_1 : \beta_0 \neq 0 \]

We performed a t-test against \(H_0: \beta_0 = 0\), which resulted in \(t(357) = 151.856, p < .001\), two-sided). At the 5% significance level, there is very strong evidence that the overall mean restaurant spending across the three types of background music is not zero.

The second row reports the t-value and p-value to test: \[ H_0: \mu_1 - \beta_0 = 0 \quad vs \quad H_1 : \mu_1 - \beta_0 \neq 0 \] or

\[ H_0: \mu_1 = \beta_0 \quad vs \quad H_1 : \mu_1 \neq \beta_0 \]

We performed a t-test against \(H_0: \mu_1 = \beta_0\), which resulted in \(t(357) = -2.818, p = .005\), two-sided. At the 5% significance level, there is strong evidence that the population mean restaurant spending for those with a silent background is not equal to the global mean restaurant spending.

The third and last row reports the t-value and p-value to test: \[ H_0: \mu_2 - \beta_0 = 0 \quad vs \quad H_1 : \mu_2 - \beta_0 \neq 0 \] or

\[ H_0: \mu_2 = \beta_0 \quad vs \quad H_1 : \mu_2 \neq \beta_0 \]

We performed a t-test against \(H_0: \mu_2 = \beta_0\), which resulted in \(t(357) = 6.781, p < .001\), two-sided. At the 5% significance level, there is very strong evidence that the population mean restaurant spending for those with a classical music background is different than the overall mean restaurant spending.

We can switch back to the default reference group constraint by either of these

# Option 1
contrasts(rest_spend$type) <- NULL
# Option 2
contrasts(rest_spend$type) <- "contr.treatment"

Contrasts

When the ANOVA F-test against the null hypothesis \(H_0: \mu_1 = \mu_2 = \cdots = \mu_g\) is rejected, it tells us that there are significant differences between at least two of the group means. However, it does not tells us which particular population means are different. Further investigation should be conducted to find out exactly which population means differ. In some cases the investigator will have preplanned comparisons they would like to make; in other situations they may have no idea what differences to look for.

Contrasts represent a wide range of hypotheses about the population means that can be tested using the fitted model.

However, they need to match the following general form. A contrast only allows us to test hypotheses that can be written as a linear combination of the population means with coefficients summing to zero: \[ H_0 : c_1 \mu_1 + c_2 \mu_2 + c_3 \mu_3 = 0 \qquad \text{with} \qquad c_1 + c_2 + c_3 = 0 \]

We have already performed some simple contrasts by looking at the model coefficients under the reference group constraint. Those represented pairwise comparisons between each group and the reference group.

However, say you wanted to test other hypotheses? Such as a combinations of two groups against a third group?

To do so, please install once for all the library emmeans via:

install.packages("emmeans")

Then, every time you want to use any of its functions, remember that you need to load it via

library(emmeans)

Question 16

We were interested in the following comparisons:

Whether having some kind of background music, rather than no music, makes a difference in mean restaurant spending
Whether there is any difference in mean restaurant spending between those with no background music and those with pop music

These are planned comparisons and can be translated to the following research hypotheses: \[ \begin{aligned} 1. \quad H_0 &: \mu_\text{No Music} = \frac{1}{2} (\mu_\text{Classical Music} + \mu_\text{Pop Music}) \\ 2. \quad H_0 &: \mu_\text{No Music} = \mu_\text{Pop Music} \end{aligned} \]

First of all check the levels of the factor type, as the contrast coefficients need to match the order of the levels:

levels(rest_spend$type)

## [1] "No Music"        "Classical Music" "Pop Music"

We specify the hypotheses by first giving each a name, and then specify the coefficients of the comparisons:

comp <- list("No Music - Some Music" = c(1, -1/2, -1/2),
             "No Music - Pop Music" = c(1, 0, -1))

Then, we load the emmeans package:

library(emmeans)

Use the emmeans() function to obtain the estimated treatment means and uncertainties for your factor:

emm <- emmeans(mdl_stz, ~ type)
emm

##  type            emmean    SE  df lower.CL upper.CL
##  No Music          22.1 0.259 357     21.6     22.7
##  Classical Music   24.2 0.259 357     23.7     24.7
##  Pop Music         21.9 0.259 357     21.4     22.4
## 
## Confidence level used: 0.95

Next, we run the contrast analysis. To do so, we use the contrast() function:

comp_res <- contrast(emm, method = comp)
comp_res

##  contrast              estimate    SE  df t.ratio p.value
##  No Music - Some Music   -0.895 0.318 357 -2.818  0.0051 
##  No Music - Pop Music     0.242 0.367 357  0.661  0.5090

Finally, we can ask for 95% confidence intervals:

confint(comp_res)

##  contrast              estimate    SE  df lower.CL upper.CL
##  No Music - Some Music   -0.895 0.318 357   -1.520   -0.270
##  No Music - Pop Music     0.242 0.367 357   -0.479    0.964
## 
## Confidence level used: 0.95

Interpret the result of the contrast analysis.

What do the p-values tell us?
Interpret the confidence intervals in the context of the study.

Solution

References

North, A., A. Shilcock, and D. Hargreaves. 2003. “The Effect of Musical Style on Restaurant Customers’ Spending.” Environment and Behavior 35: 712–18.

The alternative hypothesis that “the means are not all equal” should not be confused with “all the means are different.” With 10 groups we could have 9 means equal to each other and 1 different. The null hypothesis would still be false.↩︎
Recall that the variance is the squared standard deviation.↩︎

One-way ANOVA

Introduction

Motivation

Research question

Model specification

Cell means model

Effects model

Side-constraints

Sum to zero constraint

Reference group constraint

ANOVA F-test

Reference group constraint

Sum-to-zero Constraint

Contrasts

References