Assumptions for LMMs

Why do models make assumptions?

We can think of a linear model as a way to describe the process of how our data was generated.

Models are simplifications of processes going on in the world. In order to simplify these processes, a model will specify or “hard-code” certain aspects of the data generating process, and those aspects cannot be changed. Unchangeable aspects of a model’s data generating process are what we call “assumptions”.

Assumptions are not like significance tests: it’s not the case that the assumption is either “accepted” or “rejected”. The decision process is blurrier than that. When evaluating a model’s assumptions, we are asking ourselves: Are we satisfied that the unchangeable aspects of the model are reasonable enough simplifications, so that the model will still give us reasonable parameter estimates?

Assumptions of all linear models

Think back to the assumptions of linear models you learned in DAPR2 (see also the DAPR2 assumptions flash cards):

Linearity of association:

  • A linear model can only model associations between predictor and outcome in terms of a straight line. Within the basic linear model machinery, this cannot be changed, and we just have to accept it. That’s what it means to be an “assumption”.
    • If an association doesn’t follow a straight line, then a linear model cannot accurately capture it.
    • Think of predicting height as a function of age, or height ~ age, for example. Height rises steeply when ages are small and then eventually tapers off (and may even start to decrease in old age!).
    • If we used a linear model to model height ~ age, then our model will give us the best straight line it can.
  • To be satisfied with a linear model of that data, we have to assume that the best possible straight line is a good representation of the data.

Independence of errors:

  • A linear model can only model errors as residuals that are independent from one another. Within the basic linear model machinery, this cannot be changed.
    • One common source of non-independence are data points that come from the same source (for example, multiple data points from the same person, or from the same test question—see grouping variables).
  • To be satisfied with a simple linear model of our data, we have to assume that independent residuals are a good representation of the data.
  • If our residuals will not be independent in a simple linear model, then we must fit an LMM, or in other words, include random effects.

Normality of errors:

  • A linear model can only model errors as residuals that are normally distributed. Within the basic linear model machinery, this cannot be changed.
  • To be satisfied with a linear model of our data, we have to assume that normally-distributed residuals are a good representation of the data.

Equal variance of errors:

  • A linear model can only model errors as residuals that have equal variance. Within the basic linear model machinery, this cannot be changed.
  • To be satisfied with a linear model of our data, we have to assume that residuals with equal variance are a good representation of the data.

A new assumption in LMMs

In LMMs, we additionally assume that the random effects—that is, the group-level adjustments to the fixed effects—are normally distributed.

For how to check assumptions in LMMs, see Check assumptions.

Linked flash cards