Preliminaries

Create a new R Script or RMarkdown document (whichever you prefer working with) and give it a title for this week.

Some extra background reading

We have already seen in the last couple of weeks that we can use MLM to study something ‘over the course of X’. In the novel word learning experiment from last week’s lab, we investigated the change in the probability of answering correctly over the course of the experimental blocks.

We’ve talked about how “longitudinal” is the term commonly used to refer to any data in which repeated measurements are taken over a continuous domain. This opens up the potential for observations to be unevenly spaced, or missing at certain points. It also, as will be the focus of this week, opens the door to thinking about how many effects of interest are likely to display non-linear patterns. These exercises focus on including higher-order polynomials in the multi-level model to capture non-linearity.

Linear vs Non-Linear

Suppose we had collected the data in Figure 1, and we wanted to fit a model to predict \(y\) based on the values of \(x\).

Let’s use our old friend linear regression, \(y = \beta_0 + \beta_1(x) + \varepsilon\).

We’ll get out some estimated coefficients, some standard errors, and some p-values:

The intercept:
\(\beta_0\) = -110.28, SE = 19.54, p < .001
The estimated coefficient of x:
\(\beta_1\) = 2.71, SE = 0.55, p < .001

Job done? Clearly not - we need only overlay model upon raw data (Figure 2) to see we are missing some key parts of the pattern.

Figure 1: A clearly non-linear pattern

Figure 2: Uh-oh…

Thoughts about Model + Error

All our work here is in aim of making models of the world.

Models are just models. They are simplifications, and so they don’t perfectly fit to the observed world (indeed, how well a model fits to the world is often our metric for comparing models).
\(y - \widehat{y}\). Our observed data minus our model predicted values (i.e. in linear regression our “residuals”) reflect everything that we don’t account for in our model
In an ideal world, our model accounts for all the systematic relationships, and what is left over (our residuals) is just randomness. If our model is mis-specified, or misses out something systematic, then our residuals will reflect this.
We check for this by examining how much like randomness the residuals appear to be (zero mean, normally distributed, constant variance, i.i.d (“independent and identically distributed”) - i.e., what gets referred to as the “assumptions”).
We will never know whether our residuals contain only randomness, because we can never observe everything.

Let’s just do a quick plot(model) for some diagnostic plots of my linear model:

Does it look like the residuals are independently distributed? Not really. We need to find some way of incorporating the non-linear relationship between y and x into our model.

What is a polynomial?

Polynomials are mathematical expressions which involve a sum of powers. For instance:

\(y = 4 + x + x^2\) is a second-order polynomial as the highest power of \(x\) that appears is \(x^2\)
\(y = 9x + 2x^2 + 4x^3\) is a third-order polynomial as the highest power of \(x\) that appears is \(x^3\)
\(y = x^6\) is a sixth-order polynomial as the highest power of \(x\) that appears is \(x^6\)

For our purposes, extending our model to include higher-order terms can fit non-linear relationships between two variables. For instance, fitting models with linear and quadratic terms (\(y_i\) = \(\beta_0 + \beta_1 x_{i} \ + \beta_2 x^2_i + \varepsilon_i\)) and extending these to cubic (\(y_i\) = \(\beta_0 + \beta_1 x_{i} \ + \beta_2 x^2_i + \beta_3 x^3_i + \varepsilon_i\)) (or beyond), may aid in modelling nonlinear patterns.

What are we interested in here?

As the order of polynomials increases, we tend to be less interested in these terms in our model. Linear change is the easiest to think about: are things going up over the course of \(x\), or down? (or neither?). Quadratic change is the next most interesting, and it may help to think of this as the “rate of change”. For instance, in the plot below, it is the quadratic term which differs between the two groups trajectories.

Positive and negative quadratic terms

Raw Polynomials

There are two types of polynomial we can construct. “Raw” (or “Natural”) polynomials are the straightforward ones you might be expecting the table to the right to be filled with.

These are simply the original values of the x variable to the power of 2, 3 and so on.

We can quickly get these in R using the poly() function, with raw = TRUE.

If you want to create “raw” polynomials, make sure to specify raw = TRUE or you will not get what you want because the default behaviour is raw = FALSE.

x	x^2	x^3
1	?	?
2	?	?
3	?	?
4	?	?
5	?	?
…	…	…

poly(1:10, degree = 3, raw=TRUE)

##        1   2    3
##  [1,]  1   1    1
##  [2,]  2   4    8
##  [3,]  3   9   27
##  [4,]  4  16   64
##  [5,]  5  25  125
##  [6,]  6  36  216
##  [7,]  7  49  343
##  [8,]  8  64  512
##  [9,]  9  81  729
## [10,] 10 100 1000
## attr(,"degree")
## [1] 1 2 3
## attr(,"class")
## [1] "poly"   "matrix"

Raw polynomials are correlated

With raw (or “natural”) polynomials, the terms poly1, poly2 and poly3 are correlated.
Think think about why this might be - by definition, as \(x^1\) increases, so will \(x^2\), and so will \(x^3\) and so on.

We can visualise them:

matplot(poly(1:10, 3, raw=T), type="l")

And measure the correlation coefficients:

cor(poly(1:10, 3, raw=T)) %>% round(2)

##      1    2    3
## 1 1.00 0.97 0.93
## 2 0.97 1.00 0.99
## 3 0.93 0.99 1.00

Why might this be a problem?
Well, this multicollinearity can lead to estimation problems, and means that our parameter estimates may change considerably depending upon what terms we include in our model, and it becomes more difficult to determine which ones are important, and what the effect sizes are.
Table 1 below shows the coefficients for models fitted to a randomly generated dataset, with poly1, poly1+poly2, and poly1+poly2+poly3 as predictors (where poly1-poly3 are natural polynomials). Notice that they change with the addition of each term.

Table 1: Incremental addition of raw polynomial terms
term	y~poly1	y~poly1+poly2	y~poly1+poly2+poly3
(Intercept)	639.37	-199.68	-0.12
poly1	-144.01	275.52	98.52
poly2		-38.14	0.24
poly3			-2.33

Orthogonal Polynomials

“Orthogonal” polynomials are uncorrelated (hence the name). We can get these for \(x = 1,2,...,9,10\) using the following code:

poly(1:10, degree = 3, raw = FALSE)

##                 1           2          3
##  [1,] -0.49543369  0.52223297 -0.4534252
##  [2,] -0.38533732  0.17407766  0.1511417
##  [3,] -0.27524094 -0.08703883  0.3778543
##  [4,] -0.16514456 -0.26111648  0.3346710
##  [5,] -0.05504819 -0.34815531  0.1295501
##  [6,]  0.05504819 -0.34815531 -0.1295501
##  [7,]  0.16514456 -0.26111648 -0.3346710
##  [8,]  0.27524094 -0.08703883 -0.3778543
##  [9,]  0.38533732  0.17407766 -0.1511417
## [10,]  0.49543369  0.52223297  0.4534252
## attr(,"coefs")
## attr(,"coefs")$alpha
## [1] 5.5 5.5 5.5
## 
## attr(,"coefs")$norm2
## [1]    1.0   10.0   82.5  528.0 3088.8
## 
## attr(,"degree")
## [1] 1 2 3
## attr(,"class")
## [1] "poly"   "matrix"

Notice that the first order term has been scaled, so instead of the values 1 to 10, we have values ranging from -0.5 to +0.5, centered on 0.

colMeans(poly(1:10, degree = 3, raw = FALSE)) %>%
    round(2)

## 1 2 3 
## 0 0 0

As you can see from the output above, which computes the mean of each column, each predictor has mean 0, so they are mean-centred. This is a key fact and will affect the interpretation of our predictors later on.

Think about what this means for \(x^2\). It will be uncorrelated with \(x\) (because \(-0.5^2 = 0.5^2\))!

matplot(poly(1:10, 3, raw=F), type="l")

The correlations are zero!

cor(poly(1:10, 3, raw=F)) %>% round(2)

We can then fit the same models y~poly1, y~poly1+poly2, and y~poly1+poly2+poly3 as predictors (where poly1-poly3 are now orthogonal polynomials), and see that estimated coefficients do not change between models:

Table 2: Incremental addition of orthogonal polynomial terms
term	y~poly1	y~poly1+poly2	y~poly1+poly2+poly3
(Intercept)	-152.66	-152.66	-152.66
poly1	-1307.99	-1307.99	-1307.99
poly2		-876.36	-876.36
poly3			-129.26

Remember what zero is!

With orthogonal polynomials, you need to be careful about interpreting coefficients. For raw polynomials the intercept remains the y-intercept (i.e., where the line hits the y-axis). The higher order terms can then be thought of from that starting point - e.g., “where \(x\) is 2, \(\widehat{y}\) is \(\beta_0 + \beta_1 \cdot 2 + \beta_2 \cdot 2^2 + \beta_3 \cdot 2^3 ...\)”.

For orthogonal polynomials, the interpretation becomes more tricky. The intercept is the overall average of y, the linear predictor is the linear change pivoting around the mean of \(x\) (rather than \(x = 0\)), the quadratic term corresponds to the steepness of the quadratic curvature (“how curvy is it?”), the cubic term to the steepness at the inflection points (“how wiggly is it?”), and so on.

Some useful code from Dan

It’s possible to use poly() internally in fitting our linear model, if we want:

lm(y ~ poly(x, 3, raw = T), data = df)

Unfortunately, the coefficients will end up having long messy names poly(x, 3, raw = T)[1], poly(x, 3, raw = T)[2] etc.

It is probably nicer if we add the polynomials to our data itself. As it happens, Dan has provided a nice little function which attaches these as columns to our data, naming them poly1, poly2, etc.

# import Dan's code and make it available in our own R session
# you must do this in every script you want to use this function
source("https://uoepsy.github.io/msmr/functions/code_poly.R")

mydata <- code_poly(df = mydata, predictor = 'time', poly.order = 3, 
                    orthogonal = FALSE, draw.poly = FALSE)
head(mydata)

## # A tibble: 6 × 6
##    time     y time.Index poly1 poly2 poly3
##   <int> <dbl>      <dbl> <dbl> <dbl> <dbl>
## 1     1  96.8          1     1     1     1
## 2     2 179.           2     2     4     8
## 3     3 234.           3     3     9    27
## 4     4 248.           4     4    16    64
## 5     5 209.           5     5    25   125
## 6     6  97.2          6     6    36   216

Both will produce the same model output (but Dan’s method produces these nice neat names for the coefficients!), and we can just put the terms into our model directly as lm(y ~ poly1 + poly2 + poly3, data = mydata).

Demonstration

Our data:

head(df)

## # A tibble: 6 × 2
##       x     y
##   <int> <dbl>
## 1     1  102.
## 2     2  203.
## 3     3  301.
## 4     4  397.
## 5     5  492.
## 6     6  585.

A messy model:

m.messy <- lm(y ~ poly(x, 3, raw=T), data = df)

term	estimate	std.error	statistic	p.value
(Intercept)	2.37	2.17	1.09	.317
poly(x, 3, raw = T)1	99.70	1.63	61.30	< .001
poly(x, 3, raw = T)2	0.32	0.34	0.96	.376
poly(x, 3, raw = T)3	-0.13	0.02	-6.40	.001

Dan’s code, and a neat model:

df <- code_poly(df = df, predictor = 'x', poly.order = 3, 
                orthogonal = FALSE, draw.poly = FALSE)
mDan <- lm(y ~ poly1 + poly2 + poly3, data = df)

term	estimate	std.error	statistic	p.value
(Intercept)	2.37	2.17	1.09	.317
poly1	99.70	1.63	61.30	< .001
poly2	0.32	0.34	0.96	.376
poly3	-0.13	0.02	-6.40	.001

Exercises: Cognitive performance

STOP AND THINK: Are you using R effectively?

There are lots of different ways to organise your file system/your life in R. You might be creating a new project each week, or a new folder, or just a new .Rmd. There’s no best way to organise this - it is personal preference.

However, one thing which is common across most approaches is that having a lot of irrelevant stuff in your environment (top right pane) can get confusing and messy.

We encourage you now to clear your environment for this week, and then load in the data.

If you are worried that you are going to lose some of the objects in your environment, then this may be a sign that you are not using R to its full potential. The idea is that we can recreate all of our analyses by just running the relevant code in our script!

Alongside lme4 and tidyverse and possibly broom.mixed, we’re going to be using some of Dan’s useful functions for getting p-values and coding polynomials.

The source() function basically takes in R code and evaluates it. You can download an R script with Dan’s code here.

But you can also source this script directly from the URL, which runs the R code and creates the functions in your environment. Typically, we want these at the top of our document, because it’s similar to loading a package:

library(tidyverse)
library(lme4)
library(broom.mixed)
source("https://uoepsy.github.io/msmr/functions/code_poly.R")

Az.rda Data

30 Participants with probably Alzheimer’s Disease completed 3 tasks over 10 time points: A memory task, and two scales investigating ability to undertake complex activities of daily living (cADL) and simple activities of daily living (sADL). Performance on all of tasks was calculated as a percentage of total possible score, thereby ranging from 0 to 100.

The data is available at https://uoepsy.github.io/data/Az.rda.

variable	description
Subject	Unique Subject Identifier
Time	Time point of the study (1 to 10)
Task	Task type (Memory, cADL, sADL)
Performance	Score on test

Question A1

Load in the data and examine it.

Solution

load(url("https://uoepsy.github.io/data/Az.rda"))
summary(Az)

##     Subject         Time          Task      Performance   
##  1      : 30   Min.   : 1.0   cADL  :300   Min.   : 2.00  
##  2      : 30   1st Qu.: 3.0   sADL  :300   1st Qu.:40.00  
##  3      : 30   Median : 5.5   Memory:300   Median :52.00  
##  4      : 30   Mean   : 5.5                Mean   :49.27  
##  5      : 30   3rd Qu.: 8.0                3rd Qu.:61.00  
##  6      : 30   Max.   :10.0                Max.   :85.00  
##  (Other):720

Question A2

No modelling just yet.

Plot the performance over time for each type of task.

Try using stat_summary so that you are plotting the means (and standard errors) of each task, rather than every single data point. Why? Because this way you can get a shape of the overall trajectories of performance over time in each task.

Solution

ggplot(Az, aes(Time, Performance, color=Task, fill=Task)) + 
  stat_summary(fun.data=mean_se, geom="ribbon", color=NA, alpha=0.5) +
  stat_summary(fun=mean, geom="line")

Question A3

Why do you think raw/natural polynomials might be more useful than orthogonal polynomials for these data?

Solution

Question A4

Okay! Let’s do some modeling!

First steps:

Add 1st and 2nd order raw polynomials to the data using the code_poly() function.
Create a “baseline model”, in which performance varies over time (with both linear and quadratic change), but no differences in Task are estimated.

We need to think about our random effect structure. We’ll talk you through this bit because it’s getting a bit more complicated now.

1. What are the observations grouped by?

We have multiple observations for each participant. We can see this easily if we plot performance over time, and make a separate facet for each subject:

ggplot(Az, aes(x=Time,y=Performance))+
  geom_point()+
  facet_wrap(~Subject)

We also have multiple observations for each task, and multiple observations for each time point (1 to 10). This is true, but importantly, these things (e.g. effects of time, differences between tasks) are things that we are specifically interested in estimating. The subjects, on the other hand, are simply a random sample of people in the study. We want to account for subject-level variation, but we’re not interested in estimating specific differences between specific subjects. This lends itself perfectly to being a grouping variable in a multi-level model.

2. Are observations nested?

There’s another grouping within each subjects’ data, and that is the task they are measured on:

ggplot(Az, aes(x=Time,y=Performance,col=Task))+
  geom_point()+
  facet_wrap(~Subject)

3. What effects can vary by-groups?

For a given subject, on a given task, we can see that there is “an effect of time”. So we can conceivably think of the effect of time on performance as varying by subject and tasks within subjects.

Az %>% filter(Subject == 4) %>%
ggplot(., aes(x=Time,y=Performance,col=Task))+
  geom_point()+
  facet_wrap(~Task)+
  ylim(0,80)

4. What should our random effect structure be?

Solution

# prep for analysis
Az <- code_poly(Az, predictor="Time", poly.order=2, orthogonal=F, draw.poly = F)

# fit the base model
m.base <- lmer(Performance ~ (poly1 + poly2) + 
                 (poly1 + poly2 | Subject/Task),
               data=Az, REML=F, control=lmerControl(optimizer="bobyqa"))
summary(m.base)

## Linear mixed model fit by maximum likelihood  ['lmerMod']
## Formula: Performance ~ (poly1 + poly2) + (poly1 + poly2 | Subject/Task)
##    Data: Az
## Control: lmerControl(optimizer = "bobyqa")
## 
##      AIC      BIC   logLik deviance df.resid 
##   3804.6   3881.5  -1886.3   3772.6      884 
## 
## Scaled residuals: 
##      Min       1Q   Median       3Q      Max 
## -2.46383 -0.54482 -0.03605  0.54813  3.09190 
## 
## Random effects:
##  Groups       Name        Variance  Std.Dev. Corr       
##  Task:Subject (Intercept) 8.818e+00 2.969555            
##               poly1       6.344e+00 2.518659 -0.17      
##               poly2       2.923e-02 0.170954  0.09 -0.87
##  Subject      (Intercept) 1.578e+01 3.972260            
##               poly1       2.027e+00 1.423719 -0.57      
##               poly2       3.150e-06 0.001775  0.93 -0.83
##  Residual                 1.040e+00 1.019576            
## Number of obs: 900, groups:  Task:Subject, 90; Subject, 30
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept) 67.60667    0.79995  84.514
## poly1       -4.15943    0.37528 -11.083
## poly2        0.11801    0.01862   6.338
## 
## Correlation of Fixed Effects:
##       (Intr) poly1 
## poly1 -0.424       
## poly2  0.082 -0.639
## optimizer (bobyqa) convergence code: 0 (OK)
## boundary (singular) fit: see help('isSingular')

Question A5

Hopefully, you fitted a model like this (or thereabouts)

m.base <- lmer(Performance ~ (poly1 + poly2) + 
                 (1 + poly1 + poly2 | Subject/Task),
               data=Az, REML=F, control=lmerControl(optimizer="bobyqa"))

Remember what we learned last week about “singular fits”? It looks like this model might be a too complex for the data to sustain.

What random effect term might we consider removing? (there isn’t necessarily a “right” answer here - many may be defensible, but your decision should be based on multiple considerations/evidence).

Solution

I would be inclined to remove the poly2 term from either subjects, or tasks within subjects. I probably wouldn’t try removing it from both straight away, so we’ll need to split the random effect terms from (1 + poly1 + poly2| Subject/Task) to (1 + poly1 | Subject) + (1 + poly1 + poly2 | Subject:Task).

Removing poly2 | Subject:Task will mean that within each subject, the linear trend will still vary by task, but quadratic trend will not. For a given subject, the model will think that the performance might go up on some tasks and down on others, but the curvature will be the same.
Removing poly2 | Subject will mean that the linear trend will still vary by subjects, but the quadratic trend will not. The model will assume all Subjects have the same curvature to their trajectories.

Figure 3: Model fits for 6 subjects when excluding different random effect terms

Personally I’m more tempted to remove poly2 | Subject, in part because there is far less estimated variance in this term (see below):

# show the random effects
VarCorr(m.base)

##  Groups       Name        Std.Dev.  Corr         
##  Task:Subject (Intercept) 2.9695550              
##               poly1       2.5186594 -0.171       
##               poly2       0.1709543  0.092 -0.868
##  Subject      (Intercept) 3.9722599              
##               poly1       1.4237191 -0.568       
##               poly2       0.0017747  0.929 -0.832
##  Residual                 1.0195756

m.base <- lmer(Performance ~ (poly1 + poly2) + 
                 (1 + poly1 | Subject) + (1 + poly1 + poly2 | Subject:Task),
               data=Az, REML=F, control=lmerControl(optimizer="bobyqa"))
summary(m.base)

## Linear mixed model fit by maximum likelihood  ['lmerMod']
## Formula: Performance ~ (poly1 + poly2) + (1 + poly1 | Subject) + (1 +  
##     poly1 + poly2 | Subject:Task)
##    Data: Az
## Control: lmerControl(optimizer = "bobyqa")
## 
##      AIC      BIC   logLik deviance df.resid 
##   3798.6   3861.1  -1886.3   3772.6      887 
## 
## Scaled residuals: 
##      Min       1Q   Median       3Q      Max 
## -2.46438 -0.54563 -0.03641  0.54710  3.09162 
## 
## Random effects:
##  Groups       Name        Variance Std.Dev. Corr       
##  Subject:Task (Intercept)  8.81922 2.970               
##               poly1        6.34519 2.519    -0.17      
##               poly2        0.02923 0.171     0.09 -0.87
##  Subject      (Intercept) 15.74116 3.968               
##               poly1        1.97414 1.405    -0.56      
##  Residual                  1.03953 1.020               
## Number of obs: 900, groups:  Subject:Task, 90; Subject, 30
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept) 67.60667    0.79917  84.596
## poly1       -4.15943    0.37295 -11.153
## poly2        0.11801    0.01862   6.339
## 
## Correlation of Fixed Effects:
##       (Intr) poly1 
## poly1 -0.418       
## poly2  0.067 -0.633

Question A6

Let’s start adding in our effects of interest.

Create a new model with a fixed effect of Task
Create a new model in which performance varies linearly over time between Task type.
Create a new model in which linear and quadratic performance over time varies between Task type.
Run model comparisons.

Solution

m.0 <- lmer(Performance ~ (poly1 + poly2) + Task +
              (1 + poly1 | Subject) + (1 + poly1 + poly2 | Subject:Task),
            data=Az, REML=F, control=lmerControl(optimizer="bobyqa"))
m.1 <- lmer(Performance ~ poly1*Task + poly2 +
              (1 + poly1 | Subject) + (1 + poly1 + poly2 | Subject:Task),
            data=Az, REML=F, control=lmerControl(optimizer="bobyqa"))
m.Az.full <- lmer(Performance ~ (poly1 + poly2)*Task + 
                  (1 + poly1 | Subject) + (1 + poly1 + poly2 | Subject:Task),
                data=Az, REML=F, control=lmerControl(optimizer="bobyqa"))
anova(m.base, m.0, m.1, m.Az.full)

## Data: Az
## Models:
## m.base: Performance ~ (poly1 + poly2) + (1 + poly1 | Subject) + (1 + poly1 + poly2 | Subject:Task)
## m.0: Performance ~ (poly1 + poly2) + Task + (1 + poly1 | Subject) + (1 + poly1 + poly2 | Subject:Task)
## m.1: Performance ~ poly1 * Task + poly2 + (1 + poly1 | Subject) + (1 + poly1 + poly2 | Subject:Task)
## m.Az.full: Performance ~ (poly1 + poly2) * Task + (1 + poly1 | Subject) + (1 + poly1 + poly2 | Subject:Task)
##           npar    AIC    BIC  logLik deviance    Chisq Df Pr(>Chisq)    
## m.base      13 3798.6 3861.1 -1886.3   3772.6                           
## m.0         15 3801.8 3873.8 -1885.9   3771.8   0.8559  2     0.6518    
## m.1         17 3775.4 3857.0 -1870.7   3741.4  30.4129  2  2.488e-07 ***
## m.Az.full   19 3607.8 3699.1 -1784.9   3569.8 171.5323  2  < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

We’d like to do some inferential tests of our coefficients values. We could consider computing some confidence intervals with confint(m.Az.full, method="boot"), but this may take quite a long time with such a complex model (if we bootstrap, we’re essentially fitting thousands of models!).

For now, let’s refit the full model and obtain p-values for our coefficients by using the Satterthwaite approximations for the denominator degrees of freedom.
The code below does this for us

By using lmerTest::lmer() we don’t have to load the package library(lmerTest), and so it is just this single model that’s get fitted this way.

m.full_satter <- 
    lmerTest::lmer(Performance ~ (poly1 + poly2) * Task +
                       (1 + poly1 | Subject) + (1 + poly1 + poly2 | Subject:Task),
                   data=Az, REML=F, control=lmerControl(optimizer="bobyqa"))

tidy(m.full_satter) %>% 
    filter(effect=="fixed")

## # A tibble: 9 × 8
##   effect group term             estimate std.error statistic    df  p.value
##   <chr>  <chr> <chr>               <dbl>     <dbl>     <dbl> <dbl>    <dbl>
## 1 fixed  <NA>  (Intercept)      67.2        0.927     72.5    51.1 3.66e-53
## 2 fixed  <NA>  poly1            -3.29       0.335     -9.80   48.9 3.94e-13
## 3 fixed  <NA>  poly2             0.00947    0.0124     0.762  90.0 4.48e- 1
## 4 fixed  <NA>  TasksADL          0.0950     0.812      0.117  60.0 9.07e- 1
## 5 fixed  <NA>  TaskMemory        1.24       0.812      1.53   60.0 1.32e- 1
## 6 fixed  <NA>  poly1:TasksADL    1.36       0.270      5.05   60.1 4.47e- 6
## 7 fixed  <NA>  poly1:TaskMemory -3.98       0.270    -14.7    60.1 1.32e-21
## 8 fixed  <NA>  poly2:TasksADL   -0.0133     0.0176    -0.754  90.0 4.53e- 1
## 9 fixed  <NA>  poly2:TaskMemory  0.339      0.0176    19.3    90.0 1.07e-33

Optional: Quick refresher of scientific notation

Question A7

term	estimate	std.error	statistic	df	p.value
(Intercept)	67.16	0.93	72.48	51.06	<.001 ***
poly1	-3.29	0.34	-9.80	48.95	<.001 ***
poly2	0.01	0.01	0.76	90.00	.448
TasksADL	0.10	0.81	0.12	60.05	.907
TaskMemory	1.24	0.81	1.53	60.05	.132
poly1:TasksADL	1.36	0.27	5.04	60.08	<.001 ***
poly1:TaskMemory	-3.98	0.27	-14.73	60.08	<.001 ***
poly2:TasksADL	-0.01	0.02	-0.75	90.00	.453
poly2:TaskMemory	0.34	0.02	19.27	90.00	<.001 ***

For the cADL Task, what is the estimated average performance where x = 0?
For the sADL and Memory Tasks, is the estimated average where x = 0 different to the cADL Task?
For the cADL Task, how does performance change for every increasing time point? (what is the estimated linear slope?)
- Note, the quadratic term poly2 is non-significant, so we’ll ignore it here
For the sADL Task, how is the change in performance over time different from the cADL Task?
The answer to c) + the answer to d) will give you the estimated slope for the sADL Task.
For the Memory task, how does performance change for every increasing time point? In other words, what is the slope for the Memory task and what effect does time have?
- This is more difficult. The quadratic term is significant.
- Recall the direction of the quadratic term (positive/negative) and how this relates to the visual curvature (see back in “What’s a polynomial?”).

Solution

Question A8

Based on your answers above, can you sketch out (on paper) the model fit?

Then provide a written description.

Solution

Question A9

To what extent do model comparisons (Question A6) and the parameter-specific p-values (Question A7) yield the same results?

Solution

Question A10

Plot the model fitted values. This might be pretty similar to the plot you created in Question A2, and (hopefully) similar to the one you drew on paper for Question A8.

Solution

ggplot(Az, aes(Time, Performance, color=Task)) + 
  stat_summary(fun.data=mean_se, geom="pointrange") + 
  stat_summary(fun=mean, geom="line", aes(y=fitted(m.Az.full)))

Two quotes

“all models are wrong. some are useful.” (George Box, 1976).

“…it does not seem helpful just to say that all models are wrong. The very word model implies simplification and idealization. The idea that complex physical, biological or sociological systems can be exactly described by a few formulae is patently absurd. The construction of idealized representations that capture important stable aspects of such systems is, however, a vital part of general scientific analysis and statistical models, especially substantive ones, do not seem essentially different from other kinds of model.”(Sir David Cox, 1995).

Optional: TIMTOWTDI

Longitudinal (non-linear)

Linear vs Non-Linear

What is a polynomial?

Raw Polynomials

Orthogonal Polynomials

Some useful code from Dan

Exercises: Cognitive performance