class: center, middle, inverse, title-slide .title[ #
Multilevel Model Inference
] .subtitle[ ## Data Analysis for Psychology in R 3 ] .author[ ### Josiah King ] .institute[ ### Department of Psychology
The University of Edinburgh ] --- class: inverse, center, middle <h1 style="text-align: left;">This Lecture:</h1> <h3 style="text-align: left;">1. Inferential methods with `lmer`</h3> --- # <p></p> you might have noticed... ```r library(lme4) d3 <- read_csv("https://uoepsy.github.io/data/dapr3_mindfuldecline.csv") cogtime_model <- lmer(ACE ~ 1 + visit + (1 + visit | ppt), data = d3) summary(cogtime_model) ``` .pull-left[ ``` ## Linear mixed model fit by REML ['lmerMod'] ## Formula: ACE ~ 1 + visit + (1 + visit | ppt) ## Data: d3 ## ## REML criterion at convergence: 365.2 ## ## Scaled residuals: ## Min 1Q Median 3Q Max ## -2.3120 -0.6818 -0.0526 0.7016 2.4022 ## ## Random effects: ## Groups Name Variance Std.Dev. Corr ## ppt (Intercept) 0.157 0.396 ## visit 0.104 0.322 -0.59 ## Residual 0.244 0.494 ## Number of obs: 177, groups: ppt, 20 ## ## Fixed effects: ## Estimate Std. Error t value *## (Intercept) 85.5843 0.1201 712.67 *## visit -0.3588 0.0733 -4.89 ## ## Correlation of Fixed Effects: ## (Intr) ## visit -0.534 ``` ] .pull-right[ ![](jk_img_sandbox/wotnop.png) {{content}} ] -- <br><br><br> <center><b>Extensive debate about how best to test parameters from MLMs.</b></center> --- # p-values in `lm()` In simple LM, we test the reduction in residual SS (sums of squares), which follows an `\(F\)` distribution with a known `\(df\)`. `\(F\)` tests for each predictor will have `\(k\)` numerator degrees of freedom (how many parameters) and `\(n-k-1\)` denominator degrees of freedom. The `\(t\)`-statistics for coefficients will follow `\(t\)`-distributions with the same `\(n-k-1\)` degrees of freedom. <br> $$ `\begin{align} F \qquad & = \qquad \frac{MS_{model}}{MS_{residual}} \qquad &= \qquad \frac{SS_{model}/df_{model}}{SS_{residual}/df_{residual}} \\ \quad \\ & & df_{model} = k \\ & & df_{residual} = n-k-1 \\ \end{align}` $$ --- count: false exclude: true # p-values in `lm()` In simple LM, we test the reduction in residual SS (sums of squares), which follows an `\(F\)` distribution with a known `\(df\)`. `\(F\)` tests for each predictor will have `\(k\)` numerator degrees of freedom (how many parameters) and `\(n-k-1\)` denominator degrees of freedom. The `\(t\)`-statistics for coefficients will follow `\(t\)`-distributions with the same `\(n-k-1\)` degrees of freedom. <br> $$ `\begin{align} F \qquad & = \qquad \frac{\text{variance explained by model}/\text{nr model parameters}}{\text{variance left over}/\text{nr residuals that are free to vary}} & \\ \quad \\ & = \frac{\text{avg variance explained per model parameter}}{\text{avg variance explained per residual}} & \end{align}` $$ --- # p-values in `lmer()`? - In certain *very specific* conditions, we can work out what the degrees of freedom are. - Parameter estimates from these models are not based on minimising sums of squared residuals - They are the ML/REML estimates - This is good - it means they can handle unbalanced designs and complex random effect structures - But: - unclear how to calculate denominator `\(df\)` - unclear whether the test statistics even follow an `\(F\)` distribution --- # Options for inference <ol> <li>approximating ddf</li> <ul> <li>satterthwaite</li> <li>kenward-rogers</li> <li>m-l-1</li> <li>...</li> </ul> <li>likelihood based methods</li> <ul> <li>profile likelihood confidence interval</li> <li>likelihood ratio tests</li> </ul> <li>bootstrap</li> <ul> <li>parametric</li> <ul> <li>confidence interval</li> <li>likelihood ratio test</li> </ul> <li>case-based</li> <ul> <li>confidence interval</li> </ul> </ul> </ol> <p>(there are others...)</p> --- # Options for inference <ol> <li>approximating ddf</li> <ul> <li style="opacity:.4">satterthwaite</li> <li>kenward-rogers</li> <li style="opacity:.4">m-l-1</li> <li style="opacity:.4">...</li> </ul> <li>likelihood based methods</li> <ul> <li>profile likelihood confidence interval</li> <li>likelihood ratio tests</li> </ul> <li style="opacity:.4">bootstrap</li> <ul> <li style="opacity:.4">parametric</li> <ul> <li style="opacity:.4">confidence interval: <code>confint(model, method = "boot")</code></li> <li style="opacity:.4">likelihood ratio test: <code>pbkrtest::PBmodcomp(full_model, reduced_model)</code></li> </ul> <li style="opacity:.4">case-based</li> <ul> <li style="opacity:.4">confidence interval</li> </ul> </ul> </ol> <p style="opacity:.4">(there are others...)</p> --- # Kenward-Rogers - Adjusts standard errors to avoid small sample bias - Approximates denominator degrees of freedom - Models must be fitted with REML (which is good for small samples!) __fixed effects__ ```r library(parameters) model_parameters(cogtime_model, ci_method = "kr") # %>% print_html() ``` ``` ## # Fixed Effects ## ## Parameter | Coefficient | SE | 95% CI | t | df | p ## --------------------------------------------------------------------------- ## (Intercept) | 85.58 | 0.12 | [85.33, 85.84] | 710.54 | 18.51 | < .001 ## visit | -0.36 | 0.07 | [-0.51, -0.21] | -4.89 | 18.99 | < .001 ## ## # Random Effects ## ## Parameter | Coefficient | SE | 95% CI ## --------------------------------------------------------------- ## SD (Intercept: ppt) | 0.40 | 0.12 | [ 0.22, 0.71] ## SD (visit: ppt) | 0.32 | 0.05 | [ 0.23, 0.45] ## Cor (Intercept~visit: ppt) | -0.59 | 0.36 | [-0.94, 0.38] ## SD (Residual) | 0.49 | 0.03 | [ 0.44, 0.56] ``` --- # Kenward-Rogers - Adjusts standard errors to avoid small sample bias - Approximates denominator degrees of freedom - Models must be fitted with REML (which is good for small samples!) __Model Comparison__ ```r restricted_model <- lmer(ACE ~ 1 + (1 + visit | ppt), data = d3, REML = TRUE) full_model <- lmer(ACE ~ 1 + visit + (1 + visit | ppt), data = d3, REML = TRUE) library(pbkrtest) KRmodcomp(full_model, restricted_model) ``` ``` ## large : ACE ~ 1 + visit + (1 + visit | ppt) ## small : ACE ~ 1 + (1 + visit | ppt) ## stat ndf ddf F.scaling p.value ## Ftest 23.9 1.0 19.0 1 0.0001 *** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ``` --- # Likelihood ratio tests .pull-left[ - Compares the log-likelihood of two competing models. - ratio of two likelihoods is **asymptotically** `\(\chi^2\)`-square distributed. - *this means for small samples (at any level) it may be unreliable* - how small is too small? ¯\\\_(ツ)\_/¯ 42? ] .pull-right[ __LRTs and ML/REML__ <ul> <li>if models differ in fixed effects only, then fit with ML</li> <li style="opacity:.6">if models differ in random effects only, then fit with REML (or ML if big N)</li> <li style="opacity:.6">if models differ in both, then fit with ML</li> </ul> `anova()` will automatically re-fit models with ML. ] <br> ```r restricted_model <- lmer(ACE ~ 1 + (1 + visit | ppt), data = d3, REML = FALSE) full_model <- lmer(ACE ~ 1 + visit + (1 + visit | ppt), data = d3, REML = FALSE) anova(restricted_model, full_model) ``` ``` ## Data: d3 ## Models: ## restricted_model: ACE ~ 1 + (1 + visit | ppt) ## full_model: ACE ~ 1 + visit + (1 + visit | ppt) ## npar AIC BIC logLik deviance Chisq Df Pr(>Chisq) ## restricted_model 5 385 401 -188 375 ## full_model 6 371 390 -180 359 16.3 1 0.000055 *** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ``` --- # Likelihood-based CIs - Evaluates the curvature of the likelihood surface at the estimate. - sharp curvature = more certainty in estimate - gradual curvature = less certainty in estimate - models need to be fitted with ML, not REML ```r confint(cogtime_model, method="profile") ``` ``` ## 2.5 % 97.5 % ## .sig01 0.1537 0.64010 ## .sig02 -0.9368 -0.04886 ## .sig03 0.2339 0.44750 ## .sigma 0.4413 0.55875 ## (Intercept) 85.3421 85.82385 ## visit -0.5057 -0.21155 ``` --- # Summary - Lots of debate around how best to conduct statistical inferences based on multi-level models. -- - Lots of options. -- | | df approximations | likelihood-based | | ---------------- | ------------------------------------------------------------------ | ------------------------------------------------------------------- | | tests or CIs for model parameters | `library(parameters)`<br>`model_parameters(model, ci_method="kr")` | `confint(model, type="profile")` | | model comparison<br><small>(different fixed effects, same random effects)</small> | `library(pbkrtest)`<br>`KRmodcomp(model1,model0)` | `anova(model0,model)` | | | fit models with `REML=TRUE`.<br>good option for small samples | fit models with `REML=FALSE`.<br>needs large N at both levels (40+) | - check your sample size! but if in doubt, go for KR. --- class: inverse, center, middle, animated, rotateInDownLeft # End --- class: inverse, center, middle <h1 style="text-align: left;">This Lecture:</h1> <h3 style="text-align: left;opacity:.4">1. Inferential methods with `lmer`</h3> <h3 style="text-align: left; ">Bonus content</h3> --- # Bootstrap <ul> <li>Parametric Bootstrap<br>assumes that explanatory variables are fixed and that model specification and the distributions such as `\(\zeta_i \sim N(0,\sigma_{\zeta})\)` and `\(\varepsilon_i \sim N(0,\sigma_{\varepsilon})\)` are correct.</li><br> <li style="opacity: .6">Case-based Bootstrap<br>minimal assumptions - we just need to ensure that we correctly specify the hierarchical dependency of data.<br>requires decision of at which levels to resample.<br>(discussed more next week)</li> </ul> --- # Parametric Bootstrap The basic premise is that we: 1. fit model(s) to data 2. Do many times: - simulate data based on the parameters of the fitted model(s) - compute some statistic/estimates from the simulated data 3. Construct a distribution of possible values --- # Parametric Bootstrap .pull-left[ ## Confidence Intervals ```r full_model <- lmer(ACE ~ 1 + visit + (1 + visit | ppt), data = d3) confint(full_model, method = "boot") ``` ``` 2.5 % 97.5 % .sig01 0.1130 0.6133 .sig02 -1.0000 0.1028 .sig03 0.2084 0.4225 .sigma 0.4253 0.5542 (Intercept) 85.3443 85.8334 visit -0.5029 -0.2137 ``` ] .pull-right[ ## LRT (a bootstrapped version of the likelihood ratio test): ```r restricted_model <- lmer(ACE ~ 1 + (1 + visit | ppt), data = d3) full_model <- lmer(ACE ~ 1 + visit + (1 + visit | ppt), data = d3) library(pbkrtest) PBmodcomp(full_model, restricted_model) ``` ``` Bootstrap test; time: 71.91 sec; samples: 1000; extremes: 0; large : ACE ~ 1 + visit + (1 + visit | ppt) ACE ~ 1 + (1 + visit | ppt) stat df p.value LRT 16.3 1 0.000055 *** PBtest 16.3 0.001 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 ``` ] --- # Adding more options | | df approximations | likelihood-based | parametric bootstrap | | ---------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | ------------------------------------------------------------------------------------------------ | ---------------------------------------------------------------------------------------------------- | | model parameters | `library(parameters)`<br>`model_parameters(model, ci_method="kr")` | `confint(model, type="profile")` | `confint(model, type="boot")` | | model comparison | `library(pbkrtest)`<br>`KRmodcomp(model1,model0)` | `anova(model0,model)` | `library(pbkrtest)`<br>`PBmodcomp(model1,model0)` | | | for KR, fit models with `REML=TRUE` (a good option for small samples).<br>Other options available (e.g. Satterthwaite, m-l-1, ...)<br>easy to implement, but debate as to whether statistics actually follow an F distribution | if models differ in fixed effects, we cannot use `\(-2\Delta LL\)` when models are fitted with REML. | Time consuming, but probably most reliable option (although can be problematic with unstable models) |