class: center, middle, inverse, title-slide .title[ #
Assumptions & Diagnostics
] .subtitle[ ## Data Analysis for Psychology in R 3 ] .author[ ### Josiah King ] .institute[ ### Department of Psychology
The University of Edinburgh ] --- --- # Assumptions in LM .pull-left[ #### The general idea - `\(\varepsilon_i \sim N(0,\sigma^2)\)` iid - "zero mean and constant variance" ![](dapr3_2324_03a_assumptdiag_files/figure-html/unnamed-chunk-1-1.svg)<!-- --> ] -- .pull-right[ #### Recipe book + **L**inearity + **I**ndependence + **N**ormality + **E**qual Variances ] --- # Question: What's different in MLM? -- ## Answer: Not much! -- - General idea is unchanged: error is random - But we now have residuals at multiple levels! Random effects can be viewed as residuals at another level. --- # Resids <img src="jk_img_sandbox/Slide1.PNG" width="1707" style="display: block; margin: auto auto auto 0;" /> --- # Resids (2) <img src="jk_img_sandbox/Slide2.PNG" width="1707" style="display: block; margin: auto auto auto 0;" /> --- # Resids (3) <img src="jk_img_sandbox/Slide3.PNG" width="1707" style="display: block; margin: auto auto auto 0;" /> --- # Random effects as level 2 residuals <img src="jk_img_sandbox/lmmwodot.png" width="1535" style="display: block; margin: auto auto auto 0;" /> --- count:false # Random effects as level 2 residuals <img src="jk_img_sandbox/lmmwdot.png" width="1535" style="display: block; margin: auto auto auto 0;" /> --- # Random effects as level 2 residuals $$ `\begin{align} & \text{for observation }j\text{ in group }i \\ \quad \\ & \text{Level 1:} \\ & \color{red}{y_{ij}} = \color{blue}{\beta_{0i} \cdot 1 + \beta_{1i} \cdot x_{ij}} + \varepsilon_{ij} \\ & \text{Level 2:} \\ & \color{blue}{\beta_{0i}} = \gamma_{00} + \color{orange}{\zeta_{0i}} \\ & \color{blue}{\beta_{1i}} = \gamma_{10} + \color{orange}{\zeta_{1i}} \\ \quad \\ & \varepsilon, \, \color{orange}{\zeta_0}, \, \text{ and } \, \color{orange}{\zeta_1}\text{ are all assumed to be normally distributed with mean 0.} \end{align}` $$ --- count:false # Random effects as level 2 residuals <!-- > 200 pupils from 20 schools completed a survey containing the Emotion Dysregulation Scale (EDS) and the Child Routines Questionnaire (CRQ). Eight of the schools were taking part in an initiative to specifically teach emotion regulation as part of the curriculum. Data were also gathered regarding the average hours each child slept per night. --> .pull-left[ `\(\varepsilon\)` `resid(model)` mean zero, constant variance <br><br> <img src="dapr3_2324_03a_assumptdiag_files/figure-html/unnamed-chunk-8-1.svg" width="400px" /> ] -- .pull-right[ `\(\color{orange}{\zeta}\)` `ranef(model)` mean zero, constant variance ``` ## $cluster ``` <img src="dapr3_2324_03a_assumptdiag_files/figure-html/unnamed-chunk-9-1.svg" width="400px" /> ] --- # Assumption Plots: Residuals vs Fitted ```r plot(model, type=c("p","smooth")) ``` ![](dapr3_2324_03a_assumptdiag_files/figure-html/unnamed-chunk-10-1.svg)<!-- --> --- # Assumption Plots: qqplots ```r qqnorm(resid(model)) qqline(resid(model)) ``` ![](dapr3_2324_03a_assumptdiag_files/figure-html/unnamed-chunk-11-1.svg)<!-- --> --- # Assumption Plots: Scale-Location ```r plot(model, form = sqrt(abs(resid(.))) ~ fitted(.), type = c("p","smooth")) ``` ![](dapr3_2324_03a_assumptdiag_files/figure-html/unnamed-chunk-12-1.svg)<!-- --> --- count:false # Assumption Plots: Scale-Location ```r plot(model, form = sqrt(abs(resid(.))) ~ fitted(.) | cluster, type = c("p")) ``` ![](dapr3_2324_03a_assumptdiag_files/figure-html/unnamed-chunk-13-1.svg)<!-- --> --- # Assumption Plots: Ranefs .pull-left[ ```r qqnorm(ranef(model)$cluster[,1]) qqline(ranef(model)$cluster[,1]) qqnorm(ranef(model)$cluster[,2]) qqline(ranef(model)$cluster[,2]) ``` ![](dapr3_2324_03a_assumptdiag_files/figure-html/unnamed-chunk-14-1.svg)<!-- --> ] .pull-right[ ```r rans <- as.data.frame(ranef(model)$cluster) ggplot(rans, aes(sample = `(Intercept)`)) + stat_qq() + stat_qq_line() + labs(title="random intercept") ggplot(rans, aes(sample = x1)) + stat_qq() + stat_qq_line() labs(title="random slope") ``` ![](dapr3_2324_03a_assumptdiag_files/figure-html/unnamed-chunk-16-1.svg)<!-- --> ] --- # for a quick check ```r performance::check_model(model) ``` ![](dapr3_2324_03a_assumptdiag_files/figure-html/unnamed-chunk-17-1.svg)<!-- --> --- class: inverse, center, middle <h2><b style="opacity:0.4;">Part 1: Assumptions </b><b>Troubleshooting</b></h2> <h2 style="text-align: left;opacity:0.3;">Part 2: Case Diagnostics in MLM</h2> --- # Our Data .pull-left[ > In a study examining how cognition changes over time, a sample of 20 participants took the Addenbrooke's Cognitive Examination (ACE) every 2 years from age 60 to age 78. <small>Each participant has 10 datapoints. Participants are clusters.</small> ```r d3 <- read_csv("https://uoepsy.github.io/data/dapr3_mindfuldecline.csv") head(d3) ``` ``` ## # A tibble: 6 × 7 ## sitename ppt condition visit age ACE imp ## <chr> <chr> <chr> <dbl> <dbl> <dbl> <chr> ## 1 Sncbk PPT_1 control 1 60 84.5 unimp ## 2 Sncbk PPT_1 control 2 62 85.6 imp ## 3 Sncbk PPT_1 control 3 64 84.5 imp ## 4 Sncbk PPT_1 control 4 66 83.1 imp ## 5 Sncbk PPT_1 control 5 68 82.3 imp ## 6 Sncbk PPT_1 control 6 70 83.3 imp ``` ```r library(ICC) ICCbare(x = ppt, y = ACE, data = d3) ``` ``` ## [1] 0.4799 ``` ] .pull-right[ <img src="dapr3_2324_03a_assumptdiag_files/figure-html/unnamed-chunk-21-1.svg" style="display: block; margin: auto;" /> ] --- # When things look wrong ```r mymodel <- lmer(ACE ~ visit * condition + (1 | ppt), data = d3) performance::check_model(mymodel) ``` ![](dapr3_2324_03a_assumptdiag_files/figure-html/unnamed-chunk-22-1.svg)<!-- --> --- # When things look wrong ### __Model mis-specification?__ .pull-left[ ```r mymodel <- lmer(ACE ~ visit * condition + (1 | ppt), data = d3) ``` ![](dapr3_2324_03a_assumptdiag_files/figure-html/unnamed-chunk-24-1.svg)<!-- --> ] .pull-right[ ```r mymodel <- lmer(ACE ~ visit * condition + (1 + visit | ppt), data = d3) ``` ![](dapr3_2324_03a_assumptdiag_files/figure-html/unnamed-chunk-26-1.svg)<!-- --> ] --- # When things look wrong ### __Transformations?__ - Transforming your outcome variable may help to satisfy model assumptions -- - log(y) - 1/y - sqrt(y) - forecast::BoxCox(y) --- # When things look wrong ### __Transformations?__ - Transforming your outcome variable may help to satisfy model assumptions .pull-left[ ```r lmer(y ~ x1 + g + (1 | cluster), df) ``` ![](dapr3_2324_03a_assumptdiag_files/figure-html/unnamed-chunk-29-1.svg)<!-- --> ] .pull-right[ ```r lmer(forecast::BoxCox(y,lambda="auto") ~ x1 + g + (1 | cluster), df) ``` ![](dapr3_2324_03a_assumptdiag_files/figure-html/unnamed-chunk-31-1.svg)<!-- --> ] --- count:false # When things look wrong ### __Transformations?__ - Transforming your outcome variable may help to satisfy model assumptions **but it comes at the expense of interpretability.** .pull-left[ ```r lmer(y ~ x1 + g + (1 | cluster), df) ``` ``` ## (Intercept) x1 g ## 36.137 1.615 10.020 ``` ] .pull-right[ ```r lmer(forecast::BoxCox(y,lambda="auto") ~ x1 + g + (1 | cluster), df) ``` ``` ## (Intercept) x1 g ## 1.733760 0.006857 0.048426 ``` ] --- count:false class: extra exclude: true # When things look wrong ### __robustlmm__ .pull-left[ ```r mymodel <- lmer(ACE ~ visit * condition + (1 | ppt), data = d3) summary(mymodel)$coefficients ``` ] .pull-right[ ```r library(robustlmm) mymodelr <- rlmer(ACE ~ visit * condition + (1 | ppt), data = d3) summary(mymodelr)$coefficients ``` ] --- # When things look wrong ### __Bootstrap?__ basic idea: 1. do many many times:  a. take a sample (e.g. sample with replacement from your data, or simulated from your model parameters)  b. fit the model to the sample 2. then:  a. based on all the models fitted in step 1, obtain a distribution of parameter estimate of interest.  b. based on the bootstrap distribution from 2a, compute a confidence interval for estimate.  c. celebrate __Bootstrapping is not a panacea.__ - If we're worrying because our errors are slightly non-normal or heteroskedastic, _and if we have a large sample size_, then bootstrapping might be a good choice. - If there are issues in mis-specification (e.g the effect is non-linear and we're estimating it as linear), bootstrapping won't help us. --- # Bootstrap: What do we (re)sample? resample based on the estimated distributions of parameters? - assumes explanatory variables are fixed, model specification and the distributions (e.g. `\(\zeta \sim N(0,\sigma_{\zeta})\)` and `\(\varepsilon \sim N(0,\sigma_{\varepsilon})\)`) are correct. -- resample residuals - `\(y* = \hat{y} + \hat{\varepsilon}_{\textrm{sampled with replacement}}\)` - assumes explanatory variables are fixed, and model specification is correct. -- resample cases - **minimal** assumptions - that we have correctly specified the hierarchical structure of data - **But** do we resample: - observations? - clusters? - both? --- # Case Bootstrap ```r mymodel <- lmer(ACE ~ visit * condition + (1 + visit | ppt), data = d3) ``` .pull-left[ ```r library(lmeresampler) # resample only the people, not their observations: mymodelBScase <- bootstrap(mymodel, .f = fixef, type = "case", B = 1000, resample = c(TRUE, FALSE)) summary(mymodelBScase) ``` ``` ## Bootstrap type: case ## ## Number of resamples: 1000 ## ## term observed rep.mean se bias ## 1 (Intercept) 85.7353 85.7321 0.12417 -0.003247 ## 2 visit -0.5319 -0.5297 0.06864 0.002171 ## 3 conditionmindfulness -0.2979 -0.2959 0.18353 0.001930 ## 4 visit:conditionmindfulness 0.3459 0.3472 0.09987 0.001301 ## ## There were 26 messages, 0 warnings, and 0 errors. ## ## The most commonly occurring message was: boundary (singular) fit: see help('isSingular') ``` ] .pull-right[ ```r confint(mymodelBScase, type = "perc") ``` ``` ## # A tibble: 4 × 6 ## term estimate lower upper type level ## <chr> <dbl> <dbl> <dbl> <chr> <dbl> ## 1 (Intercept) 85.7 85.5 86.0 perc 0.95 ## 2 visit -0.532 -0.666 -0.390 perc 0.95 ## 3 conditionmindfulness -0.298 -0.650 0.0736 perc 0.95 ## 4 visit:conditionmindfulness 0.346 0.151 0.555 perc 0.95 ``` ] .footnote[<br><br> <br><br>For a nice how-to guide on the **lmeresampler** package, see http://aloy.github.io/lmeresampler/articles/lmeresampler-vignette.html. For a discussion of different bootstrap methods for multilevel models, see Leeden R.., Meijer E., Busing F.M. (2008) Resampling Multilevel Models. In: Leeuw J.., Meijer E. (eds) Handbook of Multilevel Analysis. Springer, New York, NY. DOI: 10.1007/978-0-387-73186-5_11 ] --- count:false # Case Bootstrap ```r mymodel <- lmer(ACE ~ visit * condition + (1 + visit | ppt), data = d3) ``` .pull-left[ ```r library(lmeresampler) # resample only the people, not their observations: mymodelBScase <- bootstrap(mymodel, .f = fixef, type = "case", B = 1000, resample = c(TRUE, FALSE)) summary(mymodelBScase) ``` ``` ## Bootstrap type: case ## ## Number of resamples: 1000 ## ## term observed rep.mean se bias ## 1 (Intercept) 85.7353 85.7321 0.12417 -0.003247 ## 2 visit -0.5319 -0.5297 0.06864 0.002171 ## 3 conditionmindfulness -0.2979 -0.2959 0.18353 0.001930 ## 4 visit:conditionmindfulness 0.3459 0.3472 0.09987 0.001301 ## ## There were 26 messages, 0 warnings, and 0 errors. ## ## The most commonly occurring message was: boundary (singular) fit: see help('isSingular') ``` ] .pull-right[ ```r plot(mymodelBScase, "visit*conditionmindfulness") ``` ![](dapr3_2324_03a_assumptdiag_files/figure-html/unnamed-chunk-45-1.svg)<!-- --> ] --- # Summary - Our assumptions for multi-level models are similar to that of a standard linear model in that we are concerned with the our residuals - in the multi-level case, we have residuals are multiple levels. - When assumptions appear violated, there are various courses of action to consider. - primarily, we should think about whether our model makes theoretical sense - Resampling methods (e.g. Bootstrapping) can __sometimes__ be used to obtain confidence intervals and bias-corrected estimates of model parameters. - There are various forms of the bootstrap, with varying assumptions. --- class: inverse, center, middle, animated, rotateInDownLeft # End of Part 1 --- class: inverse, center, middle <h2 style="text-align: left;opacity:0.3;">Part 1: Assumptions</h2> <h2>Part 2: Case Diagnostics in MLM</h2> --- # Influence Just like standard `lm()`, observations can have unduly high influence on our model through a combination of high leverage and outlyingness. <img src="dapr3_2324_03a_assumptdiag_files/figure-html/unnamed-chunk-46-1.svg" style="display: block; margin: auto;" /> --- # but we have multiple levels... - Both observations (level 1 units) __and__ clusters (level 2+ units) can be influential. -- - several packages, but current recommendation is **HLMdiag:** http://aloy.github.io/HLMdiag/index.html --- # Level 1 influential points .pull-left[ ```r mymodel <- lmer(ACE ~ visit * condition + (1 + visit | ppt), data = d3) qqnorm(resid(mymodel)) qqline(resid(mymodel)) ``` ![](dapr3_2324_03a_assumptdiag_files/figure-html/unnamed-chunk-47-1.svg)<!-- --> ] -- .pull-right[ ```r library(HLMdiag) infl1 <- hlm_influence(mymodel, level = 1) names(infl1) ``` ``` ## [1] "id" "ACE" "visit" "condition" ## [5] "ppt" "cooksd" "mdffits" "covtrace" ## [9] "covratio" "leverage.overall" ``` ```r infl1 ``` ``` ## # A tibble: 177 × 10 ## id ACE visit condition ppt cooksd mdffits covtrace covratio leverage.overall ## <int> <dbl> <dbl> <fct> <fct> <dbl> <dbl> <dbl> <dbl> <dbl> ## 1 1 84.5 1 control PPT_1 0.0119 1.16e-2 0.0221 1.02 0.203 ## 2 2 85.6 2 control PPT_1 0.0164 1.62e-2 0.0145 1.01 0.149 ## 3 3 84.5 3 control PPT_1 0.00158 1.56e-3 0.00901 1.01 0.111 ## 4 4 83.1 4 control PPT_1 0.00138 1.37e-3 0.00508 1.01 0.0898 ## 5 5 82.3 5 control PPT_1 0.00160 1.59e-3 0.00239 1.00 0.0853 ## 6 6 83.3 6 control PPT_1 0.000482 4.82e-4 0.000765 1.00 0.0974 ## 7 7 80.9 7 control PPT_1 0.000227 2.27e-4 0.000150 1.00 0.126 ## 8 8 81.9 8 control PPT_1 0.000112 1.12e-4 0.000619 1.00 0.171 ## 9 9 81.5 9 control PPT_1 0.000804 8.02e-4 0.00239 1.00 0.233 ## 10 10 80.4 10 control PPT_1 0.0000561 5.58e-5 0.00595 1.01 0.312 ## # ℹ 167 more rows ``` ] --- count:false # Level 1 influential points .pull-left[ ```r mymodel <- lmer(ACE ~ visit * condition + (1 + visit | ppt), data = d3) qqnorm(resid(mymodel)) qqline(resid(mymodel)) ``` ![](dapr3_2324_03a_assumptdiag_files/figure-html/unnamed-chunk-49-1.svg)<!-- --> ] .pull-right[ ```r library(HLMdiag) infl1 <- hlm_influence(mymodel, level = 1) dotplot_diag(infl1$cooksd, cutoff = "internal") ``` ![](dapr3_2324_03a_assumptdiag_files/figure-html/unnamed-chunk-50-1.svg)<!-- --> ] --- # Level 2 influential clusters ```r infl2 <- hlm_influence(mymodel, level = "ppt") dotplot_diag(infl2$cooksd, cutoff = "internal", index=infl2$ppt) ``` ![](dapr3_2324_03a_assumptdiag_files/figure-html/unnamed-chunk-52-1.svg)<!-- --> --- # What to do? - In this context (observations within participants), I would be inclined to focus on level-2 (participant) influence _first_, and only then look at the influence of individual observations. -- - It's worth looking into PPT_2 a bit further. - `mdffits` is a measure of multivariate "difference in fixed effects" ```r infl2 %>% arrange(desc(mdffits)) ``` ``` ## # A tibble: 20 × 6 ## ppt cooksd mdffits covtrace covratio leverage.overall ## <fct> <dbl> <dbl> <dbl> <dbl> <dbl> ## 1 PPT_2 0.213 0.190 0.235 1.25 0.158 ## 2 PPT_18 0.150 0.134 0.229 1.24 0.158 ## 3 PPT_19 0.102 0.0918 0.229 1.24 0.158 ## 4 PPT_5 0.0858 0.0771 0.235 1.25 0.158 ## 5 PPT_13 0.0811 0.0727 0.229 1.24 0.158 ## 6 PPT_3 0.0747 0.0670 0.235 1.25 0.158 ## 7 PPT_7 0.0678 0.0609 0.235 1.25 0.158 ## 8 PPT_12 0.0656 0.0595 0.214 1.23 0.216 ## 9 PPT_15 0.0606 0.0544 0.229 1.24 0.158 ## 10 PPT_10 0.0459 0.0413 0.235 1.25 0.158 ## 11 PPT_20 0.0407 0.0365 0.229 1.24 0.158 ## 12 PPT_14 0.0309 0.0281 0.184 1.19 0.456 ## 13 PPT_4 0.0235 0.0209 0.235 1.25 0.158 ## 14 PPT_16 0.0213 0.0191 0.229 1.24 0.158 ## 15 PPT_17 0.0192 0.0173 0.229 1.24 0.158 ## 16 PPT_8 0.0149 0.0135 0.152 1.16 0.242 ## 17 PPT_9 0.00748 0.00666 0.235 1.25 0.158 ## 18 PPT_6 0.00658 0.00600 0.198 1.21 0.356 ## 19 PPT_1 0.00228 0.00203 0.235 1.25 0.158 ## 20 PPT_11 0.00225 0.00202 0.224 1.24 0.194 ``` --- count:false # What to do? - In this context (observations within participants), I would be inclined to focus on level-2 (participant) influence _first_, and only then look at the influence of individual observations. - It's worth looking into PPT_2 a bit further. - examine fixed effects upon deletion of this participant ```r deleted_ppt <- case_delete(mymodel, level = "ppt", type = "fixef", delete = c("PPT_2")) cbind(deleted_ppt$fixef.original, deleted_ppt$fixef.delete) ``` ``` ## [,1] [,2] ## (Intercept) 85.7353 85.8735 ## visit -0.5319 -0.5251 ## conditionmindfulness -0.2979 -0.4394 ## visit:conditionmindfulness 0.3459 0.3394 ``` --- # Sensitivity Analysis? Would our conclusions change if we excluded these schools? ```r mymodel_rm2 <- lmer(ACE ~ visit * condition + (1 + visit | ppt), data = d3 %>% filter(!ppt %in% c("PPT_2"))) mymodel_rm2BS <- bootstrap(mymodel_rm2, .f = fixef, type = "case", B = 1000, resample = c(TRUE, FALSE)) confint(mymodel_rm2BS, type = "perc") ``` ``` ## # A tibble: 4 × 6 ## term estimate lower upper type level ## <chr> <dbl> <dbl> <dbl> <chr> <dbl> ## 1 (Intercept) 85.9 85.7 86.0 perc 0.95 ## 2 visit -0.525 -0.673 -0.369 perc 0.95 ## 3 conditionmindfulness -0.439 -0.743 -0.0967 perc 0.95 ## 4 visit:conditionmindfulness 0.339 0.126 0.539 perc 0.95 ``` --- # Summary - Influence can be exerted by individual observations and higher lever groups of observations - e.g. by children and by schools, or by individual trials and by participants. - We can get measures of influence at different levels, and consider how estimates and conclusions might change when certain observations (or groups) are excluded - If in doubt, conduct a sensitivity analysis: how do your conclusions change depending upon whether you include/exclude? --- class: inverse, center, middle, animated, rotateInDownLeft # End