class: center, middle, inverse, title-slide .title[ #
Analysing Experiments
] .subtitle[ ## Data Analysis for Psychology in R 2
] .author[ ### dapR2 Team ] .institute[ ### Department of Psychology
The University of Edinburgh ] --- # Week's Learning Objectives 1. Interpretation of interactions with effects coded variables. 2. Distinguish between interactions, main effect contrasts, and simple effect contrasts. 3. Use contrasts to code specific interaction hypotheses. 4. Apply pairwise tests and corrections. --- # Hypotheses Testing in Factorial Designs + **Interactions (categorical-categorical)** + A change in the effect of some condition as a function of another. + Does the effect of `Treatment` differ by `Hospital`? <!-- + With effects coding, we can also think of this as a difference in simple effects. --> + Main effects / main effect contrasts (typically not pursued if the interaction is significant). + An overall, or average, effect of a condition over combined levels of another. + Is there an effect of `Treatment` averaged over `Hospital`? + **Simple Effects** + An effect of one condition at a specific level of another. + Is there an effect of `Hospital` for those receiving `Treatment A`? (...and so on for all combinations.) + In factorial designs with more than two levels of one or more of the conditions, one can also distinguish between *simple effects* and *simple contrasts*. + A simple contrast is a more focused test that compares only two cells (or a particular combination of cells). --- # Our model and coefficients + The effects coded model: `$$y_{ijk} = b_0 + \underbrace{(b_1E_1 + b_2E_2)}_{\text{Treatment}} + \underbrace{b_3E_3}_{\text{Hospital}} + \underbrace{b_4E_{13} + b_5E_{23}}_{\text{Interactions}} + \epsilon_{i}$$` + Where the variables represent the following comparisons: ``` ## # A tibble: 6 × 7 ## Treatment Hospital E1 E2 E3 E13 E23 ## <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> ## 1 A Hosp1 1 0 1 1 0 ## 2 A Hosp2 1 0 -1 -1 0 ## 3 B Hosp1 0 1 1 0 1 ## 4 B Hosp2 0 1 -1 0 -1 ## 5 C Hosp1 -1 -1 1 -1 -1 ## 6 C Hosp2 -1 -1 -1 1 1 ``` --- # Our model and coefficients + And our model in R: ```r m1 <- lm(SWB ~ Treatment + Hospital + Treatment*Hospital, data = hosp_tbl, contrasts = list(Treatment = contr.sum, Hospital = contr.sum)) # you can code contrasts within lm ``` --- # Table of means + Useful to keep in mind the group means: <table> <thead> <tr> <th style="text-align:left;"> </th> <th style="text-align:left;"> Hosp1 </th> <th style="text-align:left;"> Hosp2 </th> <th style="text-align:left;"> Marginal </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> TreatA </td> <td style="text-align:left;"> 10.80 </td> <td style="text-align:left;"> 7.85 </td> <td style="text-align:left;"> 9.33 </td> </tr> <tr> <td style="text-align:left;"> TreatB </td> <td style="text-align:left;"> 9.43 </td> <td style="text-align:left;"> 13.11 </td> <td style="text-align:left;"> 11.27 </td> </tr> <tr> <td style="text-align:left;"> TreatC </td> <td style="text-align:left;"> 10.10 </td> <td style="text-align:left;"> 7.98 </td> <td style="text-align:left;"> 9.04 </td> </tr> <tr> <td style="text-align:left;"> Marginal </td> <td style="text-align:left;"> 10.11 </td> <td style="text-align:left;"> 9.65 </td> <td style="text-align:left;"> 9.88 </td> </tr> </tbody> </table> --- # Our results ``` ## ## Call: ## lm(formula = SWB ~ Treatment + Hospital + Treatment * Hospital, ## data = hosp_tbl, contrasts = list(Treatment = contr.sum, ## Hospital = contr.sum)) ## ## Residuals: ## Min 1Q Median 3Q Max ## -6.6000 -1.2533 0.1083 1.2650 5.7000 ## ## Coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) 9.8806 0.1510 65.425 < 2e-16 *** ## Treatment1 -0.5539 0.2136 -2.593 0.0103 * ## Treatment2 1.3928 0.2136 6.521 7.30e-10 *** ## Hospital1 0.2306 0.1510 1.527 0.1287 ## Treatment1:Hospital1 1.2428 0.2136 5.819 2.79e-08 *** ## Treatment2:Hospital1 -2.0739 0.2136 -9.710 < 2e-16 *** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Residual standard error: 2.026 on 174 degrees of freedom ## Multiple R-squared: 0.4476, Adjusted R-squared: 0.4317 ## F-statistic: 28.2 on 5 and 174 DF, p-value: < 2.2e-16 ``` --- # Interpretation with effects coding ``` ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) 9.88 0.15 65.42 0.00 ## Treatment1 -0.55 0.21 -2.59 0.01 ## Treatment2 1.39 0.21 6.52 0.00 ## Hospital1 0.23 0.15 1.53 0.13 ## Treatment1:Hospital1 1.24 0.21 5.82 0.00 ## Treatment2:Hospital1 -2.07 0.21 -9.71 0.00 ``` + `\(b_0\)` (`Intercept`) = Grand mean. + `\(b_1\)` (`Treatment1`) = Difference between row marginal for treatment A and the grand mean. + `\(b_2\)` (`Treatment2`) = Difference between row marginal for treatment B and the grand mean. + `\(b_3\)` (`Hospital1`) = Difference between column marginal for Hospital 1 and the grand mean. + `\(b_4\)` (`Treatment1:Hospital1`) = Difference between Treatment A and grand mean, in Hospital 1 and Hospital 2. + `\(b_5\)` (`Treatment2:Hospital1`) = Difference between Treatment B and grand mean, in Hospital 1 and Hospital 2. --- # Interpretation with effects coding .pull-left[ ``` ## Estimate Std. Error ## (Intercept) 9.88 0.15 ## Treatment1 -0.55 0.21 ## Treatment2 1.39 0.21 ## Hospital1 0.23 0.15 ## Treatment1:Hospital1 1.24 0.21 ## Treatment2:Hospital1 -2.07 0.21 ``` ] .pull-right[ <table> <thead> <tr> <th style="text-align:left;"> </th> <th style="text-align:left;"> Hosp1 </th> <th style="text-align:left;"> Hosp2 </th> <th style="text-align:left;"> Marginal </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> TreatA </td> <td style="text-align:left;"> 10.80 </td> <td style="text-align:left;"> 7.85 </td> <td style="text-align:left;"> 9.33 </td> </tr> <tr> <td style="text-align:left;"> TreatB </td> <td style="text-align:left;"> 9.43 </td> <td style="text-align:left;"> 13.11 </td> <td style="text-align:left;"> 11.27 </td> </tr> <tr> <td style="text-align:left;"> TreatC </td> <td style="text-align:left;"> 10.10 </td> <td style="text-align:left;"> 7.98 </td> <td style="text-align:left;"> 9.04 </td> </tr> <tr> <td style="text-align:left;"> Marginal </td> <td style="text-align:left;"> 10.11 </td> <td style="text-align:left;"> 9.65 </td> <td style="text-align:left;"> 9.88 </td> </tr> </tbody> </table> ] + `\(b_0\)` (`Intercept`) = Grand mean. + `\(b_1\)` (`Treatment1`) = Difference between row marginal for treatment A and the grand mean. + `\(b_2\)` (`Treatment2`) = Difference between row marginal for treatment B and the grand mean. + `\(b_3\)` (`Hospital1`) = Difference between column marginal for Hospital 1 and the grand mean. + `\(b_4\)` (`Treatment1:Hospital1`) = Difference between Treatment A and grand mean, in Hospital 1 and Hospital 2. + `\(b_5\)` (`Treatment2:Hospital1`) = Difference between Treatment B and grand mean, in Hospital 1 and Hospital 2. --- # Interpretation with effects coding `$$y_{ijk} = b_0 + \underbrace{(b_1E_1 + b_2E_2)}_{\text{Treatment}} + \underbrace{b_3E_3}_{\text{Hospital}} + \underbrace{b_4E_{13} + b_5E_{23}}_{\text{Interactions}} + \epsilon_{i}$$` .pull-left[ ``` ## Estimate Std. Error ## (Intercept) 9.88 0.15 ## Treatment1 -0.55 0.21 ## Treatment2 1.39 0.21 ## Hospital1 0.23 0.15 ## Treatment1:Hospital1 1.24 0.21 ## Treatment2:Hospital1 -2.07 0.21 ``` ] .pull-right[ <table> <thead> <tr> <th style="text-align:left;"> </th> <th style="text-align:left;"> Hosp1 </th> <th style="text-align:left;"> Hosp2 </th> <th style="text-align:left;"> Marginal </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> TreatA </td> <td style="text-align:left;"> 10.80 </td> <td style="text-align:left;"> 7.85 </td> <td style="text-align:left;"> 9.33 </td> </tr> <tr> <td style="text-align:left;"> TreatB </td> <td style="text-align:left;"> 9.43 </td> <td style="text-align:left;"> 13.11 </td> <td style="text-align:left;"> 11.27 </td> </tr> <tr> <td style="text-align:left;"> TreatC </td> <td style="text-align:left;"> 10.10 </td> <td style="text-align:left;"> 7.98 </td> <td style="text-align:left;"> 9.04 </td> </tr> <tr> <td style="text-align:left;"> Marginal </td> <td style="text-align:left;"> 10.11 </td> <td style="text-align:left;"> 9.65 </td> <td style="text-align:left;"> 9.88 </td> </tr> </tbody> </table> ] ``` ## # A tibble: 6 × 7 ## Treatment Hospital E1 E2 E3 E13 E23 ## <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> ## 1 A Hosp1 1 0 1 1 0 ## 2 A Hosp2 1 0 -1 -1 0 ## 3 B Hosp1 0 1 1 0 1 ## 4 B Hosp2 0 1 -1 0 -1 ## 5 C Hosp1 -1 -1 1 -1 -1 ## 6 C Hosp2 -1 -1 -1 1 1 ``` --- # Interpretation with effects coding `$$y_{ijk} = b_0 + b_1E_1 + b_2E_2 + b_3E_3 + b_4E_{13} + b_5E_{23} + \epsilon_{i}$$` + `\(b_0\)` (`Intercept`) = Mean of means ("grand mean"): `\(b_{0} = \mu_{..}\)` + `\(b_1\)` (`Treatment1`) = Difference between row marginal for treatment A and the grand mean: `\(b_1 = \mu_{1.}-\mu_{..}\)` + `\(b_2\)` (`Treatment2`) = Difference between row marginal for treatment B and the grand mean: `\(b_2 = \mu_{2.}-\mu_{..}\)` + `\(b_3\)` (`Hospital1`) = Difference between column marginal for Hospital 1 and the grand mean: `\(b_3 = \mu_{.1}-\mu_{..}\)` + `\(b_4\)` (`Treatment1:Hospital1`) = Difference between Treatment A and grand mean, in Hospital 1 and Hospital 2: `\(b_4 = \mu_{11}-b_0-b_1-b_3\)` + `\(b_5\)` (`Treatment2:Hospital1`) = Difference between Treatment B and grand mean, in Hospital 1 and Hospital 2: `\(b_5 = \mu_{21}-b_0-b_2-b_3\)` .pull-left[ ``` ## # A tibble: 6 × 7 ## Treatment Hospital E1 E2 E3 E13 E23 ## <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> ## 1 A Hosp1 1 0 1 1 0 ## 2 A Hosp2 1 0 -1 -1 0 ## 3 B Hosp1 0 1 1 0 1 ## 4 B Hosp2 0 1 -1 0 -1 ## 5 C Hosp1 -1 -1 1 -1 -1 ## 6 C Hosp2 -1 -1 -1 1 1 ``` ] .pull-right[ <table> <thead> <tr> <th style="text-align:left;"> </th> <th style="text-align:left;"> Hosp1 </th> <th style="text-align:left;"> Hosp2 </th> <th style="text-align:left;"> Marginal </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> TreatA </td> <td style="text-align:left;"> 10.80 </td> <td style="text-align:left;"> 7.85 </td> <td style="text-align:left;"> 9.33 </td> </tr> <tr> <td style="text-align:left;"> TreatB </td> <td style="text-align:left;"> 9.43 </td> <td style="text-align:left;"> 13.11 </td> <td style="text-align:left;"> 11.27 </td> </tr> <tr> <td style="text-align:left;"> TreatC </td> <td style="text-align:left;"> 10.10 </td> <td style="text-align:left;"> 7.98 </td> <td style="text-align:left;"> 9.04 </td> </tr> <tr> <td style="text-align:left;"> Marginal </td> <td style="text-align:left;"> 10.11 </td> <td style="text-align:left;"> 9.65 </td> <td style="text-align:left;"> 9.88 </td> </tr> </tbody> </table> ] --- # Interpretation with effects coding `$$y_{ijk} = b_0 + b_1E_1 + b_2E_2 + b_3E_3 + b_4E_{13} + b_5E_{23} + \epsilon_{i}$$` `$$b_{0} = \mu_{..}$$` `$$b_1 = \mu_{1.}-\mu_{..}$$` `$$b_2 = \mu_{2.}-\mu_{..}$$` `$$b_3 = \mu_{.1}-\mu_{..}$$` `$$b_4 = \mu_{11}-b_0-b_1-b_3 = (\mu_{11}-\mu_{..}) - (\mu_{1.}-\mu_{..}) - (\mu_{.1}-\mu_{..})$$` `$$b_5 = \mu_{21}-b_0-b_2-b_3 = (\mu_{21}-\mu_{..}) - (\mu_{2.}-\mu_{..}) - (\mu_{.1}-\mu_{..})$$` --- class: center, middle # Questions.... --- # Interacting manual contrasts + We can create an interaction based on manual contrasts. + Suppose we wanted to test whether there is a difference between Treatment A vs Treatment B and Treatment C across hospitals. + Steps: + Define the contrasts for each factor. + Calculate the product (remember all interactions are products). + Create the contrast code. + Run the contrast. --- # Interacting manual contrasts in R + Step 1: Individual contrasts. ```r treat_con <- c("TreatA" = 1, "TreatB" = -1/2, "TreatC" = -1/2) hosp_con <- c("Hops1" = 1, "Hosp2" = -1) ``` + Step 2: Product. ```r contr_prod <- outer(treat_con, hosp_con) contr_prod ``` ``` ## Hops1 Hosp2 ## TreatA 1.0 -1.0 ## TreatB -0.5 0.5 ## TreatC -0.5 0.5 ``` --- # Interacting manual contrasts in R + Step 3: Create the contrast. ```r m1_emm <- emmeans(m1, ~Treatment*Hospital) m1_emm ``` ``` ## Treatment Hospital emmean SE df lower.CL upper.CL ## TreatA Hosp1 10.80 0.37 174 10.07 11.53 ## TreatB Hosp1 9.43 0.37 174 8.70 10.16 ## TreatC Hosp1 10.10 0.37 174 9.37 10.83 ## TreatA Hosp2 7.85 0.37 174 7.12 8.58 ## TreatB Hosp2 13.12 0.37 174 12.39 13.85 ## TreatC Hosp2 7.98 0.37 174 7.25 8.71 ## ## Confidence level used: 0.95 ``` ```r int_comp <- list("TreatA vs B and C across hospital" = c(1, -1/2, -1/2, -1, 1/2, 1/2)) ``` --- # Interacting manual contrasts in R + Step 4: Run. ```r int_comp_test <- contrast(m1_emm, int_comp) int_comp_test ``` ``` ## contrast estimate SE df t.ratio p.value ## TreatA vs B and C across hospital 3.73 0.641 174 5.819 <.0001 ``` --- class: center, middle # Questions.... --- # Simple Effects + We noted previously that simple effects consider the effect of one condition at a specific level of the other. + Is there an effect of `Hospital` for those receiving `Treatment A`? (and so on for all combinations) + Or, put another way, is there a difference in `SWB` between Hospitals 1 and 2 for people receiving Treatment A? --- # Simple Effects with `emmeans` .pull-left[ ```r m1_emm <- emmeans(m1, ~Treatment*Hospital) m1_simple1 <- pairs(m1_emm, simple = "Hospital") m1_simple1 ``` ``` ## Treatment = TreatA: ## contrast estimate SE df t.ratio p.value ## Hosp1 - Hosp2 2.95 0.523 174 5.632 <.0001 ## ## Treatment = TreatB: ## contrast estimate SE df t.ratio p.value ## Hosp1 - Hosp2 -3.69 0.523 174 -7.047 <.0001 ## ## Treatment = TreatC: ## contrast estimate SE df t.ratio p.value ## Hosp1 - Hosp2 2.12 0.523 174 4.059 0.0001 ``` ] .pull-right[ <table> <thead> <tr> <th style="text-align:left;"> </th> <th style="text-align:left;"> Hosp1 </th> <th style="text-align:left;"> Hosp2 </th> <th style="text-align:left;"> Marginal </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> TreatA </td> <td style="text-align:left;"> 10.80 </td> <td style="text-align:left;"> 7.85 </td> <td style="text-align:left;"> 9.33 </td> </tr> <tr> <td style="text-align:left;"> TreatB </td> <td style="text-align:left;"> 9.43 </td> <td style="text-align:left;"> 13.11 </td> <td style="text-align:left;"> 11.27 </td> </tr> <tr> <td style="text-align:left;"> TreatC </td> <td style="text-align:left;"> 10.10 </td> <td style="text-align:left;"> 7.98 </td> <td style="text-align:left;"> 9.04 </td> </tr> <tr> <td style="text-align:left;"> Marginal </td> <td style="text-align:left;"> 10.11 </td> <td style="text-align:left;"> 9.65 </td> <td style="text-align:left;"> 9.88 </td> </tr> </tbody> </table> ] --- # Simple Effects with `emmeans` .pull-left[ ```r m1_simple2 <- pairs(m1_emm, simple = "Treatment") m1_simple2 ``` ``` ## Hospital = Hosp1: ## contrast estimate SE df t.ratio p.value ## TreatA - TreatB 1.370 0.523 174 2.619 0.0259 ## TreatA - TreatC 0.697 0.523 174 1.332 0.3796 ## TreatB - TreatC -0.673 0.523 174 -1.287 0.4044 ## ## Hospital = Hosp2: ## contrast estimate SE df t.ratio p.value ## TreatA - TreatB -5.263 0.523 174 -10.061 <.0001 ## TreatA - TreatC -0.127 0.523 174 -0.242 0.9682 ## TreatB - TreatC 5.137 0.523 174 9.819 <.0001 ## ## P value adjustment: tukey method for comparing a family of 3 estimates ``` ] .pull-right[ <table> <thead> <tr> <th style="text-align:left;"> </th> <th style="text-align:left;"> Hosp1 </th> <th style="text-align:left;"> Hosp2 </th> <th style="text-align:left;"> Marginal </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> TreatA </td> <td style="text-align:left;"> 10.80 </td> <td style="text-align:left;"> 7.85 </td> <td style="text-align:left;"> 9.33 </td> </tr> <tr> <td style="text-align:left;"> TreatB </td> <td style="text-align:left;"> 9.43 </td> <td style="text-align:left;"> 13.11 </td> <td style="text-align:left;"> 11.27 </td> </tr> <tr> <td style="text-align:left;"> TreatC </td> <td style="text-align:left;"> 10.10 </td> <td style="text-align:left;"> 7.98 </td> <td style="text-align:left;"> 9.04 </td> </tr> <tr> <td style="text-align:left;"> Marginal </td> <td style="text-align:left;"> 10.11 </td> <td style="text-align:left;"> 9.65 </td> <td style="text-align:left;"> 9.88 </td> </tr> </tbody> </table> ] --- # Simple effects with plots .pull-left[ <img src="dapr2_14_analysingexperiments_files/figure-html/unnamed-chunk-23-1.png" width="90%" /> ] .pull-right[ ```r m1_simple1 ``` ``` ## Treatment = TreatA: ## contrast estimate SE df t.ratio p.value ## Hosp1 - Hosp2 2.95 0.523 174 5.632 <.0001 ## ## Treatment = TreatB: ## contrast estimate SE df t.ratio p.value ## Hosp1 - Hosp2 -3.69 0.523 174 -7.047 <.0001 ## ## Treatment = TreatC: ## contrast estimate SE df t.ratio p.value ## Hosp1 - Hosp2 2.12 0.523 174 4.059 0.0001 ``` ] --- # Simple effects with plots .pull-left[ ```r m1_simple2 ``` ``` ## Hospital = Hosp1: ## contrast estimate SE df t.ratio p.value ## TreatA - TreatB 1.370 0.523 174 2.619 0.0259 ## TreatA - TreatC 0.697 0.523 174 1.332 0.3796 ## TreatB - TreatC -0.673 0.523 174 -1.287 0.4044 ## ## Hospital = Hosp2: ## contrast estimate SE df t.ratio p.value ## TreatA - TreatB -5.263 0.523 174 -10.061 <.0001 ## TreatA - TreatC -0.127 0.523 174 -0.242 0.9682 ## TreatB - TreatC 5.137 0.523 174 9.819 <.0001 ## ## P value adjustment: tukey method for comparing a family of 3 estimates ``` ] .pull-right[ <img src="dapr2_14_analysingexperiments_files/figure-html/unnamed-chunk-26-1.png" width="90%" /> ] --- # Visualizing the interaction .pull-left[ ![](dapr2_14_analysingexperiments_files/figure-html/unnamed-chunk-27-1.png)<!-- --> ] .pull-right[ ```r emmip(m1, Hospital~Treatment) + geom_hline(yintercept = mean(hosp_tbl$SWB)) ``` ] --- class: center, middle # Questions.... --- # Pairwise comparisons + As the name suggests, pairwise comparisons compare all levels of a given predictor variable with all levels of the other. + A fully exploratory pairwise analysis. ```r pairs_res <- pairs(m1_emm) ``` --- # Pairwise comparisons ``` ## contrast estimate SE df t.ratio p.value ## TreatA Hosp1 - TreatB Hosp1 1.370 0.523 174 2.619 0.0982 ## TreatA Hosp1 - TreatC Hosp1 0.697 0.523 174 1.332 0.7670 ## TreatA Hosp1 - TreatA Hosp2 2.947 0.523 174 5.632 <.0001 ## TreatA Hosp1 - TreatB Hosp2 -2.317 0.523 174 -4.428 0.0002 ## TreatA Hosp1 - TreatC Hosp2 2.820 0.523 174 5.390 <.0001 ## TreatB Hosp1 - TreatC Hosp1 -0.673 0.523 174 -1.287 0.7918 ## TreatB Hosp1 - TreatA Hosp2 1.577 0.523 174 3.014 0.0346 ## TreatB Hosp1 - TreatB Hosp2 -3.687 0.523 174 -7.047 <.0001 ## TreatB Hosp1 - TreatC Hosp2 1.450 0.523 174 2.772 0.0670 ## TreatC Hosp1 - TreatA Hosp2 2.250 0.523 174 4.301 0.0004 ## TreatC Hosp1 - TreatB Hosp2 -3.013 0.523 174 -5.760 <.0001 ## TreatC Hosp1 - TreatC Hosp2 2.123 0.523 174 4.059 0.0010 ## TreatA Hosp2 - TreatB Hosp2 -5.263 0.523 174 -10.061 <.0001 ## TreatA Hosp2 - TreatC Hosp2 -0.127 0.523 174 -0.242 0.9999 ## TreatB Hosp2 - TreatC Hosp2 5.137 0.523 174 9.819 <.0001 ## ## P value adjustment: tukey method for comparing a family of 6 estimates ``` --- # Why do pairwise comparisons? + Sometimes we do not have a concrete hypothesis to test. + Sometimes we do, but the exploratory analysis is still useful information for the field. + Pairwise comparisons throws up a statistical issue, namely the problem of multiple comparisons. + When we do lots and lots of tests, the chances of Type I error (false-positives) increase. + How we can adjust our inferences to deal with this? --- # Types of errors + Type I error = False positive + Reject the null when the null is true. + `\(\alpha = P(\text{Type I Error})\)` + Type II error = False negative + Fail to reject the null when the null is false. + `\(\beta = P(\text{Type II Error})\)` --- # A single test + If we perform a single test, our Type I error rate is `\(\alpha\)`. + So if we set `\(\alpha = 0.05\)`, the probability of a false positive is 5%. + However, what if we do multiple tests ( `\(m\)` ), each with the same `\(\alpha\)` level? + What is the probability of a false positive among `\(m\)` tests? --- # Multiple tests `$$P(\text{Type I error}) = \alpha$$` `$$P(\text{not making a Type I error}) = 1 - \alpha$$` `$$P(\text{Not making a Type I error in m tests}) = (1 - \alpha)^m$$` `$$P(\text{Making a Type I error in m tests}) = 1 - (1-\alpha)^m$$` --- # P(Making a Type I error in m tests) + Suppose `\(m=2\)` and `\(\alpha = 0.05\)` ```r 1 - ((1-0.05)^2) ``` ``` ## [1] 0.0975 ``` + Suppose `\(m=5\)` and `\(\alpha = 0.05\)` ```r 1 - ((1-0.05)^5) ``` ``` ## [1] 0.2262191 ``` + Suppose `\(m=10\)` and `\(\alpha = 0.05\)` ```r 1 - ((1-0.05)^10) ``` ``` ## [1] 0.4012631 ``` --- # Why does this matter? + The `\(P(\text{Making a Type I error in m tests}) = 1 - (1-\alpha)^m\)` is referred to as the family-wise error rate. + A "family" is a set of related tests. + When we analyse an experimental design, and we look at lots of specific comparisons, we can think of all these tests as a "family". + The larger the family, the more likely we are to find a false positive (see previous slide). --- # Corrections + There are various methods designed to control for the number of tests. + Here control means to keep the Type I Error rate at an intended `\(\alpha\)`. + Many options. Some of most common: + Bonferroni + Sidak + Tukey + Scheffe + Others you may see: + Holm's step-down + Hochberg's step-up --- # Bonferroni & Sidak + Both are considered "conservative" adjustments. + Each treats individual tests within the family as if they are independent. + Consider an `\(\alpha = 0.05\)` and `\(m=\text{number of tests}=15\)` + **Bonferroni**: `\(\alpha_{Bonferroni} = \frac{\alpha}{m}\)` ```r 0.05/15 ``` ``` ## [1] 0.003333333 ``` + **Sidak**: `\(\alpha_{Sidak} = 1 - (1- \alpha)^{\frac{1}{m}}\)` ```r 1-((1-0.05)^(1/15)) ``` ``` ## [1] 0.003413713 ``` --- # No adjustments ```r pairs(m1_emm, adjust="none") ``` ``` ## contrast estimate SE df t.ratio p.value ## TreatA Hosp1 - TreatB Hosp1 1.370 0.523 174 2.619 0.0096 ## TreatA Hosp1 - TreatC Hosp1 0.697 0.523 174 1.332 0.1847 ## TreatA Hosp1 - TreatA Hosp2 2.947 0.523 174 5.632 <.0001 ## TreatA Hosp1 - TreatB Hosp2 -2.317 0.523 174 -4.428 <.0001 ## TreatA Hosp1 - TreatC Hosp2 2.820 0.523 174 5.390 <.0001 ## TreatB Hosp1 - TreatC Hosp1 -0.673 0.523 174 -1.287 0.1998 ## TreatB Hosp1 - TreatA Hosp2 1.577 0.523 174 3.014 0.0030 ## TreatB Hosp1 - TreatB Hosp2 -3.687 0.523 174 -7.047 <.0001 ## TreatB Hosp1 - TreatC Hosp2 1.450 0.523 174 2.772 0.0062 ## TreatC Hosp1 - TreatA Hosp2 2.250 0.523 174 4.301 <.0001 ## TreatC Hosp1 - TreatB Hosp2 -3.013 0.523 174 -5.760 <.0001 ## TreatC Hosp1 - TreatC Hosp2 2.123 0.523 174 4.059 0.0001 ## TreatA Hosp2 - TreatB Hosp2 -5.263 0.523 174 -10.061 <.0001 ## TreatA Hosp2 - TreatC Hosp2 -0.127 0.523 174 -0.242 0.8090 ## TreatB Hosp2 - TreatC Hosp2 5.137 0.523 174 9.819 <.0001 ``` --- # Bonferroni in action: `emmeans` ```r pairs(m1_emm, adjust="bonferroni") ``` ``` ## contrast estimate SE df t.ratio p.value ## TreatA Hosp1 - TreatB Hosp1 1.370 0.523 174 2.619 0.1441 ## TreatA Hosp1 - TreatC Hosp1 0.697 0.523 174 1.332 1.0000 ## TreatA Hosp1 - TreatA Hosp2 2.947 0.523 174 5.632 <.0001 ## TreatA Hosp1 - TreatB Hosp2 -2.317 0.523 174 -4.428 0.0003 ## TreatA Hosp1 - TreatC Hosp2 2.820 0.523 174 5.390 <.0001 ## TreatB Hosp1 - TreatC Hosp1 -0.673 0.523 174 -1.287 1.0000 ## TreatB Hosp1 - TreatA Hosp2 1.577 0.523 174 3.014 0.0445 ## TreatB Hosp1 - TreatB Hosp2 -3.687 0.523 174 -7.047 <.0001 ## TreatB Hosp1 - TreatC Hosp2 1.450 0.523 174 2.772 0.0928 ## TreatC Hosp1 - TreatA Hosp2 2.250 0.523 174 4.301 0.0004 ## TreatC Hosp1 - TreatB Hosp2 -3.013 0.523 174 -5.760 <.0001 ## TreatC Hosp1 - TreatC Hosp2 2.123 0.523 174 4.059 0.0011 ## TreatA Hosp2 - TreatB Hosp2 -5.263 0.523 174 -10.061 <.0001 ## TreatA Hosp2 - TreatC Hosp2 -0.127 0.523 174 -0.242 1.0000 ## TreatB Hosp2 - TreatC Hosp2 5.137 0.523 174 9.819 <.0001 ## ## P value adjustment: bonferroni method for 15 tests ``` --- # Sidak with `emmeans` ```r pairs(m1_emm, adjust = "sidak") ``` ``` ## contrast estimate SE df t.ratio p.value ## TreatA Hosp1 - TreatB Hosp1 1.370 0.523 174 2.619 0.1348 ## TreatA Hosp1 - TreatC Hosp1 0.697 0.523 174 1.332 0.9533 ## TreatA Hosp1 - TreatA Hosp2 2.947 0.523 174 5.632 <.0001 ## TreatA Hosp1 - TreatB Hosp2 -2.317 0.523 174 -4.428 0.0003 ## TreatA Hosp1 - TreatC Hosp2 2.820 0.523 174 5.390 <.0001 ## TreatB Hosp1 - TreatC Hosp1 -0.673 0.523 174 -1.287 0.9647 ## TreatB Hosp1 - TreatA Hosp2 1.577 0.523 174 3.014 0.0436 ## TreatB Hosp1 - TreatB Hosp2 -3.687 0.523 174 -7.047 <.0001 ## TreatB Hosp1 - TreatC Hosp2 1.450 0.523 174 2.772 0.0889 ## TreatC Hosp1 - TreatA Hosp2 2.250 0.523 174 4.301 0.0004 ## TreatC Hosp1 - TreatB Hosp2 -3.013 0.523 174 -5.760 <.0001 ## TreatC Hosp1 - TreatC Hosp2 2.123 0.523 174 4.059 0.0011 ## TreatA Hosp2 - TreatB Hosp2 -5.263 0.523 174 -10.061 <.0001 ## TreatA Hosp2 - TreatC Hosp2 -0.127 0.523 174 -0.242 1.0000 ## TreatB Hosp2 - TreatC Hosp2 5.137 0.523 174 9.819 <.0001 ## ## P value adjustment: sidak method for 15 tests ``` --- # What about if there were less tests? ```r pairs(m1_emm, simple="Treatment", adjust="bonferroni") ``` ``` ## Hospital = Hosp1: ## contrast estimate SE df t.ratio p.value ## TreatA - TreatB 1.370 0.523 174 2.619 0.0288 ## TreatA - TreatC 0.697 0.523 174 1.332 0.5541 ## TreatB - TreatC -0.673 0.523 174 -1.287 0.5993 ## ## Hospital = Hosp2: ## contrast estimate SE df t.ratio p.value ## TreatA - TreatB -5.263 0.523 174 -10.061 <.0001 ## TreatA - TreatC -0.127 0.523 174 -0.242 1.0000 ## TreatB - TreatC 5.137 0.523 174 9.819 <.0001 ## ## P value adjustment: bonferroni method for 3 tests ``` --- # Scheffe & Tukey + **Scheffe procedure** involves an adjustment to the critical value of `\(F\)`. + The adjustment relates to the number of comparisons being made. + And makes the critical value of `\(F\)` larger for a fixed `\(\alpha\)`. + The more tests, the larger `\(F\)`. + The square-root of the adjusted F provides and adjusted `\(t\)`. -- + **Tukey's HSD** (Honest significant differences) + Compares all pairwise group means. + Each difference is divided by the `\(SE\)` of the sum of means. + This produces a `\(q\)` statistic for each comparison. + And is compared against a studentized range distribution. --- # With `emmeans` ```r pairs(m1_emm, adjust = "tukey") ``` ``` ## contrast estimate SE df t.ratio p.value ## TreatA Hosp1 - TreatB Hosp1 1.370 0.523 174 2.619 0.0982 ## TreatA Hosp1 - TreatC Hosp1 0.697 0.523 174 1.332 0.7670 ## TreatA Hosp1 - TreatA Hosp2 2.947 0.523 174 5.632 <.0001 ## TreatA Hosp1 - TreatB Hosp2 -2.317 0.523 174 -4.428 0.0002 ## TreatA Hosp1 - TreatC Hosp2 2.820 0.523 174 5.390 <.0001 ## TreatB Hosp1 - TreatC Hosp1 -0.673 0.523 174 -1.287 0.7918 ## TreatB Hosp1 - TreatA Hosp2 1.577 0.523 174 3.014 0.0346 ## TreatB Hosp1 - TreatB Hosp2 -3.687 0.523 174 -7.047 <.0001 ## TreatB Hosp1 - TreatC Hosp2 1.450 0.523 174 2.772 0.0670 ## TreatC Hosp1 - TreatA Hosp2 2.250 0.523 174 4.301 0.0004 ## TreatC Hosp1 - TreatB Hosp2 -3.013 0.523 174 -5.760 <.0001 ## TreatC Hosp1 - TreatC Hosp2 2.123 0.523 174 4.059 0.0010 ## TreatA Hosp2 - TreatB Hosp2 -5.263 0.523 174 -10.061 <.0001 ## TreatA Hosp2 - TreatC Hosp2 -0.127 0.523 174 -0.242 0.9999 ## TreatB Hosp2 - TreatC Hosp2 5.137 0.523 174 9.819 <.0001 ## ## P value adjustment: tukey method for comparing a family of 6 estimates ``` --- # With `emmeans` ```r pairs(m1_emm, adjust = "scheffe") ``` ``` ## contrast estimate SE df t.ratio p.value ## TreatA Hosp1 - TreatB Hosp1 1.370 0.523 174 2.619 0.2372 ## TreatA Hosp1 - TreatC Hosp1 0.697 0.523 174 1.332 0.8787 ## TreatA Hosp1 - TreatA Hosp2 2.947 0.523 174 5.632 <.0001 ## TreatA Hosp1 - TreatB Hosp2 -2.317 0.523 174 -4.428 0.0021 ## TreatA Hosp1 - TreatC Hosp2 2.820 0.523 174 5.390 0.0001 ## TreatB Hosp1 - TreatC Hosp1 -0.673 0.523 174 -1.287 0.8935 ## TreatB Hosp1 - TreatA Hosp2 1.577 0.523 174 3.014 0.1119 ## TreatB Hosp1 - TreatB Hosp2 -3.687 0.523 174 -7.047 <.0001 ## TreatB Hosp1 - TreatC Hosp2 1.450 0.523 174 2.772 0.1809 ## TreatC Hosp1 - TreatA Hosp2 2.250 0.523 174 4.301 0.0033 ## TreatC Hosp1 - TreatB Hosp2 -3.013 0.523 174 -5.760 <.0001 ## TreatC Hosp1 - TreatC Hosp2 2.123 0.523 174 4.059 0.0072 ## TreatA Hosp2 - TreatB Hosp2 -5.263 0.523 174 -10.061 <.0001 ## TreatA Hosp2 - TreatC Hosp2 -0.127 0.523 174 -0.242 1.0000 ## TreatB Hosp2 - TreatC Hosp2 5.137 0.523 174 9.819 <.0001 ## ## P value adjustment: scheffe method with rank 5 ``` --- # Summary + We are know finished with discussing testing effects within `lm`. + This week we have looked at ways we can code categorical data to test experimental effects. + In the next block, we will consider: + Non-continuous **dependent** variables + Approaches to modelling and modelling issues + Power + Last week of the course will be all Q&A and discussing exam. --- class: center, middle # Thanks for listening!