class: center, middle, inverse, title-slide .title[ #
Analysing Experiments
] .subtitle[ ## Data Analysis for Psychology in R 2
] .author[ ### dapR2 Team ] .institute[ ### Department of Psychology
The University of Edinburgh ] --- # Course Overview .pull-left[ <!--- I've just copied the output of the Sem 1 table here and removed the bolding on the last week, so things look consistent with the trailing opacity produced by the course_table.R script otherwise. --> <table style="border: 1px solid black;> <tr style="padding: 0 1em 0 1em;"> <td rowspan="5" style="border: 1px solid black;padding: 0 1em 0 1em;opacity:1;text-align:center;vertical-align: middle"> <b>Introduction to Linear Models</b></td> <td style="border: 1px solid black;padding: 0 1em 0 1em;opacity:1"> Intro to Linear Regression</td> </tr> <tr><td style="border: 1px solid black;padding: 0 1em 0 1em;opacity:1"> Interpreting Linear Models</td></tr> <tr><td style="border: 1px solid black;padding: 0 1em 0 1em;opacity:1"> Testing Individual Predictors</td></tr> <tr><td style="border: 1px solid black;padding: 0 1em 0 1em;opacity:1"> Model Testing & Comparison</td></tr> <tr><td style="border: 1px solid black;padding: 0 1em 0 1em;opacity:1"> Linear Model Analysis</td></tr> <tr style="padding: 0 1em 0 1em;"> <td rowspan="5" style="border: 1px solid black;padding: 0 1em 0 1em;opacity:1;text-align:center;vertical-align: middle"> <b>Analysing Experimental Studies</b></td> <td style="border: 1px solid black;padding: 0 1em 0 1em;opacity:1"> Categorical Predictors & Dummy Coding</td> </tr> <tr><td style="border: 1px solid black;padding: 0 1em 0 1em;opacity:1"> Effects Coding & Coding Specific Contrasts</td></tr> <tr><td style="border: 1px solid black;padding: 0 1em 0 1em;opacity:1"> Assumptions & Diagnostics</td></tr> <tr><td style="border: 1px solid black;padding: 0 1em 0 1em;opacity:1"> Bootstrapping</td></tr> <tr><td style="border: 1px solid black;padding: 0 1em 0 1em;opacity:1"> Categorical Predictor Analysis</td></tr> </table> ] .pull-right[ <table style="border: 1px solid black;> <tr style="padding: 0 1em 0 1em;"> <td rowspan="5" style="border: 1px solid black;padding: 0 1em 0 1em;opacity:1;text-align:center;vertical-align: middle"> <b>Interactions</b></td> <td style="border: 1px solid black;padding: 0 1em 0 1em;opacity:1"> Interactions I</td> </tr> <tr><td style="border: 1px solid black;padding: 0 1em 0 1em;opacity:1"> Interactions II</td></tr> <tr><td style="border: 1px solid black;padding: 0 1em 0 1em;opacity:1"> Interactions III</td></tr> <tr><td style="border: 1px solid black;padding: 0 1em 0 1em;opacity:1"> <b>Analysing Experiments</b></td></tr> <tr><td style="border: 1px solid black;padding: 0 1em 0 1em;opacity:0.4"> Interaction Analysis</td></tr> <tr style="padding: 0 1em 0 1em;"> <td rowspan="5" style="border: 1px solid black;padding: 0 1em 0 1em;opacity:0.4;text-align:center;vertical-align: middle"> <b>Advanced Topics</b></td> <td style="border: 1px solid black;padding: 0 1em 0 1em;opacity:0.4"> Power Analysis</td> </tr> <tr><td style="border: 1px solid black;padding: 0 1em 0 1em;opacity:0.4"> Binary Logistic Regression I</td></tr> <tr><td style="border: 1px solid black;padding: 0 1em 0 1em;opacity:0.4"> Binary Logistic Regression II</td></tr> <tr><td style="border: 1px solid black;padding: 0 1em 0 1em;opacity:0.4"> Logistic Regresison Analysis</td></tr> <tr><td style="border: 1px solid black;padding: 0 1em 0 1em;opacity:0.4"> Exam Prep and Course Q&A</td></tr> </table> ] --- # Week's Learning Objectives 1. Interpretation of interactions with effects-coded variables 2. Distinguish between interactions, main effect contrasts, and simple effect contrasts 3. Use contrasts to code specific interaction hypotheses 4. Apply pairwise tests and corrections --- # Hypotheses Testing in Factorial Designs + **Interactions (categorical-categorical)** + A change in the effect of one condition as a function of another + Does the effect of `Treatment` differ by `Hospital`? <!-- + With effects coding, we can also think of this as a difference in simple effects. --> + **Main effects / main effect contrasts** + An overall, or average, effect of a condition over combined levels of another + Is there an effect of `Treatment` averaged over `Hospital`? + **Simple Effects** + An effect of one condition at a specific level of another + Is there an effect of `Hospital` for those receiving `Treatment A`? (...and so on for all combinations) + In factorial designs with more than two levels of one or more of the conditions, one can also distinguish between *simple effects* and *simple contrasts* + A simple contrast is a more focused test that directly compares only two cells (or a particular combination of cells) --- # Our model and coefficients + The effects-coded model: `$$y_{i} = b_0 + \underbrace{(b_1E_1 + b_2E_2)}_{\text{Treatment}} + \underbrace{b_3E_3}_{\text{Hospital}} + \underbrace{b_4E_{13} + b_5E_{23}}_{\text{Interactions}} + \epsilon_{i}$$` + Where the variables represent the following comparisons: ``` ## # A tibble: 6 × 7 ## Treatment Hospital E1 E2 E3 E13 E23 ## <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> ## 1 A Hosp1 1 0 1 1 0 ## 2 A Hosp2 1 0 -1 -1 0 ## 3 B Hosp1 0 1 1 0 1 ## 4 B Hosp2 0 1 -1 0 -1 ## 5 C Hosp1 -1 -1 1 -1 -1 ## 6 C Hosp2 -1 -1 -1 1 1 ``` --- # Specifying our model in R + And our model in R: ``` r m1 <- lm(SWB ~ Treatment + Hospital + Treatment*Hospital, data = hosp_tbl, contrasts = list(Treatment = contr.sum, Hospital = contr.sum)) # you can code contrasts within lm ``` + Remember that the `contrasts` argument of the `lm` function will be set to dummy coding by default, unless otherwise specified + Here we have specified the value of the `contrasts` argument within the `lm` function --- # Table of means + Useful to keep in mind the group means: | |Hosp1 |Hosp2 |Marginal | |:--------|:-----|:-----|:--------| |TreatA |10.80 |7.85 |9.33 | |TreatB |9.43 |13.11 |11.27 | |TreatC |10.10 |7.98 |9.04 | |Marginal |10.11 |9.65 |9.88 | + Marginal means indicate the means of group means by rows and columns (for the two variables) + 9.88 is the grand mean (mean of all group means) --- # What will the model tell us? + **Marginal effects:** Does a marginal mean differ from the grand mean? + **Interactions:** Does the distance of the row or column marginals from the grand mean differ depending on the level of the other variable? | |Hosp1 |Hosp2 |Marginal | |:--------|:-----|:-----|:--------| |TreatA |10.80 |7.85 |9.33 | |TreatB |9.43 |13.11 |11.27 | |TreatC |10.10 |7.98 |9.04 | |Marginal |10.11 |9.65 |9.88 | --- # Our results ``` ## ## Call: ## lm(formula = SWB ~ Treatment + Hospital + Treatment * Hospital, ## data = hosp_tbl, contrasts = list(Treatment = contr.sum, ## Hospital = contr.sum)) ## ## Residuals: ## Min 1Q Median 3Q Max ## -6.6000 -1.2533 0.1083 1.2650 5.7000 ## ## Coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) 9.8806 0.1510 65.425 < 2e-16 *** ## Treatment1 -0.5539 0.2136 -2.593 0.0103 * ## Treatment2 1.3928 0.2136 6.521 7.30e-10 *** ## Hospital1 0.2306 0.1510 1.527 0.1287 ## Treatment1:Hospital1 1.2428 0.2136 5.819 2.79e-08 *** ## Treatment2:Hospital1 -2.0739 0.2136 -9.710 < 2e-16 *** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Residual standard error: 2.026 on 174 degrees of freedom ## Multiple R-squared: 0.4476, Adjusted R-squared: 0.4317 ## F-statistic: 28.2 on 5 and 174 DF, p-value: < 2.2e-16 ``` --- # Interpretation with effects coding .pull-left[ ``` ## Estimate Std. Error ## (Intercept) 9.88 0.15 ## Treatment1 -0.55 0.21 ## Treatment2 1.39 0.21 ## Hospital1 0.23 0.15 ## Treatment1:Hospital1 1.24 0.21 ## Treatment2:Hospital1 -2.07 0.21 ``` ] .pull-right[ | |Hosp1 |Hosp2 |Marginal | |:--------|:-----|:-----|:--------| |TreatA |10.80 |7.85 |9.33 | |TreatB |9.43 |13.11 |11.27 | |TreatC |10.10 |7.98 |9.04 | |Marginal |10.11 |9.65 |9.88 | ] <br> + `\(b_0\)` (`Intercept`) = Grand mean -- <br> + 9.88 --- # Interpretation with effects coding .pull-left[ ``` ## Estimate Std. Error ## (Intercept) 9.88 0.15 ## Treatment1 -0.55 0.21 ## Treatment2 1.39 0.21 ## Hospital1 0.23 0.15 ## Treatment1:Hospital1 1.24 0.21 ## Treatment2:Hospital1 -2.07 0.21 ``` ] .pull-right[ | |Hosp1 |Hosp2 |Marginal | |:--------|:-----|:-----|:--------| |TreatA |10.80 |7.85 |9.33 | |TreatB |9.43 |13.11 |11.27 | |TreatC |10.10 |7.98 |9.04 | |Marginal |10.11 |9.65 |9.88 | ] <br> + `\(b_1\)` (`Treatment1`) = Difference between row marginal for treatment A and the grand mean -- <br> + 9.33 - 9.88 --- # Interpretation with effects coding .pull-left[ ``` ## Estimate Std. Error ## (Intercept) 9.88 0.15 ## Treatment1 -0.55 0.21 ## Treatment2 1.39 0.21 ## Hospital1 0.23 0.15 ## Treatment1:Hospital1 1.24 0.21 ## Treatment2:Hospital1 -2.07 0.21 ``` ] .pull-right[ | |Hosp1 |Hosp2 |Marginal | |:--------|:-----|:-----|:--------| |TreatA |10.80 |7.85 |9.33 | |TreatB |9.43 |13.11 |11.27 | |TreatC |10.10 |7.98 |9.04 | |Marginal |10.11 |9.65 |9.88 | ] <br> + `\(b_2\)` (`Treatment2`) = Difference between row marginal for treatment B and the grand mean -- <br> + 11.27 - 9.88 --- # Interpretation with effects coding .pull-left[ ``` ## Estimate Std. Error ## (Intercept) 9.88 0.15 ## Treatment1 -0.55 0.21 ## Treatment2 1.39 0.21 ## Hospital1 0.23 0.15 ## Treatment1:Hospital1 1.24 0.21 ## Treatment2:Hospital1 -2.07 0.21 ``` ] .pull-right[ | |Hosp1 |Hosp2 |Marginal | |:--------|:-----|:-----|:--------| |TreatA |10.80 |7.85 |9.33 | |TreatB |9.43 |13.11 |11.27 | |TreatC |10.10 |7.98 |9.04 | |Marginal |10.11 |9.65 |9.88 | ] <br> + `\(b_3\)` (`Hospital1`) = Difference between column marginal for Hospital 1 and the grand mean -- <br> + 10.11 - 9.88 --- # Interpretation with effects coding .pull-left[ ``` ## Estimate Std. Error ## (Intercept) 9.88 0.15 ## Treatment1 -0.55 0.21 ## Treatment2 1.39 0.21 ## Hospital1 0.23 0.15 ## Treatment1:Hospital1 1.24 0.21 ## Treatment2:Hospital1 -2.07 0.21 ``` ] .pull-right[ | |Hosp1 |Hosp2 |Marginal | |:--------|:-----|:-----|:--------| |TreatA |10.80 |7.85 |9.33 | |TreatB |9.43 |13.11 |11.27 | |TreatC |10.10 |7.98 |9.04 | |Marginal |10.11 |9.65 |9.88 | ] <br> + `\(b_4\)` (`Treatment1:Hospital1`) = Difference between Treatment A in Hospital 1 and the grand mean, compared to the overall effect of Treatment A (vs grand mean) and the overall effect of Hospital 1 (vs grand mean) -- <br> + `\(b_4 = (\mu_{TreatmentAHospital1} - b_0) - b_1 - b_3\)` -- <br> + (10.80 - 9.88) - (9.33 - 9.88) - (10.11 - 9.88) --- # Interpretation with effects coding .pull-left[ ``` ## Estimate Std. Error ## (Intercept) 9.88 0.15 ## Treatment1 -0.55 0.21 ## Treatment2 1.39 0.21 ## Hospital1 0.23 0.15 ## Treatment1:Hospital1 1.24 0.21 ## Treatment2:Hospital1 -2.07 0.21 ``` ] .pull-right[ | |Hosp1 |Hosp2 |Marginal | |:--------|:-----|:-----|:--------| |TreatA |10.80 |7.85 |9.33 | |TreatB |9.43 |13.11 |11.27 | |TreatC |10.10 |7.98 |9.04 | |Marginal |10.11 |9.65 |9.88 | ] <br> + `\(b_5\)` (`Treatment2:Hospital1`) = Difference between Treatment B in Hospital 1 and the grand mean, compared to the overall effect of Treatment B (vs grand mean) and the overall effect of Hospital 1 (vs grand mean) -- <br> + `\(b_5 = (\mu_{TreatmentBHospital1} - b_0) - b_2 - b_3\)` -- <br> + (9.43 - 9.88) - (11.27 - 9.88) - (10.11 - 9.88) --- # Predictions from the regression formula `$$\hat{y} = b_0 + b_1E_1 + b2E_2 + b_3E_3 + b_4E_{13} + b_5E_{23}$$` + Plugging in our `\(\beta\)` estimates: `$$\hat{y} = 9.88 - 0.55 \cdot E_1 + 1.39 \cdot E_2 + 0.23 \cdot E_3 + 1.24 \cdot E_{13} - 2.07 \cdot E_{23}$$` <br> .pull-left[ ``` ## # A tibble: 6 × 7 ## Treatment Hospital E1 E2 E3 E13 E23 ## <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> ## 1 A Hosp1 1 0 1 1 0 ## 2 A Hosp2 1 0 -1 -1 0 ## 3 B Hosp1 0 1 1 0 1 ## 4 B Hosp2 0 1 -1 0 -1 ## 5 C Hosp1 -1 -1 1 -1 -1 ## 6 C Hosp2 -1 -1 -1 1 1 ``` ] .pull-right[ | |Hosp1 |Hosp2 |Marginal | |:--------|:-----|:-----|:--------| |TreatA |10.80 |7.85 |9.33 | |TreatB |9.43 |13.11 |11.27 | |TreatC |10.10 |7.98 |9.04 | |Marginal |10.11 |9.65 |9.88 | ] --- class: center, middle # Questions? --- # Interacting manual contrasts + We can test an interaction based on manual contrasts + Suppose we wanted to test whether there is a difference between Treatment A vs Treatment B and Treatment C across hospitals + Steps: + Define the contrasts for each factor + Calculate the product (remember all interactions are products) + Create the contrast code + Run the contrast --- # Interacting manual contrasts in R + Step 1: Individual contrasts (Comparing Treatment A to the rest) ``` r treat_con <- c("TreatA" = 1, "TreatB" = -1/2, "TreatC" = -1/2) hosp_con <- c("Hosp1" = 1, "Hosp2" = -1) ``` + Step 2: Calculating the product (with the `outer` function) ``` r contr_prod <- outer(treat_con, hosp_con) contr_prod ``` ``` ## Hosp1 Hosp2 ## TreatA 1.0 -1.0 ## TreatB -0.5 0.5 ## TreatC -0.5 0.5 ``` --- # Interacting manual contrasts in R + Step 3: Create the contrast ``` r m1_emm <- emmeans(m1, ~Treatment*Hospital) m1_emm ``` ``` ## Treatment Hospital emmean SE df lower.CL upper.CL ## TreatA Hosp1 10.80 0.37 174 10.07 11.53 ## TreatB Hosp1 9.43 0.37 174 8.70 10.16 ## TreatC Hosp1 10.10 0.37 174 9.37 10.83 ## TreatA Hosp2 7.85 0.37 174 7.12 8.58 ## TreatB Hosp2 13.12 0.37 174 12.39 13.85 ## TreatC Hosp2 7.98 0.37 174 7.25 8.71 ## ## Confidence level used: 0.95 ``` Input the weights we calculated into an object: ``` r int_comp <- list("TreatA vs B and C across hospital" = c(1, -1/2, -1/2, -1, 1/2, 1/2)) ``` --- # Interacting manual contrasts in R + Step 4: Run the contrast ``` r int_comp_test <- contrast(m1_emm, int_comp) int_comp_test ``` ``` ## contrast estimate SE df t.ratio p.value ## TreatA vs B and C across hospital 3.73 0.641 174 5.819 <.0001 ``` <br> + For interpretation, remember the contrast weights: ``` r contr_prod ``` ``` ## Hosp1 Hosp2 ## TreatA 1.0 -1.0 ## TreatB -0.5 0.5 ## TreatC -0.5 0.5 ``` --- class: center, middle # Questions? --- # Simple Effects + We noted previously that simple effects consider the effect of one condition at a specific level of the other + Is there an effect of `Hospital` for those receiving `Treatment A`? (and so on for all combinations) + Or, put another way, is there a difference in `SWB` between Hospitals 1 and 2 for people receiving Treatment A? --- # Simple Effects with `emmeans` .pull-left[ ``` r m1_emm <- emmeans(m1, ~Treatment*Hospital) m1_simple1 <- pairs(m1_emm, simple = "Hospital") m1_simple1 ``` ``` ## Treatment = TreatA: ## contrast estimate SE df t.ratio p.value ## Hosp1 - Hosp2 2.95 0.523 174 5.632 <.0001 ## ## Treatment = TreatB: ## contrast estimate SE df t.ratio p.value ## Hosp1 - Hosp2 -3.69 0.523 174 -7.047 <.0001 ## ## Treatment = TreatC: ## contrast estimate SE df t.ratio p.value ## Hosp1 - Hosp2 2.12 0.523 174 4.059 0.0001 ``` ] .pull-right[ | |Hosp1 |Hosp2 |Marginal | |:--------|:-----|:-----|:--------| |TreatA |10.80 |7.85 |9.33 | |TreatB |9.43 |13.11 |11.27 | |TreatC |10.10 |7.98 |9.04 | |Marginal |10.11 |9.65 |9.88 | ] --- # Simple Effects with `emmeans` .pull-left[ ``` r m1_simple2 <- pairs(m1_emm, simple = "Treatment") m1_simple2 ``` ``` ## Hospital = Hosp1: ## contrast estimate SE df t.ratio p.value ## TreatA - TreatB 1.370 0.523 174 2.619 0.0259 ## TreatA - TreatC 0.697 0.523 174 1.332 0.3796 ## TreatB - TreatC -0.673 0.523 174 -1.287 0.4044 ## ## Hospital = Hosp2: ## contrast estimate SE df t.ratio p.value ## TreatA - TreatB -5.263 0.523 174 -10.061 <.0001 ## TreatA - TreatC -0.127 0.523 174 -0.242 0.9682 ## TreatB - TreatC 5.137 0.523 174 9.819 <.0001 ## ## P value adjustment: tukey method for comparing a family of 3 estimates ``` ] .pull-right[ | |Hosp1 |Hosp2 |Marginal | |:--------|:-----|:-----|:--------| |TreatA |10.80 |7.85 |9.33 | |TreatB |9.43 |13.11 |11.27 | |TreatC |10.10 |7.98 |9.04 | |Marginal |10.11 |9.65 |9.88 | ] --- # Simple Effects with plots .pull-left[ ``` r emmip(m1, Treatment ~ Hospital) ``` <img src="dapr2_14_analysingexperiments_files/figure-html/unnamed-chunk-32-1.png" width="90%" /> ] .pull-right[ ``` r m1_simple1 ``` ``` ## Treatment = TreatA: ## contrast estimate SE df t.ratio p.value ## Hosp1 - Hosp2 2.95 0.523 174 5.632 <.0001 ## ## Treatment = TreatB: ## contrast estimate SE df t.ratio p.value ## Hosp1 - Hosp2 -3.69 0.523 174 -7.047 <.0001 ## ## Treatment = TreatC: ## contrast estimate SE df t.ratio p.value ## Hosp1 - Hosp2 2.12 0.523 174 4.059 0.0001 ``` ] --- # Simple Effects with plots .pull-left[ ``` r emmip(m1, Hospital ~ Treatment) ``` <img src="dapr2_14_analysingexperiments_files/figure-html/unnamed-chunk-34-1.png" width="90%" /> ] .pull-right[ ``` r m1_simple2 ``` ``` ## Hospital = Hosp1: ## contrast estimate SE df t.ratio p.value ## TreatA - TreatB 1.370 0.523 174 2.619 0.0259 ## TreatA - TreatC 0.697 0.523 174 1.332 0.3796 ## TreatB - TreatC -0.673 0.523 174 -1.287 0.4044 ## ## Hospital = Hosp2: ## contrast estimate SE df t.ratio p.value ## TreatA - TreatB -5.263 0.523 174 -10.061 <.0001 ## TreatA - TreatC -0.127 0.523 174 -0.242 0.9682 ## TreatB - TreatC 5.137 0.523 174 9.819 <.0001 ## ## P value adjustment: tukey method for comparing a family of 3 estimates ``` ] --- class: center, middle # Questions? --- # Pairwise comparisons + As the name suggests, pairwise comparisons compare all levels of a given predictor variable with all levels of the other + A fully exploratory pairwise analysis ``` r pairs_res <- pairs(m1_emm) ``` --- # Pairwise comparisons ``` ## contrast estimate SE df t.ratio p.value ## TreatA Hosp1 - TreatB Hosp1 1.370 0.523 174 2.619 0.0982 ## TreatA Hosp1 - TreatC Hosp1 0.697 0.523 174 1.332 0.7670 ## TreatA Hosp1 - TreatA Hosp2 2.947 0.523 174 5.632 <.0001 ## TreatA Hosp1 - TreatB Hosp2 -2.317 0.523 174 -4.428 0.0002 ## TreatA Hosp1 - TreatC Hosp2 2.820 0.523 174 5.390 <.0001 ## TreatB Hosp1 - TreatC Hosp1 -0.673 0.523 174 -1.287 0.7918 ## TreatB Hosp1 - TreatA Hosp2 1.577 0.523 174 3.014 0.0346 ## TreatB Hosp1 - TreatB Hosp2 -3.687 0.523 174 -7.047 <.0001 ## TreatB Hosp1 - TreatC Hosp2 1.450 0.523 174 2.772 0.0670 ## TreatC Hosp1 - TreatA Hosp2 2.250 0.523 174 4.301 0.0004 ## TreatC Hosp1 - TreatB Hosp2 -3.013 0.523 174 -5.760 <.0001 ## TreatC Hosp1 - TreatC Hosp2 2.123 0.523 174 4.059 0.0010 ## TreatA Hosp2 - TreatB Hosp2 -5.263 0.523 174 -10.061 <.0001 ## TreatA Hosp2 - TreatC Hosp2 -0.127 0.523 174 -0.242 0.9999 ## TreatB Hosp2 - TreatC Hosp2 5.137 0.523 174 9.819 <.0001 ## ## P value adjustment: tukey method for comparing a family of 6 estimates ``` --- # Why do pairwise comparisons? + Sometimes we do not have a concrete hypothesis to test + Sometimes we do, but the exploratory analysis still gives useful information + Pairwise comparisons throws up a statistical issue, namely the problem of multiple comparisons + When we do lots and lots of tests, the chances of Type I error (false-positives) increase + How we can adjust our inferences to deal with this? --- # Types of errors + Type I error = False positive + Reject the null when the null is true + `\(\alpha = P(\text{Type I Error})\)` + Type II error = False negative + Fail to reject the null when the null is false + `\(\beta = P(\text{Type II Error})\)` --- # A single test + If we perform a single test, our Type I error rate is `\(\alpha\)` + So if we set `\(\alpha = 0.05\)`, the probability of a false positive is 5% + However, what if we do multiple tests ( `\(m\)` ), each with the same `\(\alpha\)` level? + What is the probability of a false positive among `\(m\)` tests? --- # Multiple tests `$$P(\text{Type I error}) = \alpha$$` `$$P(\text{not making a Type I error}) = 1 - \alpha$$` `$$P(\text{Not making a Type I error in m tests}) = (1 - \alpha)^m$$` `$$P(\text{Making a Type I error in m tests}) = 1 - (1-\alpha)^m$$` --- # P(Making a Type I error in m tests) + Suppose `\(m=2\)` and `\(\alpha = 0.05\)` ``` r 1 - ((1-0.05)^2) ``` ``` ## [1] 0.0975 ``` + Suppose `\(m=5\)` and `\(\alpha = 0.05\)` ``` r 1 - ((1-0.05)^5) ``` ``` ## [1] 0.2262191 ``` + Suppose `\(m=10\)` and `\(\alpha = 0.05\)` ``` r 1 - ((1-0.05)^10) ``` ``` ## [1] 0.4012631 ``` --- # Why does this matter? + The `\(P(\text{Making a Type I error in m tests}) = 1 - (1-\alpha)^m\)` is referred to as the family-wise error rate + A "family" is a set of related tests + When we analyse an experimental design, and we look at lots of specific comparisons, we can think of all these tests as a "family" + The larger the family, the more likely we are to find a false positive (see previous slide) --- # Corrections + There are various methods designed to control for the number of tests + Here control means to keep the Type I Error rate at an intended `\(\alpha\)` + Many options. Some of most common: + Bonferroni + Sidak + Tukey + Scheffe + Others you may see: + Holm's step-down + Hochberg's step-up --- # Bonferroni & Sidak + Both are considered "conservative" adjustments + Each treats individual tests within the family as if they are independent + Consider an `\(\alpha = 0.05\)` and `\(m=\text{number of tests}=15\)` + **Bonferroni**: `\(\alpha_{Bonferroni} = \frac{\alpha}{m}\)` ``` r 0.05/15 ``` ``` ## [1] 0.003333333 ``` + **Sidak**: `\(\alpha_{Sidak} = 1 - (1- \alpha)^{\frac{1}{m}}\)` ``` r 1-((1-0.05)^(1/15)) ``` ``` ## [1] 0.003413713 ``` --- # No adjustments ``` r pairs(m1_emm, adjust="none") ``` ``` ## contrast estimate SE df t.ratio p.value ## TreatA Hosp1 - TreatB Hosp1 1.370 0.523 174 2.619 0.0096 ## TreatA Hosp1 - TreatC Hosp1 0.697 0.523 174 1.332 0.1847 ## TreatA Hosp1 - TreatA Hosp2 2.947 0.523 174 5.632 <.0001 ## TreatA Hosp1 - TreatB Hosp2 -2.317 0.523 174 -4.428 <.0001 ## TreatA Hosp1 - TreatC Hosp2 2.820 0.523 174 5.390 <.0001 ## TreatB Hosp1 - TreatC Hosp1 -0.673 0.523 174 -1.287 0.1998 ## TreatB Hosp1 - TreatA Hosp2 1.577 0.523 174 3.014 0.0030 ## TreatB Hosp1 - TreatB Hosp2 -3.687 0.523 174 -7.047 <.0001 ## TreatB Hosp1 - TreatC Hosp2 1.450 0.523 174 2.772 0.0062 ## TreatC Hosp1 - TreatA Hosp2 2.250 0.523 174 4.301 <.0001 ## TreatC Hosp1 - TreatB Hosp2 -3.013 0.523 174 -5.760 <.0001 ## TreatC Hosp1 - TreatC Hosp2 2.123 0.523 174 4.059 0.0001 ## TreatA Hosp2 - TreatB Hosp2 -5.263 0.523 174 -10.061 <.0001 ## TreatA Hosp2 - TreatC Hosp2 -0.127 0.523 174 -0.242 0.8090 ## TreatB Hosp2 - TreatC Hosp2 5.137 0.523 174 9.819 <.0001 ``` --- # Bonferroni in action: `emmeans` ``` r pairs(m1_emm, adjust="bonferroni") ``` ``` ## contrast estimate SE df t.ratio p.value ## TreatA Hosp1 - TreatB Hosp1 1.370 0.523 174 2.619 0.1441 ## TreatA Hosp1 - TreatC Hosp1 0.697 0.523 174 1.332 1.0000 ## TreatA Hosp1 - TreatA Hosp2 2.947 0.523 174 5.632 <.0001 ## TreatA Hosp1 - TreatB Hosp2 -2.317 0.523 174 -4.428 0.0003 ## TreatA Hosp1 - TreatC Hosp2 2.820 0.523 174 5.390 <.0001 ## TreatB Hosp1 - TreatC Hosp1 -0.673 0.523 174 -1.287 1.0000 ## TreatB Hosp1 - TreatA Hosp2 1.577 0.523 174 3.014 0.0445 ## TreatB Hosp1 - TreatB Hosp2 -3.687 0.523 174 -7.047 <.0001 ## TreatB Hosp1 - TreatC Hosp2 1.450 0.523 174 2.772 0.0928 ## TreatC Hosp1 - TreatA Hosp2 2.250 0.523 174 4.301 0.0004 ## TreatC Hosp1 - TreatB Hosp2 -3.013 0.523 174 -5.760 <.0001 ## TreatC Hosp1 - TreatC Hosp2 2.123 0.523 174 4.059 0.0011 ## TreatA Hosp2 - TreatB Hosp2 -5.263 0.523 174 -10.061 <.0001 ## TreatA Hosp2 - TreatC Hosp2 -0.127 0.523 174 -0.242 1.0000 ## TreatB Hosp2 - TreatC Hosp2 5.137 0.523 174 9.819 <.0001 ## ## P value adjustment: bonferroni method for 15 tests ``` --- # Sidak with `emmeans` ``` r pairs(m1_emm, adjust = "sidak") ``` ``` ## contrast estimate SE df t.ratio p.value ## TreatA Hosp1 - TreatB Hosp1 1.370 0.523 174 2.619 0.1348 ## TreatA Hosp1 - TreatC Hosp1 0.697 0.523 174 1.332 0.9533 ## TreatA Hosp1 - TreatA Hosp2 2.947 0.523 174 5.632 <.0001 ## TreatA Hosp1 - TreatB Hosp2 -2.317 0.523 174 -4.428 0.0003 ## TreatA Hosp1 - TreatC Hosp2 2.820 0.523 174 5.390 <.0001 ## TreatB Hosp1 - TreatC Hosp1 -0.673 0.523 174 -1.287 0.9647 ## TreatB Hosp1 - TreatA Hosp2 1.577 0.523 174 3.014 0.0436 ## TreatB Hosp1 - TreatB Hosp2 -3.687 0.523 174 -7.047 <.0001 ## TreatB Hosp1 - TreatC Hosp2 1.450 0.523 174 2.772 0.0889 ## TreatC Hosp1 - TreatA Hosp2 2.250 0.523 174 4.301 0.0004 ## TreatC Hosp1 - TreatB Hosp2 -3.013 0.523 174 -5.760 <.0001 ## TreatC Hosp1 - TreatC Hosp2 2.123 0.523 174 4.059 0.0011 ## TreatA Hosp2 - TreatB Hosp2 -5.263 0.523 174 -10.061 <.0001 ## TreatA Hosp2 - TreatC Hosp2 -0.127 0.523 174 -0.242 1.0000 ## TreatB Hosp2 - TreatC Hosp2 5.137 0.523 174 9.819 <.0001 ## ## P value adjustment: sidak method for 15 tests ``` --- # What if there were fewer tests? ``` r pairs(m1_emm, simple="Treatment", adjust="bonferroni") ``` ``` ## Hospital = Hosp1: ## contrast estimate SE df t.ratio p.value ## TreatA - TreatB 1.370 0.523 174 2.619 0.0288 ## TreatA - TreatC 0.697 0.523 174 1.332 0.5541 ## TreatB - TreatC -0.673 0.523 174 -1.287 0.5993 ## ## Hospital = Hosp2: ## contrast estimate SE df t.ratio p.value ## TreatA - TreatB -5.263 0.523 174 -10.061 <.0001 ## TreatA - TreatC -0.127 0.523 174 -0.242 1.0000 ## TreatB - TreatC 5.137 0.523 174 9.819 <.0001 ## ## P value adjustment: bonferroni method for 3 tests ``` --- # Scheffe & Tukey + **Scheffe procedure** involves an adjustment to the critical value of `\(F\)` + The adjustment relates to the number of comparisons being made + And makes the critical value of `\(F\)` larger for a fixed `\(\alpha\)` + The more tests, the larger `\(F\)` + The square-root of the adjusted F provides and adjusted `\(t\)` -- + **Tukey's HSD** (Honest significant differences) + Compares all pairwise group means + Each difference is divided by the `\(SE\)` of the sum of means + This produces a `\(q\)` statistic for each comparison. + And is compared against a studentised range distribution --- # With `emmeans` ``` r pairs(m1_emm, adjust = "tukey") ``` ``` ## contrast estimate SE df t.ratio p.value ## TreatA Hosp1 - TreatB Hosp1 1.370 0.523 174 2.619 0.0982 ## TreatA Hosp1 - TreatC Hosp1 0.697 0.523 174 1.332 0.7670 ## TreatA Hosp1 - TreatA Hosp2 2.947 0.523 174 5.632 <.0001 ## TreatA Hosp1 - TreatB Hosp2 -2.317 0.523 174 -4.428 0.0002 ## TreatA Hosp1 - TreatC Hosp2 2.820 0.523 174 5.390 <.0001 ## TreatB Hosp1 - TreatC Hosp1 -0.673 0.523 174 -1.287 0.7918 ## TreatB Hosp1 - TreatA Hosp2 1.577 0.523 174 3.014 0.0346 ## TreatB Hosp1 - TreatB Hosp2 -3.687 0.523 174 -7.047 <.0001 ## TreatB Hosp1 - TreatC Hosp2 1.450 0.523 174 2.772 0.0670 ## TreatC Hosp1 - TreatA Hosp2 2.250 0.523 174 4.301 0.0004 ## TreatC Hosp1 - TreatB Hosp2 -3.013 0.523 174 -5.760 <.0001 ## TreatC Hosp1 - TreatC Hosp2 2.123 0.523 174 4.059 0.0010 ## TreatA Hosp2 - TreatB Hosp2 -5.263 0.523 174 -10.061 <.0001 ## TreatA Hosp2 - TreatC Hosp2 -0.127 0.523 174 -0.242 0.9999 ## TreatB Hosp2 - TreatC Hosp2 5.137 0.523 174 9.819 <.0001 ## ## P value adjustment: tukey method for comparing a family of 6 estimates ``` --- # With `emmeans` ``` r pairs(m1_emm, adjust = "scheffe") ``` ``` ## contrast estimate SE df t.ratio p.value ## TreatA Hosp1 - TreatB Hosp1 1.370 0.523 174 2.619 0.2372 ## TreatA Hosp1 - TreatC Hosp1 0.697 0.523 174 1.332 0.8787 ## TreatA Hosp1 - TreatA Hosp2 2.947 0.523 174 5.632 <.0001 ## TreatA Hosp1 - TreatB Hosp2 -2.317 0.523 174 -4.428 0.0021 ## TreatA Hosp1 - TreatC Hosp2 2.820 0.523 174 5.390 0.0001 ## TreatB Hosp1 - TreatC Hosp1 -0.673 0.523 174 -1.287 0.8935 ## TreatB Hosp1 - TreatA Hosp2 1.577 0.523 174 3.014 0.1119 ## TreatB Hosp1 - TreatB Hosp2 -3.687 0.523 174 -7.047 <.0001 ## TreatB Hosp1 - TreatC Hosp2 1.450 0.523 174 2.772 0.1809 ## TreatC Hosp1 - TreatA Hosp2 2.250 0.523 174 4.301 0.0033 ## TreatC Hosp1 - TreatB Hosp2 -3.013 0.523 174 -5.760 <.0001 ## TreatC Hosp1 - TreatC Hosp2 2.123 0.523 174 4.059 0.0072 ## TreatA Hosp2 - TreatB Hosp2 -5.263 0.523 174 -10.061 <.0001 ## TreatA Hosp2 - TreatC Hosp2 -0.127 0.523 174 -0.242 1.0000 ## TreatB Hosp2 - TreatC Hosp2 5.137 0.523 174 9.819 <.0001 ## ## P value adjustment: scheffe method with rank 5 ``` --- # Summary + This week we have looked at ways we can code categorical data to test experimental effects + Interpreting model output for effects-coded categorical `\(\times\)` categorial interactions + Setting up specific contrasts and comparisons + Correcting for increases in the probability of Type I error --- ## This week .pull-left[ ### Tasks <img src="figs/labs.svg" width="10%" /> **Attend your lab and work together on the exercises** <br> <img src="figs/exam.svg" width="10%" /> **Complete the weekly quiz** ] .pull-right[ ### Support <img src="figs/forum.svg" width="10%" /> **Help each other on the Piazza forum** <br> <img src="figs/oh.png" width="10%" /> **Attend office hours (see Learn page for details)** ]