class: center, middle, inverse, title-slide #
Week 4: Binary Predictors
## Data Analysis for Psychology in R 2
### TOM BOOTH & ALEX DOUMAS ### Department of Psychology
The University of Edinburgh ### AY 2020-2021 --- # Weeks Learning Objectives 1. Understand the meaning of model coefficients in the case of a binary predictor. 2. Be able to state the assumptions underlying a linear model. 3. Understand how to assess if a fitted model satisfies the linear model assumptions. 4. Understand how to use transformations when the model violates assumptions. --- # Topics for today + Today will will focus on the linear model with a binary predictor. + Recap categorical and binary variables + Extend our data example + Run through the steps and interpretation for the linear model --- # Recap: Categorical variables + Categorical variables can only take discrete values + E.g., animal type: 1= duck, 2= cow, 3= wasp + They are mutually exclusive + No duck-wasps or cow-ducks! + In R, categorical variables should be of class `factor` + The discrete values are `levels` + Levels can have numeric values (1, 2 3) and labels (e.g. "duck", "cow", "wasp") --- # Recap: Binary variable + Binary variable is a categorical variable with two levels. + Traditionally coded with a 0 and 1 + Referred to as dummy coding + We will come back to this for categorical variables with 2+ levels -- + Why 0 and 1? + Quick version: It has some nice properties when it comes to interpretation. --- # Extending our example .pull-left[ + Our in class example so far has used test scores and revision time for 10 students. + Let's say we collect this data on 150 students. + We also collected data on who they studied with; + 0 = alone + 1 = with others + So our variable `study` is a binary ] .pull-right[ + This data set is available on LEARN ```r *df <- read_csv("./dapr2_lec07.csv") ``` <div style="border: 1px solid #ddd; padding: 0px; overflow-y: scroll; height:300px; "><table class="table" style="width: auto !important; margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:left;position: sticky; top:0; background-color: #FFFFFF;"> ID </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> score </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> hours </th> <th style="text-align:right;position: sticky; top:0; background-color: #FFFFFF;"> study </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> ID1 </td> <td style="text-align:right;"> 5 </td> <td style="text-align:right;"> 3.3 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:left;"> ID2 </td> <td style="text-align:right;"> 6 </td> <td style="text-align:right;"> 2.6 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:left;"> ID3 </td> <td style="text-align:right;"> 5 </td> <td style="text-align:right;"> 3.7 </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> ID4 </td> <td style="text-align:right;"> 6 </td> <td style="text-align:right;"> 3.6 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:left;"> ID5 </td> <td style="text-align:right;"> 6 </td> <td style="text-align:right;"> 3.7 </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> ID6 </td> <td style="text-align:right;"> 7 </td> <td style="text-align:right;"> 4.4 </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> ID7 </td> <td style="text-align:right;"> 6 </td> <td style="text-align:right;"> 3.6 </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> ID8 </td> <td style="text-align:right;"> 6 </td> <td style="text-align:right;"> 4.1 </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> ID9 </td> <td style="text-align:right;"> 5 </td> <td style="text-align:right;"> 3.6 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:left;"> ID10 </td> <td style="text-align:right;"> 5 </td> <td style="text-align:right;"> 3.9 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:left;"> ID11 </td> <td style="text-align:right;"> 4 </td> <td style="text-align:right;"> 2.2 </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> ID12 </td> <td style="text-align:right;"> 7 </td> <td style="text-align:right;"> 5.7 </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> ID13 </td> <td style="text-align:right;"> 6 </td> <td style="text-align:right;"> 3.8 </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> ID14 </td> <td style="text-align:right;"> 5 </td> <td style="text-align:right;"> 3.5 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:left;"> ID15 </td> <td style="text-align:right;"> 6 </td> <td style="text-align:right;"> 4.8 </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> ID16 </td> <td style="text-align:right;"> 6 </td> <td style="text-align:right;"> 4.6 </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> ID17 </td> <td style="text-align:right;"> 5 </td> <td style="text-align:right;"> 3.0 </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> ID18 </td> <td style="text-align:right;"> 4 </td> <td style="text-align:right;"> 2.9 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:left;"> ID19 </td> <td style="text-align:right;"> 5 </td> <td style="text-align:right;"> 2.8 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:left;"> ID20 </td> <td style="text-align:right;"> 5 </td> <td style="text-align:right;"> 2.4 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:left;"> ID21 </td> <td style="text-align:right;"> 6 </td> <td style="text-align:right;"> 3.7 </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> ID22 </td> <td style="text-align:right;"> 6 </td> <td style="text-align:right;"> 4.7 </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> ID23 </td> <td style="text-align:right;"> 5 </td> <td style="text-align:right;"> 3.6 </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> ID24 </td> <td style="text-align:right;"> 6 </td> <td style="text-align:right;"> 4.5 </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> ID25 </td> <td style="text-align:right;"> 6 </td> <td style="text-align:right;"> 4.0 </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> ID26 </td> <td style="text-align:right;"> 4 </td> <td style="text-align:right;"> 1.8 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:left;"> ID27 </td> <td style="text-align:right;"> 4 </td> <td style="text-align:right;"> 2.5 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:left;"> ID28 </td> <td style="text-align:right;"> 3 </td> <td style="text-align:right;"> 1.9 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:left;"> ID29 </td> <td style="text-align:right;"> 6 </td> <td style="text-align:right;"> 4.3 </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> ID30 </td> <td style="text-align:right;"> 6 </td> <td style="text-align:right;"> 4.5 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:left;"> ID31 </td> <td style="text-align:right;"> 5 </td> <td style="text-align:right;"> 3.3 </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> ID32 </td> <td style="text-align:right;"> 6 </td> <td style="text-align:right;"> 4.6 </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> ID33 </td> <td style="text-align:right;"> 5 </td> <td style="text-align:right;"> 3.5 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:left;"> ID34 </td> <td style="text-align:right;"> 5 </td> <td style="text-align:right;"> 3.0 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:left;"> ID35 </td> <td style="text-align:right;"> 6 </td> <td style="text-align:right;"> 4.2 </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> ID36 </td> <td style="text-align:right;"> 7 </td> <td style="text-align:right;"> 4.8 </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> ID37 </td> <td style="text-align:right;"> 6 </td> <td style="text-align:right;"> 5.2 </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> ID38 </td> <td style="text-align:right;"> 7 </td> <td style="text-align:right;"> 3.9 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:left;"> ID39 </td> <td style="text-align:right;"> 5 </td> <td style="text-align:right;"> 2.6 </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> ID40 </td> <td style="text-align:right;"> 5 </td> <td style="text-align:right;"> 2.9 </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> ID41 </td> <td style="text-align:right;"> 6 </td> <td style="text-align:right;"> 3.3 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:left;"> ID42 </td> <td style="text-align:right;"> 7 </td> <td style="text-align:right;"> 4.8 </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> ID43 </td> <td style="text-align:right;"> 6 </td> <td style="text-align:right;"> 3.3 </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> ID44 </td> <td style="text-align:right;"> 7 </td> <td style="text-align:right;"> 4.0 </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> ID45 </td> <td style="text-align:right;"> 4 </td> <td style="text-align:right;"> 2.0 </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> ID46 </td> <td style="text-align:right;"> 4 </td> <td style="text-align:right;"> 3.3 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:left;"> ID47 </td> <td style="text-align:right;"> 6 </td> <td style="text-align:right;"> 4.1 </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> ID48 </td> <td style="text-align:right;"> 4 </td> <td style="text-align:right;"> 2.8 </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> ID49 </td> <td style="text-align:right;"> 7 </td> <td style="text-align:right;"> 5.9 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:left;"> ID50 </td> <td style="text-align:right;"> 6 </td> <td style="text-align:right;"> 4.4 </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> ID51 </td> <td style="text-align:right;"> 5 </td> <td style="text-align:right;"> 3.0 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:left;"> ID52 </td> <td style="text-align:right;"> 5 </td> <td style="text-align:right;"> 3.8 </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> ID53 </td> <td style="text-align:right;"> 5 </td> <td style="text-align:right;"> 2.8 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:left;"> ID54 </td> <td style="text-align:right;"> 7 </td> <td style="text-align:right;"> 5.7 </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> ID55 </td> <td style="text-align:right;"> 6 </td> <td style="text-align:right;"> 3.9 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:left;"> ID56 </td> <td style="text-align:right;"> 6 </td> <td style="text-align:right;"> 3.0 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:left;"> ID57 </td> <td style="text-align:right;"> 4 </td> <td style="text-align:right;"> 1.6 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:left;"> ID58 </td> <td style="text-align:right;"> 5 </td> <td style="text-align:right;"> 3.6 </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> ID59 </td> <td style="text-align:right;"> 6 </td> <td style="text-align:right;"> 3.5 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:left;"> ID60 </td> <td style="text-align:right;"> 3 </td> <td style="text-align:right;"> 0.8 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:left;"> ID61 </td> <td style="text-align:right;"> 5 </td> <td style="text-align:right;"> 3.7 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:left;"> ID62 </td> <td style="text-align:right;"> 5 </td> <td style="text-align:right;"> 3.1 </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> ID63 </td> <td style="text-align:right;"> 6 </td> <td style="text-align:right;"> 4.1 </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> ID64 </td> <td style="text-align:right;"> 6 </td> <td style="text-align:right;"> 4.2 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:left;"> ID65 </td> <td style="text-align:right;"> 6 </td> <td style="text-align:right;"> 3.5 </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> ID66 </td> <td style="text-align:right;"> 5 </td> <td style="text-align:right;"> 2.4 </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> ID67 </td> <td style="text-align:right;"> 5 </td> <td style="text-align:right;"> 2.7 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:left;"> ID68 </td> <td style="text-align:right;"> 5 </td> <td style="text-align:right;"> 4.1 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:left;"> ID69 </td> <td style="text-align:right;"> 6 </td> <td style="text-align:right;"> 3.8 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:left;"> ID70 </td> <td style="text-align:right;"> 6 </td> <td style="text-align:right;"> 4.4 </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> ID71 </td> <td style="text-align:right;"> 7 </td> <td style="text-align:right;"> 5.4 </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> ID72 </td> <td style="text-align:right;"> 6 </td> <td style="text-align:right;"> 3.8 </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> ID73 </td> <td style="text-align:right;"> 5 </td> <td style="text-align:right;"> 4.2 </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> ID74 </td> <td style="text-align:right;"> 7 </td> <td style="text-align:right;"> 5.2 </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> ID75 </td> <td style="text-align:right;"> 5 </td> <td style="text-align:right;"> 3.2 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:left;"> ID76 </td> <td style="text-align:right;"> 6 </td> <td style="text-align:right;"> 3.1 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:left;"> ID77 </td> <td style="text-align:right;"> 5 </td> <td style="text-align:right;"> 3.1 </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> ID78 </td> <td style="text-align:right;"> 6 </td> <td style="text-align:right;"> 4.3 </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> ID79 </td> <td style="text-align:right;"> 6 </td> <td style="text-align:right;"> 4.1 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:left;"> ID80 </td> <td style="text-align:right;"> 7 </td> <td style="text-align:right;"> 5.3 </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> ID81 </td> <td style="text-align:right;"> 5 </td> <td style="text-align:right;"> 3.0 </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> ID82 </td> <td style="text-align:right;"> 6 </td> <td style="text-align:right;"> 2.9 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:left;"> ID83 </td> <td style="text-align:right;"> 6 </td> <td style="text-align:right;"> 3.7 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:left;"> ID84 </td> <td style="text-align:right;"> 5 </td> <td style="text-align:right;"> 3.1 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:left;"> ID85 </td> <td style="text-align:right;"> 7 </td> <td style="text-align:right;"> 4.2 </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> ID86 </td> <td style="text-align:right;"> 6 </td> <td style="text-align:right;"> 4.8 </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> ID87 </td> <td style="text-align:right;"> 6 </td> <td style="text-align:right;"> 4.9 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:left;"> ID88 </td> <td style="text-align:right;"> 6 </td> <td style="text-align:right;"> 4.6 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:left;"> ID89 </td> <td style="text-align:right;"> 6 </td> <td style="text-align:right;"> 3.2 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:left;"> ID90 </td> <td style="text-align:right;"> 4 </td> <td style="text-align:right;"> 2.8 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:left;"> ID91 </td> <td style="text-align:right;"> 6 </td> <td style="text-align:right;"> 4.0 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:left;"> ID92 </td> <td style="text-align:right;"> 5 </td> <td style="text-align:right;"> 3.9 </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> ID93 </td> <td style="text-align:right;"> 6 </td> <td style="text-align:right;"> 2.6 </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> ID94 </td> <td style="text-align:right;"> 6 </td> <td style="text-align:right;"> 4.2 </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> ID95 </td> <td style="text-align:right;"> 4 </td> <td style="text-align:right;"> 2.6 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:left;"> ID96 </td> <td style="text-align:right;"> 3 </td> <td style="text-align:right;"> 2.8 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:left;"> ID97 </td> <td style="text-align:right;"> 7 </td> <td style="text-align:right;"> 4.3 </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> ID98 </td> <td style="text-align:right;"> 4 </td> <td style="text-align:right;"> 3.1 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:left;"> ID99 </td> <td style="text-align:right;"> 4 </td> <td style="text-align:right;"> 2.3 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:left;"> ID100 </td> <td style="text-align:right;"> 5 </td> <td style="text-align:right;"> 2.7 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:left;"> ID101 </td> <td style="text-align:right;"> 5 </td> <td style="text-align:right;"> 2.2 </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> ID102 </td> <td style="text-align:right;"> 5 </td> <td style="text-align:right;"> 3.5 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:left;"> ID103 </td> <td style="text-align:right;"> 6 </td> <td style="text-align:right;"> 4.4 </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> ID104 </td> <td style="text-align:right;"> 4 </td> <td style="text-align:right;"> 2.0 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:left;"> ID105 </td> <td style="text-align:right;"> 6 </td> <td style="text-align:right;"> 2.8 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:left;"> ID106 </td> <td style="text-align:right;"> 3 </td> <td style="text-align:right;"> 2.3 </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> ID107 </td> <td style="text-align:right;"> 5 </td> <td style="text-align:right;"> 3.3 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:left;"> ID108 </td> <td style="text-align:right;"> 6 </td> <td style="text-align:right;"> 2.8 </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> ID109 </td> <td style="text-align:right;"> 6 </td> <td style="text-align:right;"> 4.2 </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> ID110 </td> <td style="text-align:right;"> 7 </td> <td style="text-align:right;"> 4.5 </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> ID111 </td> <td style="text-align:right;"> 7 </td> <td style="text-align:right;"> 4.6 </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> ID112 </td> <td style="text-align:right;"> 6 </td> <td style="text-align:right;"> 4.0 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:left;"> ID113 </td> <td style="text-align:right;"> 6 </td> <td style="text-align:right;"> 4.7 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:left;"> ID114 </td> <td style="text-align:right;"> 7 </td> <td style="text-align:right;"> 4.1 </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> ID115 </td> <td style="text-align:right;"> 7 </td> <td style="text-align:right;"> 4.1 </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> ID116 </td> <td style="text-align:right;"> 6 </td> <td style="text-align:right;"> 3.4 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:left;"> ID117 </td> <td style="text-align:right;"> 6 </td> <td style="text-align:right;"> 4.6 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:left;"> ID118 </td> <td style="text-align:right;"> 4 </td> <td style="text-align:right;"> 1.9 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:left;"> ID119 </td> <td style="text-align:right;"> 6 </td> <td style="text-align:right;"> 3.8 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:left;"> ID120 </td> <td style="text-align:right;"> 4 </td> <td style="text-align:right;"> 3.2 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:left;"> ID121 </td> <td style="text-align:right;"> 6 </td> <td style="text-align:right;"> 3.8 </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> ID122 </td> <td style="text-align:right;"> 6 </td> <td style="text-align:right;"> 4.3 </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> ID123 </td> <td style="text-align:right;"> 5 </td> <td style="text-align:right;"> 2.9 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:left;"> ID124 </td> <td style="text-align:right;"> 5 </td> <td style="text-align:right;"> 3.6 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:left;"> ID125 </td> <td style="text-align:right;"> 6 </td> <td style="text-align:right;"> 3.0 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:left;"> ID126 </td> <td style="text-align:right;"> 5 </td> <td style="text-align:right;"> 2.6 </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> ID127 </td> <td style="text-align:right;"> 5 </td> <td style="text-align:right;"> 2.6 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:left;"> ID128 </td> <td style="text-align:right;"> 7 </td> <td style="text-align:right;"> 3.7 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:left;"> ID129 </td> <td style="text-align:right;"> 5 </td> <td style="text-align:right;"> 3.3 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:left;"> ID130 </td> <td style="text-align:right;"> 5 </td> <td style="text-align:right;"> 3.7 </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> ID131 </td> <td style="text-align:right;"> 5 </td> <td style="text-align:right;"> 1.8 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:left;"> ID132 </td> <td style="text-align:right;"> 6 </td> <td style="text-align:right;"> 5.1 </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> ID133 </td> <td style="text-align:right;"> 6 </td> <td style="text-align:right;"> 3.8 </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> ID134 </td> <td style="text-align:right;"> 5 </td> <td style="text-align:right;"> 3.6 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:left;"> ID135 </td> <td style="text-align:right;"> 6 </td> <td style="text-align:right;"> 3.5 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:left;"> ID136 </td> <td style="text-align:right;"> 6 </td> <td style="text-align:right;"> 4.7 </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> ID137 </td> <td style="text-align:right;"> 6 </td> <td style="text-align:right;"> 3.8 </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> ID138 </td> <td style="text-align:right;"> 5 </td> <td style="text-align:right;"> 2.7 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:left;"> ID139 </td> <td style="text-align:right;"> 6 </td> <td style="text-align:right;"> 3.4 </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> ID140 </td> <td style="text-align:right;"> 6 </td> <td style="text-align:right;"> 2.9 </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> ID141 </td> <td style="text-align:right;"> 8 </td> <td style="text-align:right;"> 6.5 </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> ID142 </td> <td style="text-align:right;"> 5 </td> <td style="text-align:right;"> 3.3 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:left;"> ID143 </td> <td style="text-align:right;"> 5 </td> <td style="text-align:right;"> 3.6 </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> ID144 </td> <td style="text-align:right;"> 6 </td> <td style="text-align:right;"> 4.0 </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> ID145 </td> <td style="text-align:right;"> 6 </td> <td style="text-align:right;"> 3.6 </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> ID146 </td> <td style="text-align:right;"> 5 </td> <td style="text-align:right;"> 3.0 </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> ID147 </td> <td style="text-align:right;"> 5 </td> <td style="text-align:right;"> 2.4 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:left;"> ID148 </td> <td style="text-align:right;"> 7 </td> <td style="text-align:right;"> 4.2 </td> <td style="text-align:right;"> 0 </td> </tr> <tr> <td style="text-align:left;"> ID149 </td> <td style="text-align:right;"> 4 </td> <td style="text-align:right;"> 2.9 </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> ID150 </td> <td style="text-align:right;"> 6 </td> <td style="text-align:right;"> 3.6 </td> <td style="text-align:right;"> 0 </td> </tr> </tbody> </table></div> ] --- # LM with binary predictors + In previous lectures we asked the question: + **Do students who study more get higher scores on the test?** + And we specified a linear model: `$$y_i = \beta_0 + \beta_1 x_{i} + \epsilon_i$$` + Or `$$score_i = \beta_0 + \beta_1 hours_{i} + \epsilon_i$$` -- + And nothing changes with our binary variable. We can ask the question: + **Do students who study with others score better than students who study alone?** `$$score_i = \beta_0 + \beta_1 study_{i} + \epsilon_i$$` --- # In `R` ```r res <- lm(score ~ study, data = df) summary(res) ``` ``` ## ## Call: ## lm(formula = score ~ study, data = df) ## ## Residuals: ## Min 1Q Median 3Q Max ## -2.8333 -0.8333 0.1667 0.7778 2.1667 ## ## Coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) 5.2222 0.1076 48.552 < 2e-16 *** ## study 0.6111 0.1492 4.097 6.87e-05 *** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Residual standard error: 0.9127 on 148 degrees of freedom ## Multiple R-squared: 0.1019, Adjusted R-squared: 0.0958 ## F-statistic: 16.79 on 1 and 148 DF, p-value: 6.866e-05 ``` --- # Interpretation .pull-left[ + As before, the intercept `\(\hat \beta_0\)` is the expected value of `\(y\)` when `\(x=0\)` + What is `\(x=0\)` here? + It is the students who study alone. + So what about `\(\hat \beta_1\)`? + **Look at the output on the right hand side.** + What do you notice about the difference in averages? ] .pull-right[ ```r df %>% * group_by(., study) %>% summarise( * Average = round(mean(score),4) ) ``` ``` ## # A tibble: 2 x 2 ## study Average ## <dbl> <dbl> ## 1 0 5.22 ## 2 1 5.83 ``` ] --- # Interpretation + `\(\hat \beta_0\)` = predicted expected value of `\(y\)` when `\(x = 0\)` + Or, the mean of group coded 0 (those who study alone) + `\(\hat \beta_1\)` = predicted difference between the means of the two groups. + Group 1 - Group 0 (Mean `score` for those who study with others - mean `score` of those who study alone) + Notice how this maps to our question. + Do students who study with others score better than students who study alone? --- class: center, middle # Time for a break **Have a go at the binary variable quiz** --- class: center, middle # Welcome Back! **Where we left off... ** Let's think about the interpretation by group a little more. --- # Equations for each group + What would our linear model look like if we added the values for `\(x\)`. `$$\widehat{score} = \hat \beta_0 + \hat \beta_1 study$$` + For those who study alone ( `\(study = 0\)` ): `$$\widehat{score}_{alone} = \hat \beta_0 + \hat \beta_1 \times 0$$` + So; `$$\widehat{score}_{alone} = \hat \beta_0$$` --- # Equations for each group + For those who study with others ( `\(study = 1\)` ): `$$\widehat{score}_{others} = \hat \beta_0 + \hat \beta_1 \times 1$$` + So; `$$\widehat{score}_{others} = \hat \beta_0 + \hat \beta_1$$` + And if we re-arrange; `$$\hat \beta_1 = \widehat{score}_{others} - \hat \beta_0$$` + Remembering that `\(\widehat{score}_{alone} = \hat \beta_0\)`, we finally obtain: `$$\hat \beta_1 = \widehat{score}_{others} - \widehat{score}_{alone}$$` --- # Visualize the model <img src="dapR2_lec07_LMcategorical_files/figure-html/unnamed-chunk-6-1.png" width="504" /> --- # Visualize the model <img src="dapR2_lec07_LMcategorical_files/figure-html/unnamed-chunk-7-1.png" width="504" /> --- # Visualize the model <img src="dapR2_lec07_LMcategorical_files/figure-html/unnamed-chunk-8-1.png" width="504" /> --- # Visualize the model <img src="dapR2_lec07_LMcategorical_files/figure-html/unnamed-chunk-9-1.png" width="504" /> --- # Evaluation of model and significance of `\(\beta_1\)` + `\(R^2\)` and `\(F\)`-ratio interpretation are identical to their interpretation in models with only continuous predictors. + And we assess the significance of predictors in the same way + We use the standard error of the coefficient to construct: + We calculate the `\(\hat \beta_1\)` = difference between groups + `\(t\)`-value and associated `\(p\)`-value for the coefficient + Or a confidence interval around the coefficient --- # Hold on... it's a t-test ```r df %>% t.test(score ~ study, .) ``` ``` ## ## Welch Two Sample t-test ## ## data: score by study ## t = -4.0883, df = 145.39, p-value = 7.163e-05 ## alternative hypothesis: true difference in means is not equal to 0 ## 95 percent confidence interval: ## -0.9065440 -0.3156782 ## sample estimates: ## mean in group 0 mean in group 1 ## 5.222222 5.833333 ``` ??? Yup! --- # Standardizing `\(\hat \beta_1\)` + When discussing continuous predictors we discussed standardized `\(b_1\)` and unstandardized `\(\hat \beta_1\)` + Recall, when we calculated `\(b_1\)` we used the SD of x, `\(s_x\)`. + For a binary categorical variable, the SD is not appropriate. + 1 unit also has meaning - it is the membership of a different group. + As such, we do not standardize --- # Summary of today + Recapped categorical and binary variables + Introduced the linear model with a single binary variable + Considered the interpretation of the coefficients + And saw it is an independent sample `\(t\)`-test --- # Next tasks + This week: + Complete your lab + Come to office hours + Weekly quiz: Assessed quiz - Week 3 content. + Open Monday 09:00 + Closes Sunday 17:00