F-tests & Standardization

class: center, middle, inverse, title-slide

# <b>F-tests & Standardization </b>
## Data Analysis for Psychology in R 2<br><br>
### dapR2 Team
### Department of Psychology<br>The University of Edinburgh

---

# Weeks Learning Objectives
1. Understand the use of `$F$` and incremental `$F$` tests.

2. Be able to run and interpret `$F$`-tests in R.

3. Understand how and when to standardize model coefficients and when this is appropriate to do.

4. Understand the relationship between the correlation coefficient and the regression slope.

5. Be able calculate standardized coefficients in R.

---
# Where we left off...
+ Last week we looked at:
  + The significance of individual predictors
  + Overall model evaluation through `$R^2$` and adjusted `$R^2$` to see how much variance in the outcome has been explained.

+ Today we will:
  + Look at significance tests of the overall model
  + Discuss how we can use the same tools to do incremental tests (how much does my model improve when I add variables)
  + Interpreting models based on standardized coefficients.

---
#  Significance of the overall model 
+ The test of the individual predictors (IVs, or `$x$`'s) does not tell us if the overall model is significant or not.
	+ Neither does R-square
	+ But both are indicative

+ To test the significance of the model as a whole, we conduct an `$F$`-test.

---
#  F-test & F-ratio
+ An `$F$`-test involves testing the statistical significance of a test statistic called (wait for it) the `$F$`-ratio.

+ The `$F$`-ratio tests the null hypothesis that all the regression slopes in a model are all zero.

+ In other words, our predictors tell us nothing about our outcome.
  + They explain no variance.

+ If our predictors do explain some variance, our `$F$`-ratio will be significant.

---
# Our results (significant F)

```r
performance <- lm(score ~ hours + motivation, data = test_study2); summary(performance)
```

```
## 
## Call:
## lm(formula = score ~ hours + motivation, data = test_study2)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -12.9548  -2.8042  -0.2847   2.9344  13.8240 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  6.86679    0.65473  10.488   <2e-16 ***
## hours        1.37570    0.07989  17.220   <2e-16 ***
## motivation   0.91634    0.38376   2.388   0.0182 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 4.386 on 147 degrees of freedom
## Multiple R-squared:  0.6696,	Adjusted R-squared:  0.6651 
## F-statistic: 148.9 on 2 and 147 DF,  p-value: < 2.2e-16
```

---
#  F-ratio: Some details
+ `$F$`-ratio is a ratio of the explained to unexplained variance:

`$$F = \frac{\frac{SS_{model}}{df_{model}}}{\frac{SS_{residual}}{df_{residual}}} = \frac{MS_{Model}}{MS_{Residual}}$$`

+ Where MS = mean squares

+ **What are mean squares?**
  + Mean squares are sums of squares calculations divided by the associated degrees of freedom.
  + We saw how to calculate model and residual sums of squares last week

+ But what are degrees of freedom...

---
# Degrees of freedom
+ The degrees of freedom are defined as the number of independent values associated with the different calculations.
  + Df are typically the combination of the amount of data you have (sample size) and the number of things you need to calculate/estimate.

+ **Residual degrees of freedom = n-k-1**
  + `$SS_{residual}$` calculation is based on our model, in which we estimate k `$\beta$` terms (-k) and an intercept (-1)

+ **Total degrees of freedom = n-1**
  + `$SS_{total}$` calculation is based on the observed `$y_i$` and `$\bar{y}$` . 
  + In order to estimate `$\bar{y}$` , all apart from one value of `$y$` is free to vary, hence n-1

+ **Model degrees of freedom = k**
  + `$SS_{model}$` are dependent on estimated `$\beta$` , hence k.

---
# F-table

<table class="table" style="width: auto !important; margin-left: auto; margin-right: auto;">
 <thead>
  <tr>
   <th style="text-align:left;"> SS </th>
   <th style="text-align:left;"> df </th>
   <th style="text-align:left;"> MS </th>
   <th style="text-align:left;"> Fratio </th>
   <th style="text-align:left;"> pvalue </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:left;"> Model </td>
   <td style="text-align:left;"> k </td>
   <td style="text-align:left;"> SS model/df model </td>
   <td style="text-align:left;"> MS model/ MS residual </td>
   <td style="text-align:left;"> F(df model,df residual) </td>
  </tr>
  <tr>
   <td style="text-align:left;"> Residual </td>
   <td style="text-align:left;"> n-k-1 </td>
   <td style="text-align:left;"> SS residual/df residual </td>
   <td style="text-align:left;">  </td>
   <td style="text-align:left;">  </td>
  </tr>
  <tr>
   <td style="text-align:left;"> Total </td>
   <td style="text-align:left;">  </td>
   <td style="text-align:left;">  </td>
   <td style="text-align:left;">  </td>
   <td style="text-align:left;">  </td>
  </tr>
</tbody>
</table>

---
# Our example (note the df at the bottom)

```r
summary(performance)
```

---
# F-ratio
+ Bigger `$F$`-ratios indicate better models.
  + It means the model variance is big compared to the residual variance.

+ The null hypothesis for the model says that the best guess of any individuals `$y$` value is the mean of `$y$` plus error.
	+ Or, that the `$x$` variables carry no information collectively about `$y$`.
	+ I.e. the slopes all = 0

+ `$F$`-ratio will be close to 1 when the null hypothesis is true
  + If there is equivalent residual to model variation, `$F$`=1
	+ If there is more model than residual `$F$` > 1

---
# Testing the significance of `$F$`

+ The `$F$`-ratio is our test statistic for the significance of our model. 
  + As with all statistical inferences, we would select an `$\alpha$` level.
  + Calculate the critical value of `$F$` associated with this.
  + And then compare our value to the critical value.
  
+ The `$F$`-ratio is then evaluated against an `$F$`-distribution with `$df_{Model}$` and `$df_{Residual}$` and a pre-defined `$\alpha$`
  + This provides us with the test of the overall model. 
  
  
---
# Visualize the test

.pull-left[

<img src="dapr2_04_testinglm2_files/figure-html/unnamed-chunk-6-1.png" width="504" />
]

.pull-right[

+ Critical value and `$p$`-value:

```r
tibble(
  Crit = round(qf(0.95, 2, 147),3),
  Exactp = 1-pf(148.9, 2, 147)
)
```

```
## # A tibble: 1 x 2
##    Crit Exactp
##   <dbl>  <dbl>
## 1  3.06      0
```

+ From this we would **reject the null**.

]

---
class: center, middle
# Time for a rest

**Any questions....**

---
# Unstandardized vs standardized coefficients
- So far we have calculated unstandardized `$\hat \beta_1$`.

+ We interpreted the slope as the change in `$y$` units for a unit change in `$x$`
  + Where the unit is determined by how we have measured our variables.

+ In our running example:
  + A unit of study time is 1 hour.
  + A unit of test score is 1 point.
  
+ However, sometimes we may want to represent our results in standard units.

---
# Standardized units
+ Why might standard units be useful?

+ **If the scales of our variables are arbitrary.**
  + Example: A sum score of questionnaire items answered on a Likert scale.
  + A unit here would equal moving from a 2 to 3 on one item.
  + This is not especially meaningful (and actually has A LOT of associated assumptions)

+ **If we want to compare the effects of variables on different scales**
  + If we want to say something like, the effect of `$x_1$` is stronger than the effect of `$x_2$`, we need a common scale.

---
# Standardizing the coefficients
+ After calculating a `$\hat \beta_1$`, it can be standardized by:

`$$\hat{\beta_1^*} = \hat \beta_1 \frac{s_x}{s_y}$$`

+ where;
  + `$\hat{\beta_1^*}$` = standardized beta coefficient
  + `$\hat \beta_1$` = unstandardized beta coefficient
  + `$s_x$` = standard deviation of `$x$`
  + `$s_y$` = standard deviation of `$y$`

---
# Standardizing the variables

+ Alternatively, for continuous variables, transforming both the IV and DV to `$z$`-scores (mean=0, SD=1) prior to fitting the model yields standardised betas.

+ `$z$`-score for `$x$`:

`$$z_{x_i} = \frac{x_i - \bar{x}}{s_x}$$`

+ and the `$z$`-score for `$y$`:

`$$z_{y_i} = \frac{y_i - \bar{y}}{s_y}$$`

+ That is, we divide the individual deviations from the mean by the standard deviation
  
---
# Two approaches in action

```r
summary(performance)$coefficients
```

```
##              Estimate Std. Error   t value     Pr(>|t|)
## (Intercept) 6.8667872  0.6547299 10.487969 1.465732e-19
## hours       1.3756983  0.0798881 17.220316 4.485905e-37
## motivation  0.9163386  0.3837558  2.387817 1.821929e-02
```

```r
*round(1.3756983 * (sd(test_study2$hours)/sd(test_study2$score)),3)
```

```
## [1] 0.819
```

---
# Two approaches in action

```r
test_study2 <- test_study2 %>%
  mutate(
    z_score = scale(score, center = T, scale = T),
    z_hours = scale(hours, center = T, scale = T),
    z_motivation = scale(motivation, center = T, scale = T)
  )

*performance_z <- lm(z_score ~ z_hours + z_motivation, data = test_study2)
round(summary(performance_z)$coefficients,3)
```

```
##              Estimate Std. Error t value Pr(>|t|)
## (Intercept)     0.000      0.047   0.000    1.000
## z_hours         0.819      0.048  17.220    0.000
## z_motivation    0.114      0.048   2.388    0.018
```

---
#  Interpreting standardized regression coefficients  
+ `$R^2$` , `$F$` and `$t$`-test remain the same for the standardized coefficients as for unstandardised coefficients.

+ `$b_0$` (intercept) = zero when all variables are standardized:
$$
\bar{y} - \hat \beta_1 \bar{x} = 0 - \hat \beta_1  0 = 0
$$

+ The interpretation of the coefficients becomes the increase in `$y$` in standard deviation units for every standard deviation increase in `$x$`

+ So, in our example:

>**For every standard deviation increase in hours of study, test score increases by 0.82 standard deviations**

---
# Which should we use? 
+ Unstandardized regression coefficients are often more useful when the variables are on  meaningful scales
	+ E.g. X additional hours of exercise per week adds Y years of healthy life

+ Sometimes it's useful to obtain standardized regression coefficients
	+ When the scales of variables are arbitrary
	+ When there is a desire to compare the effects of variables measured on different scales

+ Cautions
	+ Just because you can put regression coefficients on a common metric doesn't mean they can be meaningfully compared.
	+ The SD is a poor measure of spread for skewed distributions, therefore, be cautious of their use with skewed variables

---
# Relationship to correlation ( `$r$` )
+ Standardized slope ( `$\hat \beta_1^*$` ) = correlation coefficient ( `$r$` ) for a linear model with a single continuous predictor.

+ For example:

```r
round(lm(z_score ~ z_hours, data = test_study2)$coefficients, 2)
```

```
## (Intercept)     z_hours 
##        0.00        0.81
```

```r
round(cor(test_study2$hours, test_study2$score),2)
```

```
## [1] 0.81
```

---
# Relationship to correlation ( `$r$` )

+ They are the same:
  + `$r$` is a standardized measure of linear association
  + `$\hat \beta_1^*$` is a standardized measure of the linear slope.

+ Something similar is true for linear models with multiple predictors.
  + Slopes are equivalent to the *part correlation coefficient*

---
# Writing up Results
+ What should we include:
  + A full results table formatted (`tab_model`)
  + A comment on the overall model ( `$F$`-test, `$R^2$` , adjusted- `$R^2$`)
  + A comment on the significance of key `$\beta$` coefficients
  + An interpretation of those `$\beta$` coefficients

+ We will also need to comment on our model checks (but we wont talk about how to do these until week 9)

???
Perhaps look at some examples of published papers

---
# `tab_model` (very basic table)

```r
library(sjPlot); tab_model(performance)
```

<table style="border-collapse:collapse; border:none;">
<tr>
<th style="border-top: double; text-align:center; font-style:normal; font-weight:bold; padding:0.2cm;  text-align:left; ">&nbsp;</th>
<th colspan="3" style="border-top: double; text-align:center; font-style:normal; font-weight:bold; padding:0.2cm; ">score</th>
</tr>
<tr>
<td style=" text-align:center; border-bottom:1px solid; font-style:italic; font-weight:normal;  text-align:left; ">Predictors</td>
<td style=" text-align:center; border-bottom:1px solid; font-style:italic; font-weight:normal;  ">Estimates</td>
<td style=" text-align:center; border-bottom:1px solid; font-style:italic; font-weight:normal;  ">CI</td>
<td style=" text-align:center; border-bottom:1px solid; font-style:italic; font-weight:normal;  ">p</td>
</tr>
<tr>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; ">(Intercept)</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center;  ">6.87</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center;  ">5.57&nbsp;&ndash;&nbsp;8.16</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center;  "><strong>&lt;0.001</strong></td>
</tr>
<tr>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; ">hours</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center;  ">1.38</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center;  ">1.22&nbsp;&ndash;&nbsp;1.53</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center;  "><strong>&lt;0.001</strong></td>
</tr>
<tr>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; ">motivation</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center;  ">0.92</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center;  ">0.16&nbsp;&ndash;&nbsp;1.67</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:center;  "><strong>0.018</strong></td>
</tr>
<tr>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; padding-top:0.1cm; padding-bottom:0.1cm; border-top:1px solid;">Observations</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left; border-top:1px solid;" colspan="3">150</td>
</tr>
<tr>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; text-align:left; padding-top:0.1cm; padding-bottom:0.1cm;">R<sup>2</sup> / R<sup>2</sup> adjusted</td>
<td style=" padding:0.2cm; text-align:left; vertical-align:top; padding-top:0.1cm; padding-bottom:0.1cm; text-align:left;" colspan="3">0.670 / 0.665</td>
</tr>

</table>

---
# Pause to summarise what we know so far

+ So far we have seen how to:
  + run a linear model with a single predictor
  + extend this and add predictors
  + interpret these coefficients either in original units or standardized units
  + test the significance of `$\beta$` coefficients
  + test the significance of the overall model
  + estimate the amount of variance explained by our model

+ Short version, well done, you can now run and interpret linear models with continuous predictors.

+ We have a fair bit more to discuss in this course, but this is a great start!
  + Next week we will look at categorical predictor variables.

---
class: center, middle
# Thanks for listening!