class: center, middle, inverse, title-slide .title[ #
F-tests & Model Comparison
] .subtitle[ ## Data Analysis for Psychology in R 2
] .author[ ### dapR2 Team ] .institute[ ### Department of Psychology
The University of Edinburgh ] --- # Course Overview .pull-left[ <table style="border: 1px solid black;> <tr style="padding: 0 1em 0 1em;"> <td rowspan="5" style="border: 1px solid black;padding: 0 1em 0 1em;opacity:1;text-align:center;vertical-align: middle"> <b>Introduction to Linear Models</b></td> <td style="border: 1px solid black;padding: 0 1em 0 1em;opacity:1"> Intro to Linear Regression</td> </tr> <tr><td style="border: 1px solid black;padding: 0 1em 0 1em;opacity:1"> Interpreting Linear Models</td></tr> <tr><td style="border: 1px solid black;padding: 0 1em 0 1em;opacity:1"> Testing Individual Predictors</td></tr> <tr><td style="border: 1px solid black;padding: 0 1em 0 1em;opacity:1"> <b>Model Testing & Comparison</b></td></tr> <tr><td style="border: 1px solid black;padding: 0 1em 0 1em;opacity:0.4"> Linear Model Analysis</td></tr> <tr style="padding: 0 1em 0 1em;"> <td rowspan="5" style="border: 1px solid black;padding: 0 1em 0 1em;opacity:0.4;text-align:center;vertical-align: middle"> <b>Analysing Experimental Studies</b></td> <td style="border: 1px solid black;padding: 0 1em 0 1em;opacity:0.4"> Categorical Predictors & Dummy Coding</td> </tr> <tr><td style="border: 1px solid black;padding: 0 1em 0 1em;opacity:0.4"> Effects Coding & Coding Specific Contrasts</td></tr> <tr><td style="border: 1px solid black;padding: 0 1em 0 1em;opacity:0.4"> Assumptions & Diagnostics</td></tr> <tr><td style="border: 1px solid black;padding: 0 1em 0 1em;opacity:0.4"> Bootstrapping</td></tr> <tr><td style="border: 1px solid black;padding: 0 1em 0 1em;opacity:0.4"> Categorical Predictor Analysis</td></tr> </table> ] .pull-right[ <table style="border: 1px solid black;> <tr style="padding: 0 1em 0 1em;"> <td rowspan="5" style="border: 1px solid black;padding: 0 1em 0 1em;opacity:0.4;text-align:center;vertical-align: middle"> <b>Interactions</b></td> <td style="border: 1px solid black;padding: 0 1em 0 1em;opacity:0.4"> Interactions I</td> </tr> <tr><td style="border: 1px solid black;padding: 0 1em 0 1em;opacity:0.4"> Interactions II</td></tr> <tr><td style="border: 1px solid black;padding: 0 1em 0 1em;opacity:0.4"> Interactions III</td></tr> <tr><td style="border: 1px solid black;padding: 0 1em 0 1em;opacity:0.4"> Analysing Experiments</td></tr> <tr><td style="border: 1px solid black;padding: 0 1em 0 1em;opacity:0.4"> Interaction Analysis</td></tr> <tr style="padding: 0 1em 0 1em;"> <td rowspan="5" style="border: 1px solid black;padding: 0 1em 0 1em;opacity:0.4;text-align:center;vertical-align: middle"> <b>Advanced Topics</b></td> <td style="border: 1px solid black;padding: 0 1em 0 1em;opacity:0.4"> Power Analysis</td> </tr> <tr><td style="border: 1px solid black;padding: 0 1em 0 1em;opacity:0.4"> Binary Logistic Regression I</td></tr> <tr><td style="border: 1px solid black;padding: 0 1em 0 1em;opacity:0.4"> Binary Logistic Regression II</td></tr> <tr><td style="border: 1px solid black;padding: 0 1em 0 1em;opacity:0.4"> Logistic Regresison Analysis</td></tr> <tr><td style="border: 1px solid black;padding: 0 1em 0 1em;opacity:0.4"> Exam Prep and Course Q&A</td></tr> </table> ] --- # This Week's Learning Objectives 1. Understand the use of `\(F\)` and incremental `\(F\)` tests 2. Be able to run and interpret `\(F\)`-tests in R 3. Understand how to use model comparisons to test different types of questions 4. Understand the difference between nested and non-nested models, and the appropriate statistics to use for comparison in each case --- class: inverse, center, middle # Part 1: Recap and `\(F\)`-tests --- # Where we left off... + Last week we looked at: + The significance of individual predictors + Overall model evaluation through `\(R^2\)` and adjusted `\(R^2\)` to see how much variance in the outcome has been explained + Today we will: + Look at significance tests of the overall model + Discuss how we can use the same tools to do incremental tests (how much does my model improve when I add variables) --- # Statistical significance of the overall model + Does our combination of `\(x\)`'s significantly improve prediction of `\(y\)`, compared to not having any predictors? -- + Some indications that the model might be significant: + Slopes for individual predictors associated with significant `\(p\)`-values + High `\(R^2\)` + But these do not directly show model significance -- + To test the significance of the model as a whole, we conduct an `\(F\)`-test --- # `\(F\)`-test & `\(F\)`-ratio + An `\(F\)`-test involves testing the statistical significance of a test statistic called the `\(F\)`-ratio (also called `\(F\)`-statistic) + The `\(F\)`-ratio tests the null hypothesis that all the regression slopes in a model are zero -- + In other words, our predictors tell us nothing about our outcome + They explain no variance -- + The more variance our predictors explain, the bigger our `\(F\)`-ratio + As with `\(t\)`-values and the `\(t\)`-distribution, we compare the `\(F\)`-statistic to the `\(F\)`-distribution to obtain a `\(p\)`-value --- # Our results (significant `\(F\)`) ``` r performance <- lm(score ~ hours + motivation, data = test_study2); summary(performance) ``` ``` ## ## Call: ## lm(formula = score ~ hours + motivation, data = test_study2) ## ## Residuals: ## Min 1Q Median 3Q Max ## -12.9548 -2.8042 -0.2847 2.9344 13.8240 ## ## Coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) 6.86679 0.65473 10.488 <2e-16 *** ## hours 1.37570 0.07989 17.220 <2e-16 *** ## motivation 0.91634 0.38376 2.388 0.0182 * ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Residual standard error: 4.386 on 147 degrees of freedom ## Multiple R-squared: 0.6696, Adjusted R-squared: 0.6651 ## F-statistic: 148.9 on 2 and 147 DF, p-value: < 2.2e-16 ``` --- # F-ratio: Some details + `\(F\)`-ratio is a ratio of the explained to unexplained variance: `$$F = \frac{\frac{SS_{model}}{df_{model}}}{\frac{SS_{residual}}{df_{residual}}} = \frac{MS_{Model}}{MS_{Residual}}$$` + Where MS = mean squares -- + **What are mean squares?** + Mean squares are sums of squares calculations divided by the associated degrees of freedom + We saw how to calculate model and residual sums of squares last week + But what are model and residual degrees of freedom? --- # Degrees of freedom + The degrees of freedom are defined as the number of independent values associated with the different calculations + Conceptually, how many values in the calculation can vary, if we keep the outcome of the calculation fixed -- + `\(df\)` are typically linked to: + the amount of data you have (sample size, `\(n\)`) + and the number of things you need to calculate/estimate based on that data (in our case the number of `\(beta\)`s) --- # Degrees of freedom + **Model degrees of freedom = `\(k\)` ** + `\(SS_{model}\)` are dependent on estimated `\(\beta\)` s, which is `\(k + 1\)` (the number of predictors plus the intercept) + From `\(k + 1\)`, we subtract `\(1\)` as not all the estimates can vary while holding the outcome constant + This gives us `\(k\)` for Model `\(df\)` -- + **Residual degrees of freedom = `\(n-k-1\)` ** + `\(SS_{residual}\)` calculation is based on our individual data points and our model (in which we estimate `\(k + 1\)` `\(\beta\)` terms, i.e. the slopes and an intercept) + For each coefficient estimated, we lose a degree of freedom, as we're fitting the model to the data and reducing the flexibility in how much the residuals (errors) can vary -- + **Total degrees of freedom = `\(n-1\)` ** + `\(SS_{total}\)` calculation is based on the observed `\(y_i\)` and `\(\bar{y}\)` . + In order to estimate `\(\bar{y}\)` , all apart from one value of `\(y\)` is free to vary, hence `\(n-1\)` --- # Our example (note the `\(df\)` at the bottom) ``` r summary(performance) ``` ``` ## ## Call: ## lm(formula = score ~ hours + motivation, data = test_study2) ## ## Residuals: ## Min 1Q Median 3Q Max ## -12.9548 -2.8042 -0.2847 2.9344 13.8240 ## ## Coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) 6.86679 0.65473 10.488 <2e-16 *** ## hours 1.37570 0.07989 17.220 <2e-16 *** ## motivation 0.91634 0.38376 2.388 0.0182 * ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Residual standard error: 4.386 on 147 degrees of freedom ## Multiple R-squared: 0.6696, Adjusted R-squared: 0.6651 ## F-statistic: 148.9 on 2 and 147 DF, p-value: < 2.2e-16 ``` --- # `\(F\)`-ratio + Bigger `\(F\)`-ratios indicate better fitting models + It means the variance explained by the model is big compared to the residual variance -- + `\(H_0\)` for the model says that the best guess of any individual's `\(y\)` value is `\(\bar{y}\)` (plus error) + Or, that the `\(x\)` variables collectively carry no information about `\(y\)` + All slopes = 0 -- `$$F = \frac{MS_{Model}}{MS_{Residual}}$$` + `\(F\)`-ratio will be close to 1 when `\(H_0\)` is true + If there is equivalent model to residual variation ( `\(MS_{model} = MS_{residual}\)` ), then `\(F\)`=1 + If there is more model than residual variation, then `\(F\)` > 1 --- # Testing the significance of `\(F\)` + The `\(F\)`-ratio is our test statistic for the significance of our model + As with all statistical inferences, we would select an `\(\alpha\)` level. + Identify the proper null `\(F\)`-distribution and calculate the critical value of `\(F\)` associated with chosen level of `\(\alpha\)` + Compare our `\(F\)`-statistic to the critical value. + If our value is more extreme than the critical value, it is considered significant --- # Sampling distribution for the null .pull-left[ + Similar to the `\(t\)`-distribution, the `\(F\)`-distribution changes shape based on `\(df\)` + With an `\(F\)`-statistic, we have to consider both the `\(df_{model}\)` and `\(df_{residual}\)` + In parentheses, `\(df_{model}\)` is shown before `\(df_{residual}\)` ] .pull-right[ <img src="dapr2_04_ftests_files/figure-html/unnamed-chunk-7-1.png" width="504" /> ] --- # A decision about the null + We have an `\(F\)`-statistic (from our model output summary): + `\(F = 148.9\)` -- + We consider `\(df_{model}\)` and `\(df_{residual}\)` to get our null distribution: + `\(df_{model}=k=2\)` + `\(df_{residual}=n-k-1=150-2-1=147\)` -- + We need to set our `\(\alpha\)` level + `\(\alpha = .05\)` -- + Now we can compute our critical value for `\(F\)` --- # Visualise the test .pull-left[ <img src="dapr2_04_ftests_files/figure-html/unnamed-chunk-8-1.png" width="504" /> ] .pull-right[ + `\(F\)`-distribution with 2 `\(df_{model}\)` and 147 `\(df_{residual}\)` (our null distribution) ] --- count: false # Visualise the test .pull-left[ <img src="dapr2_04_ftests_files/figure-html/unnamed-chunk-9-1.png" width="504" /> ] .pull-right[ + `\(F\)`-distribution with 2 `\(df_{model}\)` and 147 `\(df_{residual}\)` (our null distribution) + Our critical value (using the `qf` function) ``` r (Crit = round(qf(0.95, 2, 147), 3)) ``` ``` ## [1] 3.058 ``` ] --- count: false # Visualise the test .pull-left[ <img src="dapr2_04_ftests_files/figure-html/unnamed-chunk-11-1.png" width="504" /> ] .pull-right[ + `\(F\)`-distribution with 2 `\(df_{model}\)` and 147 `\(df_{residual}\)` (our null distribution) + Our critical value (using the `qf` function) ``` r (Crit = round(qf(0.95, 2, 147), 3)) ``` ``` ## [1] 3.058 ``` + We can calculate the probability of an F-statistic at least as extreme as ours, given `\(H_0\)` is true (our `\(p\)`-value): ``` r (pVal = 1-pf(148.9, 2, 147)) ``` ``` ## [1] 0 ``` ] --- count: false # Visualise the test .pull-left[ <img src="dapr2_04_ftests_files/figure-html/unnamed-chunk-14-1.png" width="504" /> ] .pull-right[ + `\(F\)`-distribution with 2 `\(df_{model}\)` and 147 `\(df_{residual}\)` (our null distribution) + Our critical value (using the `qf` function) ``` r (Crit = round(qf(0.95, 2, 147), 3)) ``` ``` ## [1] 3.058 ``` + We can calculate the probability of an F-statistic at least as extreme as ours, given `\(H_0\)` is true (our `\(p\)`-value): ``` r (pVal = 1-pf(148.9, 2, 147)) ``` ``` ## [1] 0 ``` + Our model significantly predicted the variance in test score, `\(F(2,147)= 148.90, p < .001\)` ] --- class: center, middle # Questions? --- class: inverse, center, middle # Part 2: Model Comparison & Incremental `\(F\)`-tests --- # Model comparisons + So far, our questions have been _is our overall model better than nothing?_ ( `\(F\)`-test ) or _which variables, specifically, are good predictors of the outcome variable?_ ( `\(t\)`-tests of `\(\beta\)` estimates ) -- + But what if instead we wanted to ask: > **When I make a change to my model, does it improve or not?** + This question is the core of model comparison -- + We can adapt this to our models in a more specific way: + E.g. is a model with `\(x_1\)` and `\(x_2\)` and `\(x_3\)` as predictors better than the model with just `\(x_1\)`? -- + So far: + We have tested individual predictors + and we have tested overall models + **but we have not tested the improvement when we add predictors** + We have not looked at the combined performance of a _subset_ of predictors --- # `\(F\)`-test as an incremental test + One important way we can think about the `\(F\)`-test and the `\(F\)`-ratio is as an incremental test against an "empty" or null model + A null or empty model is a linear model with only the intercept + In this model, our predicted value of the outcome for every case in our data set, is the mean of the outcome ( `\(\bar{y}\)`) + That is, with no predictors, we have no information that may help us predict the outcome + So we will be "least wrong" by guessing the mean of the outcome + An empty model is the same as saying all `\(\beta\)` = 0 + And remember, this was the null hypothesis of the `\(F\)`-test -- + So in this way, the `\(F\)`-test can be seen as **comparing two models** -- + We can extend this idea, and use the `\(F\)`-test to compare two models that contain fewer or more predictors + This is the **incremental `\(F\)`-test** --- # Incremental `\(F\)`-test .pull-left[ + The incremental `\(F\)`-test evaluates the statistical significance of the improvement in variance explained in an outcome with the addition of further predictor(s) + It is based on the difference in `\(F\)`-values between two models. + We call the model with the additional predictor(s) the **full model** + We call the model without additional predictors the **restricted model** ] .pull-right[ `$$F_{(df_R-df_F),df_F} = \frac{(SSR_R-SSR_F)/(df_R-df_F)}{SSR_F / df_F}$$` $$ `\begin{align} & \text{Where:} \\ & SSR_R = \text{residual sums of squares for the restricted model} \\ & SSR_F = \text{residual sums of squares for the full model} \\ & df_R = \text{residual degrees of freedom from the restricted model} \\ & df_F = \text{residual degrees of freedom from the full model} \\ \end{align}` $$ ] --- # Example of model comparison + Consider this example based on data from the Midlife in United States (MIDUS2) study: + Outcome: self-rated health + Covariates: Age, sex + Predictors: Big Five traits and Purpose in Life + Research Question: Does personality predict self-rated health over and above age and sex? --- # The data ``` r midus <- read_csv("data/MIDUS2.csv") midus2 <- midus %>% select(1:4, 31:42) %>% mutate( PIL = rowMeans(.[grep("PIL", names(.))],na.rm=T) ) %>% select(1:4, 12:17) %>% drop_na(.) slice(midus2, 1:3) ``` ``` ## # A tibble: 3 × 10 ## ID age sex health O C E A N PIL ## <dbl> <dbl> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> ## 1 10002 69 MALE 8 2.14 2.8 2.6 3.4 2 5.86 ## 2 10019 51 MALE 8 3.14 3 3.4 3.6 1.5 5.71 ## 3 10023 78 FEMALE 4 3.57 3.4 3.6 4 1.75 5.14 ``` --- # The models + Does personality significantly predict self-rated health over and above the effects of age and sex? + First step here is to run two models. + M1: We predict from age and sex + M2: we add in the FFM (personality) traits ``` r m1 <- lm(health ~ age + sex, data = midus2) ``` ``` r m2 <- lm(health ~ age + sex + O + C + E + A + N, data = midus2) ``` --- # Model 1 output (age + sex) ``` ## ## Call: ## lm(formula = health ~ age + sex, data = midus2) ## ## Residuals: ## Min 1Q Median 3Q Max ## -7.3057 -1.0496 0.5796 0.8533 2.9769 ## ## Coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) 7.755974 0.183649 42.232 < 2e-16 *** ## age -0.008829 0.003117 -2.833 0.00467 ** ## sexMALE 0.035288 0.078619 0.449 0.65359 ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Residual standard error: 1.642 on 1758 degrees of freedom ## Multiple R-squared: 0.004627, Adjusted R-squared: 0.003494 ## F-statistic: 4.086 on 2 and 1758 DF, p-value: 0.01697 ``` --- # Model 2 output (age + sex + personality) ``` ## ## Call: ## lm(formula = health ~ age + sex + O + C + E + A + N, data = midus2) ## ## Residuals: ## Min 1Q Median 3Q Max ## -6.7723 -0.7921 0.2532 1.0097 3.9550 ## ## Coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) 6.66172 0.45100 14.771 < 2e-16 *** ## age -0.01310 0.00298 -4.396 1.17e-05 *** ## sexMALE -0.09571 0.07955 -1.203 0.229 ## O 0.09308 0.08306 1.121 0.263 ## C 0.57147 0.08507 6.717 2.49e-11 *** ## E 0.56771 0.08061 7.043 2.70e-12 *** ## A -0.40380 0.09025 -4.474 8.15e-06 *** ## N -0.56493 0.06189 -9.128 < 2e-16 *** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Residual standard error: 1.521 on 1753 degrees of freedom ## Multiple R-squared: 0.1484, Adjusted R-squared: 0.145 ## F-statistic: 43.65 on 7 and 1753 DF, p-value: < 2.2e-16 ``` --- # Incremental `\(F\)`-test in R + Second step + Compare the two models based on an incremental `\(F\)`-test + In order to apply the `\(F\)`-test for model comparison in R, we use the `anova()` function. + `anova()` takes as its arguments models that we wish to compare + Here we see an example with 2 models, but we could use more ``` r anova(m1, m2) ``` --- # Incremental `\(F\)`-test in R ``` r anova(m1, m2) ``` ``` ## Analysis of Variance Table ## ## Model 1: health ~ age + sex ## Model 2: health ~ age + sex + O + C + E + A + N ## Res.Df RSS Df Sum of Sq F Pr(>F) ## 1 1758 4740.2 ## 2 1753 4055.4 5 684.85 59.208 < 2.2e-16 *** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ``` --- class: center, middle # Questions? --- class: inverse, center, middle # Part 3: Non-nested models and alternatives to `\(F\)`-tests --- # Nested vs non-nested models + The `\(F\)`-ratio depends on the comparison models being nested + Nested means that the predictors in one model are a subset of the predictors in the other + We also require the models to be computed on the same data + Be careful when data contains NA's + The `lm` function excludes the whole row of data if any of `\(y\)` or `\(x\)`'s specified in the model have missing values on that row + If the additional variables you add to the full model have NA's, the data sets used by the models could end up different! -- > **You can only do an `\(F\)`-test if the models are nested: the variables are nested and the data are identical** --- # Nested vs non-nested models Assuming no NA's in `data`: .pull-left[ **Nested** ``` r m0 <- lm(outcome ~ x1 + x2 , data = data) m1 <- lm(outcome ~ x1 + x2 + x3, data = data) ``` + These models are nested. + `x1` and `x2` appear in both models ] .pull-right[ **Non-nested** ``` r m0 <- lm(outcome ~ x1 + x2 + x4, data = data) m1 <- lm(outcome ~ x1 + x2 + x3, data = data) ``` + These models are non-nested + There are unique variables in both models + `x4` in `m0` + `x3` in `m1` ] --- # Model comparison for non-nested models + So what happens when we have non-nested models? + There are two commonly used alternatives + AIC + BIC + Unlike the incremental `\(F\)`-test AIC and BIC do not require two models to be nested + Smaller (more negative) values indicate better fitting models + So we compare values and choose the model with the smaller AIC or BIC value --- # AIC & BIC .pull-left[ `$$AIC = n\,\text{ln}\left( \frac{SS_{residual}}{n} \right) + 2k$$` $$ `\begin{align} & \text{Where:} \\ & SS_{residual} = \text{sum of squares residuals} \\ & n = \text{sample size} \\ & k = \text{number of explanatory variables} \\ & \text{ln} = \text{natural log function} \end{align}` $$ ] .pull-right[ `$$BIC = n\,\text{ln}\left( \frac{SS_{residual}}{n} \right) + k\,\text{ln}(n)$$` $$ `\begin{align} & \text{Where:} \\ & SS_{residual} = \text{sum of squares residuals} \\ & n = \text{sample size} \\ & k = \text{number of explanatory variables} \\ & \text{ln} = \text{natural log function} \end{align}` $$ ] --- # Parsimony corrections + Both AIC and BIC contain something called a parsimony correction + In essence, they penalise models for being complex + This is to help us avoid overfitting (adding predictors arbitrarily to improve fit) `$$AIC = n\,\text{ln}\left( \frac{SS_{residual}}{n} \right) + 2k$$` `$$BIC = n\,\text{ln}\left( \frac{SS_{residual}}{n} \right) + k\,\text{ln}(n)$$` + BIC has a harsher parsimony penalty for typical sample sizes when applying linear models than AIC + When `\(\text{ln}(n) > 2\)` , BIC will have a more severe parsimony penalty (i.e. essentially all the time!) --- # In R + Let's use AIC and BIC on our `m1` and `m2` models from previously: .pull-left[ ``` r AIC(m1, m2) ``` ``` ## df AIC ## m1 4 6749.246 ## m2 9 6484.457 ``` ] .pull-right[ ``` r BIC(m1, m2) ``` ``` ## df BIC ## m1 4 6771.141 ## m2 9 6533.719 ``` ] --- # Let's consider a different example + Our previous models were nested + `m1` had just covariates + `m2` added personality + Using the same data, let's consider a non-nested example + Suppose we want to compare a model that: + predicts self-rated health from just 5 personality variables (`nn1` : non-nested model 1) + to a model that predicts from age, sex and a variable called Purpose in Life (PIL) (`nn2`). --- # Applied to non-nested models ``` r nn1 <- lm(health ~ O + C + E + A + N, data=midus2) nn2 <- lm(health ~ age + sex + PIL, data = midus2) ``` ``` r AIC(nn1, nn2) ``` ``` ## df AIC ## nn1 7 6501.524 ## nn2 5 6564.953 ``` ``` r BIC(nn1, nn2) ``` ``` ## df BIC ## nn1 7 6539.840 ## nn2 5 6592.321 ``` --- # Considerations for use of AIC and BIC + AIC and BIC can be used for both nested and non-nested models -- + The AIC and BIC for a single model are not meaningful + They only make sense for model comparisons + We evaluate these comparisons by looking at the difference, `\(\Delta\)`, between two values -- + There are no specific thresholds for `\(\Delta AIC\)` to suggest how big a difference in two models is needed to conclude that one is substantively better than the other -- + The following `\(\Delta BIC\)` cutoffs have been suggested (Raftery, 1995): | Value | Interpretation | |-------------------|---------------------------------------------------| | `\(\Delta < 2\)` | No evidence of difference between models | | `\(2 < \Delta < 6\)` | Positive evidence of difference between models | | `\(6 < \Delta < 10\)` | Strong evidence of difference between models | | `\(\Delta > 10\)` | Very strong evidence of difference between models | --- class: center, middle # Questions? --- # Pause to summarise what we know so far + So far we have seen how to: + run a linear model with a single predictor + extend this and add predictors + interpret these coefficients either in original units or standardised units + test the significance of `\(\beta\)` coefficients + test the significance of the overall model + estimate the amount of variance explained by our model + evaluate improvements to model fit when variables are added + select a better-fitting model between two nested or non-nested models + You can now run and interpret linear models with continuous predictors + Next week we will put this into action constructing and implementing an analysis plan for a linear model on a real example --- ## This week .pull-left[ ### Tasks <img src="figs/labs.svg" width="10%" /> **Attend your lab and work together on the exercises** <br> <img src="figs/exam.svg" width="10%" /> **Complete the weekly quiz** ] .pull-right[ ### Support <img src="figs/forum.svg" width="10%" /> **Help each other on the Piazza forum** <br> <img src="figs/oh.png" width="10%" /> **Attend office hours (see Learn page for details)** ] --- class: inverse, center, middle # Thanks for listening