Generalised Linear Model

class: center, middle, inverse, title-slide

# <b> Generalised Linear Model </b>
## Data Analysis for Psychology in R 2<br><br>
### dapR2 Team
### Department of Psychology<br>The University of Edinburgh

---

# Weeks Learning Objectives

+ Develop awareness of the glm for modelling non-continuous outcomes

+ Understand the types of missing data and common causes.

+ Develop awareness of approaches to dealing with missing data.

---
# Topics for today

+ Generalised Linear Model
  + Basic structure
  + Types of outcome and associated GLM

---
# Why do we need the GLM?

- We saw last week that when we have a binary outcome, the linear model is not appropriate.

- There are many other different types of outcome variable for which the same is true.

- The GLM provides a unified framework to understand how we can analyse outcome data of different types.
  - Binary
  - multiple categories (ordered or unordered)
  - count data
  - continuous data with non-normal distributions (response times)

---
# Structure of GLM

1. ***Random component*** that specifies the conditional distribution of `$Y_i$` given the values of the predictors (or 2).
  - Sounds scary, but remember this: `$\epsilon \sim N(0, \sigma)$` from linear model
  - This is just another way of saying that the conditional distribution of `$Y_i$` is normal or,
  - `$Y_i \sim N(\beta_0 + \beta1x_1, \sigma)$`
  
--

2. ***Linear function of predictors***. `$\eta_i = \beta_0 + \beta1x_1 ... +\beta_kx_k$`
  - This is what we have seen for a majority of this course.
  - And previously called the deterministic element of the linear model

3. A ***linearizing link function*** `$g(.)$` which links `$\eta_i$` to mean of response (or `$E(Y_i))$` ) 
  - This is a transformation of (1) to (2)
  - This can take lots of forms (we wont look at all in detail)

---
# Structure of GLM

- This leads to two ways to think about the GLM.

1. A linear function predicting a transformed response variable.

`$$g(E(Y_i)) = \eta_i$$`

2. A non-linear model for the response.

`$$E(Y_i) = g^{-1}(\eta_i)$$`

- Note this mirrors the two ways we looked at the logistic model last week.

---
# Returning to our types of data
- The conditional distribution of `$Y_i$` for different data types is not normal.
  - Hence the linear model is not appropriate.

<table class="table" style="width: auto !important; margin-left: auto; margin-right: auto;">
 <thead>
  <tr>
   <th style="text-align:left;"> Data Type </th>
   <th style="text-align:left;"> Example </th>
   <th style="text-align:left;"> Distribution </th>
   <th style="text-align:left;"> Link Name </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:left;"> ~Continuous Normal </td>
   <td style="text-align:left;"> Cognitive scores </td>
   <td style="text-align:left;"> Normal </td>
   <td style="text-align:left;"> Identity </td>
  </tr>
  <tr>
   <td style="text-align:left;"> ~Continuous non-normal </td>
   <td style="text-align:left;"> Response time </td>
   <td style="text-align:left;"> Exponential or Gamma </td>
   <td style="text-align:left;"> Negative Inverse </td>
  </tr>
  <tr>
   <td style="text-align:left;"> Count (unbounded) </td>
   <td style="text-align:left;"> Road traffic accidents </td>
   <td style="text-align:left;"> Poisson </td>
   <td style="text-align:left;"> Log </td>
  </tr>
  <tr>
   <td style="text-align:left;"> Binary </td>
   <td style="text-align:left;"> Hiring </td>
   <td style="text-align:left;"> Bernoulli or Binomial </td>
   <td style="text-align:left;"> Logit </td>
  </tr>
  <tr>
   <td style="text-align:left;"> 2+ categories </td>
   <td style="text-align:left;"> Occupational Choice </td>
   <td style="text-align:left;"> Multinomial </td>
   <td style="text-align:left;"> Logit </td>
  </tr>
</tbody>
</table>

---
# Linear model within GLM

- We can apply this idea to the standard linear model.

- In the previous slide we saw reference to the **identity** link.
  - This simply means that the function returns the same value as the input.
  - Why?

- Remember, the purpose of `$g(.)$` is to be a linearizing function. 
  - With a standard continuous predictor, the linear model ***is*** linear. 
  - There is no transformation needed.
  
`$$\eta_i = g(E(Y_i)) = E(Y_i)$$`

---
# Looking back to logistic

- We can also link our logistic model from last week.

- One step which hopefully connects is to emphasize:

`$$E(Y_i) = P(Y=1)$$`

- `$g(.)$` is thr logistic function we discussed last week.

`$$g(.) = ln \left (\frac{P(Y=1)}{1-P(Y=1)} \right)$$`

- So:

`$$g(E(Y_i)) = \eta_i$$`

- Is the logistic model from last week:

`$$ln \left (\frac{P(Y=1)}{1-P(Y=1)} \right) = \beta_0 + \beta_1x_1 + \beta_2x_2 ... + \beta_kx_k$$`

---
# Other forms of GLM

<table class="table" style="width: auto !important; margin-left: auto; margin-right: auto;">
 <thead>
  <tr>
   <th style="text-align:left;"> Data Type </th>
   <th style="text-align:left;"> Example </th>
   <th style="text-align:left;"> Distribution </th>
   <th style="text-align:left;"> Link Name </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:left;"> Continuous Normal </td>
   <td style="text-align:left;"> A </td>
   <td style="text-align:left;"> Normal </td>
   <td style="text-align:left;"> Identity </td>
  </tr>
  <tr>
   <td style="text-align:left;"> Continuous non-normal </td>
   <td style="text-align:left;"> B </td>
   <td style="text-align:left;"> Exponential or Gamma </td>
   <td style="text-align:left;"> Negative Inverse </td>
  </tr>
  <tr>
   <td style="text-align:left;"> Count </td>
   <td style="text-align:left;"> C </td>
   <td style="text-align:left;"> Poisson </td>
   <td style="text-align:left;"> Log </td>
  </tr>
  <tr>
   <td style="text-align:left;"> Binary </td>
   <td style="text-align:left;"> D </td>
   <td style="text-align:left;"> Bernoulli or Binomial </td>
   <td style="text-align:left;"> Logit </td>
  </tr>
  <tr>
   <td style="text-align:left;"> 2+ categories </td>
   <td style="text-align:left;"> E </td>
   <td style="text-align:left;"> Multinomial </td>
   <td style="text-align:left;"> Logit </td>
  </tr>
</tbody>
</table>

- This table represents a small subset of the most commonly used models.

- But as a framework the GLM has been extended into a huge array of different models.

---
# Estimation and evaluation

- **Estimation**: For all cases you will likely encounter, it is maximum likelihood.
  - As noted last week, there is an excellent introduction in the Enders Missing Data book (see reading)
  - And there is a short introduction in week 8 lab.

- **Evaluation**: If we have used ML, we have the deviance.
  - And so we can use the likelihood ratio (or `$\chi^2$` difference or drop in deviance) tests
  - AIC
  - BIC
  - *z* (sometimes called Wald) tests

---
# How much of this do I need to remember?

- Remember is exists!

- Remember that you know about the tools needed to run and evaluate these models:
  - maximum likelihood
  - likelihood ratio test
  - AIC and BIC
  - *z* or Wald test
  - linear function of `$x$` and `$\beta$`
  - `glm()`

- And after this, remember that if you think this is the way you need to go for a project, you know what to ask about!