This section focuses on the distinction between binary and binomial data.
For binary regression, all the data in our outcome variable has to be a 0 or a 1. For example, the correct
variable below:
participant | question | correct |
---|---|---|
1 | 1 | 1 |
1 | 2 | 0 |
1 | 3 | 1 |
... | ... | ... |
But we can re-express this information in a different way, when we know the total number of questions asked:
participant | questions_correct | questions_incorrect |
---|---|---|
1 | 2 | 1 |
2 | 1 | 2 |
3 | 3 | 0 |
... | ... | ... |
To model data when it is in this form, we can express our outcome as cbind(questions_correct, questions_incorrect)