Categorical predictors:
Practice analysis


Data Analysis for Psychology in R 2

Elizabeth Pankratz (elizabeth.pankratz@ed.ac.uk)


Department of Psychology
University of Edinburgh
2025–2026

Course Overview


Introduction to Linear Models Intro to Linear Regression
Interpreting Linear Models
Testing Individual Predictors
Model Testing & Comparison
Linear Model Analysis
Analysing Experimental Studies Categorical Predictors & Dummy Coding
Effects Coding & Coding Specific Contrasts
Assumptions & Diagnostics
Bootstrapping
Categorical Predictor Analysis
Interactions Interactions I
Interactions II
Interactions III
Analysing Experiments
Interaction Analysis
Advanced Topics Power Analysis
Binary Logistic Regression I
Binary Logistic Regression II
Logistic Regression Analysis
Exam Prep and Course Q&A

The game plan for this week

The game plan for this week


Lectures:

Together, we’ll practice the full analysis workflow we’ve developed this semester.

  • Today: we’ll work through one example, maybe more.

  • Tomorrow: you’ll vote later on what you’d like us to do.


Labs:

You’ll practice writing up these analyses as a report.

The study brief

The study brief


Review: The analysis workflow

The analysis workflow


Research questions and data

Research questions and data

RQ1: Do conscientiousness, frequency of accessing online materials, and year of study in University predict course attendance?

RQ2: Is there a difference in attendance between those with early/late classes in comparison to those with midday classes?

Dataframe called data1 containing:

  • pid: Participant ID number
  • Attendance: Total attendance (in days)
  • Conscientiousness: Conscientiousness (Levels: Low, Moderate, High)
  • Time: Time of class (Levels: 9AM, 10AM, 11AM, 12PM, 1PM, 2PM, 3PM, 4PM)
  • OnlineAccess: Frequency of access to online course materials (Levels: Rarely, Sometimes, Often)
  • Year: Year of study in university (Levels: Y1, Y2, Y3, Y4, MSc, PhD)

RQ3: Is class attendance associated with final grades?

Dataframe called data2 containing:

  • Marks: Final grade (0–100)
  • Attendance: Total attendance (in days)


I will live code, but
you’ll all vote on what to do

Analysis steps so far, in three phases (1)

Not every analysis requires every step.

Think of these steps like a buffet for you to pick and choose from, depending on what your analysis needs.


(1) Before model fitting:

  • Identify the relevant variables
  • Data tidying (e.g., missingness? factor levels?)
  • Get summary statistics for the relevant variables
  • Plot each relevant variable individually
  • Plot the relevant variables together
  • Set up categorical predictors (e.g., what a priori coding scheme?)
  • Set up continuous predictors (e.g., any transformations?)
  • Think what the model coefficients might look like
  • Formally state null and alternative hypotheses

Analysis steps so far, in three phases (2)


(2) Model fitting:

  • Fit the linear model to the data

Analysis steps so far, in three phases (3)


(3) After model fitting:

  • Check model assumptions
  • Bootstrap the linear model
  • Run diagnostics for multicollinearity
  • Run diagnostics for unusual data points (we’ll skip this today for reasons of time)
  • Interpret the model estimates
  • Run sensitivity analysis (we’ll skip this today for reasons of time)
  • Get estimated marginal means
  • Plot estimated marginal means
  • Create and test manual contrasts
  • Write up your analysis

Vote on the next step on Wooclap: RQ1




RQ1: wooclap.com, enter code FJAKCO

What to do tomorrow?

What to do tomorrow?




wooclap.com, enter code ZBTYJA

If we keep analysing data

Vote on the next step on Wooclap: RQ2




RQ2: wooclap.com, enter code PHPQUA

Vote on the next step on Wooclap: RQ3




RQ3: wooclap.com, enter code ZRXMDS

This week


Tasks


Attend your lab and work together on the exercises

Support


Help each other on the Piazza forum


Complete the weekly quiz

Attend office hours (see Learn page for details)