Block 1 Flash Cards

Flash Card Aims

The purpose of these flashcards is to complement your Semester 1 Weeks 1 - 5 core learning materials i.e., your lecture and lab materials, by offering additional guidance and examples on key concepts/topics. It’s designed to deepen your understanding, clarify complex concepts, and help you make connections between different areas of study. Think of it as an extra resource that supports what you’re learning in the classroom.

You may want to consider using the below as a supporting document whilst your work through lab exercises, and/or refer to in order to aid revision.

R Packages

Within this reading, the following packages are used:

  • tidyverse
  • sjPlot
  • kableExtra
  • psych

Presenting Results

Note that you must not copy any of the write-ups included below for future reports - if you do, you will be committing plagiarism, and this type of academic misconduct is taken very seriously by the University. You can find out more here.

Back to Basics

For an overview of basic statistical tests and core concepts (e.g., \(p\)-values), please revisit the DAPR1 materials for a refresher (also accessible via the DAPR1 Learn page).

Terminology

Data Exploration

The common first port of call for almost any statistical analysis is to explore the data, and we can do this visually and/or numerically.

Marginal Distributions Bivariate Associations
Description The distribution of each variable individually (i.e., without reference to the values of the other variables). Describing the association between two numeric variables.
Visually Plot each variable individually.

You could use, for example, geom_density() for a density plot or geom_histogram() for a histogram to comment on and/or examine:
  • The shape of the distribution. Look at the shape, centre and spread of the distribution. Is it symmetric or skewed? Is it unimodal or bimodal?
  • Identify any unusual observations. Do you notice any extreme observations (i.e., outliers)?
Plot associations among two variables.

You could use, for example, geom_point() for a scatterplot to comment on and/or examine:
  • The direction of the association indicates whether there is a positive or negative association
  • The form of association refers to whether the relationship between the variables can be summarized well with a straight line or some more complicated pattern
  • The strength of association entails how closely the points fall to a recognizable pattern such as a line
  • Unusual observations that do not fit the pattern of the rest of the observations and which are worth examining in more detail
Numerically    Compute and report summary statistics e.g., mean, standard deviation, median, min, max, etc.

You could, for example, calculate summary statistics such as the mean (mean()) and standard deviation (sd()), etc. within summarize()
Compute and report the correlation coefficient.

You can use the cor() function to calculate this

Numeric Exploration

Numeric exploration of data involves examining key statistics like mean, median, and standard deviation via descriptives tables; and assessing the associations among variables through correlation coefficients. Exploring our data numerically helps us to identify patterns and associations in the data.

Descriptives

Descriptives Tables

Descriptives Tables - Examples

Correlation

Correlation Coefficient

Correlation Matrix

Correlation - Hypothesis Testing

Correlation - Hypothesis Testing in R

Visual Exploration

Visual exploration of our data allows us to visualize the distributions of our data, and to identify potential associations between variables.

How to Visualise Data

Data Visualisation - Marginal Examples

Data Visualisation - Bivariate Examples

Functions and Mathematical Models

Basic functions and mathematical models are foundational tools used to describe and predict associations between variables.

Identification & Specification

Deterministic Models - Description & Specification

Deterministic Models - Visualisation

Deterministic Models - Predicted Values

Statistical Models

Statistical models are used to understand the associations among variables.

Specifying Hypotheses

Simple Linear Regression Models - Description & Specification

Simple Linear Regression Models - Example

Multiple Linear Regression Models - Description & Specification

Simple & Multiple Regression Models - Extracting Information

Simple Linear Regression Models - Visualisation

Multiple Linear Regression Models - Visualisation

Model Predicted Values & Residuals

Model predicted values are the estimates generated by a regression model for the dependent variable based on the independent variable(s), whilst residuals are the differences between these predicted values and the actual observed values (in turn indicating the accuracy of the model’s predictions).

Predicted Values

Residuals

Predicted Values - Example

Data Transformations

There are many transformations we can do to a continuous variable, but the most common ones are centering and scaling. These transformations can help to aid interpretability of our statistical models.

Centering

Scaling

Standardisation

Model Fit

Assessing model fit involves examining metrics like the sum of squares to measure variability explained by the model, the \(F\)-ratio to evaluate the overall significance of the model by comparing explained variance to unexplained variance, and \(R\)-squared / Adjusted \(R\)-squared to quantify the proportion of variance in the dependent variable explained by the independent variable(s).

Sums of Squares

F-ratio

R-squared and Adjusted R-squared

Model Comparisons

One useful thing we might want to do is compare our models with and without some predictor(s).There are numerous ways we can do this, but the method chosen depends on the models and underlying data:

Nested vs Non-Nested Models

Incremental F-test

AIC & BIC

General Formatting & Presenting of Results

LaTeX Symbols & Equations

By embedding LaTeX into RMarkdown, you can accurately and precisely format mathematical expressions, ensuring that they are not only technically correct but also visually appealing and easy to interpret.

LaTeX Guide

APA Formatting

APA format is a writing/presentation style that is often used in psychology to ensure consistency in communication. APA formatting applies to all aspects of writing - from formatting of papers (including tables and figures), citation of sources, and reference lists. This means that it also applies to how you present results in your Psychology courses, including DAPR2.

APA Formatting Guides

Tables

We want to ensure that we are presenting results in a well formatted table. To do so, there are lots of different packages available (see Lesson 4 of the RMD bootcamp).

One of the most convenient ways to present results from regression models is to use the tab_model() function from sjPlot

Creating tables via tab_model

Cross Referencing

Cross-referencing is a very helpful way to direct your reader through your document, and the good news is that this can be done automatically in RMarkdown.

Cross Referencing

Footnotes

  1. Yes, the error term is gone. This is because the line of best-fit gives you the prediction of the average recall accuracy for a given age, and not the individual recall accuracy of an individual person, which will almost surely be different from the prediction of the line.↩︎