Block 3 Analysis & Write-Up Example

Learning Objectives

At the end of this lab, you will:

Understand how to write-up and provide interpretation of a 4x2 factorial ANOVA¹

What You Need

Be up to date with lectures
Have completed Labs Semester 2 Weeks 1-4

Required R Packages

Remember to load all packages within a code chunk at the start of your RMarkdown file using library(). If you do not have a package and need to install, do so within the console using install.packages(" "). For further guidance on installing/updating packages, see Section C here.

For this lab, you will need to load the following package(s):

tidyverse
psych
kableExtra
sjPlot
interactions
emmeans

Lab Data

You can download the data required for this lab here or read it in via this link https://uoepsy.github.io/data/laptop_vs_longhand.csv

Section A: Write-Up

In this lab you will be presented with the output from a statistical analysis, and your job will be to write-up and present the results. We’re going to use a simulated dataset based on a paper (the same that you have worked on in lectures this week) concerning test outcomes and note taking methods.

The aim in writing should be that a reader is able to more or less replicate your analyses without referring to your R code. This requires detailing all of the steps you took in conducting the analysis. The point of using RMarkdown is that you can pull your results directly from the code. If your analysis changes, so does your report!

Make sure that your final report doesn’t show any R functions or code. Remember you are interpreting and reporting your results in text, tables, or plots, targeting a generic reader who may use different software or may not know R at all. If you need a reminder on how to hide code, format tables, etc., make sure to review the rmd bootcamp.

Important - Write-Up Examples & Plagiarism

The example write-up sections included below are not perfect - they instead should give you a good example of what information you should include within each section, and how to structure this. For example, some information is missing (e.g., description of data checks, interpretation of descriptive statistics), some information could be presented more clearly (e.g., variable names in tables, table/figure titles/captions, and rationales for choices), and writing could be more concise in places (e.g., discussion section could be more succinct and more focused on the research questions in places).

Further, you must not copy any of the write-up included below for future reports - if you do, you will be committing plagiarism, and this type of academic misconduct is taken very seriously by the University. You can find out more here.

Study Overview

Research Aim

Explore the associations among study time and note-taking medium on test scores.

Research Questions

RQ1: Do differences in test scores between study conditions differ by the note-taking medium used?

RQ2: Are there differences in test scores between participants when comparing pairs of study and note-taking conditions? If so, what are these specific differences?

Note Taking: Data Codebook

Variable	Description
test_score	Test Score (0-100)
medium	Medium of note-taking (levels = Longhand, Laptop)
study	Study time (levels = No, Minimal, Moderate, Extensive)

test_score	medium	study
47.47812	Laptop	No
50.41772	Laptop	No
49.88763	Laptop	No
48.47961	Laptop	No
48.44368	Laptop	No
48.02715	Laptop	No

Setup

Create a new RMarkdown file
Load the required package(s)
Read the laptop_vs_longhand dataset into R, assigning it to an object named notes

Analysis Code

Try to answer the research question above without referring to the provided analysis code below, and then check how your script matches up - is there anything you missed or done differently? If so, discuss the differences with a tutor - there are lots of ways to code to the same solution!

Provided Analysis Code

######Step 1 is always to read in the data, then to explore, check, describe, and visualise it.

#check coding of variables - are they coded as they should be?
str(notes)

spc_tbl_ [160 × 3] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
 $ test_score: num [1:160] 47.5 50.4 49.9 48.5 48.4 ...
 $ medium    : chr [1:160] "Laptop" "Laptop" "Laptop" "Laptop" ...
 $ study     : chr [1:160] "No" "No" "No" "No" ...
 - attr(*, "spec")=
  .. cols(
  ..   test_score = col_double(),
  ..   medium = col_character(),
  ..   study = col_character()
  .. )
 - attr(*, "problems")=<externalptr>

head(notes)

# A tibble: 6 × 3
  test_score medium study
       <dbl> <chr>  <chr>
1       47.5 Laptop No   
2       50.4 Laptop No   
3       49.9 Laptop No   
4       48.5 Laptop No   
5       48.4 Laptop No   
6       48.0 Laptop No

#check for NAs - none in dataset, so no missing values
table(is.na(notes))


FALSE 
  480

#make variables factors
notes <- notes %>%
    mutate(medium = as_factor(medium),
           study = as_factor(study))

#create descriptives table
descript <- notes %>% 
    group_by(study, medium) %>%
   summarise(
       M_Score = round(mean(test_score), 2),
       SD_Score = round(sd(test_score), 2),
       SE_Score = round(sd(test_score)/sqrt(n()), 2),
       Min_Score = round(min(test_score), 2),
       Max_Score = round(max(test_score), 2)
    )
descript

# A tibble: 8 × 7
# Groups:   study [4]
  study     medium   M_Score SD_Score SE_Score Min_Score Max_Score
  <fct>     <fct>      <dbl>    <dbl>    <dbl>     <dbl>     <dbl>
1 No        Laptop      48.1        2     0.45      44.9      51.2
2 No        Longhand    51.0        2     0.45      48.0      54.9
3 Minimal   Laptop      55.6        2     0.45      52.1      60.1
4 Minimal   Longhand    60.9        2     0.45      57.8      65.3
5 Moderate  Laptop      59.3        2     0.45      55.8      63.0
6 Moderate  Longhand    80.7        2     0.45      76.3      84.6
7 Extensive Laptop      61.2        2     0.45      56.9      64.3
8 Extensive Longhand    90.6        2     0.45      86.8      94.3

#boxplot
p1 <- ggplot(data = notes, aes(x = study, y = test_score, color = medium)) + 
  geom_boxplot() + 
    ylim(0,100) +
    labs(x = "Study Condition", y = "Test Score")
p1

#plot showing the mean score for each condition
# p2 is useful to notice that lines do not run in parallel - suggests interaction
p2 <- ggplot(descript, aes(x = study, y = M_Score, color = medium)) + 
  geom_point(size = 3) +
  geom_linerange(aes(ymin = M_Score - 2 * SE_Score, ymax = M_Score + 2 * SE_Score)) +
  geom_path(aes(x = as.numeric(study)))
p2

######Step 2 is to run your model(s) of interest to answer your research question, and make sure that the data meet the assumptions of your chosen test

#set reference levels
notes$medium <- fct_relevel(notes$medium , "Longhand")
notes$study <- fct_relevel(notes$study , "No")

#build model
notes_mdl <- lm(test_score ~ study*medium, data = notes)

#check assumptions - note should check diagnostics here too!
par(mfrow=c(2,2))
plot(notes_mdl)

par(mfrow=c(1,1))

# look at model output - summary()
summary(notes_mdl)


Call:
lm(formula = test_score ~ study * medium, data = notes)

Residuals:
    Min      1Q  Median      3Q     Max 
-4.3485 -1.4764 -0.1018  1.4039  4.5321 

Coefficients:
                            Estimate Std. Error t value Pr(>|t|)    
(Intercept)                  51.0200     0.4472 114.084  < 2e-16 ***
studyMinimal                  9.8900     0.6325  15.637  < 2e-16 ***
studyModerate                29.6700     0.6325  46.912  < 2e-16 ***
studyExtensive               39.5600     0.6325  62.550  < 2e-16 ***
mediumLaptop                 -2.9000     0.6325  -4.585 9.41e-06 ***
studyMinimal:mediumLaptop    -2.4400     0.8944  -2.728  0.00712 ** 
studyModerate:mediumLaptop  -18.4900     0.8944 -20.672  < 2e-16 ***
studyExtensive:mediumLaptop -26.5200     0.8944 -29.650  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 2 on 152 degrees of freedom
Multiple R-squared:  0.9803,    Adjusted R-squared:  0.9794 
F-statistic:  1081 on 7 and 152 DF,  p-value: < 2.2e-16

#table results
tab_model(notes_mdl, 
          pred.labels = c('Intercept', 'Study - Minimal', 'Study - Moderate', 'Study - Extensive', 'Medium - Laptop', 'Study - Minimal : Medium - Laptop', 'Study - Moderate : Medium - Laptop', 'Study - Extensive : Medium - Laptop'),
          title = 'RQ1 - Regression Table for Total Scores Model')

RQ1 - Regression Table for Total Scores Model
	test score
Predictors	Estimates	CI	p
Intercept	51.02	50.14 – 51.90	<0.001
Study - Minimal	9.89	8.64 – 11.14	<0.001
Study - Moderate	29.67	28.42 – 30.92	<0.001
Study - Extensive	39.56	38.31 – 40.81	<0.001
Medium - Laptop	-2.90	-4.15 – -1.65	<0.001
Study - Minimal : Medium - Laptop	-2.44	-4.21 – -0.67	0.007
Study - Moderate : Medium - Laptop	-18.49	-20.26 – -16.72	<0.001
Study - Extensive : Medium - Laptop	-26.52	-28.29 – -24.75	<0.001
Observations	160
R² / R² adjusted	0.980 / 0.979

#int model plot
plt_notes_mdl <- cat_plot(model = notes_mdl, 
                  pred = study, 
                  modx = medium, 
                  main.title = "Scores across Study and Medium",
                  x.label = "Study",
                  y.label = "Score",
                  legend.main = "Medium")
plt_notes_mdl

#####Step 3 somewhat depends on the outcomes of step 2. Here, you may need to consider conducting further analyses before writing up / describing your results in relation to the research question. 

#Perform a pairwise comparison of the mean accuracy (as measured by points accrued) across the 2×2 factorial design, making sure to adjust for multiple comparisons. 

m1_emm <- emmeans(notes_mdl, ~study*medium)

pairs_res <- pairs(m1_emm)
pairs_res

 contrast                               estimate    SE  df t.ratio p.value
 No Longhand - Minimal Longhand            -9.89 0.632 152 -15.637  <.0001
 No Longhand - Moderate Longhand          -29.67 0.632 152 -46.912  <.0001
 No Longhand - Extensive Longhand         -39.56 0.632 152 -62.550  <.0001
 No Longhand - No Laptop                    2.90 0.632 152   4.585  0.0002
 No Longhand - Minimal Laptop              -4.55 0.632 152  -7.194  <.0001
 No Longhand - Moderate Laptop             -8.28 0.632 152 -13.092  <.0001
 No Longhand - Extensive Laptop           -10.14 0.632 152 -16.033  <.0001
 Minimal Longhand - Moderate Longhand     -19.78 0.632 152 -31.275  <.0001
 Minimal Longhand - Extensive Longhand    -29.67 0.632 152 -46.912  <.0001
 Minimal Longhand - No Laptop              12.79 0.632 152  20.223  <.0001
 Minimal Longhand - Minimal Laptop          5.34 0.632 152   8.443  <.0001
 Minimal Longhand - Moderate Laptop         1.61 0.632 152   2.546  0.1847
 Minimal Longhand - Extensive Laptop       -0.25 0.632 152  -0.395  0.9999
 Moderate Longhand - Extensive Longhand    -9.89 0.632 152 -15.637  <.0001
 Moderate Longhand - No Laptop             32.57 0.632 152  51.498  <.0001
 Moderate Longhand - Minimal Laptop        25.12 0.632 152  39.718  <.0001
 Moderate Longhand - Moderate Laptop       21.39 0.632 152  33.821  <.0001
 Moderate Longhand - Extensive Laptop      19.53 0.632 152  30.880  <.0001
 Extensive Longhand - No Laptop            42.46 0.632 152  67.135  <.0001
 Extensive Longhand - Minimal Laptop       35.01 0.632 152  55.356  <.0001
 Extensive Longhand - Moderate Laptop      31.28 0.632 152  49.458  <.0001
 Extensive Longhand - Extensive Laptop     29.42 0.632 152  46.517  <.0001
 No Laptop - Minimal Laptop                -7.45 0.632 152 -11.779  <.0001
 No Laptop - Moderate Laptop              -11.18 0.632 152 -17.677  <.0001
 No Laptop - Extensive Laptop             -13.04 0.632 152 -20.618  <.0001
 Minimal Laptop - Moderate Laptop          -3.73 0.632 152  -5.898  <.0001
 Minimal Laptop - Extensive Laptop         -5.59 0.632 152  -8.839  <.0001
 Moderate Laptop - Extensive Laptop        -1.86 0.632 152  -2.941  0.0717

P value adjustment: tukey method for comparing a family of 8 estimates

#plot
plot(pairs_res)

The 3-Act Structure

We need to present our report in three clear sections - think of your sections like the 3 key parts of a play or story - we need to (1) provide some background and scene setting for the reader, (2) present our results in the context of the research question, and (3) present a resolution to our story - relate our findings back to the question we were asked and provide our answer.

Act I: Analysis Strategy

Question 1

Attempt to draft an analysis strategy section based on the above research question and analysis provided.

Analysis Strategy - What to Include*

Your analysis strategy will contain a number of different elements detailing plans and changes to your plan. Remember, your analysis strategy should not contain any results. You may wish to include the following sections:

Very brief data and design description:
- Give the reader some background on the context of your write-up. For example, you may wish to describe the data source, data collection strategy, study design, number of observational units.
- Specify the variables of interest in relation to the research question, including their unit of measurement, the allowed range (e.g., for Likert scales), and how they are scored. If you have categorical data, you will need to specify the levels and coding of your variables, and what was specified as your reference level and the justification for this choice.
Data management:
- Describe any data cleaning and/or recoding.
- Are there any observations that have been excluded based on pre-defined criteria? How/why, and how many?
- Describe any transformations performed to aid your interpretation (i.e., mean centering, standardisation, etc.)
Model specification:
- Clearly state your hypotheses and specify your chosen significance level.
- What type of statistical analysis do you plan to use to answer the research question? (e.g., simple linear regression, multiple linear regression, binary logistic regression, etc.)
- In some cases, you may wish to include some visualisations and descriptive tables to motivate your model specification.
- Specify the model(s) to be fitted to answer your given research question and analysis structure. Clearly specify the response and explanatory variables included in your model(s). This includes specifying the type of coding scheme applied if using categorical data.
- * Specify the assumption and diagnostic checks that you will conduct. Specify what plots you will use, and how you will evaluate these.

*Note, given time constraints in lab, we have not included any reference to diagnostic checks in this write-up example - you would be expected to include this in your report. You can find more information on diagnostic checks in the S1 Week 9 Lab and S1 Week 9 Lectures.

As noted and encouraged throughout the course, one of the main benefits of using RMarkdown is the ability to include inline R code in your document. Try to incorporate this in your write up so you can automatically pull the specified values from your code. If you need a reminder on how to do this, see Lesson 3 of the Rmd Bootcamp.

The notes dataset contained information on 160 participants who took part in a study concerning the role(s) of note taking and study time on test scores. Participants took notes on a lecture via one of two mediums - either on a laptop \((n = 80)\) or long-hand using pen and paper \((n = 80)\). They were then randomly allocated to one of four study time conditions, either engaging in no \((n = 40)\), minimal \((n = 40)\), moderate \((n = 40)\), or extensive \((n = 40)\) study of the notes taken on their assigned medium. Participants then answered a series of questions based on the lecture content. The maximum score was 100, where higher scores reflected better test performance.

The aim of this report was to address the following two research questions:

Do differences in test scores between study conditions differ by the note-taking medium used?
Are there differences in test scores between participants when comparing pairs of study and note-taking conditions? If so, what are these specific differences?

All participant data was complete, and test scores within range i.e., 0-100. Categorical variables were coded as factors, and dummy coding applied where ‘No’ was designated as the reference level for study condition, and ‘Longhand’ as the reference level for medium.

To address RQ1 and investigate whether study condition (No, Minimal, Moderate, Extensive) and note-taking medium (Longhand, Laptop) interacted to influence test scores, the following 4 \(\times\) 2 model specification was used:

\[ \begin{align} \text{Test Score} &= \beta_0 \\ &+ \beta_1 \cdot \text{S}_\text{Minimal} \\ &+ \beta_2 \cdot \text{S}_\text{Moderate} \\ &+ \beta_3 \cdot \text{S}_\text{Extensive} \\ &+ \beta_4 \cdot \text{M}_\text{Laptop} \\ &+ \beta_5 \cdot (\text{S}_\text{Minimal} \cdot \text{M}_\text{Laptop}) \\ &+ \beta_6 \cdot (\text{S}_\text{Moderate} \cdot \text{M}_\text{Laptop}) \\ &+ \beta_7 \cdot (\text{S}_\text{Extensive} \cdot \text{M}_\text{Laptop}) \\ &+ \epsilon \end{align} \]

where we tested whether there was a significant interaction between study condition and note-taking medium:

\[ H_0: \text{All}~~ \beta_j = 0 ~\text{(for j = 5, 6, 7)} \]

\[ H_1: \text{At least one}~ \beta_j \neq \text{(for j = 5, 6, 7)} \]

Effects were considered statistically significant at \(\alpha = .05\). As we were using between-subjects datasets, we assumed independence of our error terms. We assumed linearity as all predictor variables were categorical. Equal variances was assessed via partial residual plots (residuals should be evenly spread across the range of fitted values, where the spread should be constant across the range of fitted values), and normality was assessed via a qqplot of the residuals (points should follow along the diagonal line).

To address RQ2 and explore if there are pairwise differences and determine which conditions significantly differed from each other, we will conduct a series of pairwise comparisons. Since we are interested in all pairwise comparisons of means, we will apply a Tukey correction.

Act II: Results

Question 2

Attempt to draft a results section based on your detailed analysis strategy and the analysis provided.

Results - What To Include*

The results section should follow from your analysis strategy. This is where you would present the evidence and results that will be used to answer the research questions and can support your conclusions. Make sure that you address all aspects of the approach you outlined in the analysis strategy (including the evaluation of assumptions and diagnostics).

In this section, it is useful to include tables and/or plots to clearly present your findings to your reader. It is important, however, to carefully select what is the key information that should be presented. You do not want to overload the reader with unnecessary or duplicate information (e.g., do not present print outs of the head of a dataset, or the same information in tables and plots, etc.), and you also want to save space in case there is a page limit. Make use of figures with multiple panels where you can. You can also make use of an Appendix to present your assumption and diagnostic* plots/tables, but remember that you must evaluate these in-text within the results section and clearly refer the reader to the relevant plots within the Appendix.

As a broad guideline, you want to start with the results of any exploratory data analysis, presenting tables of summary statistics and exploratory plots. You may also want to visualise associations between/among variables and report covariances or correlations. Then, you should move on to the results from your model.

Descriptive statistics are displayed in Table 1.

Table 1: Descriptive Statistics

Descriptive Statistics
study	medium	M_Score	SD_Score	SE_Score	Min_Score	Max_Score
No	Longhand	51.02	2	0.45	48.02	54.92
No	Laptop	48.12	2	0.45	44.86	51.16
Minimal	Longhand	60.91	2	0.45	57.84	65.29
Minimal	Laptop	55.57	2	0.45	52.07	60.10
Moderate	Longhand	80.69	2	0.45	76.34	84.64
Moderate	Laptop	59.30	2	0.45	55.77	62.98
Extensive	Longhand	90.58	2	0.45	86.82	94.32
Extensive	Laptop	61.16	2	0.45	56.93	64.32

In the No and Minimal study conditions, there did not appear to be differences in test score between those using a laptop or longhand when note-taking. However, those in the longhand note-taking condition scored higher than those using laptops in the moderate and extensive study conditions. This suggested that there may be an interaction (see Figure 1).

Figure 1: Association between Test Score and Medium / Study Conditions

Test scores were analysed with a 4 (study: no vs minimal vs moderate vs extensive) \(\times\) 2 (medium: longhand vs laptop) categorical interaction model.

The model met assumptions of linearity and independence (see Appendix A, top left panel of Figure 4; residuals were randomly scattered with a mean of zero and there was no clear dependence), homoscedasticity (see Appendix A, bottom left panel of Figure 4; there was a constant spread of residuals), and normality (see Appendix A, top right panel of Figure 4; the QQplot showed very little deviation from the diagonal line).

There was a significant interaction between study condition and note-taking medium \(F(7, 152) = 1081, p < . 001\). Full regression results, including 95% Confidence Intervals, are shown in Table 2.

Table 2: RQ1 - Regression Table for Total Scores Model

RQ1 - Regression Table for Total Scores Model
	test score
Predictors	Estimates	CI	p
Intercept	51.02	50.14 – 51.90	<0.001
Study - Minimal	9.89	8.64 – 11.14	<0.001
Study - Moderate	29.67	28.42 – 30.92	<0.001
Study - Extensive	39.56	38.31 – 40.81	<0.001
Medium - Laptop	-2.90	-4.15 – -1.65	<0.001
Study - Minimal : Medium - Laptop	-2.44	-4.21 – -0.67	0.007
Study - Moderate : Medium - Laptop	-18.49	-20.26 – -16.72	<0.001
Study - Extensive : Medium - Laptop	-26.52	-28.29 – -24.75	<0.001
Observations	160
R² / R² adjusted	0.980 / 0.979

As displayed in Figure 2, results suggested that the difference in scores did differ significantly across the note-taking medium and study conditions, where scores differences appeared to get larger as the period of study increased (i.e., there was little difference between longhand and laptop note-taking conditions when participants engaged in no study, but the gap in test scores seemed to grow as the length of study time increased).

To explore the interaction further, and address RQ2, pairwise comparisons were conducted. Tukey’s Honestly Significant Difference comparisons (see Figure 3) indicated that the vast majority of pairwise comparisons were statistically significant. There were only three pairs of comparisons that were not - those in the minimal longhand condition did not significantly differ from those in either the moderate laptop (95% CI [-0.33 - 3.55]) or extensive laptop (95% CI [-2.19 - 1.69]) conditions; and there was no difference between those in the laptop condition who studied either for a moderate or extensive period of time (95% CI [-3.80 - 0.08]). Overall, test differences appeared more pronounced when using the longhand note-taking medium across study conditions.

Figure 3: Tukey HSD Pairwise Comparisons

Act III: Discussion

Question 3

Attempt to draft a discussion section based on your results and the analysis provided.

Discussion - What To Include

In the discussion section, you should summarise the key findings from the results section and provide the reader with a few take-home sentences drawing the analysis together and relating it back to the original question.

The discussion should be relatively brief, and should not include any statistical analysis - instead think of the discussion as a conclusion, providing an answer to the research question(s).

Assumptions & Diagnostics Appendix

Question 4

Given that the report should be kept as concise as possible, you may wish to utilize the appendix to present assumption and diagnostic plots. You must however ensure that you have:

Described what assumptions you will check in the analysis strategy, including how you will evaluate them.
Summarized the evaluations of your assumptions and diagnostic checks in the results section of the main report.
Accurately referred to the figures and tables labels presented in the appendix in the main report (if you don’t refer to them, the reader won’t know what they are relevant to!).

Section B: Block 3 (Weeks 1-5) Recap

In the second part of the lab, there is no new content - the purpose of the recap section is for you to revisit and revise the concepts you have learned over the last 4 weeks (or the full academic year if you feel that it would be beneficial to revise the materials from blocks 1 & 2 too).

We would encourage you to complete any outstanding work on these exercises (e.g., complete partial write-ups), and review solutions.

Given that we are now \(\frac{3}{4}\) of the way through the DAPR2 course, we would also strongly encourage you to start creating your revision materials in advance of the exam. You can access all the flashcards that you’ve been presented with in this block here. These will provide a good starting point for collating your notes together on the contents of blocks 1, 2, and 3. We also suggest that you review your weekly quiz feedback (as many of you have learned in Psychology 2A, it is important to provide feedback to allow learners to improve their learning and retention of information, as well as correct any misunderstandings!).

Footnotes

A factorial ANOVA compares means across two or more independent variables (each with two or more levels) and their interaction.↩︎