DAPR3 Report 2023/24

Own work policy

Please note that this is an individual assignment and you are expected to work on your own with respect to both R code and report.
Similarity checks will be performed and further investigations will be carried out on assignments exceeding a certain threshold.

The use of large language models (e.g. Chat-GPT) is not permitted, nor is it recommended to rely on such tools for conducting or writing up statistical analyses.

General Info

Coursework set: Thursday 19th October at Midday
Coursework due: Thursday 9th November at Midday

This report will count towards 30% of the final course grade.

Questions

We are here to help and to clarify anything we can, however we will not answer direct questions such as “Is this [part of my coursework] correct?”

What we would like you to do is think about why you are asking the question. If it is because you are unsure about a section of the material, look back over it, come and discuss the examples from class, and then apply that to the coursework.

Please ask questions on the discussion forum so that all students may benefit from the answer (please also check that your question has not already been posted!)

Instructions

Your task is to describe and analyse the data provided in order to answers a set of research question(s). Analyses will draw on the methodologies we have discussed in lectures and labs. The specific study contexts and research questions can be found below.

You are required to submit two files:

a complete report knitted to PDF (max 4 pages excluding appendix)
and the associated .Rmd file

As the compiled document will not contain visible R code, a large part of the challenge comes in clearly describing all aspects of the analysis procedure. A reader of your compiled document should be able to more or less replicate your analyses without referring to your R code.

Your compiled report (.pdf) should contain:

Clear written details of the analysis conducted in order to answer the question, including transparency with regards to decisions made about the data prior to and during analysis.
Results, in appropriate detail (for instance, a test statistic and p-value, mean and standard error, not just one of these).
Presentation of results where appropriate (in the form of tables or plots).
Interpretation (in the form of written paragraphs referencing relevant parts of your results) leading to a conclusion regarding the question.

Your RMarkdown file (.Rmd) should also contain:

Code that will successfully conduct the analysis described in your report, and return the exact results, figures and tables that are detailed in your report.

Knitting to PDF

Please note that to knit to pdf, you should:

Make sure the tinytex package is installed.
run tinytex::install_tinytex() in the console
Makes sure the ‘yaml’ (bit at the very top of your document) looks something like this:

---
title: "this is my report title"
author: "B1234506"
date: "13/08/2021"
output: bookdown::pdf_document2
---

If you cannot knit to pdf, then: Knit to html file; open your html in a web-browser (e.g. Chrome, Firefox); print to pdf (Ctrl+P, then choose to save to pdf); submit the pdf you just saved.

Report Structure

Your report should include three sections:

Analysis Strategy
Results
Discussion

You do not need to include an introduction to the study unless you feel it is helpful in writing your analysis strategy.

1. Analysis Strategy

In this section you should describe how you are going to answer each of the research questions. The marking of this section will be based on the completeness of your descriptions of:

data cleaning and variable recoding
any descriptive statistics or visualizations you will use prior to running models,
a description of the models you specified (including rationale for both the fixed effects and random effect structure),
what information from the models (e.g. which parameter estimates) answer the research questions
how these estimates and models will be evaluated,
how you will check your model (assumptions and diagnostics including the criteria you have used to evaluate each), AND
rationales for all choices.

Your analysis strategy should not contain any results.

Hint:
A reader of your report should be able to more or less replicate your analyses without referring to your R code.

2. Results

The results section should follow logically from your analysis strategy and present the results of all aspects of your approach. A typical structure would begin by presenting descriptive statistics and move on to inferential tests. Things to remember:

All key model results should be presented (tables very useful) in the main body of the report
You should provide full interpretation of key results
Model assumption and diagnostic checks should be discussed. This can be brief, and link to both the criteria set out in your strategy, and the assumptions appendix.

3. Discussion (very brief)

The Discussion section should contain very brief (1-2 sentence) summary statements linking the formal results to each of the research questions. The marking of this section will be based on the coherence and accuracy of these statements. This should not contain repetition of detailed statistical results, but should refer to those presented in the analysis.

Appendix (optional)

In addition to the above, you may include an assumption appendix. This section has no page limit. You may use this to present assumption and diagnostic plots.
Please note:

you must still describe your assumption tests in your strategy, including how you will evaluate them.
you must also still summarise the results in the results section of the main report.
you must refer accurately to the figures and tables labels presented in the appendix.
The appendix is only for assumption and diagnostic figures and results. Any results from your main models included in the appendix will not be marked.

Formatting

The focus of this report is on your ability to create reproducible results, implementing analyses to answer research questions and interpreting the results. However, we do require that the reports are neatly formatted and written clearly. Below are some pointers:

Figures and tables should be numbered and captioned, and referred to in the text; important statistical outcomes should be summarised in the text.
Reporting should be clear and consistent. If in doubt, follow APA 7th Edition guidelines.
Your report should be a maximum of 4 sides (excluding the assumptions appendix) when knitted to a PDF and using the default formatting and font settings within RStudio are used when knitting your file.
Code chunks should be hidden in the pdf produced by your rmd file. To tell RMarkdown to not show your code when knitting, add echo=FALSE next to the r that appears after the backticks.

For a guide on writing in RMarkdown, please see the Rmd-bootcamp lessons at https://uoepsy.github.io/scs/rmd-bootcamp/.

Submission

Please submit both files on-line via the Turnitin links on the LEARN page for dapR3. There will be two links, clearly labelled, as the files need to be submitted individually. The submission links will be within the Assessments tab and will become available after you click on the “Own work declaration” link.

Please name your file using only your exam number. For instance, B123405.Rmd and B123405.pdf

Late Penalties

Submissions are considered late until both files are submitted on Turnitin.

Grading Rubric

Compiled reports will be assessed according to the following components, with the following weightings:

Analysis Strategy = 40%
- ability of methods to answer the research questions
- clarity in description of methods and how these answer research questions (without reliance on R syntax)
- detail allowing for methods to be fully replicated
- detail of statistical assumptions underlying methods, how these will be assessed and what approaches will be undertaken in the event of assumptions being violated
Results = 40%
- presentation of key estimates/tests to answer the research questions
- completeness of relevant information for key estimates/tests
- clarity and accuracy in interpretation of key estimates/tests
- presentation of auxiliary results such as those relating to covariates
Discussion = 10%
- accuracy and clarity of link between relevant findings and research questions
Writing and formatting = 10%
- clarity and conciseness of writing
- appropriate formatting of any equations
- ability to present any figures and tables in ‘publication-ready’ style

The overall mark will be rounded to the nearest value on the extended common marking scheme for Psychology.

Code Penalties

We will apply penalties for non-reproducibility of results.
Prior to submitting, check the following:

Does the code provided lead to the exact same results that you have reported?

A good way to assess this by checking that the file knits successfully. If there are any errors in your code, this may prevent your file from knitting (see Rmd-Bootcamp Lesson 2).

If the code doesn’t provide the reported results, your grade will be penalised by deducting 10 points from your final grade.

In other words, a 62 becomes 52, and a 42 becomes 32.

If your submission files are not .Rmd & PDF, you might lose all points and get a grade of zero.

If you have any technical issues with the Turnitin submission, please contact the Teaching Office, and cc the PPLS.Psych.Stats@ed.ac.uk email address.

Late Penalties

Submissions are considered late until both files are submitted on Turnitin.

REPORT TASK

Conduct and report on an analysis that addresses the research aims of the study detailed below.

The data is available at: https://uoepsy.github.io/data/dapr3_2324_report.csv.
please note the data will only be available from the release date onwards (see key dates).

Study Description
The present study is interested in how the ‘1/2/3/4/5 star reviews’ of Edinburgh fringe shows is associated with attendee-enjoyment, and whether this is different depending upon attendees’ prior-exposure to the ratings before seeing a show.

52 participants took part in the study, and at the outset completed a questionnaire that assessed participants’ preference for 5 types of show: “comedy”, “theatre”, “spoken word”, “musical” and “dance, physical theatre and circus”. 10 shows at the Edinburgh festival fringe were randomly selected, including two shows with a 1-star rating, two with a 2-star rating, two 3-star shows, two 4-star shows, and two 5-star shows. Every participant was given tickets to all 10 shows. Half of the participants were given flyers that included these ratings on them ahead of time, while the other half were not told anything about the show they were going to see. Participants all attended the same screening of these shows, and were asked afterwards to rate (from 0 to 100) how much they enjoyed the show.

Research Question: After accounting for peoples’ preferences for all 5 types of show, are higher star-ratings associated with more enjoyment of the shows, and does this depend on whether or not attendees know the rating beforehand?

Table 1:
Data Dictionary
variable	description
ppt	Participant ID
pref_c	Comedy Rating. Higher scores indicate greater preference for comedy shows (sum score from 5 questions each scored on a 5-point likert scale)
pref_t	Theatre Rating. Higher scores indicate greater preference for theatre shows (sum score from 5 questions each scored on a 5-point likert scale)
pref_sw	Spoken-Word Rating. Higher scores indicate greater preference for spoken word shows (sum score from 5 questions each scored on a 5-point likert scale)
pref_m	Musical Rating. Higher scores indicate greater preference for musical shows (sum score from 5 questions each scored on a 5-point likert scale)
pref_dpc	Dance/Physical-Theatre/Circus Rating. Higher scores indicate greater preference for dance/physical theatre/circus shows (sum score from 5 questions each scored on a 5-point likert scale)
flyer	Experimental condition indicating if participants were given flyers (on which the X-star rating was included) prior to the show. 0 = No Flyer, 1 = Flyer
show	Show ID
stars	X-Star Rating of the show attended. Ranges from 1 to 5.
enj	Post-attendance enjoyment rating for a given show. Measured on a sliding scale, scores range from 0 to 100