USMR Group Project 2025/26

Due: Thursday 11th December at 12 Noon

Your task is to describe and analyse some data in order to provide answers to a set of research questions. Analyses will draw on the methodologies we have discussed in lectures, readings and lab exercises.

Stop! Before you do anything else, make sure that every member of the group has a record of everyone else’s exam numbers, which can be found on their matriculation cards (see here for more information). Whoever submits the coursework on behalf of the group will need to provide these numbers at submission time.

Group work policy

Please note that each group should submit a single report (two documents: see what-you-need-to-submit). All group members are expected and encouraged to contribute to the report, but you should not work with other groups.

Similarity checks will be performed, and further investigations will be carried out if assignments from different groups are similar.

Please see the AI policy on Learn.

What you need to submit

Each group is required to submit 2 documents.

A final compiled report (.pdf format), detailing your analyses, results, interpretation and conclusions.
- No longer than 6 pages, excluding optional appendix (see below)
- Doesn’t contain visible R code. Sort of like the “analysis and results” section of a published paper, which describes, reports, presents and interprets (more detail in how-to-approach-the-questions).
The .R or .Rmd document which reproduces the results you give in the report.¹

Page limit

Your report should be no longer than 6 pages.

If you are knitting to html, please open the html in a browser and print to pdf in order to check how many pages your submission takes up).

An OPTIONAL Appendix of maximum four pages² can be used to present additional tables and figures, if you wish (but there is no requirement to do this).

The appendix is a good place for supplementary materials. By this we mean figures and tables that are not strictly necessary for the reader to understand and replicate your results, but provide additional context to your report.

good use of the appendix:
“The model met the assumptions of linear regression with residuals showing a constant mean of approximately zero across the fitted values (see appendix Fig X), and …”
not so good use of the appendix:
“The model met assumptions (see appendix Fig X).”

Report Formatting

We expect most groups to do their analyses using a standalone R script, and to write the report in a separate word-processor document detailing those analyses. You should then export the word-processor document to .pdf format for submission. The R Script should remain as a text file with a .R suffix.

For some groups, it may be sensible to use a collaborative word-processor such as Google Docs. Unfortunately there are no straightforward ways to edit .R documents collaboratively; we suggest that each group ensures it knows who holds the “master copy” of the script that will be submitted.

Some groups may find it useful to use RMarkdown (.Rmd) files for initial formatting. For example, you might like to knit to a Word file from an RMarkdown document, and to subsequently make edits to your text and formatting in the Word document itself (or in a Google Doc obtained by uploading the Word document to Google Drive).³

We don’t mind which of these approaches each group takes: The important thing to remember is that the data analysis and modelling results in the report should match those produced in your R or RMarkdown file.

If you do wish to do some or all of your formatting in RMarkdown, then we suggest the following readings for help:

Feel free also to post formatting questions on the Piazza discussion forum.

A note on knitting .Rmd directly to pdf

Getting RMarkdown to knit directly to pdf can be a pain, and formatting is difficult.

We recommend:

knit to .html, then Ctrl+P to print to .pdf
knit to .docx, then export to .pdf

Template files

If you use RStudio on the PPLS Server, you will need to upload the template .R or .Rmd file to your space on the server in order to use it.

Template .R file

If you plan to use plain R, a small template for your project can be downloaded here.

Template .Rmd/.qmd file

For groups who would like to use RMarkdown (or Quarto⁴) you can find a template file to use if you wish:

Again, using the templates will ensure that you have the correct data. In the templates, you will also find empty sections for you to add R code to. However, you should feel free to change these if you wish: These are just templates, and are mainly designed to make your life easier. The templates set echo = FALSE for all code-chunks. This means that your R code (but not the output) is hidden in the compiled document. Both templates currently compile to .html files, but you can change this if you wish.

Submitting your files

Pre-submission checks

Before submitting, we strongly advise you to check that your group’s code runs. The easiest way to check this is to:

if using an .R script: Clear your environment, restart your R session (top menu, Session > Restart R), and run your code line by line to see if any errors arise.
if using RMarkdown: Check that your .Rmd compiles (i.e., can you knit your Rmarkdown document into .html/.pdf/.docx without error?)

If you use RStudio on the PPLS Server, you may need to export the file to your computer in order to upload it to Turnitin.

Filenames

For both files which you submit, the filename should be your group name with the appropriate extension, and nothing else.
For example, the group Wilcoxon Woodpeckers would submit two files:

one of WilcoxonWoodpeckers.R / WilcoxonWoodpeckers.Rmd
WilcoxonWoodpeckers.pdf

Working individually?

For anyone who has obtained permission to complete the task individually, please name each file with your exam number (the letter “B” followed by a six digit number - which can be found on your student card: See here for more information). For example, if your exam number was B123456 you would submit:

one of B123456.R / B123456.Rmd
B123456.pdf

You should also write your exam number in the “Submission Title” box prior to submission.

Where to submit

ONLY ONE PERSON FROM EACH GROUP NEEDS TO SUBMIT

We suggest that you do this together/on a call, so that all group members are able to confirm that they are happy to submit.

Go to the Assessments page on Learn, and look for “Assessment Submission”. There you will find an own-work declaration which requires marking as reviewed, before two submission boxes will be visible (one for each file), where you can submit.

For each file you should complete the “Submit File” popup by entering the exam numbers of all of your group members in the “Submission Title” box (see below).

Late penalties

Submissions are considered late until both files are submitted on Turnitin (see the PPLS policy on late penalties on the MSc Hub).

Grading

We are primarily marking each group’s report, and not your code
Grades and feedback are provided for the finished reports, with marks awarded for providing evidence of the ability to:

understand and execute appropriate statistical methods to answer each of the questions
provide clear explanation of the methods undertaken
provide clear and accurate presentation and interpretation of results and conclusions.

Why we still want your code
We still require your code so that we can assess the reproducibility of your work. We also use it as a way to give you extra marks based on the elegance of your coding and/or use of RMarkdown.

Five points will be deducted from your final grade if your marker cannot determine how the results in your report were generated in your .Rmd/.R document (for example, if the code produces errors or produces values different from those reported). This means that a 75 out of 100 becomes 70 out of 100.
Up to ten points will be added for good use of R or RMarkdown (for example, where code is elegant, or when inline R code is used to report results in the text). For example, a 70 might be raised up to 80.

Peer-adjusted marking

Once the group project has been submitted, every member of the group will complete the peer-assessment, in which you will be asked to award “mark adjustments” to yourself and to each of the other members of your group. This will be done through Learn; details will be made available over the next couple of weeks. Each submitted mark should be understood as follows: Relative to the group as a whole, how much did each member contribute? If someone made an average contribution, you should award that person the middle score. If they contributed substantially more, you may choose to give a higher mark; if they genuinely contributed less, you might choose a lower mark. Marks for each group member are scaled then averaged together, and then used as “weights” to adjust the overall project mark. You can see an example of how this logic works by visiting https://uoe-psy.shinyapps.io/peer_adj/ where there is a “live demo” of a peer-adjustment system.

Please also be aware that if you don’t fill out the form, then other members’ ratings will hold more weight in the adjustments made.

How to approach the task

For each of the questions below, the report (final .pdf) is expected to include:

Clear written details of the analysis conducted and how it can provide an answer the question, including transparency with regards to decisions made about the data prior to and during analyses.
Presentation of results where appropriate (in the form of tables or plots).
Statistical findings, in appropriate detail (for instance, a test statistic, standard error, and p-value, not just one of these). Remember to cite degrees of freedom where needed.
Interpretation (in the form of a written paragraph(s) referencing relevant parts of your results and statistics) leading to a conclusion regarding the question.

The code you write in your submitted .R/.Rmd file should successfully undertake the analysis described in A), which returns C). You should also include the code to produce B).

Important (Helpful) Tips:

The .pdf report should not contain visible R code, meaning that a large part of the challenge comes in clearly describing all aspects of the analysis procedure.
A reader of your report should be able to more or less replicate your analyses without referring to your R code (or using R, if they are unenlightened).
Write as if the reader has a very basic understanding of statistics and does not necessarily use R.
You do not need to include information about the study background or the collection of the data in your report.

Any Questions?

This document contains a basic overview of the task and of how to submit it. If you have any questions concerning the coursework report, we ask that you post them on the designated section of the Piazza discussion forum on Learn. If you have a question, it is likely your classmates may have the same question. Before posting a question, please check the on-line board in case it has already been answered.

THE COURSEWORK

Getting the data

You can access the data by running the following lines, where you replace the words wilcoxon_woodpeckers with your group name (keep the quotes). (These lines of code are also in the templates, above.)

source("https://edin.ac/42QTz8T")
get_my_data(group_name = "wilcoxon_woodpeckers")

Working individually?

For anyone who has obtained permission to complete the task individually, please use your exam number (the letter “B” followed by a six digit number - which can be found on your student card: See here for more information), and set individual = TRUE in the get_my_data() function.

source("https://edin.ac/42QTz8T")
get_my_data(group_name = "B123456", 
            individual = TRUE)

You may want to put the (edited) lines above at the beginning of your analysis script to ensure that you are analysing the correct data.

After running the code above, data for all three parts of the task (below) will be found in your environment. These will be called pilotA, pilotB, pilotC for Part 1, nudges for Part 2, and followup for Part 3.

Important

Please note: The data you are analysing is not real! Although we have tried to make the values plausible, you don’t need to know anything about behavioural ‘nudges’ or the environmental footprints of digital devices order to complete the report, which should be written as if the data was collected as described.

Note also that the data you obtain will be unique to your group, which means that the results of any statistics you run will also be unique to the group. In some cases, you may find different significant effects to those of other groups: This is nothing to worry about, and your markers will know how to check your individual results.

Part 1

Suggestion: 2 pages

A group of researchers is interested in investigating the efficacy of small ‘nudges’ to help encourage peoples’ pro-environmental behaviours.

Their first foray into this field was to complete 3 small pilot studies. Two of these investigated pre-existing nudges from the literature.

For pilot study A, they investigated the idea that placing googly eyes on light-switches results in lights being turned off more when people leave a room. They took 120 rooms in Edinburgh University buildings, and put googly eyes on the light switches of 40 of them. In another 40 they put a little written message above each light-switch asking people to turn it off, and in the remaining 40 they left the switches as normal. A week later, they went around all the rooms at the end of the day and recorded whether the lights had been left on or not.
For pilot study B, the researchers went to the Brewbox Coffee stand outside 7 George Square, and asked them to record the numbers of coffees sold each day over two months—one month where they offered a 50p discount for using keep-cups, and one month where they did not have a discount.
Pilot study C investigated whether there was an association between peoples’ level of environmental concern and their usage of digital devices. The researchers randomly sampled people walking through George Square gardens, and asked them to complete a short questionnaire (9 questions, each scored on a scale from 1–5), and also asked them for their smartphone usage (in minutes) in the previous day. They stopped collecting data once they had 30 people for whom they were able to get an accurate measure of smartphone usage.

The researchers collect all their data from these pilot studies, and then reach out to your team, and ask you to conduct the analyses. The researchers ask you:

Were lights in the 120 Edinburgh University rooms being turned off independently of what had been stuck to the light-switches (googly eyes, a written message, or nothing)?
Does Brewbox sell more coffees if they have offer a keep-cup discount?
Is the amount of time people spend on digital devices associated with their level of environmental concern?

Table 1: Data Dictionary: pilotA

variable	description
room	Room name
condition	What was presented on the lightswitch in the room (0 = nothing, 1 = a written request to turn off lights, 2 = googly eyes)
lights	Whether the lights were off (0) or on (1) when the researchers returned a week later

Table 2: Data Dictionary: pilotB

variable	description
discount	Discount offered (50p keep cup discount vs no discount)
ncoffees	Number of coffees sold

Table 3: Data Dictionary: pilotC

variable	description
pid	Participant ID
env_concern	Environmental Concern Score (sum of 9 questions each measured on a 1-5 Likert scale)
device_usage	Device usage (in minutes) recorded by the participant's device for the previous day

Part 2

Suggestion: 2 pages

The researchers have also devised their own ‘nudge’. This is a smartphone app along with an internet browser add-in that provides a live-readout of the environmental footprint of device usage.

They recruited 216 people who all completed the same environmental concern questionnaire that was used in the pilot study C. All participants agreed to install the app on their devices and keep it for the following month. For all participants the app recorded the environmental footprint of their device usage. Participants were randomly allocated in to one of three conditions. In the first condition, the purpose of the app was explained to the participants, and the live-readout was visible at all times on their device (meaning these participants recevied a “constant nudge”). In the second condition, participants were told about the purpose of the app, but rather than a constantly visible read-out, participants had to access the app to see their environmental footprint (therefore these participants recieved an “opt-in nudge”). In the third condition, participants were not told the purpose of the app, and they could not see or access their environmental footprint metric (therefore these participants did not receive the ‘nudge’).

The researchers wanted to investigate whether the different types of nudge resulted in a lower environmental footprint, and whether this depended on the levels of environmental concern expressed by device users.

Once again, your team is asked to conduct an appropriate analysis and write up a report.

Table 4: Data Dictionary: nudges

variable	description
ppt	Participant Name
age	Participant age (years)
env_concern	Environmental Concern Score (sum of 9 questions each measured on a 1-5 Likert scale)
op_sys	Operating system of device (Android vs Iphone)
nudged	The condition in which the participant was allocated, indicating whether they received a 'constant nudge' (an always-on live readout of their device footprint), an 'opt-in nudge' (a readout of their device footprint only via accessing the app), or 'no nudge' (the app recorded but did not show the user their device footprint). These are coded as 0 = 'no nudge', 1 = 'opt-in nudge', 2 = 'constant nudge'.
EF	Environmental Footprint score (composite measure based on device battery usage and average server usage of all sites visited)

Part 3

Suggestion: 2 pages

Three months after the study, the researchers have followed up with as many as possible of the participants and asked them to indicate simply whether they still had the app installed or not.

They come to you—their faithful team of young statisticians—once more! This time their question is more exploratory: they want to know, specifically for people who had received either of the nudges, what things independently predicted that they would uninstall the app.

Table 5: Data Dictionary: followup

variable	description
ppt	Participant Name
installed	Whether the participant still had the app installed when followed up 3 months after the study (1 = installed, 0 = uninstalled)

Footnotes

.qmd is also acceptable.↩︎
meaning that your final .pdf file should be max 10 pages.↩︎
https://rstudio.ppls.ed.ac.uk supports knitting to .html, .pdf and word (.docx); choose by clicking the drop-down arrow next to “Knit”.↩︎
think RMarkdown version 2, as seen occasionally in lectures!↩︎