clothing <- read_csv("https://uoepsy.github.io/data/dapr3_mannequin.csv")Identifying grouping structure
In this lab, you will start to build skills that you’ll use again and again for the next five weeks (and really, for the rest of your data analysis career!).
You’ll practice identifying the grouping structure within a given dataset and thinking about what kind of variability each grouping variable contributes.
In the questions below, you’ll apply the tools you saw in the lectures to three different group-structured datasets.
(Next week, you’ll apply new tools to these same three datasets to understand their grouping structures even more deeply!)
- Create a new .Rmd file for this week’s exercises.
- Save it somewhere you can find it again.
- Give it a clear name (for example,
dapr3_lab01.Rmd). - In the first code chunk, load the packages you’ll need this week:
tidyverse
Clothing
Read in the dataset located at https://uoepsy.github.io/data/dapr3_mannequin.csv and name it clothing using the following line of code.
RQ: Are people more likely to purchase clothing when they see it displayed on a model, and is this association dependent on item price?
| variable | description |
|---|---|
| purch_rating | Purchase rating (sliding scale 0 to 100, with higher ratings indicating greater perceived likelihood of purchase) |
| price | Price presented for item (range £5 to £100) |
| ppt | Participant identifier |
| condition | Whether items are seen on a model or on a white background |
Thirty participants were presented with a set of pictures of items of clothing, and rated each item how likely they were to buy it. Each participant saw 20 items, ranging in price from £5 to £100. 15 participants saw these items worn by a model, while the other 15 saw the items hanging against a white background.
Based on the RQ:
- Which variable is the outcome variable, aka the dependent variable?
- Which variable or variables is/are the predictor variable(s), aka the independent variable(s)?
What grouping variable(s) does this dataset contain?
- Which grouping variable(s) contribute reproducible/manipulated/controlled variability?
- Which grouping variable(s) contribute random/non-manipulated/non-controlled variability?
Monkey status
Read in the dataset located at https://uoepsy.github.io/data/msmr_monkeystatus.csv and name it monkey.
RQ: How is the social status of monkeys associated with their ability to solve problems, while controlling for the difficulty of the problem?
| variable | description |
|---|---|
| status | Social status of monkey (adolescent, subordinate adult, or dominant adult) |
| difficulty | Problem difficulty ('easy' vs 'difficult') |
| monkeyID | Monkey name |
| solved | Whether or not the problem was successfully solved by the monkey |
Researchers have given a sample of Rhesus Macaques various problems to solve in order to receive treats. Troops of Macaques have a complex social structure, but adult monkeys tend can be loosely categorised as having either a “dominant” or “subordinate” status. The monkeys in our sample are either adolescent monkeys, subordinate adults, or dominant adults. Each monkey attempted various problems before they got bored/distracted/full of treats. Each problems were classed as either “easy” or “difficult”, and the researchers recorded whether or not the monkey solved each problem.
Based on the RQ:
- Which variable is the outcome variable, aka the dependent variable?
- Which variable or variables is/are the predictor variable(s), aka the independent variable(s)?
What grouping variable(s) does this dataset contain?
- Which grouping variable(s) contribute reproducible/manipulated/controlled variability?
- Which grouping variable(s) contribute random/non-manipulated/non-controlled variability?
Laughs
Read in the dataset located at https://uoepsy.github.io/data/lmm_laughs.csv and name it laughs.
RQ: How is the delivery format of jokes (audio-only vs. audio AND video) associated with differences in humour ratings?
| variable | description |
|---|---|
| ppt | Participant identification number |
| joke_label | Joke presented |
| joke_id | Joke identification number |
| delivery | Experimental manipulation: whether joke was presented in audio-only ('audio') or in audiovideo ('video') |
| rating | Humour rating chosen on a slider from 0 to 100 |
These data are simulated to imitate an experiment that investigates the effect of visual non-verbal communication (i.e., gestures, facial expressions) on joke appreciation. Ninety participants took part in the experiment, in which they each rated how funny they found a set of 30 jokes. For each participant, the order of these 30 jokes was randomised for each run of the experiment. For each participant, the set of jokes was randomly split into two halves, with the first half being presented in audio-only, and the second half being presented in audio and video. This meant that each participant saw 15 jokes with video and 15 without, and each joke would be presented with video roughly half of the time.
Based on the RQ:
- Which variable is the outcome variable, aka the dependent variable?
- Which variable or variables is/are the predictor variable(s), aka the independent variable(s)?
What grouping variable(s) does this dataset contain?
- Which grouping variable(s) contribute reproducible/manipulated/controlled variability?
- Which grouping variable(s) contribute random/non-manipulated/non-controlled variability?
The Piazza forum
Finally: we want you to get familiar with the course’s Piazza page. [TODO ADD LINK]
Piazza is a discussion forum where you can anonymously post questions that your coursemates, tutors, and instructors can see and respond to. Throughout this course, please use Piazza to ask us your stats questions. Asking on Piazza is better than asking by email because then everybody else can benefit from your questions.
Your final task this week is to get to know Piazza by anonymously posting about something you like. (A cute photo of your pet, a nice thing someone said to you, the best food you ate recently… whatever makes you happy!)