Overview

general things to remember

every group gets their own data (it’s all from the same data generating process, so will all be similar to one another)

what to do

c30 mins per report

leave in-text comments (rough guide: aim for >3 a page)
leave summary box notes
- ultimately this will typically end up being 2 or 3 paragraphs giving an overview of the whole thing
- if you are struggling for time, please just leave this as bullet points of the main “good things/needs improvement things” and i will go through them all and write them up.
decide on a grade for each section and enter it in the spreadsheet
- see rubric for each Q
- if someone misses out some stuff but does other cool stuff, then their mark might go up again…
- In general, we are looking for students to be aware of choices they might have and to write about those choice points – you might not agree with their choices, but if they’re explicitly made and somewhat defensible, try and find marks for them

Reasons to look at the code:

a quick glance to see if it’s neat, commented, etc. (could find marks for very clear code);
a more detailed look if you can’t work out what is being reported;
as a check that numbers / graphs come from the code if the report seems weird to you.

We’d expect you to do 1. if poss, but only resort to 2. and 3. if there are issues you want to explore. If something looks really weird, flag it and we’ll take a look

commenting style

think of comments as something the students can learn from
avoid negativity, be constructive
generally try to avoid categorical statements of “wrong/right”, “correct/incorrect”, “should/shouldn’t”
- imagine that a student complains, and comes to you to ask for an explanation of your comment. can you defend it?
- e.g., instead of “this is wrong, you should have done X”, go for “this sounds like you have done Y, which doesn’t quite do what you want because [insert explanation]. An alternative would be to do X which would [insert explanation]”

things to remember

we don’t care about specifically APA formatting etc. it’s about “clarity and consistency”
- do cast an eye over their reporting of stats though, remember that things like t would ideally be accompanied by df etc.

Cleaning

there are some typos in the instruments, but the main thing they will need to do is categorise instruments in to families/categories (woodwind, brass, percussion, strings)
there is a Theramin. they would ideally comment on that as it doesn’t fall into any family. if they add it to a family [with an explanation] that’s fine.
there are some ages at -99
there is an “i don’t know” in the musician variable
there are some people that will end up being influential in the analyses later on. some people will possibly remove these at the start. it might be a bit messy for them to explain this..

They don’t have to use code like this. Any code that works is fine!

library(tidyverse)
source("https://edin.ac/4eYWn7P")
get_my_data(group_name = "asymptotic_arias")
orchestra <- 
  orchestra |> mutate(
    family = case_when(
      instrument %in%  c("Flute","Oboe","Clarinet","Bassoon","Piccolo","Piccollo") ~ "woodwind",
      instrument %in% c("Timpani","Snare Drum","Bass Drum","Cymbals") ~ "percussion",
      instrument %in% c("Trumpet","French Horn","French Hron","Trombone","Tuba","Euphonium") ~ "brass",
      instrument %in% c("Violin","Viola","Cello","Double Bass") ~ "string",
      TRUE ~ NA
    ),
    age = ifelse(age<0,NA,age),
    musician = factor(musician, levels=c("non-musician","musician")),
    isinf = pptname %in% c("Sigmund Freud","Beatrix Potter","Stephen Jay Gould")
  ) |> na.omit() |> filter(!isinf)

Question 1

Questions

1. What do we know about the sample? (Describe and Explore)

Suggestion: 2 pages

Prior to conducting the main analyses, the researchers would like some descriptive statistics on the participants in the study. In addition, they would like you to test that giving the participants the choice of instrument to listen to has not led to different sorts of participants listening to different types of solos. They would like you to test and report on:

Whether there’s any difference in tempos of pieces listened to between those who are musicians vs those who aren’t.
Whether there is an association between age and the tempo of piece played to them, and, if so, in what direction (e.g., the orchestra might have unintentionally played slower pieces for older people!).
Whether there is a balance such that musicians and non-musicians were equally likely to choose to listen to instruments from the different orchestral groups.

Provide a suitable brief description of the dataset, and then answer each of the questions above using an appropriate statistical test.

Hint:
When providing a description of your sample, think about the tradeoff between space used (by, for example, a figure or table) and detail (in writing). There’s no right way to describe the sample, but readers will want to understand the basic “shape” of the data.

Expected analysis

descriptives

mean and sd age
count (and %) musicians vs non-musicians
instrument to family mapping?
counts (and %) family chosen
mean bpm (possibly split by family)
count and % enjoyed

summary(orchestra)

##    pptname               age                musician    instrument       
##  Length:394         Min.   :24.00   non-musician:212   Length:394        
##  Class :character   1st Qu.:40.00   musician    :182   Class :character  
##  Mode  :character   Median :44.00                      Mode  :character  
##                     Mean   :44.36                                        
##                     3rd Qu.:50.00                                        
##                     Max.   :75.00                                        
##       bpm             ERS            enjoyed             family         
##  Min.   : 20.0   Min.   :-10.250   Length:394         Length:394        
##  1st Qu.: 85.0   1st Qu.:  3.692   Class :character   Class :character  
##  Median :110.0   Median :  5.920   Mode  :character   Mode  :character  
##  Mean   :107.8   Mean   :  5.299                                        
##  3rd Qu.:130.0   3rd Qu.:  7.615                                        
##  Max.   :180.0   Max.   : 13.290                                        
##    isinf        
##  Mode :logical  
##  FALSE:394      
##                 
##                 
##                 
##

table(orchestra$musician) |> print() |> prop.table()

## 
## non-musician     musician 
##          212          182

## 
## non-musician     musician 
##    0.5380711    0.4619289

1a

t.test.
if they just use welch (default) then fine. If they assume equal variances, then would be nice to be also shown a var.test.
i’m pretty sure almost all groups will have this as non-significant

with(orchestra, t.test(bpm ~ musician))

## 
##  Welch Two Sample t-test
## 
## data:  bpm by musician
## t = -0.42985, df = 391.41, p-value = 0.6675
## alternative hypothesis: true difference in means between group non-musician and group musician is not equal to 0
## 95 percent confidence interval:
##  -7.199780  4.616326
## sample estimates:
## mean in group non-musician     mean in group musician 
##                   107.1698                   108.4615

with(orchestra, boxplot(bpm ~ musician))

1b

a correlation, but ideally with a plot as it will highlight some weird people that don’t fit.
if they frame as lm() then they might justify removing these people based on influence, which is fine. could also just do it from plot though.
some might have already discussed removing these at the outset
if they don’t remove them it will probably be non-signif
if they do remove them, it will probably be signif

with(orchestra, cor.test(age, bpm))

## 
##  Pearson's product-moment correlation
## 
## data:  age and bpm
## t = -5.2226, df = 392, p-value = 2.87e-07
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.3451564 -0.1603007
## sample estimates:
##        cor 
## -0.2550577

with(orchestra, plot(age, bpm))

1c

chisquare test of independence
probably signif for every group
ideally a table too, or discussion of relative counts.
- something that highlights that it’s percussion that has biggest impact here (musicians much less likely to choose to listen to percussion solos)

with(orchestra, table(family,musician)) |> 
  print() |>
  chisq.test()

##             musician
## family       non-musician musician
##   brass                53       64
##   percussion           31        2
##   string               65       44
##   woodwind             63       72

## 
##  Pearson's Chi-squared test
## 
## data:  print(with(orchestra, table(family, musician)))
## X-squared = 29.049, df = 3, p-value = 2.187e-06

Rubric

descriptor	Marks	descriptives	methods	results writeup
inadequate (bad fail)	0, 15
inadequate (clear fail)	25		basic descriptives only (e.g. just two means)	minimal. Basically just restating what R spits out
inadequate (marginal fail)	32, 38	very minimal (i.e. N ppts, what variables are present). some inappropriate (e.g. means of categories)	descriptives only, but appreciation of variability (e.g. talks means and sds)	pretty unclear explanation. Missing key parts (e.g., missing a p-value, or not providing a conclusion). plots not clearly relevant to question, but contain some relevant variables
adequate	42, 45, 48	mean age, counts of musicians/non-musicians. Not much else	appropriate family of test (e.g. t, chisq etc)	written okay, but some larger errors (e.g., a conclusion mismatches with reported p-val)
good	52, 55, 58	list-like descriptive stats for every variable, rather than a “description of the sample” table of everything, not much in text	clearly stated specific test (e.g., welch T, chisquare test of indepdence)	clearly explained but some errors in reporting stats (e.g., missing df). Messy or basic plots/tables
very good	62, 65, 68	clear description of avgs and variability in sample characteristics, plus discussion of missing data	clearly stated specific test, acknowledgement of assumptions	clearly explained and reported. Could improve in contextualising results (i.e. discussion of direction of effects, relative deviations for chisq)
excellent	72, 75, 78	excellent description, discussing missingness and reasons for exclusion etc		clearly explained and reported and placed in context, with nice plots and tables where relevant.
excellent	85
excellent	92, 100

Question 2

Questions

2. Sound & Sentiment

Suggestion: 2 pages

The first major research aim is to investigate the question about emotional response. Recall that the researchers are interested in whether the tempo of a musical piece is associated with eliciting more or less emotional response, and whether this might differ between the broad categories of instruments (strings/woodwind/brass/percussion).
Conduct and write up appropriate analysis/analyses to address this question.

Hint: Neither this analysis nor the one below need be very complex. Think about the background-&-study-aims, and what the researchers already “know” to be true; what they suspect might affect things (but aren’t necessarily interested in); and what the focus of their research is.

Expected analysis

linear model with an interaction term betw tempo & instrument-category
probably control for age (older people seem to have been presented with slower music) and musician (musicians tended to less frequently choose to listen to percussion). both of these may well also influence ERS, and so could be confounders
it would be nice to get an omnibus test of the interaction, so compare the full model to an additive model
sensible plot would probably be 4 non-parallel lines

mod.f = lm(ERS ~ age + musician + family * bpm, 
           data = orchestra)
summary(mod.f)

## 
## Call:
## lm(formula = ERS ~ age + musician + family * bpm, data = orchestra)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -7.0879 -1.2478  0.0949  1.3450  4.6750 
## 
## Coefficients:
##                        Estimate Std. Error t value Pr(>|t|)    
## (Intercept)           5.6139716  0.9838929   5.706 2.32e-08 ***
## age                   0.0155966  0.0136406   1.143  0.25359    
## musicianmusician      2.0270042  0.2137002   9.485  < 2e-16 ***
## familypercussion      1.0882874  1.5448904   0.704  0.48158    
## familystring         -1.9615077  1.0786459  -1.818  0.06977 .  
## familywoodwind        2.7685141  0.9716320   2.849  0.00462 ** 
## bpm                  -0.0284936  0.0065871  -4.326 1.94e-05 ***
## familypercussion:bpm -0.0678006  0.0129785  -5.224 2.88e-07 ***
## familystring:bpm      0.0437517  0.0096602   4.529 7.92e-06 ***
## familywoodwind:bpm    0.0002777  0.0086594   0.032  0.97443    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.026 on 384 degrees of freedom
## Multiple R-squared:  0.7327, Adjusted R-squared:  0.7264 
## F-statistic: 116.9 on 9 and 384 DF,  p-value: < 2.2e-16

mod.res = lm(ERS ~ age + musician + family + bpm, 
           data = orchestra)
anova(mod.res,mod.f)

## Analysis of Variance Table
## 
## Model 1: ERS ~ age + musician + family + bpm
## Model 2: ERS ~ age + musician + family * bpm
##   Res.Df    RSS Df Sum of Sq      F    Pr(>F)    
## 1    387 1889.7                                  
## 2    384 1576.8  3    312.88 25.399 5.207e-15 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

expand_grid(
  age = mean(orchestra$age),
  musician = unique(orchestra$musician),
  family = unique(orchestra$family),
  bpm = seq(min(orchestra$bpm),max(orchestra$bpm),length.out=30)
) |>
  broom::augment(mod.f, newdata = _, interval = "confidence") |>
  ggplot(aes(x=bpm,col=family,fill=family))+
  geom_point(data = orchestra, aes(y = ERS), alpha=.4) +
  geom_line(aes(y=.fitted)) +
  geom_ribbon(aes(y=.fitted,ymin=.lower,ymax=.upper), alpha=.3)+
  facet_wrap(~musician)

Rubric

descriptor	Marks	modelling strategy	model specification	results writeup	interpretation
inadequate (bad fail)	0, 15
inadequate (clear fail)	25
inadequate (marginal fail)	32, 38	model selection without really considering RQ no interaction bpm*category included, without any reason given	doesn’t include covariates age & musician, with no mention of why not misses key interaction	presents model table but doesn’t tell reader what to look at	interprets simple as main or interprets interaction as simple
adequate	42, 45, 48	confused: considers both appropriate model RQ, but then also goes down distractions of model comparisons without clear point to them	key interaction included, no mention of contrasts/scaling etc unclear explanation of model spec but (benefit of doubt) looks ok	list like going through each coefficient more results than needed (e.g., loads of model comparisons that aren’t really relevant)	some minor inaccuracies/issues
good	52, 55, 58	straight to an appropriate model for RQ	appropriate model, uses default contrasts etc but are clearly stated.	pulls out key params/relevant info. Minimal, but concise and sufficient plots that show model estimates but no underlying variability (i.e. lines but no datapoints)	generally all correctly interpreted
very good	62, 65, 68	appropriate model for RQ, and compares to restricted model for a useful test (e.g., additive vs interaction)	very clear explanation of model including clear contrasts used and clear explanation of scale of continuous predictors	pulls out relevant info and provides a bit extra that is useful, such as a discussion of assumptions/influence nicely made plots with sensible labels etc. ideally with data as well as model estimates	correctly interpreted, nicely placed in context (i.e. direction of effects is made clear)
excellent	72, 75, 78		does something “clever” (e.g., sum contrasts)	very good results section that tells a clear story, good discussion of assumptions and some clearly wellthought out approaches (i.e. sensitivity analyses) ‘publication ready’ plots	all good interpretation, directions clear, well written
excellent	85		clever stuff with good justification
excellent	92, 100

Question 3

Questions

3. Enjoyment in every note

Suggestion: 2 pages

The second research aim is to explore what factors led to being more likely to report enjoyment of the solo piece.
from background: “but the experimenters were also interested in what types of music people find enjoyable.”

Note: The researchers are pretty sure that nobody enjoys percussion solos, so for this question specifically they are happy if you want to exclude all percussion solos in order to make the analysis more straightforward. (This doesn’t mean you have to, but it may make things easier).

Expected analysis

logistic regression
probably “enjoyed” as ‘success’, as opposed to ‘not enjoyed’
question here of whether we want to include ERS as a predictor.
- could defend either option, but will make major change to interpretation
plots may/may not be present, but ideally would be predicted probabilities on the y
Q3 is not a well defined research question, which means we’re probably going to have lots of much more exploratory approaches here, and they will often end up doing endless model comparisons
- there’s not much guidance here other than the “types of music” is the focus. but this could include: faster/slower (bpm), instrument family, more emotional (ERS) etc.

orchestra3 <- 
  orchestra |> mutate(
    enj = enjoyed=="enjoyed"
  ) |> filter(family != "percussion")


mod3 <- glm(enj ~ age+musician+family+bpm,
            data = orchestra3,
            family=binomial)

mod3a <- glm(enj ~ ERS+age+musician+family+bpm,
            data = orchestra3,
            family=binomial)

Rubric

descriptor	Marks	modelling strategy	model specification	results writeup	interpretation
inadequate (bad fail)	0, 15
inadequate (clear fail)	25		linear not logistic
inadequate (marginal fail)	32, 38	model selection without really considering RQ (i.e. focus should be “type of music”, so would ideally consider bpm, category, ERS maybe)	doesn’t include covariates age & musician, with no mention of why not	presents model table but doesn’t tell reader what to look at	major issues: e.g., interprets OR centered on 0 rather than 1 and so gets direction wrong
adequate	42, 45, 48	confused: considers both appropriate model RQ, but then also goes down distractions of model comparisons without clear point to them	no mention of contrasts/scaling etc unclear explanation of model spec but (benefit of doubt) looks ok	list like going through each coefficient more results than needed (e.g., loads of model comparisons that aren’t really relevant)	more minor issues: e.g., misses multiplicative aspect of OR, or says they are change in probability (not odds). interprets back to front because never checked which of enjoyed vs not-enjoyed was modelled as success
good	52, 55, 58	goes straight to one model that contains most of the relevant things (bpm, category, maybe ERS)	appropriate model, uses default contrasts etc but are clearly stated.	pulls out key params/relevant info. Minimal, but concise and sufficient	generally all correctly interpreted
very good	62, 65, 68	model that seems reasonable, with some attempted explanation of why things are in there does model selection in a principled way to answer RQ (i.e. compares models and uses the fact that bpm was excluded from final model tells us something)	very clear explanation of model including clear contrasts used and clear explanation of scale of continuous predictors	pulls out relevant info and provides a bit extra that is useful, such as a discussion of assumptions/influence nicely made plots should ideally be predicted probabilities and should have sensible labels etc.	correctly interpreted, nicely placed in context (i.e. direction of effects is made clear)
excellent	72, 75, 78	nicely justified and reasonable model	does something “clever” (e.g., sum contrasts)	very good results section that tells a clear picture of “enjoyment ~ types of music” (however they interpret ‘types of music’). discussion of limitations/assumptions/something exhibiting thoughtful work ‘publication ready’ plots	all good interpretation, directions clear, well written
excellent	85		clever stuff with good justification
excellent	92, 100

USMR 2425 marking

Overview

general things to remember

what to do

commenting style

things to remember

Cleaning

Question 1

Questions

1. What do we know about the sample? (Describe and Explore)

Expected analysis

descriptives

1a

1b

1c

Rubric

Question 2

Questions

2. Sound & Sentiment

Expected analysis

Rubric

Question 3

Questions

3. Enjoyment in every note

Expected analysis

Rubric