<- read.csv("https://uoepsy.github.io/data/conduct_probs_scale.csv")
cpdata # discard the first column
<- cpdata[,-1] cpdata
W9 Exercises: EFA
Conduct Problems
Data: Conduct Problems
A researcher is developing a new brief measure of Conduct Problems. She has collected data from n=450 adolescents on 10 items, which cover the following behaviours:
- Breaking curfew
- Vandalism
- Skipping school
- Bullying
- Spreading malicious rumours
- Fighting
- Lying
- Using a weapon
- Stealing
- Threatening others
Our task is to use the dimension reduction techniques we learned about in the lecture to help inform how to organise the items she has developed into subscales.
The data can be found at https://uoepsy.github.io/data/conduct_probs_scale.csv
Read in the dataset.
Create a correlation matrix for the items, and inspect the items to check their suitability for exploratory factor analysis
Take a look at Reading 9# Initial Checks.
How many dimensions should be retained?
This question can be answered in the same way as we did for PCA - use a scree plot, parallel analysis, and MAP test to guide you.
Use the function fa()
from the psych package to conduct and EFA to extract 2 factors (this is what we suggest based on the various tests above, but you might feel differently - the ideal number of factors is subjective!). Use a suitable rotation (rotate = ?
) and extraction method (fm = ?
).
Would you expect factors to be correlated? If so, you’ll want an oblique rotation.
See R9#doing-an-efa.
Inspect your solution. Make sure to look at and think about the loadings, the variance accounted for, and the factor correlations (if estimated).
Just printing an fa
object:
<- fa(data, ..... )
myfa myfa
Will give you lots and lots of information.
You can extract individual parts using:
myfa$loadings
for the loadingsmyfa$Vaccounted
for the variance accounted for by each factormyfa$Phi
for the factor correlation matrix
You can find a quick guide to reading the fa
output here: efa_output.pdf.
Look back to the description of the items, and suggest a name for your factors based on the patterns of loadings.
To sort the loadings, you can use
print(myfa$loadings, sort = TRUE)
Compare three different solutions:
- your current solution from the previous questions
- one where you fit 1 more factor
- one where you fit 1 fewer factors
Which one looks best?
We’re looking here to assess:
- how much variance is accounted for by each solution
- do all factors load on 3+ items at a salient level?
- do all items have at least one loading at a salient level?
- are there any “Heywood cases” (communalities or standardised loadings that are >1)?
- should we perhaps remove some of the more complex items?
- is the factor structure (items that load on to each factor) coherent, and does it make theoretical sense?
Write a brief paragraph or two that summarises your method and the results from your chosen optimal factor structure for the 10 conduct problems.
Write about the process that led you to the number of factors. Discuss the patterns of loadings and provide definitions of the factors.
Dimensions of Apathy
Dataset: radakovic_das.csv
Apathy is lack of motivation towards goal-directed behaviours. It is pervasive in a majority of psychiatric and neurological diseases, and impacts everyday life. Traditionally, apathy has been measured as a one-dimensional construct, it may be that multiple different types of demotivation provides a better explanation.
We have data on 250 people who have responded to 24 questions about apathy, that can be accessed at https://uoepsy.github.io/data/radakovic_das.csv. Information on the items can be seen in the table below.
All items are measured on a 6-point Likert scale of Always (0), Almost Always (1), Often (2), Occasionally (3), Hardly Ever (4), and Never (5). Certain items (indicated in the table below with a -
direction) are reverse scored to ensure that higher scores indicate greater levels of apathy.
item | direction | question |
---|---|---|
1 | + | I need a bit of encouragement to get things started |
2 | - | I contact my friends |
3 | - | I express my emotions |
4 | - | I think of new things to do during the day |
5 | - | I am concerned about how my family feel |
6 | + | I find myself staring in to space |
7 | - | Before I do something I think about how others would feel about it |
8 | - | I plan my days activities in advance |
9 | - | When I receive bad news I feel bad about it |
10 | - | I am unable to focus on a task until it is finished |
11 | + | I lack motivation |
12 | + | I struggle to empathise with other people |
13 | - | I set goals for myself |
14 | - | I try new things |
15 | + | I am unconcerned about how others feel about my behaviour |
16 | - | I act on things I have thought about during the day |
17 | + | When doing a demanding task, I have difficulty working out what I have to do |
18 | - | I keep myself busy |
19 | + | I get easily confused when doing several things at once |
20 | - | I become emotional easily when watching something happy or sad on TV |
21 | + | I find it difficult to keep my mind on things |
22 | - | I am spontaneous |
23 | + | I am easily distracted |
24 | + | I feel indifferent to what is going on around me |
Here is some code that does the following:
- reads in the data
- renames the variables as “q1”, “q2”, “q3”, … and so on
- recodes the variables so that instead of words, the responses are coded as numbers
<- read_csv("https://uoepsy.github.io/data/radakovic_das.csv")
rdas
names(rdas) <- paste0("q",1:24)
<- rdas |>
rdas mutate(across(q1:q24, ~case_match(.,
"Always" ~ 0,
"Almost Always" ~ 1,
"Often" ~ 2,
"Occasionally" ~ 3,
"Hardly Ever" ~ 4,
"Never" ~ 5
)))
What number of underlying dimensions best explain the variability in the questionnaire?
Check the suitability of the items before conducting exploratory factor analysis to address this question. Decide on an optimal factor solution and provide a theoretical name for each factor. We’re not doing scale development here, so ideally we don’t want to get rid of items.
Once you’ve tried, have a look at this paper by Radakovic & Abrahams that is essentially what you’ve just done! (the data isn’t the same, ours is fake!).
Footnotes
You should provide the table of factor loadings. It is conventional to omit factor loadings \(<|0.3|\); however, be sure to ensure that you mention this in a table note.↩︎