Why are your variables correlated?
What are your goals?
Why are your variables correlated?
What are your goals?
One of many features that distinguish factor analysis and principal components analysis
Key concept of psychometrics (factor analysis is a part)
Theorized common cause (e.g., cognitive ability) of responses to a set of variables
round(cor(agg.items),2)
## item1 item2 item3 item4 item5 item6 item7 item8 item9 item10## item1 1.00 0.51 0.48 0.39 0.55 0.00 0.09 0.05 0.09 0.05## item2 0.51 1.00 0.55 0.45 0.61 0.04 0.11 0.08 0.09 0.04## item3 0.48 0.55 1.00 0.44 0.58 0.04 0.11 0.07 0.09 0.04## item4 0.39 0.45 0.44 1.00 0.48 0.03 0.11 0.04 0.10 0.01## item5 0.55 0.61 0.58 0.48 1.00 0.01 0.09 0.02 0.09 0.01## item6 0.00 0.04 0.04 0.03 0.01 1.00 0.52 0.53 0.42 0.43## item7 0.09 0.11 0.11 0.11 0.09 0.52 1.00 0.74 0.56 0.57## item8 0.05 0.08 0.07 0.04 0.02 0.53 0.74 1.00 0.54 0.57## item9 0.09 0.09 0.09 0.10 0.09 0.42 0.56 0.54 1.00 0.42## item10 0.05 0.04 0.04 0.01 0.01 0.43 0.57 0.57 0.42 1.00
library(psych)agg_res <- fa(agg.items, nfactors = 2, fm = "ml", rotate = "oblimin")agg_res
## Factor Analysis using method = ml## Call: fa(r = agg.items, nfactors = 2, rotate = "oblimin", fm = "ml")## Standardized loadings (pattern matrix) based upon correlation matrix## ML1 ML2 h2 u2 com## item1 0.00 0.67 0.45 0.55 1## item2 0.02 0.75 0.57 0.43 1## item3 0.02 0.72 0.51 0.49 1## item4 0.01 0.60 0.36 0.64 1## item5 -0.03 0.81 0.66 0.34 1## item6 0.63 -0.04 0.39 0.61 1## item7 0.85 0.04 0.74 0.26 1## item8 0.86 -0.03 0.73 0.27 1## item9 0.63 0.05 0.41 0.59 1## item10 0.67 -0.04 0.44 0.56 1## ## ML1 ML2## SS loadings 2.71 2.56## Proportion Var 0.27 0.26## Cumulative Var 0.27 0.53## Proportion Explained 0.51 0.49## Cumulative Proportion 0.51 1.00## ## With factor correlations of ## ML1 ML2## ML1 1.00 0.11## ML2 0.11 1.00## ## Mean item complexity = 1## Test of the hypothesis that 2 factors are sufficient.## ## df null model = 45 with the objective function = 3.91 with Chi Square = 3894## df of the model are 26 and the objective function was 0.02 ## ## The root mean square of the residuals (RMSR) is 0.01 ## The df corrected root mean square of the residuals is 0.01 ## ## The harmonic n.obs is 1000 with the empirical chi square 8.33 with prob < 1 ## The total n.obs was 1000 with Likelihood Chi Square = 16.15 with prob < 0.93 ## ## Tucker Lewis Index of factoring reliability = 1.004## RMSEA index = 0 and the 90 % confidence intervals are 0 0.007## BIC = -163.4## Fit based upon off diagonal values = 1## Measures of factor score adequacy ## ML1 ML2## Correlation of (regression) scores with factors 0.94 0.92## Multiple R square of scores with factors 0.88 0.85## Minimum correlation of possible factor scores 0.77 0.70
Factor loading's, like PCA loading's, show the relationship of each measured variable to each factor.
We interpret our factor models by the pattern and size of these loading's.
Square of the factor loading's tells us how much item variance is explained ( h2
), and how much isn't ( u2
)
Factor correlations : When estimated, tell us how closely factors relate (see rotation)
SS Loading
and proportion of variance information is interpreted as we discussed for PCA.
PCA
EFA
What does it mean to model the data?
EFA tries to explain these patterns of correlations
ρ(y1,y2⋅ξ)=corr(e1,e2)=0ρ(y1,y3⋅ξ)=corr(e1,e3)=0ρ(y2,y3⋅ξ)=corr(e2,e3)=0
var(total)=var(common)+var(specific)+var(error)
Σ=ΛΦΛ′+Ψ
Σ: A p×p observed covariance matrix (from data)
Λ: A p×m matrix of factor loading's (relates the m factors to the p items)
Φ: An m×m matrix of correlations between factors ("goes away" with orthogonal factors)
Ψ: A diagonal matrix with p elements indicating unique (error) variance for each item
As EFA is a model, just like linear models and other statistical tools, it has some assumptions:
This boils down to is the data correlated.
We can take this a step further and calculate the squared multiple correlation (SMC).
There are also some statistical test (e.g. Bartlett's test)
For PCA, we discussed the use of the eigen-decomposition.
As we have a model for the data in factor analysis, we need to estimate the model parameters
The most efficient way to factor analyze data is to start by estimating communalities
If we consider that EFA is trying to explain true common variance, then communalitie estimates are more useful to us than total variance.
Estimating communalities is difficult because population communalities are unknown
Compute initial communalities from SMCs
Once we have these reasonable lower bounds, we substitute the 1s in the diagonal of our correlation matrix with the SMCs derived in step 1
Obtain the factor loading matrix using the eigenvalues and eigenvectors of the matrix obtained in the step 2
fa
procedureStarts with some other solution, e.g., PCA or principal axes, extracting a set number of factors
Adjusts loading's of all factors on each variable so as to minimize the residual correlations for that variable
MINRES doesn't "try" to estimate communalities
If you apply principal axis factoring to the original correlation matrix with a diagonal of communalities derived from step 2, you get the same factors as in the method of minimum residuals
Uses a general iterative procedure for estimating parameters that we have previously discussed.
Method offers the advantage of providing numerous "fit" statistics that you can use to evaluate how good your model is compared to alternative models
Assume a distribution for your data, e.g., a normal distribution
The issue is that for big analyses, sometimes it is not possible to find values for factor loadings that = MLE estimates.
Also may produce solutions with impossible values
Sometimes the construct we are interested in is not continuous, e.g. number of crimes committed.
Sometimes we assume the construct is, but we measure it with a discrete scale.
Most constructs we seek to measure by questionnaire fall into the latter category.
If we are concerned and the construct is normally distributed, we can conduct our analysis on a matrix of polychoric correlations
If the construct is not normally distributed, you can conduct a factor analysis that allows for these kinds of variables
The best option, as with many statistical models, is ML.
If ML solutions fail to converge, principal axis is a simple approach which typically yields reliable results.
If concerns over the distribution of variables, use PAF on the polychoric correlations.
We have discussed the methods for deciding on the number of factors in the context of PCA.
Recall we have 4 tools:
For FA, we generally want a slightly more nuanced approach than pure variance:
We will go through this process in later video.
Factor solutions can sometimes be complex to interpret.
Why is this the case?
In other words, there is no unique solution to the factor problem
And this is also in part why the theoretical coherence of the models plays a much bigger role in FA than PCA.
Factor rotation is an approach to clarifying the relationships between items and factors.
Thus although we can not numerically tell rotated solutions apart, we can select the one with the most coherent solution.
There are many different ways this can be achieved.
Each variable (row) should have at least one zero loading
Each factor (column) should have same number of zero’s as there are factors
Every pair of factors (columns) should have several variables which load on one factor, but not the other
Whenever more than four factors are extracted, each pair of factors (columns) should have a large proportion of variables which do not load on either factor
Every pair of factors should have few variables which load on both factors
All factor rotation methods seek to optimize one or more aspects of simple structure.
Orthogonal
Oblique
Original correlations
EFA with no rotation and 5 factors
EFA with no rotation and 5 factors
EFA with orthogonal rotation and 5 factors
EFA with orthogonal rotation and 5 factors
EFA with oblique rotation and 5 factors
Easy, my recommendation is always to choose oblique.
Why?
However, there is a catch...
When we have an obliquely rotated solution, we need to draw a distinction between the pattern and structure matrix.
When we use orthogonal rotation, the pattern and structure matrix are the same.
When we use oblique rotation, the structure matrix is the pattern matrix multiplied by the factor correlations.
In most practical situations, this does not impact what we do, but it is important to highlight the distinction.
Keyboard shortcuts
↑, ←, Pg Up, k | Go to previous slide |
↓, →, Pg Dn, Space, j | Go to next slide |
Home | Go to first slide |
End | Go to last slide |
Number + Return | Go to specific slide |
b / m / f | Toggle blackout / mirrored / fullscreen mode |
c | Clone slideshow |
p | Toggle presenter mode |
t | Restart the presentation timer |
?, h | Toggle this help |
o | Tile View: Overview of Slides |
Esc | Back to slideshow |
Why are your variables correlated?
What are your goals?
Why are your variables correlated?
What are your goals?
One of many features that distinguish factor analysis and principal components analysis
Key concept of psychometrics (factor analysis is a part)
Theorized common cause (e.g., cognitive ability) of responses to a set of variables
round(cor(agg.items),2)
## item1 item2 item3 item4 item5 item6 item7 item8 item9 item10## item1 1.00 0.51 0.48 0.39 0.55 0.00 0.09 0.05 0.09 0.05## item2 0.51 1.00 0.55 0.45 0.61 0.04 0.11 0.08 0.09 0.04## item3 0.48 0.55 1.00 0.44 0.58 0.04 0.11 0.07 0.09 0.04## item4 0.39 0.45 0.44 1.00 0.48 0.03 0.11 0.04 0.10 0.01## item5 0.55 0.61 0.58 0.48 1.00 0.01 0.09 0.02 0.09 0.01## item6 0.00 0.04 0.04 0.03 0.01 1.00 0.52 0.53 0.42 0.43## item7 0.09 0.11 0.11 0.11 0.09 0.52 1.00 0.74 0.56 0.57## item8 0.05 0.08 0.07 0.04 0.02 0.53 0.74 1.00 0.54 0.57## item9 0.09 0.09 0.09 0.10 0.09 0.42 0.56 0.54 1.00 0.42## item10 0.05 0.04 0.04 0.01 0.01 0.43 0.57 0.57 0.42 1.00
library(psych)agg_res <- fa(agg.items, nfactors = 2, fm = "ml", rotate = "oblimin")agg_res
## Factor Analysis using method = ml## Call: fa(r = agg.items, nfactors = 2, rotate = "oblimin", fm = "ml")## Standardized loadings (pattern matrix) based upon correlation matrix## ML1 ML2 h2 u2 com## item1 0.00 0.67 0.45 0.55 1## item2 0.02 0.75 0.57 0.43 1## item3 0.02 0.72 0.51 0.49 1## item4 0.01 0.60 0.36 0.64 1## item5 -0.03 0.81 0.66 0.34 1## item6 0.63 -0.04 0.39 0.61 1## item7 0.85 0.04 0.74 0.26 1## item8 0.86 -0.03 0.73 0.27 1## item9 0.63 0.05 0.41 0.59 1## item10 0.67 -0.04 0.44 0.56 1## ## ML1 ML2## SS loadings 2.71 2.56## Proportion Var 0.27 0.26## Cumulative Var 0.27 0.53## Proportion Explained 0.51 0.49## Cumulative Proportion 0.51 1.00## ## With factor correlations of ## ML1 ML2## ML1 1.00 0.11## ML2 0.11 1.00## ## Mean item complexity = 1## Test of the hypothesis that 2 factors are sufficient.## ## df null model = 45 with the objective function = 3.91 with Chi Square = 3894## df of the model are 26 and the objective function was 0.02 ## ## The root mean square of the residuals (RMSR) is 0.01 ## The df corrected root mean square of the residuals is 0.01 ## ## The harmonic n.obs is 1000 with the empirical chi square 8.33 with prob < 1 ## The total n.obs was 1000 with Likelihood Chi Square = 16.15 with prob < 0.93 ## ## Tucker Lewis Index of factoring reliability = 1.004## RMSEA index = 0 and the 90 % confidence intervals are 0 0.007## BIC = -163.4## Fit based upon off diagonal values = 1## Measures of factor score adequacy ## ML1 ML2## Correlation of (regression) scores with factors 0.94 0.92## Multiple R square of scores with factors 0.88 0.85## Minimum correlation of possible factor scores 0.77 0.70
Factor loading's, like PCA loading's, show the relationship of each measured variable to each factor.
We interpret our factor models by the pattern and size of these loading's.
Square of the factor loading's tells us how much item variance is explained ( h2
), and how much isn't ( u2
)
Factor correlations : When estimated, tell us how closely factors relate (see rotation)
SS Loading
and proportion of variance information is interpreted as we discussed for PCA.
PCA
EFA
What does it mean to model the data?
EFA tries to explain these patterns of correlations
ρ(y1,y2⋅ξ)=corr(e1,e2)=0ρ(y1,y3⋅ξ)=corr(e1,e3)=0ρ(y2,y3⋅ξ)=corr(e2,e3)=0
var(total)=var(common)+var(specific)+var(error)
Σ=ΛΦΛ′+Ψ
Σ: A p×p observed covariance matrix (from data)
Λ: A p×m matrix of factor loading's (relates the m factors to the p items)
Φ: An m×m matrix of correlations between factors ("goes away" with orthogonal factors)
Ψ: A diagonal matrix with p elements indicating unique (error) variance for each item
As EFA is a model, just like linear models and other statistical tools, it has some assumptions:
This boils down to is the data correlated.
We can take this a step further and calculate the squared multiple correlation (SMC).
There are also some statistical test (e.g. Bartlett's test)
For PCA, we discussed the use of the eigen-decomposition.
As we have a model for the data in factor analysis, we need to estimate the model parameters
The most efficient way to factor analyze data is to start by estimating communalities
If we consider that EFA is trying to explain true common variance, then communalitie estimates are more useful to us than total variance.
Estimating communalities is difficult because population communalities are unknown
Compute initial communalities from SMCs
Once we have these reasonable lower bounds, we substitute the 1s in the diagonal of our correlation matrix with the SMCs derived in step 1
Obtain the factor loading matrix using the eigenvalues and eigenvectors of the matrix obtained in the step 2
fa
procedureStarts with some other solution, e.g., PCA or principal axes, extracting a set number of factors
Adjusts loading's of all factors on each variable so as to minimize the residual correlations for that variable
MINRES doesn't "try" to estimate communalities
If you apply principal axis factoring to the original correlation matrix with a diagonal of communalities derived from step 2, you get the same factors as in the method of minimum residuals
Uses a general iterative procedure for estimating parameters that we have previously discussed.
Method offers the advantage of providing numerous "fit" statistics that you can use to evaluate how good your model is compared to alternative models
Assume a distribution for your data, e.g., a normal distribution
The issue is that for big analyses, sometimes it is not possible to find values for factor loadings that = MLE estimates.
Also may produce solutions with impossible values
Sometimes the construct we are interested in is not continuous, e.g. number of crimes committed.
Sometimes we assume the construct is, but we measure it with a discrete scale.
Most constructs we seek to measure by questionnaire fall into the latter category.
If we are concerned and the construct is normally distributed, we can conduct our analysis on a matrix of polychoric correlations
If the construct is not normally distributed, you can conduct a factor analysis that allows for these kinds of variables
The best option, as with many statistical models, is ML.
If ML solutions fail to converge, principal axis is a simple approach which typically yields reliable results.
If concerns over the distribution of variables, use PAF on the polychoric correlations.
We have discussed the methods for deciding on the number of factors in the context of PCA.
Recall we have 4 tools:
For FA, we generally want a slightly more nuanced approach than pure variance:
We will go through this process in later video.
Factor solutions can sometimes be complex to interpret.
Why is this the case?
In other words, there is no unique solution to the factor problem
And this is also in part why the theoretical coherence of the models plays a much bigger role in FA than PCA.
Factor rotation is an approach to clarifying the relationships between items and factors.
Thus although we can not numerically tell rotated solutions apart, we can select the one with the most coherent solution.
There are many different ways this can be achieved.
Each variable (row) should have at least one zero loading
Each factor (column) should have same number of zero’s as there are factors
Every pair of factors (columns) should have several variables which load on one factor, but not the other
Whenever more than four factors are extracted, each pair of factors (columns) should have a large proportion of variables which do not load on either factor
Every pair of factors should have few variables which load on both factors
All factor rotation methods seek to optimize one or more aspects of simple structure.
Orthogonal
Oblique
Original correlations
EFA with no rotation and 5 factors
EFA with no rotation and 5 factors
EFA with orthogonal rotation and 5 factors
EFA with orthogonal rotation and 5 factors
EFA with oblique rotation and 5 factors
Easy, my recommendation is always to choose oblique.
Why?
However, there is a catch...
When we have an obliquely rotated solution, we need to draw a distinction between the pattern and structure matrix.
When we use orthogonal rotation, the pattern and structure matrix are the same.
When we use oblique rotation, the structure matrix is the pattern matrix multiplied by the factor correlations.
In most practical situations, this does not impact what we do, but it is important to highlight the distinction.