A correlation matrix is a square matrix (i.e. same number of columns as rows), and it is symmetric on the diagonals. Furthermore, the diagonal values themselves are all 1 (the correlation of a variable with itself is 1).
So we might see a correlation matrix (lets call it \(\mathbf{R}\)) that looks like this:
One way in which we can create a symmetric square matrix is to take the product of some vector \(\mathbf{f}\) with \(\mathbf{f'}\) (\(\mathbf{f}\) “transposed”). This is just how you might “square” a vector. To multiply two vectors you take each entry of the first and multiply it with each entry of the second.
Note that I’ve chosen values for \(\mathbf{f}\) that get pretty close to re-creating \(\mathbf{R}\) (clever me!). I can get even closer if I take this and add to it the product of another vector:
I haven’t just chosen a random additional vector here, I’ve chosen one that is “orthogonal” to the first vector (this gets a little complicated and trigonometry-related, but click here for an explanation of why the dot product of orthogonal vectors is equal to zero).
In essence eigen decomposition is about finding a set of vectors that can, together, express our correlation matrix \(\mathbf{R}\).
For an \(n \times n\) correlation matrix \(\mathbf{R}\), there is a set of \(n\)eigenvectors\(x_i\) that solve the equation \[
\mathbf{x_i R} = \lambda_i \mathbf{x_i}
\] Where the vector multiplied by the correlation matrix is equal to some eigenvalue\(\lambda_i\) multiplied by that vector.
We can write this without subscript \(i\) as: \[
\begin{align}
& \mathbf{R X} = \mathbf{X \lambda} \\
& \text{where:} \\
& \mathbf{R} = \text{correlation matrix} \\
& \mathbf{X} = \text{matrix of eigenvectors} \\
& \mathbf{\lambda} = \text{vector of eigenvalues}
\end{align}
\] these vectors that make up \(\mathbf{X}\) are orthogonal (\(\mathbf{XX' = I}\)), which means that \(\mathbf{R = X \lambda X'}\)
For some demonstration in R, let’s create a correlation matrix
# lets create a correlation matrix, as f f'f <-seq(.9,.4,-.1)R <- f %*%t(f)#give rownames and colnamesrownames(R)<-colnames(R)<-paste0("V",seq(1:6))#constrain diagonals to equal 1diag(R)<-1# here's our correlation matrix!R
The eigenvectors represent the direction of the vectors, and the eigenvalues represent magnitude (i.e., how much variance of \(\mathbf{R}\) they capture).
Our Principal Components \(\mathbf{P}\) are the eigenvectors scaled by the square root of the eigenvalues:
# eigenvectors# e$vectors# scaled by sqrt of eigenvalues# diag(sqrt(e$values))P <- e$vectors %*%diag(sqrt(e$values))P
Principal Components Analysis
Call: principal(r = R, nfactors = 1, rotate = "none", covar = FALSE)
Standardized loadings (pattern matrix) based upon correlation matrix
PC1 h2 u2 com
V1 0.88 0.78 0.22 1
V2 0.83 0.69 0.31 1
V3 0.77 0.59 0.41 1
V4 0.69 0.48 0.52 1
V5 0.60 0.37 0.63 1
V6 0.50 0.25 0.75 1
PC1
SS loadings 3.16
Proportion Var 0.53
Mean item complexity = 1
Test of the hypothesis that 1 component is sufficient.
The root mean square of the residuals (RMSR) is 0.09
Fit based upon off diagonal values = 0.95
Look familiar? It looks like the first component we computed manually. The first column of \(\mathbf{P}\). The direction is flipped, but that is somewhat irrelevant because it’s the same dimension (i.e., South is just “negative North”!)
These aren’t 1, like they are in \(R\).
But they are proportional: this is the amount of variance in each observed variable that is explained by this first component.
The principal() function calls this ‘communalities’, which is really more of a term we see in factor analysis.