Correspondence Analysis - Computational Details
The notation used in this section follows closely that used by Greenacre (1984). Also, refer to Greenacre (1984) for a detailed discussion of the computations involved.
Option | Description |
---|---|
Notation | The computations are based on the following matrices:
|
Singular value decomposition | The computation of the row and column coordinates is based on the generalized singular value decomposition of
P, as:
P = ADuB' so that A inverse(Dr)A = B' inverse(Dc)B = I where A is the matrix of the left-side generalized singular vectors, B is the matrix of the right-side generalized singular vectors, Du is a diagonal matrix with the diagonal elements equal to the generalized singular values, and I stands for the identity matrix (a diagonal matrix with 1's in the diagonal). |
Coordinates for row and column points | The computation of the coordinates for row and column points depends on the option button you select in the Standardization of Coordinates group box on the Options tab of the Correspondence Analysis Results dialog box. |
Row & column profiles | The row coordinates are computed based on the row profile matrix R = inverse(Dr)P, and the column coordinates are computed based on the column profile matrix computed analogously. Specifically, the row coordinates are computed as F = inverse(Dr)ADu, and the column coordinates as G = inverse(Dc)BDu. This option is appropriate when you are interested in interpreting both the distances between row points, and the distances between column points (the distances in both coordinate systems for row points and column points are Chi-square distances). However, note that, as discussed in the Introductory Overview, distances between column and row points are not meaningful. |
Canonical standardization | The row coordinates are computed as F = inverse(Dr)A(Du)½, and the column coordinates as G = inverse(Dc)B(Du)½. |
Row profiles (interpret row dist.) | The row coordinates are computed based on the row profile matrix R = inverse(Dr)P. Specifically, the (principal) row coordinates are computed as F = inverse(Dr)ADu, and the standard column coordinates as G = inverse(Dc)B. This option is appropriate when you are interested in interpreting the distances between row points; the column coordinates should not be interpreted. |
Column profiles (interpret col. dist.) | When reviewing the results for column points in multiple correspondence analysis, the column coordinates are computed based on the column profile matrix. Specifically, the (principal) column coordinates are computed as F = inverse(Dc)BDu, and the standard row coordinates as G = inverse(Dr)A. This option is appropriate when you are interested in interpreting the distances between column points; the row coordinates should not be interpreted. |
Model equation | When using the default method of standardization, the following model on
P in
k dimensions shows how the relative frequencies are approximated:
P » rc' + DrF inverse(Du)G'Dc In this formula F and G stand for the row and column coordinates, respectively. |
Computation of quality and inertia | Note that the choice of the standardization method does not affect the computation of the quality and inertia values reported in the spreadsheet that is displayed when you click the
Row and column coordinates button on the
Advanced tab of the
Correspondence Analysis Results dialog box. Those values are always computed based on the
Row and column profiles standardization.
Specifically, define the diag(x) operator as setting the elements of vector x into the diagonal of a diagonal matrix; define the square(X) operator as squaring each element in matrix or vector X; then the partial contributions of the row and column points to the total inertia are computed as inverse(Dr)square(A) and inverse(Dc)square(B), respectively. The quality (Cosine²) for the individual dimensions is computed as diag(inverse(square(ADu)1)) square(ADu) and diag(inverse(square(BDu)1)) square(BDu) for the row and column points, respectively, where 1 stands for a column vector with all elements equal to 1. The inertia for the row and column points is computed as (1/t) inverse(Dr) square(ADu)1 and (1/t) inverse(Dc) square(BDu)1, respectively, where t stands for the total inertia. |
Supplementary points, simple correspondence analysis | The computation of the coordinates for supplementary row and column points depends on the option button you select in the Standardization of Coordinates group box on the Options tab of the Correspondence Analysis Results dialog box. Let Rs and Cs be the matrices of relative row or column frequencies for the supplementary rows and columns, respectively. The supplementary row and column frequencies are then computed as described in the following options. |
Row & column profiles | The supplementary row and column coordinates are computed as Rs inverse(Dc)B and Cs inverse(Dr) A, respectively. |
Canonical standardization | The supplementary row and column coordinates are computed as Rs inverse(Dc)B(Du)½ and Cs inverse(Dr) A(Du)½, respectively. |
Row profiles (interpret row dist.) | The supplementary row and column coordinates are computed as Rs inverse(Dc)B and Cs inverse(Dr) A inverse(Du), respectively. |
Column profiles (interpret row dist.) | The supplementary row and column coordinates are computed as Rs inverse(Dc) B inverse(Du) and Cs inverse(Dr) A, respectively. |
Supplementary points, multiple correspondence analysis | In multiple correspondence analysis, supplementary column coordinates are computed as Cs inverse(Dr) A inverse(Du). |