Correspondence Analysis - Computational Details

The notation used in this section follows closely that used by Greenacre (1984). Also, refer to Greenacre (1984) for a detailed discussion of the computations involved.

Option	Description
Notation	The computations are based on the following matrices: P is the matrix of relative frequencies, that is, each element of P is computed as the respective frequency from the input table, divided by the grand total of all values. r is the vector of row totals of P. c is the vector of column totals of P. D_r is a diagonal matrix, the diagonal elements of D_r are equal to the row totals of P. D_c is a diagonal matrix, the diagonal elements of D_c are equal to the column totals of P.
Singular value decomposition	The computation of the row and column coordinates is based on the generalized singular value decomposition of P, as: P = AD_uB' so that A inverse(Dr)A = B' inverse(Dc)B = I where A is the matrix of the left-side generalized singular vectors, B is the matrix of the right-side generalized singular vectors, D_u is a diagonal matrix with the diagonal elements equal to the generalized singular values, and I stands for the identity matrix (a diagonal matrix with 1's in the diagonal).
Coordinates for row and column points	The computation of the coordinates for row and column points depends on the option button you select in the Standardization of Coordinates group box on the Options tab of the Correspondence Analysis Results dialog box.
Row & column profiles	The row coordinates are computed based on the row profile matrix R = inverse(D_r)P, and the column coordinates are computed based on the column profile matrix computed analogously. Specifically, the row coordinates are computed as F = inverse(D_r)AD_u, and the column coordinates as G = inverse(D_c)BD_u. This option is appropriate when you are interested in interpreting both the distances between row points, and the distances between column points (the distances in both coordinate systems for row points and column points are Chi-square distances). However, note that, as discussed in the Introductory Overview, distances between column and row points are not meaningful.
Canonical standardization	The row coordinates are computed as F = inverse(D_r)A(D_u)^½, and the column coordinates as G = inverse(D_c)B(D_u)^½.
Row profiles (interpret row dist.)	The row coordinates are computed based on the row profile matrix R = inverse(D_r)P. Specifically, the (principal) row coordinates are computed as F = inverse(D_r)AD_u, and the standard column coordinates as G = inverse(D_c)B. This option is appropriate when you are interested in interpreting the distances between row points; the column coordinates should not be interpreted.
Column profiles (interpret col. dist.)	When reviewing the results for column points in multiple correspondence analysis, the column coordinates are computed based on the column profile matrix. Specifically, the (principal) column coordinates are computed as F = inverse(D_c)BD_u, and the standard row coordinates as G = inverse(D_r)A. This option is appropriate when you are interested in interpreting the distances between column points; the row coordinates should not be interpreted.
Model equation	When using the default method of standardization, the following model on P in k dimensions shows how the relative frequencies are approximated: P » rc' + D_rF inverse(D_u)G'D_c In this formula F and G stand for the row and column coordinates, respectively.
Computation of quality and inertia	Note that the choice of the standardization method does not affect the computation of the quality and inertia values reported in the spreadsheet that is displayed when you click the Row and column coordinates button on the Advanced tab of the Correspondence Analysis Results dialog box. Those values are always computed based on the Row and column profiles standardization. Specifically, define the diag(x) operator as setting the elements of vector x into the diagonal of a diagonal matrix; define the square(X) operator as squaring each element in matrix or vector X; then the partial contributions of the row and column points to the total inertia are computed as inverse(D_r)square(A) and inverse(D_c)square(B), respectively. The quality (Cosine²) for the individual dimensions is computed as diag(inverse(square(AD_u)1)) square(AD_u) and diag(inverse(square(BD_u)1)) square(BD_u) for the row and column points, respectively, where 1 stands for a column vector with all elements equal to 1. The inertia for the row and column points is computed as (1/t) inverse(D_r) square(AD_u)1 and (1/t) inverse(D_c) square(BD_u)1, respectively, where t stands for the total inertia.
Supplementary points, simple correspondence analysis	The computation of the coordinates for supplementary row and column points depends on the option button you select in the Standardization of Coordinates group box on the Options tab of the Correspondence Analysis Results dialog box. Let R_s and C_s be the matrices of relative row or column frequencies for the supplementary rows and columns, respectively. The supplementary row and column frequencies are then computed as described in the following options.
Row & column profiles	The supplementary row and column coordinates are computed as R_s inverse(D_c)B and C_s inverse(D_r) A, respectively.
Canonical standardization	The supplementary row and column coordinates are computed as R_s inverse(D_c)B(D_u)^½ and C_s inverse(D_r) A(D_u)^½, respectively.
Row profiles (interpret row dist.)	The supplementary row and column coordinates are computed as R_s inverse(D_c)B and C_s inverse(D_r) A inverse(D_u), respectively.
Column profiles (interpret row dist.)	The supplementary row and column coordinates are computed as R_s inverse(D_c) B inverse(D_u) and C_s inverse(D_r) A, respectively.
Supplementary points, multiple correspondence analysis	In multiple correspondence analysis, supplementary column coordinates are computed as C_s inverse(D_r) A inverse(D_u).

Copyright © 2021. Cloud Software Group, Inc. All Rights Reserved.