Estimation of Variance Components - Estimating the Variation of Random Factors
The ANOVA method provides an integrative approach to estimating variance components, because ANOVA techniques can be used to estimate the variance of random factors, to estimate the components of variance in the dependent variable attributable to the random factors, and to test whether the variance components differ significantly from zero. The ANOVA method for estimating the variance of the random factors begins by constructing the Sums of squares and cross products (SSCP) matrix for the independent variables. The sums of squares and cross products for the random effects are then residualized on the fixed effects, leaving the random effects independent of the fixed effects, as required in the mixed model (see, for example, Searle, Casella, & McCulloch, 1992). The residualized Sums of squares and cross products for each random factor are then divided by their degrees of freedom to produce the coefficients in the Expected mean squares matrix. Nonzero off-diagonal coefficients for the random effects in this matrix indicate confounding, which must be taken into account when estimating the population variance for each factor. For the Wheat.sta data, treating both Variety and Plot as random effects, the coefficients in the Expected mean squares matrix show that the two factors are at least somewhat confounded. The Expected mean squares spreadsheet is shown below.
Expected Mean Squares (wheat.sta) | ||||
Mean Squares Type: 1 | ||||
Source | Effect (F/R) |
VARIETY | PLOT | Error |
{1}VARIETY | Random | 3.179487 | 1.000000 | 1.000000 |
{2}PLOT | Random | 1.000000 | 1.000000 | |
Error | 1.000000 |
The coefficients in the Expected mean squares matrix are used to estimate the population variation of the random effects by equating their variances to their expected mean squares. For example, the estimated population variance for Variety using Type Sums of squares would be 3.179487 times the Mean square for Variety plus 1 times the Mean square for Plot plus 1 times the Mean square for Error.
The ANOVA method provides an integrative approach to estimating variance components, but it is not without problems (i.e., ANOVA estimates of variance components are generally biased, and can be negative, even though variances, by definition, must be either zero or positive). An alternative to ANOVA estimation is provided by maximum likelihood estimation. Maximum likelihood methods for estimating variance components are based on quadratic forms, and typically, but not always, require iteration to find a solution. Perhaps the simplest form of maximum likelihood estimation is MIVQUE(0) estimation. MIVQUE(0) produces Minimum Variance Quadratic Unbiased Estimators (i.e., MIVQUE). In MIVQUE(0) estimation, there is no weighting of the random effects (thus the 0 [zero] after MIVQUE), so an iterative solution for estimating variance components is not required. MIVQUE(0) estimation begins by constructing the Quadratic sums of squares (SSQ) matrix. The elements for the random effects in the SSQ matrix can most simply be described as the sums of squares of the sums of squares and cross products for each random effect in the model (after residualization on the fixed effects). The elements of this matrix provide coefficients, similar to the elements of the Expected Mean Squares matrix, which are used to estimate the covariances among the random factors and the dependent variable. The SSQ matrix for the Wheat.sta data is shown below. Note that the nonzero off-diagonal element for Variety and Plot again shows that the two random factors are at least somewhat confounded.
MIVQUE(0) Variance Component Estimation (wheat.sta) | ||||
SSQ Matrix | ||||
Source | VARIETY | PLOT | Error | DAMAGE |
{1}VARIETY | 31.90533 | 9.53846 | 9.53846 | 2.418964 |
{2}PLOT | 9.53846 | 12.00000 | 12.00000 | 1.318077 |
Error | 9.53846 | 12.00000 | 12.00000 | 1.318077 |
Restricted Maximum Likelihood (REML) and Maximum Likelihood (ML) variance component estimation methods are closely related to MIVQUE(0). In fact, in the program, REML and ML use MIVQUE(0) estimates as start values for an iterative solution for the variance components, so the elements of the SSQ matrix serve as initial estimates of the covariances among the random factors and the dependent variable for both REML and ML.