Variance Components - Computational Details

Most of the methods used in the computation of mixed model ANOVA and the estimation of variance components are described in the Introductory Overview, in the Examples listed in the Variance Components and Mixed Model ANOVA/ANCOVA Index, and in the sections on the relevant options. These methods are summarized and further details are given below.

Mixed Model ANOVA

Mixed Model ANOVA techniques are used when random effects are included in the model. The method for estimating the variance of random factors begins by constructing the Sums of squares and cross products (SSCP) matrix for the independent variables. The sums of squares and cross products for the random effects are then residualized on the fixed effects, leaving the random effects independent of the fixed effects, as required in the mixed model. Using the method of synthesis (Hartley, 1967), the residualized Sums of squares and cross products for each random factor are then divided by their degrees of freedom to produce the coefficients in the Expected mean squares matrix. The Expected mean squares is then used to estimate the variation of the random effects in the model. The Expected mean squares matrix can be produced using Type I, Type II, or Type III Sums of squares (Milliken & Johnson, 1992).

To test the significance of effects in mixed or random models, error terms must be constructed that contain all the same sources of random variation except for the variation of the respective effect of interest. This is done using Satterthwaite's method of denominator synthesis (Satterthwaite, 1946), which finds the linear combinations of sources of random variation that serve as appropriate error terms for testing the significance of the respective effect of interest. To perform the tests of significance of effects, ratios of appropriate Mean squares are then formed to compute F statistics and p-values for each effect. Denominator degrees of freedom for corresponding synthesized error terms are computed using Satterthwaite's method. The resulting F tests generally are approximate, rather than exact, and can be based on fractional degrees of freedom, reflecting fractional sources of random variation from which the error terms are synthesized. These approximate F tests become undefined, and thus are not displayed, when the denominator degrees of freedom approach or become zero.

Overparameterized and sigma-restricted models
In one line of literature, the analysis of multi-factor ANOVA designs is generally discussed in terms of the so-called Sigma-restricted model. In short, the ANOVA parameters are constrained to sum to zero; in this manner, given k levels of a factor, the k-1 parameters (corresponding to the k-1 degrees of freedom) can readily be estimated (e.g., Lindeman, 1974, Snedecor and Cochran, 1989, p. 322). Another tradition discusses ANOVA in the context of the unconstrained (and thus over-parameterized) general linear model (e.g., Kirk, 1968). The results for mixed (random and fixed effect) models can be different applying the two approaches. For this discussion, suppose you had a two-way mixed model design: Subject (random) by Treatment (fixed). It turns out that if you start with the sigma-restricted model, the expected mean square for the random effect (i.e., Subject) does not contain the two-way interaction; however, without the sigma restriction, it does (compare tables 4.6 and 4.7 in Searle, Casella, & McCulloch, 1992; the derivations of the expected mean squares in the two cases are also discussed in that reference).

The next question, of course, is which expected mean square is right; the answer to this question is "it depends." Searle, Casella, and McCulloch (1992) give a detailed discussion of the advantages and disadvantages of the two approaches (they conclude that the question "has no definitive, universally acceptable answer," p. 126). However, Searle, et al. also point out that when the parameters of the fixed effects "are being taken as realized values of random variables, it is not realistic to have them summing to zero" (Searle, et al. p. 123). Moreover, for the case of unbalanced data, this restriction is usually never even considered. Therefore, most general linear model routines (including the Variance Components and Mixed Model ANOVA/ANCOVA method of analysis) that estimate expected mean squares for mixed models will usually use the solution for the over-parameterized model. However, note that STATISTICA ANOVA/MANOVA uses (by default) the means model approach, and will construct F-tests for mixed models that are consistent with the sigma restricted model (which is the ANVOA "tradition" most commonly discussed in statistics textbooks in the biological and social sciences).

Maximum Likelihood Methods of Variance Component Estimation

MIVQUE(0) variance component estimation. MIVQUE(0) is the simplest of the maximum likelihood methods for estimating variance components. Maximum likelihood methods for estimating variance components are based on quadratic forms, and typically, but not always, require iteration to find a solution. MIVQUE(0) estimation begins by constructing the Quadratic sums of squares (SSQ) matrix. Each element of the SSQ matrix is the trace of the product matrix for each combination of partitions of the design matrix involving the random effects and the dependent variable, after residualizing on the fixed effects. Perhaps more simply, the elements of the SSQ matrix can be described as the sums of squares of the sums of squares and cross products for each random effect in the model (after residualization on the fixed effects).

MIVQUE(0) variance components are estimated by inverting the partition of the SSQ matrix that does not include the dependent variable (or finding the generalized inverse, for singular matrices), and postmultiplying the inverse by the dependent variable column vector. This amounts to solving the system of equations that relates the dependent variable to the random independent variables, taking into account the covariation among the independent variables. In MIVQUE(0) estimation, unlike REML or ML estimation, there is no weighting of the random effects (thus the 0 [zero] after MIVQUE), so an iterative solution for estimating variance components is not required.

REML and ML variance component estimation
The basic idea behind both REML and ML estimation is to find the set of weights for the random effects in the model that minimize the negative of the natural logarithm times the likelihood of the data (the likelihood of the data can vary from zero to one, so minimizing the negative of the natural logarithm times the likelihood of the data amounts to maximizing the probability, or the likelihood, of the data). The method of maximum likelihood (the term first used by Fisher, 1922a) is a general method of estimating parameters of a population by values that maximize the likelihood (L) of a sample. The likelihood L of a sample of N observations x1, x2, ..., xn, is the joint probability function p(x1, x2, ..., xn), where x1, x2, ..., xn are discrete random variables. If x1, x2, ..., xn are continuous random variables, then the likelihood L of a sample of n observations, x1, x2, ..., xn, is the joint density function f(x1, x2, ..., xn).

Let L be the likelihood of a sample, where L is a function of the parameters θ1, θ2, ... θk. Then the maximum likelihood estimators of θ1, θ2, ... θk are the values of θ1, θ2, ... θk that maximize L.

Let θ be an element of Ω. If Ω is an open interval, and if L(θ) is differentiable and assumes a maximum on Ω, then the MLE will be a solution of the following equation: (dL(θ))/dθ = 0. For detailed information on maximum likelihood estimation, see Mendenhall and Sincich (1984), Bain and Engelhardt (1989), and Neter, Wasserman, and Kutner (1989). See also, Nonlinear Estimation.

REML and ML use MIVQUE(0) estimates as start values for an iterative maximum likelihood solution for the variance components, so the elements of the SSQ matrix serve as initial estimates of the covariances among the random factors and the dependent variable. REML and ML variance components are estimated by iteratively optimizing the parameter estimates for the effects in the model. REML differs from ML in that the likelihood of the data is maximized only for the random effects, thus REML is a restricted solution.

The statistical theory underlying maximum likelihood variance component estimation techniques is an advanced topic (Searle, Casella, & McCulloch, 1992, is recommended as an authoritative and comprehensive source). Implementation of maximum likelihood estimation algorithms for variance components, furthermore, is difficult (see, for example, Hemmerle & Hartley, 1973, and Jennrich & Sampson, 1976, for descriptions of these algorithms), and faulty implementation can lead to variance component estimates that lie outside the parameter space, converge prematurely to nonoptimal solutions, or give nonsensical results. Milliken and Johnson (1992) noted all of these problems with the commercial statistical software packages they used to estimate variance components. In the Variance Components and Mixed Model ANOVA/ANCOVA method of analysis, care has been taken to avoid these problems as much as possible. Note, for example, that for the analysis reported in Example 2: Variance Component Estimation for a Four-Way Mixed Factorial Design, most statistical packages do not give reasonable results, but the estimates computed in the Variance Components and Mixed Model ANOVA/ANCOVA method of analysis are quite reasonable.

By the definition of variance, the parameter space for variance component estimates is from zero (inclusive) to infinity. One of the most formidable problems in the implementation of maximum likelihood variance component estimation algorithms involves dealing with negative variance component estimates. It is disconcerting but true that any method of variance component estimation can give negative variance component estimates for unbalanced designs (with unequal N's at different levels of the factors).

The algorithms used in the Variance Components and Mixed Model ANOVA/ANCOVA method of analysis produce several notable effects when negative variance component estimates are encountered. First, negative estimates for the error component are not allowed. Adjustments to the estimates of other components are made to force the estimate for the error component to be non-negative. Second, any redundancy in the effects in the model at the current iteration is resolved by dropping one or more of the effects from the model for the current iteration, and setting their variance component estimates to zero. This step alone prevents many negative variance component estimates. However, dropped effects are allowed to reenter the model in subsequent iterations if a non-negative variance component is estimated for them. Third, the variance component estimate for a random effect (other than for the error component) is always set to zero when a negative variance component estimate is encountered, and the resulting model is evaluated accordingly. Some of these steps may seem extreme, but they have been used successfully in other implementations of maximum likelihood variance component estimation algorithms, and are certainly preferable to interpreting a model with negative variance components estimates.

"Singular Hessian matrix at point of conversion in maximum likelihood estimation." In some cases you may see this message at the top of the spreadsheet reporting the estimates of the variance components; usually, you will also see at least one component with a reported asymptotic standard error of 0 (zero). This message indicates that during the iterative computations, at least two components were found to be entirely redundant, and that no unique estimates can be derived for those components. This may, for example, happen when there are 0 (zero) degrees of freedom for the error term, and hence, when the error variance is not defined. The computational algorithm used in STATISTICA always attempts to resolve any redundancies so as to retain a positive (>0) estimate for the Error variance component (otherwise, if during the estimation an intermediate solution makes the Error component equal to 0 [zero], the algorithm may prematurely terminate with a non-maximum likelihood solution). However, when this message appears in the header of a results spreadsheet in your analysis, you should carefully review the current model (e.g., click the SS and MS for effects button on the Variance Components and Mixed Model ANOVA/ANCOVA Results dialog - Estimation tab), and interpret the estimates of the variance components with caution (as they are not unique!).

Note: there are several modules in STATISTICA that will perform Analysis of Variance for factorial or specialized designs. For a discussion of these modules and the types of designs for which they are best suited, refer to Methods for Analysis of Variance. Note also that the Visual General Linear Model (GLM) module can analyze designs with any number and type of between effects and compute ANOVA-based variance component estimates for any effect in a mixed-model analysis using any of the six types of sums of squares.