Computational Details
Most of the methods used in the computation of mixed model ANOVA and the estimation of variance components are described in the Introductory Overview, in the Examples listed in the Variance Estimation and Precision Index, and in the sections on the relevant options. These methods are summarized and further details are given below. Note that in the Variance Estimation and Precision module, the general linear model approach is only used in computing analysis of variance results for the ANOVA estimation method. Residual and least squares are all computed using the mixed model approach (see below for more information on these two types of models).
General Linear Models vs. Mixed Models
When analyzing data that includes both fixed and random effects, two different models can be used: a general linear model or a mixed linear model.
- General linear model approach
- The first approach is to treat the all effects as if they were fixed, using the traditional general linear model
y = Xb+e
where
y is a vector of response values
X is a design matrix that accounts for all effects (fixed and random)
b is the unknown vector of parameter estimates and
e is the vector of unknown random error.
With this general linear model approach is used with a mixed-model (random effects) designs, between effects are tested using relevant error terms based on the covariation of random sources of variation in the design. Specifically, this is done using Satterthwaite's method of denominator synthesis (Satterthwaite, 1946), which finds the linear combinations of sources of random variation that serve as appropriate error terms for testing the significance of the respective effect of interest. Variance components are estimated using the expected mean squares. In the Variance Estimation and Precision module, to fit a general linear model for the purpose of viewing ANOVA results with denominator synthesis results for random effects and expected means squares, select the ANOVA estimation method. See Mixed Model ANOVA (below) for more discussion of how this is implemented in Variance Estimation and Precision. Note that the General Linear Model (GLM) module provides a comprehensive implementation of this approach.
- Mixed linear model approach
- A more "modern" approach to analyzing mixed models (or random effect models) is to fit a mixed model (i.e., a model that provides for both fixed and random effects, separately)
y = Xb+Zg+e
where
y is a vector of response values (observed data)
X is a design matrix that accounts for all fixed effects
b is the unknown vector of parameter estimates for fixed effects,
Z is a design matrix that accounts for all random effects
g is the unknown vector of parameter estimates for random effects and
e is the vector of unknown random error.
Note: with this model, the elements of e are no longer required to be independent or homogenous. Additional theory (with examples) is provided in Littell et al (1996) and Verbeke and Molenberghs (1997).When a mixed linear model is fit, variance components are typically estimated using a maximum likelihood method. Because the standard maximum likelihood technique (ML) produces biased estimates of the variance components, a restricted maximum likelihood technique (REML) is commonly used. In Variance Estimation and Precision, a mixed linear model approach is used in computing analysis of variance results for the REML estimation method as well as for computing residuals and least squares means for both estimation methods. See Maximum Likelihood Methods of Variance Component Estimation (below) for more details on how REML is implemented in the Variance Estimation and Precision module.
In STATISTICA, when a mixed model is fit, the random effects are always appended to the fixed effects model. This means that the fixed effects enter the model before the random effects.
Mixed Model ANOVA (ANOVA Method)
Mixed Model ANOVA techniques are used when random effects are included in the model. The method for estimating the variance of random factors begins by constructing the Sums of squares and cross products (SSCP) matrix for the independent variables. The sums of squares and cross products for the random effects are then residualized on the fixed effects, leaving the random effects independent of the fixed effects, as required in the mixed model. Using the method of synthesis (Hartley, 1967), the residualized Sums of squares and cross products for each random factor are then divided by their degrees of freedom to produce the coefficients in the Expected mean squares matrix. The Expected mean squares are then used to estimate the variation of the random effects in the model. The Expected mean squares matrix can be produced using Type I, Type II, or Type III Sums of squares (Milliken & Johnson, 1992).
To test the significance of effects in mixed or random models, error terms must be constructed that contain all the same sources of random variation except for the variation of the respective effect of interest. This is done using Satterthwaite's method of denominator synthesis (Satterthwaite, 1946), which finds the linear combinations of sources of random variation that serve as appropriate error terms for testing the significance of the respective effect of interest. To perform the tests of significance of effects, ratios of appropriate Mean squares are then formed to compute F statistics and p-values for each effect. Denominator degrees of freedom for corresponding synthesized error terms are computed using Satterthwaite's method. The resulting F tests generally are approximate, rather than exact, and can be based on fractional degrees of freedom, reflecting fractional sources of random variation from which the error terms are synthesized. These approximate F tests become undefined, and thus are not displayed, when the denominator degrees of freedom approach or become zero.
Maximum Likelihood Methods of Variance Component Estimation
- REML variance component estimation
- The basic idea behind REML estimation is to find the set of weights for the random effects in the model that minimize the negative of the natural logarithm times the likelihood of the data (the likelihood of the data can vary from zero to one, so minimizing the negative of the natural logarithm times the likelihood of the data amounts to maximizing the probability, or the likelihood, of the data). The method of maximum likelihood (the term first used by Fisher, 1922a) is a general method of estimating parameters of a population by values that maximize the likelihood (L) of a sample. The likelihood L of a sample of N observations x1, x2, ..., xn, is the joint probability function p(x1, x2, ..., xn), where x1, x2, ..., xn are discrete random variables. If x1, x2, ..., xn are continuous random variables, then the likelihood L of a sample of n observations, x1, x2, ..., xn, is the joint density function f(x1, x2, ..., xn).
Let L be the likelihood of a sample, where L is a function of the parameters θ1, θ2, ... θk. Then the maximum likelihood estimators of θ1, θ2, ... θk are the values of θ1, θ2, ... θk that maximize L.
Let θ be an element of Ω. If Ω is an open interval, and if L(θ) is differentiable and assumes a maximum on Ω, then the MLE will be a solution of the following equation: (dL(θ))/dθ = 0. For detailed information on maximum likelihood estimation, see Mendenhall and Sincich (1984), Bain and Engelhardt (1989), and Neter, Wasserman, and Kutner (1989). See also, Nonlinear Estimation.
REML uses MIVQUE(0) estimates (see Variance Components & Mixed Model ANCOVA) as start values for an iterative maximum likelihood solution for the variance components, so the elements of the SSQ matrix serve as initial estimates of the covariances among the random factors and the dependent variable. REML variance components are estimated by iteratively optimizing the parameter estimates for the effects in the model. In REML, the likelihood of the data is maximized only for the random effects, thus REML is a restricted solution.
The statistical theory underlying maximum likelihood variance component estimation techniques is an advanced topic (Searle, Casella, & McCulloch, 1992, is recommended as an authoritative and comprehensive source). Implementation of maximum likelihood estimation algorithms for variance components, furthermore, is difficult (see, for example, Hemmerle & Hartley, 1973, and Jennrich & Sampson, 1976, for descriptions of these algorithms), and faulty implementation can lead to variance component estimates that lie outside the parameter space, converge prematurely to nonoptimal solutions, or give nonsensical results. Milliken and Johnson (1992) noted all of these problems with the commercial statistical software packages they used to estimate variance components. In the Variance Components and Precision method of analysis, care has been taken to avoid these problems as much as possible. Note, for example, that for the analysis reported in Example 2: Variance Component Estimation for a Four-Way Mixed Factorial Design, most statistical packages do not give reasonable results, but the estimates computed in the Variance Components and Precision method of analysis are quite reasonable.
By the definition of variance, the parameter space for variance component estimates is from zero (inclusive) to infinity. One of the most formidable problems in the implementation of maximum likelihood variance component estimation algorithms involves dealing with negative variance component estimates. It is disconcerting but true that any method of variance component estimation can give negative variance component estimates for unbalanced designs (with unequal N's at different levels of the factors).
The algorithms used in the Variance Components and Precision method of analysis produce several notable effects when negative variance component estimates are encountered. First, negative estimates for the error component are not allowed. Adjustments to the estimates of other components are made to force the estimate for the error component to be non-negative. Second, any redundancy in the effects in the model at the current iteration is resolved by dropping one or more of the effects from the model for the current iteration, and setting their variance component estimates to zero. This step alone prevents many negative variance component estimates. However, dropped effects are allowed to reenter the model in subsequent iterations if a non-negative variance component is estimated for them. Third, the variance component estimate for a random effect (other than for the error component) is always set to zero when a negative variance component estimate is encountered, and the resulting model is evaluated accordingly. Some of these steps may seem extreme, but they have been used successfully in other implementations of maximum likelihood variance component estimation algorithms, and are certainly preferable to interpreting a model with negative variance components estimates.
- Degrees of freedom
- When the REML estimation method is used, the variance/covariance matrix for variance components is the inverse of the Hessian (H; second order partial derivatives) matrix. Degrees of freedom for the variance components are a function of the Wald statistic (Z) which is calculated as the variance component estimate divided by its standard error (the square root of the corresponding diagonal element in the H-1 matrix). For REML variance component estimates,
df = 2Z2 .
When REML variance component estimates are combined, the degrees of freedom are computed from the standard error for the combined variance estimate. Specifically, the variance VAR[VA+VB] of the combined variance components VA+VB is computed as Var[VA+VB]=Var[VA]+Var[VB]+2*Covar[VA,VB]). Thus, the new Z (Wald Statistic) for this combined variance estimate would be:
Z = (VA + VB)/[Var(VA)+Var(VB)+2*Covar(VA,VB)]1/2
- Hessian was singular at point of convergence
- In some cases you may see this message at the top of the spreadsheet reporting the estimates of the variance components; usually, you will also see at least one component with a reported asymptotic standard error of 0 (zero). This message indicates that during the iterative computations, at least two components were found to be entirely redundant, and that no unique estimates can be derived for those components. This may, for example, happen when there are 0 (zero) degrees of freedom for the error term, and hence, the error variance is not defined.
The computational algorithm used in STATISTICA always attempts to resolve any redundancies so as to retain a positive (>0) estimate for the Error variance component (otherwise, if during the estimation an intermediate solution makes the Error component equal to 0 [zero], the algorithm may prematurely terminate with a non-maximum likelihood solution). However, when this message appears in the header of a results spreadsheet in your analysis, you should carefully review the current model (e.g., click the ANOVA table button on the Variance Estimation and Precision Results dialog - Summary tab), and interpret the estimates of the variance components with caution (as they are not unique).
Note: there are several modules in STATISTICA that will perform Analysis of Variance for factorial or specialized designs. For a discussion of these modules and the types of designs for which they are best suited, refer to Methods for Analysis of Variance. Note also that the General Linear Model (GLM) module can analyze designs with any number and type of between effects and compute ANOVA-based variance component estimates for any effect in a mixed-model analysis using any of the six types of sums of squares.
Standard Errors and Confidence Bounds for REML Variance Estimates
The Variance Estimates Spreadsheets (available by clicking Variance estimates on the Variance Evaluation tab of the Variance Estimation and Precision Results dialog) reports the variance estimate for each random effect in the model as well as a variety of other statistics. When REML analysis is used, this spreadsheet includes the standard errors and confidence bounds for those variance estimates. Information on the computation of variance estimates and their degrees of freedom are given above. Recall that degrees of freedom for REML are calculated as 2*Z2, where Z is the Wald statistic (i.e., the variance component divided by its standard error). Other results are calculated as shown below. Note that these values are not reported when ANOVA analysis is used.
- Standard error
- For REML analysis, the standard error is calculated as the square root of the corresponding diagonal element in the H-1 matrix (as described above).
- Z
- For REML analysis, Z is calculated as a simple Wald statistic (i.e., the variance estimate divided by its standard error).
- p of Z
- The p of Z value is the one-tailed area of the normal distribution outside the Z score. It is the probability of seeing a bigger z-score.
- Alpha
- This column reports the alpha used in calculating the confidence bounds.
- Lower CI/Upper CI
- When REML analysis is selected, confidence bounds are reported for the variance estimates. The bounds are given by
where
n are the degrees of freedom
and the denominators are quantiles of the c2 distribution with n degrees of freedom.