Statistical Estimation - General Properties of Discrepancy Functions

Define q as the current vector of free parameter values. Let S(q) represent a function which models S as a function of the t free parameter values in q. The traditional approach to statistical estimation states as a model that

H0: S = S(q)(47)

S(q) is assumed in our general discussion to be any twice differentiable function of q. In practice, it is usually restricted to the particular form of the general model supported by the covariance structure software used for fitting the model to data. For example, when fitting covariance matrices, SEPATH is restricted to the model of Equation 41. In this case, assuming B, G, and X have elements which are either fixed numerical values or elements of q, one may write

S(q) = G(B - I)-1GXG¢ (B¢ - I)-1G¢ (48)

The discrepancy function F(S, S(q)) is a measure on S and S(q). In general, if a model is identified (see Model Identification later in this section), minimization of a discrepancy function satisfying the following three restrictions will lead to consistent estimates for the elements of q:

F(S, S(q)) ³ 0

F(S, S(q)) = 0 if and only if S = S(q).

F(S, S(q)) is continuous in S and S(q).

The above notation, which is employed in many books and papers on structural equation modeling, can be quite confusing in practice, because q may stand for different quantities in different situations. For example, when we refer above to the discrepancy function F(S, S(q)), we are referring to any permissible set of numbers employed as parameters in a model. In other contexts, the values in q may acquire a more specific meaning. For example, when we are referring to the outcome of a maximum likelihood minimization process in which the maximum likelihood discrepancy function has been minimized as a function of q, the elements of q are now "maximum likelihood estimates."

Besides the sample discrepancy function
 F(S, S(q)), we may also discuss the population discrepancy function F(S, S(q)), which we would obtain if we somehow knew S, the population covariance matrix, and used our estimation algorithm to fit the structural model to S rather than S. We may write

(49)

Thus, the null hypothesis in Equation 47 may be expressed in several equivalent forms. For example,

H0: F(S, S(q)) = 0(50)

or

H0: (51)

As a simple consequence of the preceding definitions, we can see that, when q is identified (see Model Identification), and the null hypothesis is true, q is uniquely defined for any discrepancy function. However, suppose the null hypothesis is not true, which under most conditions is the reasonable assumption. In this case, we might define the "population parameters" as those we would obtain if we somehow knew S, and fit a model to S by minimizing a discrepancy function. The parameters in q would then be defined as those that "fit best in the population." The subtle problem here is that different discrepancy functions will usually produce different q values. Hence, although the point is hardly ever discussed in the literature, q is, in practice, hardly ever uniquely defined, unless you choose a particular discrepancy function (say, maximum likelihood) as your criterion for choosing q "in the population." The problem is that discrepancy functions have been chosen primarily on the basis of their optimality properties for fitting S to a model, not for fitting S. The reader should keep that subtle point in mind when reading the following discussion of discrepancy functions.