Unconstrained Minimization Techniques

SEPATH produces its estimates for the elements of q by minimizing a discrepancy function under choice of q. The problem of finding the q that minimizes the discrepancy function is certainly a difficult one. It is, in fact, non-trivial even when q has only one element.

"Unconstrained minimization," i.e., the minimization of a function of q, is a major area in the field of numerical analysis. If interested, you are urged to read an especially clear treatment of this area given by Dennis and Schnabel (1983).

The discussion begins with a nontechnical overview. The discrepancy function is minimized by an iterative process. The iteration starts with initial estimates (often referred to as "starting values") for the elements of q. On each iteration, the current value of the function is calculated, and the program estimates, using derivatives, which direction of change for q will produce a further decrease in the discrepancy function. q is changed in that direction by an initial amount (called the "step length"), and the function is recalculated. It may be that, according to certain criteria, the initial step went either too far or not far enough. In that case, the step length may go through several adjustments during an iteration. Once the "best" step length is estimated, the program moves on to the next iteration. When the discrepancy function stops improving, the algorithm terminates.

Now for some technical details. SEPATH uses a minor variation of the Gauss-Newton type algorithm discussed by Mels (1989) and Browne (1982). Let d (q) be a vector of first partial derivatives (of the elements of S(q)) with respect to the elements of q. That is,

(82)

In the Gauss-Newton approach, Hk is an approximate Hessian given by:

(83)

and gk is the negative gradient of the discrepancy function with respect to q, i.e.,

(84)

In the Gauss-Newton algorithm, the estimate on the kth iteration is related to the estimate on the next iteration by the formula:

(85)

The vector dk establishes the direction of change for the parameters on the kth iteration, while the scalar parameter lk establishes, jointly with dk, the length of the step vector.

Especially during the early phases of iteration, the program may attempt to take extremely large steps. Often, this causes no problem, and in fact hastens the progress toward a correct solution. However, sometimes large steps result in a set of parameter estimates that cause iteration to "blow up," either because the estimates yield a singular estimated covariance matrix, or because the estimates end up in a region from which recovery is impossible.

With SEPATH, you can control the length of steps the program takes in the following manner. There is a Maximum Step Length parameter b in the Global Iteration Parameters box in the Analysis Parameters dialog. If the vector dk has a length greater than b, it is rescaled by a positive constant so that its length becomes exactly b.

When dk is first calculated, it is compared to b and rescaled if necessary. Then lk is set to 1, is calculated via equation 85, and is calculated.

If the new function value is less than the value on the preceding iteration by a "reasonable" amount, the algorithm proceeds to the next iteration. The interpretation of "reasonable" is controlled by the line search parameter a. When a is small (for example, near the default value of .0001), virtually any reduction in the discrepancy function is acceptable. When a is larger, a greater change is required. (For a full technical discussion of the line search parameter a and how it is employed, consult Dennis and Schnabel, 1983, especially eq. 6.3.3a.)

Usually, an acceptable change occurs immediately. However, on some occasions may actually exceed , or be much closer to it than expected. In such a case, lk is adjusted by a "line search" algorithm until an acceptable value of the discrepancy function is found. Note that, in effect, the minimization in terms of the t unknown elements of q is temporarily reduced to a minimization problem in one unknown, namely lk. Usually, a good value of the discrepancy function can be found for 0 < lk <= 1. Some algorithms assume, in effect, this will always occur by constraining lk to within those limits. However, in some circumstances this will not occur — either a lk greater than 1 is required to achieve improvement in the discrepancy function, or the step direction is wrong (in effect a negative lk is necessary to reduce the discrepancy function). In such cases algorithms (such as the simple "stephalving" approach advocated by Mels, 1989, and Bentler, 1989) requiring 0 < lk <= 1 may exhibit erratic behavior and/or fail to converge.