Rules for SEPATH Path Diagrams
In this section, we establish rules for path diagrams, which will guarantee that the diagram will represent accurately any model that fully accounts for all variances and covariances of all variables, both manifest and latent. Our rules are based on the following considerations.
Path diagrams consist of variables connected by wires and arrows, representing, respectively, undirected and directed relationships. These variables must be endogenous or exogenous. They must also be manifest or latent. Hence any variable can be classified into 4 categories: (a) manifest endogenous, (b) manifest exogenous, (c) latent endogenous, and (d) latent exogenous.
If random variables are related by linear equations, then variables which are endogenous have variances and covariances which are determinate functions of the variables on which they regress. For example if X and Y are orthogonal and
W = aX + bY, then
σ2w = a2σ2x + b2σ2y
Hence, one way of guaranteeing that a diagram can account for variances and covariances among all its variables is to require: (1) representation of all variances and covariances among exogenous variables, (2) no variances or covariances to be directly represented in the diagram for endogenous variables, and (3) all variables in the diagram be involved in at least one relationship.
There is a significant practical problem with many path diagrams lack of space. In many cases, there are so many exogenous variables that there is simply not enough room to represent, adequately, the variances and covariances among them. Diagrams which try often end up looking like piles of spaghetti. For a beautiful example of a spaghetti diagram, see page 147 of James, Mulaik, and Brett (1982).
One way of compensating for this problem is to include rules for default variances and covariances which allow a considerable number of them to be represented implicitly in the diagram.
These considerations lead to the following rules:
- Manifest variables are always represented in boxes (squares or rectangles) while latent variables are always in ovals or circles.
- Directed relationships are always represented explicitly with arrows between two variables.
- Undirected relationships need not be represented explicitly. (See rule 9 below regarding implicit representation of undirected relationships.)
- Undirected relationships, when represented explicitly are shown by a wire from a variable to itself, or from one variable to another.
- Endogenous variables may never have wires connected to them.
- Free parameter numbers for a wire or arrow are always represented with integers placed on or slightly above the middle of the wire or arrow line.
- Fixed values for a wire or arrow are always represented with a floating point number containing a decimal point. The number is generally placed on or slightly above the middle of the wire or arrow line.
- Different statistical populations are represented by a line of demarcation and the words Group 1 (for the first population or group), Group 2, etc., in each diagram section.
- All exogenous variables must have their variances represented either explicitly or implicitly. If variances and covariances are not represented explicitly, then the following rules hold:
a. For latent variables, variances not explicitly represented in the diagram are assumed to be 1.0, and covariances not explicitly represented are assumed to be 0.
b. For manifest variables, variances and covariances not explicitly represented are assumed to be free parameters each having a different parameter number. These numbers are not equal to any number appearing explicitly in the diagram.
By adopting a consistent standard for path diagrams, we can facilitate clear communication of path models, regardless of what system is used to analyze them. Besides standing on their own as a coherent standard for path diagrams, the above rules for SEPATH PATH diagrams are designed to match the PATH1 language, and allow quick translation from the diagram to the language, and vice-versa.
Within this electronic manual we will adhere to these simplifying conventions. However, the typical SEPATH user will attempt to use the program to reproduce results from published papers employing a wide variety of standards for their path diagrams. In some cases this will create no problems, and the user will be able to translate directly to and from the published path diagram to a PATH1 representation of the model. However, experience indicates that it is often useful to translate published diagrams into a SEPATH diagram, i.e., one which obeys rules 1-9 above, before coding the diagram in the PATH1 language. Frequently the translation process will draw attention to errors or ambiguities in the published diagram. (See Resolving Ambiguities in Path Diagrams for a discussion of this problem.)