Workspace Node: GLZ Custom Design - Results - Summary Tab

In the GLZ Custom Design node dialog box, under the Results heading, select the Summary tab to access the following options.

Element Name Description
Summary of all effects Select this check box to produce a spreadsheet with the Wald statistic and respective p-values for all effects in the model. Significant effects in this spreadsheet will be highlighted. Note that these results always pertain to the overall model with all effects, regardless of which effects were selected by any Model building procedures specified on the Advanced tab.
Note: models that are not full-rank (e.g., overparameterized models). When redundant columns are detected during the evaluation of the design matrix, some difficulties arise when computing the Wald statistic for the overall mode+l, and when attempting to compute Type 3 LR tests of effects (see below). Specifically, because of the redundancy of some of the parameters in the model, independent tests of effects, controlling for all other parameters in the model (not belonging to the effect under consideration) cannot be computed. Therefore, the Summary of all effects and Type 3 LR test buttons will not be available in that case.
Type 1 LR test Select this check box to produce a spreadsheet with the results for the Type 1 sequential tests for the effects in the model. Refer to Six types of sums of squares for details on how effects can be tested in unbalanced factorial ANOVA designs (in particular, see Type I sums of squares). In short, the spreadsheet will report the log-likelihoods for the model that includes a particular effect (shown in the respective row of the spreadsheet) and all effects that precede it (shown in the previous rows of the spreadsheet); the incremental Chi-square statistic then provides a test of the increment in the log-likelihood, attributable to the respective (current) effect. Note that these results always pertain to the overall model with all effects, regardless of which effects were selected by any Model building procedures on the Advanced tab.
Type 3 LR test Select this check box to produce a spreadsheet with the results for the Type 3 tests for the effects in the model. Refer to Six types of sums of squares for details on how effects can be tested in unbalanced factorial ANOVA designs (in particular, see Type III sums of squares). In short, the spreadsheet will report the log-likelihoods for the models that include all effects except for the current effect (shown in the respective row of the spreadsheet); the incremental Chi-square statistic for that model, and the full model (that includes all effects) then provides a test of the increment in the log-likelihood, attributable to the respective (current) effect, while controlling for all other effects. Note that these results always pertain to the overall model with all effects, regardless of which effects were selected by any Model building procedures on the Advanced tab. See the Models that are not full-rank (e.g., overparameterized models) note above for additional information on this option.
Cell statistics Select this check box to produce a spreadsheet of the descriptive statistics for each cell in the design; specifically, descriptive statistics are computed for the dependent (response) variable, as well as any continuous predictors (covariates) in the design, for each column of the overparameterized design matrix for categorical effects. Thus, marginal means and standard deviations are available for each categorical effect in the design. Note that for lower-order effects (e.g., main-effects in designs that also contain interactions involving the main effects), the reported means are weighted marginal means, and as such estimates of the weighted population marginal means (for details, see, for example, Milliken and Johnson, 1984, page 132).
Design term Select this check box to produce a spreadsheet of all the labels for each column in the design matrix.
V-C matrix Select this check box to produce an (asymptotic) variance-covariance matrix for the parameter estimates. Specifically, the values shown in this spreadsheet are the expected parameter variances and covariances computed via the Fisher Scoring method. See the Reading the spreadsheet note (above) for additional information on this option.
Corr. matrix Select this check box to produce an (asymptotic) correlation matrix for the parameter estimates. Specifically, the values shown in this spreadsheet are the expected parameter correlations computed via the Fisher Scoring method.
Note: reading the spreadsheet. Note that each row of the spreadsheet corresponds to a column in the design matrix. When the design includes categorical predictor variables, the parameter estimates pertain to the coded effects in the design matrix. The GLM topic The Sigma-Restricted and Overparameterized Model discusses in detail how this coding is accomplished (and how, consequently, the parameter estimates can be interpreted). You can also refer to the Design terms and Coefficient options in Summary results for between effects in GLM for details on the labeling of the columns of the design matrix in results spreadsheets.
Model building Select this check box to produce a spreadsheet with the summary for the model building procedure. For details about the available model building techniques, refer to the Introductory Overview. This option is not available if All effects was specified on the Advanced tab.
Note: results for stepwise or best-subset regression. Unlike in the stepwise or best-subset results in General Regression Models (GRM), the results that can be reviewed from the GLZ always pertain to the full model, regardless of which effects were selected for inclusion during the model building procedure. The reason for this is that, unlike in GRM, the relationship between predictors, and their interactive effects (e.g., two predictors masking the effects of a third) are often much more complex. Also, unlike in GRM, because of the manner in which the p1, enter and p2, remove probabilities are determined [in forward stepwise selection, the score statistic is used to select new (significant) effects; while the Wald statistic is used during backward steps], the Forward stepwise and Backward stepwise methods may result in the repetitive selection and removal of one or more predictors. Therefore, the stepwise results can be reviewed separately via these options.
Note: stepwise methods. When either forward stepwise, backward stepwise, forward entry, or backward removal are selected on the Advanced tab, the spreadsheet will show for each step which effects were in the model at that step, which ones were not in the model, and which one was selected for entry or removal. For each effect in the model, the spreadsheet will show the Wald statistic and respective p-value; for each effect not in the model, the spreadsheet will show the score statistic and respective p-value.
Note: selection of variables for inclusion in or removal from the model. Forward selection will cause variables to be moved into the model, backward selection will start with a model with all predictor variables and effects in the model, which are then removed. The Forward entry and Backward removal options will only allow for variables or effects to be entered or to be removed, respectively, depending on the chosen method (forward or backward). The Forward stepwise and Backward stepwise options will at each step cause Statistica to consider simultaneously the addition or removal of a variable or effect, based on the current specifications of p1, enter or p2, remove. See the p1, enter, p2, remove, and Max steps option descriptions in the Advanced tab topic for additional details.

For example, if Forward stepwise is selected, Statistica will at each step consider both a step "forward", i.e., entry of another variable or effect into the model (based on the p enter), and a step "backward", i.e., removal of a previously entered variable or effect from the model (based on the p to remove). The reason the Forward stepwise method usually adds rather than removes variables or effects (i.e., the reason why it is a forward selection method) is because of the required setting of the p1, enter and p2, remove values, which have to be specified so that p1, enter is smaller than the p2, remove, thus guaranteeing that significant predictor variables or effects are entered into the model, and not removed. Most of the widely used algorithms for stepwise selection use the Forward stepwise and Backward stepwise methods.

Note: best subset regression. When the current analysis used the best subset method for selecting effects for the model, the spreadsheet will show the best-fitting subsets that were found, based on the chosen criterion (Likelihood score, Likelihood, Akaike information criterion (AIC) specified on the Advanced tab; the respective statistics are also reported in the spreadsheet). Note that the number of best subsets that will be shown in the spreadsheet (i.e., retained from the search computations) can be determined with the Max. subset option on the Advanced tab.
Estimates Select this check box to produce a spreadsheet with the parameter estimates, their standard errors, and statistical significance. See the Reading the spreadsheet note (above) for information on the proper way to read this spreadsheet. Note that these results always pertain to the overall model with all effects, regardless of which effects were selected by any Model building procedures on the Advanced tab.
Note: reference level for categorical dependent (response) variable. The last category (level) that is specified for a categorical dependent (response) variable will be the reference category for the comparisons with the other categories. So, for example, if a multinomial dependent (response) variable with k = 3 levels is analyzed, the k-1 = 2 parameters for each predictor (effect column) pertain to the comparison of 1) the first level with the last level, and 2) the second level with the last level of the dependent (response) variable.
Conf. intervals. Select this check box to produce a spreadsheet with the confidence intervals for the parameter estimates (see also Estimates above); the confidence level (p value) that is to be used for the interval can be specified in the Conf. limit field (see below). See the Reading the spreadsheet note (above) for information on the proper way to read this spreadsheet. Note that these results always pertain to the overall model with all effects, regardless of which effects were selected by any Model building procedures on the Advanced tab.
Iter. results. Select this check box to produce a spreadsheet that shows the parameter estimates and the model log-likelihood at each iteration. Specifically, each column of the spreadsheet represents one iteration, and the rows show the respective parameter estimates and model log-likelihood at that iteration. See the Reading the spreadsheet note (above) for information on the proper way to read this spreadsheet. Note that these results always pertain to the overall model with all effects, regardless of which effects were selected by any Model building procedures on the Advanced tab.
Sign. lev. Type in the value to be used for all spreadsheets and graphs where statistically significant results are to be highlighted (e.g., in the Summary of all effects spreadsheet); by default all results significant at the p < .05 level will be highlighted.
Conf. limit. Type in the value to be used for constructing confidence limits in the respective results spreadsheets or graphs (e.g., in the Confidence intervals of estimates spreadsheet); by default 95% confidence limits will be constructed.
Sample
Analysis Select this option button to produce spreadsheets for all observations that were used to compute the current results (the Analysis sample).
Cross-validation Select this option button to produce spreadsheets for all observations that were not used to compute the current results, but have valid data for all predictor and dependent variables (the Cross-validation sample).
Both Select this option button to display spreadsheets for all observations in both the Analysis sample and the Cross-validation sample.
Note: if the above three option buttons are dimmed, no cross-validation sample was specified on the Advanced tab.
Goodness of fit Select this check box to produce a spreadsheet showing the Pearson Chi-square statistic, deviance statistic, scaled Pearson Chi-square statistic, scaled deviance statistic, log-likelihood value, AIC, and BIC  for the current (overall) model (see also the Introductory Overview for details). All of these statistics, except for the log-likelihood, AIC, and BIC, are asymptotically Chi-square distributed, so large values of the respective statistics (relative to the degrees of freedom; the ratios of the respective statistics over the degrees of freedom are displayed in the last column of the spreadsheet) imply that the model does not fit the data well. For models where the distribution is binomial, Cox-Snell R2, Nagelkerke R2, and Hosmer-Lemeshow test are also computed. Note that these results always pertain to the overall model with all effects, regardless of which effects were selected by any Model building procedures on the Advanced tab.

Global null hypothesis tests are used to test that the parameter estimates are significantly different from zero. Each statistic is assumed to have an asymptotic chi-square distribution with p degrees of freedom given the null hypothesis.

Likelihood ratio test = 2*[log-likelihood for estimated model – log-likelihood for null model]

Score test = transposed gradient for null model * variance-covariance matrix for null model * gradient for null model

Wald test = transpose Parameter estimates vector for full model * Hessian matrix for full model * Parameter estimates vector for full model

HL Groups Specify the number of groups, g, used in the computation of the Hosmer-Lemeshow goodness of fit test. Statistica will sort the predicted probabilities and try to create g groups of equal size.
Aggregation Select this check box to compute the predicted values (and related statistics, e.g., residuals) in terms of predicted frequencies. In models with categorical response variables, predicted values (and related statistics, e.g., residuals) can be computed in terms of the raw data or for aggregated frequency counts. For example, in the Binomial case, and for raw data, you can think of the response variable as having two possible values: 0 (zero) or 1. Accordingly, predicted values should be computed that fall in the range from 0 (zero) to 1 (e.g., classification probabilities). If the Aggregation check box is selected, Statistica will consider the aggregated (tabulated) data set. In that case, you can think of the response variable as a frequency count, reflecting the number of observations that fall into the respective categories. This is easiest imagined in the case where the predictors are also categorical in nature: The resulting aggregated data file would simply be a multi-way frequency table.
Aggreg. data Select this check box to review the aggregated data in a spreadsheet. In models with categorical response variables, predicted values (and related statistics, e.g., residuals) can be computed in terms of the raw data or for aggregated frequency counts. For example, in the Binomial case, and for raw data, you can think of the response variable as having two possible values: 0 (zero) or 1. Accordingly, predicted values should be computed that fall in the range from 0 (zero) to 1 (e.g., classification probabilities).

For example, suppose your data contain a binary dependent (response) variable; the raw (non-aggregated) data may look like this:

Raw data Select this check box to produce a spreadsheet with the design matrix, values of the dependent (response) variable, case weights, values of the count variable (if one was selected, and the current distribution is Binomial, Multinomial, or Ordinal multinomial; see Specification dialogs and syntax and the Introductory Overview), and the values of the offset variable (if one was selected).
Overdispersion In models with categorical response variables, the chosen distribution will be Poisson, Binomial, Multinomial, or Ordinal multinomial responses. In that case the default dispersion parameter (1.0) for the generalized linear/nonlinear model (i.e., for the exponential family of distributions) may not be adequate. You can select the Overdispersion check box and then select either the Pearson Chi2 or Deviance option button as the estimate of the dispersion parameter.
Pearson Chi2 If you specify Pearson Chi2, the dispersion parameter is estimated by Pearson's chi-square statistic divided by its degrees of freedom. The adjustment is reflected in scale parameter as it is proportional to dispersion parameter.
Deviance If you specify Deviance, the dispersion parameter is estimated by the deviance divided by its degrees of freedom.

Changing the overdispersion parameter will affect the computation (values) of the parameter variances and covariances and the model likelihood, and all related statistics (e.g., standard errors, prediction errors, etc.). For details, refer to McCullagh and Nelder, 1989.

Options / C / W. See Common Options.

OK Click the OK button to accept all the specifications made in the dialog box and to close it. The analysis results will be placed in the Reporting Documents node after running (updating) the project.
.
x1 x2  y
1.5 2 1
2.5 1 0
2.5 1 1
1.5 2 0
2.5 1 1
1.5 2 1

After aggregation, these data can be represented as follows:

x1 x2 y1 y0 total
1.5 2 2 1 3
2.5 1 2 1 3

where the values in the column labeled y1 are counts of the number of observations where y = 1, and y0 are counts of the number of observations where y = 0.

When you select the Aggregation check box, Statistica will convert the raw data into the aggregated representation, and by clicking on the Aggreg. data button, you can review the aggregated data in a spreadsheet. Remember that selecting the Aggregation check box will also affect the computation (and display) of predicted and residual values; see the description of the Aggregation check box (above) for details.