Generalized Linear Models

Builds a generalized linear model to predict a continuous or categorical dependent variable. Best-subset and stepwise selection of continuous and categorical (ANOVA-like) predictor effects are also supported. The parameters in Statistica allow full access to the GLZ syntax for specifying models. Default results include the parameter estimates, overall fit indices, and results for best subset or stepwise model building; set the Level of detail parameter to All results to request additional results.

Element Name Description
General
Detail of computed results reported Specifies the level of computed results reported. If Minimal results is requested, only the final parameter estimates, fit indices, and stepwise or best subset summary results are reported. If All results is requested, various descriptive statistics, iteration history, and parameter correlations are also reported. Residual and predicted statistics (for observations) can be requested as options.
Analysis syntax Analysis syntax string for generalized linear models. You can specify here the complete syntax, as, for example, copied from a Statistica analysis. Set this string to empty, or just GLZ; to create the syntax from the specific options specified below.
Distribution Specifies a dependent variable distribution for the model.
Link function Specifies a link function for the model.
Power parameter For the Power link function, specify a value for the power parameter. Refer to Generalized Linear Models in the Electronic Manual for additional information.
Design Specify the (ANCOVA-like) design for the between group design (categorical and continuous predictors); default is main effects only.

 Use the syntax:
 DESIGN = Design specifications

 Example 1.
 DESIGN = GROUP | GENDER | TIME | PAID; {makes a full factorial design}

 Example 2.
 DESIGN = SEQUENCE + PERSON(SEQUENCE) + TREATMNT + SEQUENCE*TREATMNT;

 Example 3.
 DESIGN = MULLET | SHEEPSHD | CROAKER @2; {Makes factorial design to degree 2}

 Example 4.
 DESIGN = TEMPERAT | MULLET | SHEEPSHD | CROAKER - TEMPERAT; {Removes main effect for TEMPERAT from factorial design}

 Example 5.
 DESIGN = BLOCK + DEGREES + DEGREES*DEGREES + TIME + TIME*TIME + TIME*DEGREES;
Intercept Specifies whether the intercept (constant) is to be included in the model (i.e., a parameter is to be estimated for the intercept); the default is INTERCEPT=INCLUDE.
Model building method Specifies a model building method.
Generates data source, if N for input less than Generates a data source for further analyses with other Data Miner nodes if the input data source has fewer than k observations, as specified in this edit field; note that parameter k (number of observations) will be evaluated against the number of observations in the input data source, not the number of valid or selected observations.
Estimation
Sweep delta 1.E- Specifies the negative exponent for a base-10 constant Delta (delta = 10^-sdelta); the default value is 7. Delta is used in sweeping, to detect redundant columns in the design matrix.
Convergence 1.E- Specifies the negative exponent for a base-10 constant Delta (delta = 10^-idelta); the default value is 12. Delta is used to check for convergence.
Maximum number of iterations Specifies the maximum number of iterations for estimating the parameters of the model.
Stepwise Selection
p to enter Specifies p-to-enter for stepwise selection of predictors.
p to remove Specifies p-to-remove for stepwise selection of predictors.
Max No Steps Specifies maximum number of steps for stepwise selection of variables.
Best-Subset Selection
Best subsets measure Select a measure for best subset selection; the best subset search method can be based on three different test statistics: Select the Likelihood score option to use the score statistic; select the Likelihood option to use the overall model likelihood; select the Akaike IC option to use the Akaike information criterion (AIC).

 Because the evaluation of the score statistic does not require iterative computations, best subset selection based on the score statistic is computationally faster, while the selection based on the other two statistics usually provides more accurate results.
Number of subsets to display Specifies the number of subsets to display in the results; Statistica keeps a log of the best k predictor models of any given size, using k as specified by this parameter.
Selected Results
Least square means Creates the expected marginal means, given the current model; either all marginal means tables can be computed, or only the means for the highest-order effect of the factorial design.
Residual analysis Creates predicted and residual values for all cases (observations).
Normal probability plot Creates a normal probability plot of residuals.
Deployment Deployment is available if the Statistica installation is licensed for this feature.
Generates C/C++ code Generates C/C++ code for deployment of predictive model.
Generates SVB code Generates Statistica Visual Basic code for deployment of predictive model.
Generates PMML code Generates PMML (Predictive Models Markup Language) code for deployment of predictive model. This code can be used via the Rapid Deployment options to efficiently compute predictions for (score) large data sets.
Saves C/C++ code Save C/C++ code for deployment of predictive model.
File name for C/C code Specify the name and location of the file where to save the (C/C++) deployment code information.
Saves SVB code Save Statistica Visual Basic code for deployment of predictive model
File name for SVB code Specify the name and location of the file where to save the (SVB/VB) deployment code information.
Saves PMML code Saves PMML (Predictive Models Markup Language) code for deployment of predictive model. This code can be used via the Rapid Deployment options to efficiently compute predictions for (score) large data sets.
File name for PMML (XML) code Specify the name and location of the file where to save the (PMML/XML) deployment code information.