Generalized Linear Models

Builds a generalized linear model to predict a continuous or categorical dependent variable. Best-subset and stepwise selection of continuous and categorical (ANOVA-like) predictor effects are also supported. The parameters in Statistica allow full access to the GLZ syntax for specifying models. Default results include the parameter estimates, overall fit indices, and results for best subset or stepwise model building; set the Level of detail parameter to All results to request additional results.

General

Element Name	Description
Detail of computed results reported	Specifies the level of computed results reported. If Minimal results is requested, only the final parameter estimates, fit indices, and stepwise or best subset summary results are reported. If All results is requested, various descriptive statistics, iteration history, and parameter correlations are also reported. Residual and predicted statistics (for observations) can be requested as options.
Analysis syntax	Analysis syntax string for generalized linear models. You can specify here the complete syntax, as, for example, copied from a Statistica analysis. Set this string to empty, or just GLZ; to create the syntax from the specific options specified below.
Distribution	Specifies a dependent variable distribution for the model.
Link function	Specifies a link function for the model.
Power parameter	For the Power link function, specify a value for the power parameter. Refer to Generalized Linear Models in the Electronic Manual for additional information.
Design	Specify the (ANCOVA-like) design for the between group design (categorical and continuous predictors); default is main effects only.   Use the syntax:  DESIGN = Design specifications   Example 1.  DESIGN = GROUP \| GENDER \| TIME \| PAID; {makes a full factorial design}   Example 2.  DESIGN = SEQUENCE + PERSON(SEQUENCE) + TREATMNT + SEQUENCETREATMNT;   Example 3.  DESIGN = MULLET \| SHEEPSHD \| CROAKER @2; {Makes factorial design to degree 2}   Example 4.  DESIGN = TEMPERAT \| MULLET \| SHEEPSHD \| CROAKER - TEMPERAT; {Removes main effect for TEMPERAT from factorial design}   Example 5.  DESIGN = BLOCK + DEGREES + DEGREESDEGREES + TIME + TIMETIME + TIMEDEGREES;
Intercept	Specifies whether the intercept (constant) is to be included in the model (i.e., a parameter is to be estimated for the intercept); the default is INTERCEPT=INCLUDE.
Model building method	Specifies a model building method.
Generates data source, if N for input less than	Generates a data source for further analyses with other Data Miner nodes if the input data source has fewer than k observations, as specified in this edit field; note that parameter k (number of observations) will be evaluated against the number of observations in the input data source, not the number of valid or selected observations.

Estimation

Element Name	Description
Sweep delta 1.E-	Specifies the negative exponent for a base-10 constant Delta (delta = 10^-sdelta); the default value is 7. Delta is used in sweeping, to detect redundant columns in the design matrix.
Convergence 1.E-	Specifies the negative exponent for a base-10 constant Delta (delta = 10^-idelta); the default value is 12. Delta is used to check for convergence.
Maximum number of iterations	Specifies the maximum number of iterations for estimating the parameters of the model.

Stepwise Selection

Element Name	Description
p to enter	Specifies p-to-enter for stepwise selection of predictors.
p to remove	Specifies p-to-remove for stepwise selection of predictors.
Max No Steps	Specifies maximum number of steps for stepwise selection of variables.

Best-Subset Selection

Element Name	Description
Best subsets measure	Select a measure for best subset selection; the best subset search method can be based on three different test statistics: Select the Likelihood score option to use the score statistic; select the Likelihood option to use the overall model likelihood; select the Akaike IC option to use the Akaike information criterion (AIC).   Because the evaluation of the score statistic does not require iterative computations, best subset selection based on the score statistic is computationally faster, while the selection based on the other two statistics usually provides more accurate results.
Number of subsets to display	Specifies the number of subsets to display in the results; STATISTICA will keep a log of the best k predictor models of any given size, using k as specified by this parameter.

Element Name

Description

Best subsets measure

Select a measure for best subset selection; the best subset search method can be based on three different test statistics: Select the Likelihood score option to use the score statistic; select the Likelihood option to use the overall model likelihood; select the Akaike IC option to use the Akaike information criterion (AIC).   Because the evaluation of the score statistic does not require iterative computations, best subset selection based on the score statistic is computationally faster, while the selection based on the other two statistics usually provides more accurate results.

Number of subsets to display

Specifies the number of subsets to display in the results; STATISTICA will keep a log of the best k predictor models of any given size, using k as specified by this parameter.

Selected Results

Element Name	Description
Least square means	Creates the expected marginal means, given the current model; either all marginal means tables can be computed, or only the means for the highest-order effect of the factorial design.
Residual analysis	Creates predicted and residual values for all cases (observations).
Normal probability plot	Creates a normal probability plot of residuals.

Deployment

Deployment is available if the Statistica installation is licensed for this feature.

Element Name	Description
Generates C/C++ code	Generates C/C++ code for deployment of predictive model.
Generates SVB code	Generates Statistica Visual Basic code for deployment of predictive model.
Generates PMML code	Generates PMML (Predictive Models Markup Language) code for deployment of predictive model. This code can be used via the Rapid Deployment options to efficiently compute predictions for (score) large data sets.
Saves C/C++ code	Save C/C++ code for deployment of predictive model.
File name for C/C code	Specify the name and location of the file where to save the (C/C++) deployment code information.
Saves SVB code	Save Statistica Visual Basic code for deployment of predictive model.
File name for SVB code	Specify the name and location of the file where to save the (SVB/VB) deployment code information.
Saves PMML code	Saves PMML (Predictive Models Markup Language) code for deployment of predictive model. This code can be used via the Rapid Deployment options to efficiently compute predictions for (score) large data sets.
File name for PMML (XML) code	Specify the name and location of the file where to save the (PMML/XML) deployment code information.

Contents

Index