User Specified Regression and Loss Function

Fit arbitrary regression models using custom-defined loss functions; you can specify a regression equation using standard notation (e.g., Var3=a+log(b*Var4)). Logical operators are also supported.

STATISTICA will estimate the parameters of the regression equation by minimizing a custom loss function, of the form Loss=Function (e.g., Loss=W*Abs(Obs-Pred)). Use the Least squares regression options (and the very efficient Levenberg-Marquardt algorithm) to estimate the parameters for arbitrary linear and nonlinear regression problems for large data sets (using the least-squares criterion; this is the recommended method for fitting nonlinear models.

 Note: If no model (syntax) is specified, STATISTICA will fit a simple linear model, using the least squares criterion.

Element Name Description
General
Detail of computed results reported Detail of results reported; if Minimal detail is requested, then only the parameter estimates will be reported; if Comprehensive results is requested, the covariances and correlations of parameter estimates are also reported. If All results is requested, a plot of the fitted 2D or 3D function (if applicable) will also be reported, along with various descriptive graphs. Predicted and residual values can be requested as an option; if All results is requested, various plots of residuals will also be computed.
User-defined function Specifies the regression equation. Specify the desired regression model in the general form for models:

 Dep.Var = Predictor Model

 On the left side of the equation, specify the dependent variable; on the right side, specify the expression including independent variables and the parameters to be estimated.

 Refer to variables either by their numbers (e.g., = v1 - v2) or name (e.g., = Retail - Cost)

 All names that are not recognized by STATISTICA as variable names or valid reserved keywords are interpreted to be parameters.

 Equations can contain logical expressions that involve constants, variables, parameters, or any mixture of the three.

 Example: v5=a+b*v5+log(c*v6)
Loss function Specifies the loss function (default is (OBS-PRED)**2, i.e., least squares); in general, all rules apply as outlined for the specification of the regression equation for the model (see also the Electronic Manual for details). In addition, the two keywords PRED and OBS are available to allow you to refer to the predicted and observed values, respectively, for the dependent variable. For example, the default least squares loss function can be specified as:
 L = (Obs - Pred)^2
Missing data deletion Missing data can be casewise deleted or substituted by the respective variable means.
p, for highlighting p value for highlighting significant results (parameter estimates) in results spreadsheets.
Residual analysis Creates predicted and residual values; if the All results Level of detail is selected, then probability plots, surface plots, etc. are also reported.
Generates data source, if N for input less than Generates a data source for further analyses with other Data Miner nodes if the input data source has fewer than k observations, as specified in this edit field; note that parameter k (number of observations) will be evaluated against the number of observations in the input data source, not the number of valid or selected observations.
Estimation
Estimation Method Select the parameter estimation (function optimization) method; refer to the Electronic Manual for details regarding the different algorithms.
Asymptotic standard errors Select this option to compute asymptotic standard errors for the parameter estimates. Note that STATISTICA uses derivative-free estimation methods, and the estimation of asymptotic standard errors may fail in difficult analysis problems (e.g., when predictors are highly redundant). You can also find alternative (efficient) methods in the Generalized Linear Models facilities.
User Eta for differencing Use user-defined value of Eta for finite difference computations. The standard errors for the parameter estimates in are computed via finite differencing. Specifically, the matrix of second-order partial derivatives is approximated. In order to obtain accurate estimates for the derivatives, some a priori knowledge is necessary of the reliability of the loss function.
Eta value; 1E- Specifies the negative exponent for a base-10 constant Eta; Eta will be used for the finite difference computations, to estimate parameter standard errors. The standard errors for the parameter estimates are computed via finite differencing. Specifically, the matrix of second-order partial derivatives is approximated. In order to obtain accurate estimates for the derivatives, some a priori knowledge is necessary of the reliability of the loss function.
Number of iterations Specifies the maximum number of iterations to be performed during the parameter estimation.
Convergence criterion Set the convergence criterion value (by default, 0.0001); refer to the Electronic Manual for details.
Scale MS error to 1 Select the Scale MS-error to 1 option to rescale the mean square error to 1, which is recommended for maximum likelihood estimates. The resulting standard deviations for the parameter estimates are then the usual information theory standard errors.