Workspace Node: Lasso Regression - Specifications - Validation Tab

Statistica performs v‐fold cross‐validation to identify the optimal lambda value using minimum average error. The initial run is on the entire data to get the lambda sequence. The cross‐validation procedure then splits the input data into v folds and runs the algorithm with each fold omitted. The average errors are calculated over each fold and their standard deviations computed.

In the Lasso Regression workspace node dialog box, select the Validation tab within the Specifications group to access the following options:

Option Description
Variables Click the Variables button to display a standard variable selection dialog box. Select one dependent variable and two or more independent variables. The independent variables can be categorical or continuous, or a combination of both.
V‐fold cross‐validation Use this check box to enable or disable v‐fold cross‐validation. V‐fold cross‐validation is particularly useful when no test sample is available and the learning sample is too small to have the test sample taken from it.
Seed random number The positive integer value entered in this box is used as the seed for a random number generator that produces v‐fold random subsamples from the input data.
Number of folds, v The value entered in the this box determines the number of cross‐validation samples that will be generated from the input data.
Loss function Specifies the way the errors are calculated. There are two ways.
  • Mean squared error: Select this option button to use squared residuals to calculate the mean cross‐ validated error.
  • Mean absolute error: Select this option button to use the absolute value of the residuals to calculate the mean cross‐validated error.
Grouped errors Select this check box to compute separate statistics for each fold and use their mean and standard error to obtain the error curve. If this check box is not selected, an error matrix is built up and then summarized to obtain the error curve.
Options / C / W For more information, refer to "Common Options" in Statistica Electronic Manual.
OK Click the OK button to accept all the specifications made in the dialog box and to close it. The analysis results are placed in the Reporting Documents node after running or updating the project.
Cancel Click the Cancel button to close the Lasso Regression dialog box without making any changes to the current specifications.
Note: Statistica ignores all cases that have missing data for any of the variables selected in the list.
Note: All cases with weight less than or equal to zero will be treated as missing data.