Lasso Regression - Validation Tab

Statistica performs v-fold cross-validation to identify the optimal lambda value using minimum average error. The initial run is on the entire data to get the lambda sequence. The cross-validation procedure then splits the input data into v folds and runs the algorithm with each fold omitted. The average errors are calculated over each fold and their standard deviations computed.

In the Lasso Regression dialog box, select the Validation tab to access the following options:

V-fold cross-validation

Use this check box to enable/disable v-fold cross-validation. V-fold cross-validation is particularly useful when no test sample is available and the learning sample is too small to have the test sample taken from it.

Seed random number

Select this check box to enter a positive integer value in the list box. This positive integer value is used as the seed for a random number generator that produces v-fold random subsamples from the input data.

Number of folds, v

The value entered in the this list box determines the number of cross-validation samples that will be generated from the input data.

Loss function

Specifies the way the errors are calculated. Available options:

Element Name Description
Mean squared error Select this option button to use squared residuals to calculate the mean cross-validated error.
Mean absolute error Select this option button to use the absolute value of the residuals to calculate the mean cross-validated error.
Grouped errors Select this check box to compute separate statistics for each fold and use their mean and standard error to obtain the error curve. If this check box is not selected, an error matrix is built up and then summarized to obtain the error curve.

Options / C / W / By Group. See Common Options.