Workspace Node: Support Vector Machines - Specifications - Cross-Validation Tab

In the Support Vector Machines node dialog box, under the Specifications heading, select the Cross-validation tab to access options to apply the cross-validation algorithm in order to obtain estimates of the training parameters, which are displayed on the SVM tab. Although you can specify these training parameters on the SVM tab, it is often the case that little is known about their best values. The process of cross-validation can provide you with estimates.

Element Name Description
V-fold cross-validation In this group box, select the Apply v-fold cross-validation check box to use a v-fold cross-validation algorithm to determine the best value of the training parameters specified in the Grid search group box. The v-fold cross-validation algorithm is described in some detail in the documentation for Classification Trees, Classification and Regression Trees (C&RT), and General CHAID modules.

The general idea of this method is to divide the overall sample into a number of v folds (randomly drawn disjoint sub-samples). The same type of SVM analysis is then successively applied to the observations belonging to the v-1 folds (which constitutes the cross-validation training sample), and the results of the analyses are applied to sample v (the sample or fold that was not used to fit the SVM model, i.e., this is the testing sample) to compute the error usually defined as the sum-of-squared (this error quantifies how well the observations in sample v can be predicted by the SVM model). The results for the v replications are averaged to yield a single measure model error of the stability of the respective model, i.e., the validity of the model for predicting unseen data.

Apply v-fold cross-validation Select this check box to apply v-fold cross-validation. The training parameters on the SVM tab will not be available if this option is selected since their values will be determined by the cross-validation algorithm.
V value In this field, specify the number of folds used to perform the cross-validation. The default value is 10, the minimum is 2. The larger the number of sampling, the fewer data cases will be available in each sample. This may lead to noisy cross-validation results (among the v-folds). Thus, care should be taken in the determining the V value.
Seed Specify the random number generator seed to be used in the process of (randomly) grouping the data into v folds.
Grid search The options in this group box define the grid search (the Minimum, Maximum, and Increments) for the Capacity, Epsilon, and Nu training parameters.
Minimum Specify the minimum value of the training parameters to start with.
Maximum Specify the maximum value of the training parameters to search for.
Increment Specify the increase in the value of the training parameters when searching is performed.

Options / C / W. See Common Options.

OK Click the OK button to accept all the specifications made in the dialog box and to close it. The analysis results will be placed in the Reporting Documents node after running (updating) the project.