Workspace Node: Support Vector Machines - Specifications - Training Tab

In the Support Vector Machines node dialog box, under the Specifications heading, select the Training tab to access options to determine settings that will affect the training process (i.e., the final SVM model). The combination of the Maximum number of iterations and Stop at error (accuracy) options enable you to construct a sufficiently trained SVM model in a time duration determined by either the maximum number of iterations or accuracy (whichever is reached first). If the error on the training sample drops below the given value, the SVM model is considered to have trained sufficiently well, and training is terminated.

Element Name Description
Maximum number of iterations Training of an SVM model is an iterative procedure that can be achieved using a process known as the minimization of an error function. The larger the number of these iterations, the more the SVM is trained and the better it can predict (generally) the training sample. Use this option to determine the maximum number of iterations for which the SVM model should be trained. Note that the maximum number of iterations will determine an upper bound on the number of iterations that the SVM will undergo. Should the training error (see Stop at error, below) reach its designated value first, training will be terminated whether the maximum number of iterations is reached or not.
Stop at error In this field, specify the target error values. If the error on the training set reaches the given value, the SVM is considered to have been trained sufficiently, and training is terminated even if the maximum number of iterations is not reached (see Maximum number of iterations, above).
Cache size, in MB. Size of the kernel cache Depending on your system's resources, set this option to as high as possible, which helps to reduce the computational time by caching certain quantities of optimization that are recently calculated and used.
Scale continuous independent variable(s) Select this check box to scale the continuous input variables to the range [0, 1]. This option is particularly useful when the continuous input range is large, generating a large number of support vectors, which may result in over-fitting.
Scale continuous dependent variable Select this check box to scale the continuous dependent variable. Applicable only to regression problems.
Shrink data Based on the heuristic of confining the free vectors into a possible set. Select this check box to save optimization time.
Use penalty for unbalanced class labels This option is applicable only to classification problems. Sometimes it may be the case that a categorical dependent variable contains an unbalanced number of classes. Example: a categorical variable with 30 cases and three classes A, B, and C, with frequencies 10, 19, 1. It is clear that in cases such as this, the SVM model will have the tendency of  misclassifying C as A or B. To counteract this imbalance, STATISTICA SVM allows you to associate a penalty value with each class category. The larger the penalty for a class category, the more the SVM is discouraged to associate a new data case with that class.

When this check box is selected and you click the OK button, the Penalty selection dialog box is displayed, which is used to assign custom penalties to the class categories.

Set class penalty Click this button to display a general user entry spreadsheet. Shown in this spreadsheet is the frequency of the class labels (in the training sample) and their corresponding class penalties (default is 1). Use this spreadsheet to specify the individual class penalties. Note that the frequencies cannot be edited since they are properties of the data and displayed on the spreadsheet for convenience (since their values can indicate imbalance in class categories). After selecting the penalties, click either OK to confirm the changes (make them permanent) or Cancel to discard them.

Options / C / W. See Common Options.

OK Click the OK button to accept all the specifications made in the dialog box and to close it. The analysis results will be placed in the Reporting Documents node after running (updating) the project.