Setting the Model and Training Options
You can manually set the model creation and training options for your project in Learn UI.
To set the model creation and training options in Learn UI:
| 1. | On the top menu, select Options >Model item. |
Figure 39: Model Options
| 2. | On the Model Creation tab, adjust the following options that are used to create the model: |
Model Creation Options
|
Field |
Description |
||||||
|
Model Type |
Large - Large models take subsets into account; they might predict better in some situations. Small - Small models are much smaller and can support more features. |
||||||
|
Initial Learning Rate |
Defines how quickly the model state changes during training. It is recommended to set the value between 1.1 and 2.0. Default value:
|
||||||
|
Annealing Rate |
Defines the speed of learning state decrease with each iteration. The default value of 0.05 makes the learning rate two times smaller after about 15 iterations. |
||||||
|
Precision |
Precision of internal model weights. A higher precision might help distinguish similar examples with different labels. It is recommended to set the value between 6 and 10 (default). The amount of memory used by the model is proportional to 2 to the power of the selected precision value. |
||||||
|
Submodel Training |
Dynamic - (Default) This mode uses augmentation of training data to better train submodels. It is recommended for most projects, especially when training with relatively few record pairs. None - This option trains with actual training examples only. It can be used if datasets contain abundant examples from all subsets that can be encountered. Training is much faster; therefore, this mode is also applicable when the number of model features is very large. |
| 3. | Optionally, click the Set Defaults to set the default model creation options that are based on the current number of model features. |
| 4. | To set the desired model training options, click the Training Method tab. |
Figure 40: Training Method Options
| 5. | On the Training Method tab, select Minimize validation error to use the recommended method that avoids overfitting. The training stops at the iteration where the validation error is the lowest (other criteria for tie breaking are also used). Use this option for the final model that is going to be used in production. |
| 6. | Select Minimize training error to stop the training at the iteration where the training error rate is the lowest. You can use this method to identify mislabeled pairs in the training dataset. The remaining errors in the training dataset are the minimal set of pairs that the model was not able to predict correctly, thus these pairs are likely to have incorrect or contradictory labels. |
It is recommended not to use this option for the final model, or for evaluating the performance of the model, because of significant overfitting that might happen in such training.
| 7. | Adjust the following information in the fields: |
Figure 41: Training options
|
Field |
Description |
|
Iterations to explore after finding best iteration |
Perform this number of iterations after the current best result is found to search for a better result. The default is 35 iterations. The value can be increased to explore more iterations, especially if the learning rate is small. Using a much smaller value is not recommended. |
|
Minimum number of iterations |
The minimum number of iterations that are always performed when training the model. This can be used to extend training until the learning rate becomes reasonably small. The default is 0 iterations. Changing this parameter is rarely needed because typically the training is made long enough by using the recommended value of the Good fit distance between error rates parameter. |
|
Good fit distance between error rates |
Training does not stop at iterations where training error rate is greater than the validation error rate by more than the specified distance. The default value is 1%. Essentially, if 0% is used, the training continues until the model is able to predict the training dataset as well as, or better than, the validation dataset. This parameter should be 0% or slightly above 0% to prevent underfitting. Using 100% ignores underfitting (not recommended). |
| 8. | Optionally, click the Set Defaults to set the default training method options. |
| 9. | Click OK. If you have not used manually set model options for this project before a confirmation dialog is displayed and the manually set options are used from that point on in this project. In this case model creation options are no longer automatically adjusted based on the number of features, even if the number of features is later changed. If you intend to keep using the automatic adjustment of model creation options and are using the dialog just to view the default options for the current number of features, click Cancel instead. |
Figure 42: Confirmation
All the model setting and training method options are saved when the project itself is saved.