Training algorithm
|
Use the options in this group box to select a training algorithm and specify options that are related to the selected algorithm. The
Learning rate and
Momentum options are enabled only when the algorithm is Gradient descent.
|
Algorithm
|
From this drop-down list, select the training algorithm to use. The available algorithms are as follows.
- Gradient descent: Gradient descent is a first order optimization algorithm that attempts to move incrementally to successively lower points in search space in order to locate a minimum.
- BFGS: BFGS (Broyden-Fletcher-Goldfarb-Shanno, or Quasi-Newton) is a powerful second order training algorithm with very fast convergence but high memory requirements due to storing the Hessian matrix.
- Conjugate gradient: Conjugate gradient is a fast training algorithm for multilayer perceptrons that proceeds by a series of line searches through error space. Succeeding search directions are selected to be conjugate (non-interfering). It is a good generic algorithm with generally fast convergence.
|
Cycles
|
Specifies the number of training cycles for the network. In each training cycle, the entire training set is passed through the networks and the network error is calculated. This information is then used to adjust the weights so that the error is further reduced.
|
Learning rate
|
Specifies the learning rate used to adjust the weights. A higher learning rate may converge more quickly, but may also exhibit greater instability. Values of 0.1 or lower are reasonably conservative. Higher learning rates may cause divergence of the weights. You can specify a learning rate only when the Gradient descent algorithm has been selected.
|
Momentum
|
Specifies the momentum. Momentum is used to compensate for slow convergence if weight adjustments are consistently in one direction - the adjustment picks up speed. Momentum usually increases the speed of convergence of Gradient descent considerably, and a higher rate can allow you to decrease the learning rate to increase stability without sacrificing much in the way of convergence speed. You can specify the Momentum only when the Gradient descent algorithm is selected.
|
Network randomization
|
Use the options in this group box to specify how the weights should be initialized at the beginning of training. You can select
Normal randomization or
Uniform randomization. In addition to selecting a distribution, you must also specify the mean/min and variance/max to use. You may change the default mean/min and variance/max settings if you want, but it is generally recommended that you set the mean/min to zero and variance/max to no more than 0.1. This will help the network to gradually grow from its linear state (small weight values) to the nonlinear (large weight values) mode for modeling the input-target relationship as and when necessary during the training process.
|
Normal randomization
|
Uses a normal randomization of weights for the neural network model. A normal distribution (with the mean and variance specified) is used to draw the initial weight values.
|
Uniform randomization
|
Uses uniform randomization of weights for the neural network model. A uniform distribution (with the mean and variance specified) is used to draw the initial weight values.
|
Mean/Min
|
Specifies either the mean (for the normal distribution) or the minimum value (for the uniform distribution) to use for drawing the initial (that is, before training starts) weight sample.
|
Variance/Max
|
Specifies either the variance (for the normal distribution) or the maximum value (for the uniform distribution) to use for drawing the initial (that is, before training starts) weight sample.
|
Stopping conditions
|
Use the options in this group box to specify when to apply the stopping conditions for early stopping of network training.
Note: Although SANN defaults to using the training set for early stopping when no test sample is selected, it is still possible to use the training set for that purpose while also having a test sample. To do so, select a validation set (instead of a test set). The only difference between test and validation sets is that the former is used for early stopping while the latter is never presented to the network while training is in progress. So, by having a validation set, you effectively have a test set while using the training sample for early stopping.
|
Enable stopping conditions
|
Select the check box to implement early stopping to the training of the neural network. Early stopping is applied when the conditions defined below are met by the training algorithm.
|
Change in error
|
When stopping conditions are applied, network training ends if the average network error improvement over a specified number of training cycles is less than the
Change in error value specified here.
|
Window
|
You can enter the number of training cycles over which the average improvement in network error must be at least as large as the specified
Change in error.
|