SANN - Custom Neural Network/Subsampling - MLP Tab

You can select the MLP tab of the SANN - Custom Neural Network dialog box or the SANN - Subsampling dialog box to access the options described here. For information on the options that are common to all tabs (located at the top and on the lower-right side of the dialog box), see SANN - Custom Neural Network or SANN - Subsampling. Note that the MLP tab is only available when the Multilayer perceptron (MLP) option button is selected on the Quick tab.

Option Description
Training Algorithm Use the options in this group box to select a training algorithm and specify certain options that are related to the selected algorithm. The Learning rate and Momentum options are only enabled when the algorithm is Gradient descent.
Algorithm From this drop-down list, you can select the training algorithm to use. The available algorithms are explained with a brief description.
  • Gradient descent: Gradient descent is a first order optimization algorithm that attempts to move incrementally to successively lower points in search space in order to locate a minimum.
  • BFGS: Broyden-Fletcher-Goldfarb-Shanno (BFGS) or Quasi-Newton is a powerful second order training algorithm with very fast convergence, but high memory requirements due to storing the Hessian matrix.
  • Conjugate gradient: Conjugate gradient is a fast training algorithm for multilayer perceptrons that proceeds by a series of line searches through error space. Succeeding search directions are selected to be conjugate (non-interfering). It is a good generic algorithm with generally fast convergence.
Cycles Specifies the number of training cycles for the network. In each training cycle the entire training set is passed through the networks and the network error  is calculated. This information is then used to adjust the weights so that the error is further reduced.
Learning rate Specifies the learning rate used to adjust the weights. A higher learning rate might converge more quickly, but might also exhibit greater instability. Values of 0.1 or lower are reasonably conservative. Higher learning rate might cause divergence of the weights. You can only specify a learning rate when the Gradient descent algorithm has been selected.
Momentum Specifies the momentum. Momentum is used to compensate for slow convergence if weight adjustments are consistently in one direction - the adjustment picks up speed. Momentum usually increases the speed of convergence of Gradient descent considerably, and a higher rate can enable you to decrease the learning rate to increase stability without sacrificing much in the way of convergence speed. You can only specify the Momentum when the Gradient descent algorithm has been selected.
Network randomization Use the options in this group box to specify how the weights should be initialized at the beginning of training. You can select Normal randomization or Uniform randomization. In addition to selecting a distribution, you must also specify the mean/min and variance/max to use. You can change the default mean/min and variance/max settings, but it is generally recommended that you set the mean/min to zero and variance/max no more than 0.1. This will help the network to gradually grow from its linear state (small weight values) to the nonlinear (large weight values) mode for modeling the input-target relationship as and when necessary during the training process.
Normal randomization Uses a normal randomization of weights for the neural network model. A normal distribution (with the mean and variance specified following) is used to draw the initial weight values.
Uniform randomization Uses a uniform randomization of weights for the neural network model. A uniform distribution (with the mean and variance specified following) is used to draw the initial weight values.
  • Mean/Min: Specifies either the mean (for the normal distribution) or the minimum value (for the uniform distribution) to use for drawing the initial (that is, before training starts) weight sample.
  • Variance/Max: Specifies either the variance (for the normal distribution) or the maximum value (for the uniform distribution) to use for drawing the initial (that is, before training starts) weight sample.
Stopping conditions Use the options in this group box to specify when to apply the stopping conditions for early stopping of network training.
Note: Although SANN defaults to using the training set for early stopping when no test sample is selected, it is still possible to use the training set for that purpose while also having a test sample. To do so, select a validation set (instead of a test set). The only difference between test and validation sets is that the former is used for early stopping while the latter is never presented to the network while training is in progress. So, by having a validation set, you effectively have a test set while using the training sample for early stopping.
Enable stopping conditions Select the check box to implement early stopping to the training of the neural network. Early stopping is applied when the conditions defined below are met by the training algorithm.
Change in error When stopping conditions are applied, network training ends if the average network error improvement over a specified number of training cycles is less than the Change in error value given here.
Window Enter the number of training cycles over which the average improvement in network error must be at least as large as the specified Change in error.