SANN - Data Selection
You can click the OK button in the SANN - Analysis/Deployment Startup Panel to display the SANN - Data Selection dialog box.
This contains four tabs depending on the type of analysis selected: Quick, Sampling, Subsampling, and Time series. The Time series tab is available for time series analysis. The options described here are available regardless of which tab is selected.
Option | Description |
---|---|
OK | Displays the dialog box for the strategy selected on the Quick tab (either the SANN - Automated Network Search (ANS) dialog box or the SANN - Custom Neural Network dialog box). |
Cancel | Closes the dialog box and returns to the SANN - New Analysis/Deployment Startup Panel. |
Options | See Options Menu for descriptions of the commands on this menu. |
MD handling (inputs) | This group box specifies the way to treat cases with missing values (in the input variables of the selected models). It is always disabled for Time series analysis. There are two options:
|
Case selection | Displays the Analysis/Graph Case Selection Conditions dialog box, which is used to create conditions for which cases are included or excluded in the current analysis. More information is available in the case selection conditions overview, syntax summary, and dialog box description. |
Case weights | Displays the Analysis/Graph Case Weights dialog box, which is used to adjust the contribution of individual cases to the outcome of the current analysis by weighting those cases in proportion to the values of a selected variable. In Statistica SANN, case weights are used to encourage a network to emphasis on or ignore learning specific cases or even regions from the data set. All data cases by default have case weights equal to 1. If a data case is assigned a case weight less than 1, for example 0.5, then the error due to mis-fitting that data case is half. This means the network will emphasis less on learning this particular data case since there is less penalty for error in predictions. Similarly, a neural network will fine tune better to predicting a data case with weight, say, equal to 2, since in this case the error due to predictions is twice as much. |
Note: The mean substitution option always computes the simple arithmetic mean, to replace missing data, even when weights are in effect. Weights in SANN are used or interpreted as measures of case importance. It means that they will affect the estimation of neural network parameters themselves. If the intention of weights is to compute a weighted mean, (example, a population average computed using weights) to replace missing data in the input file, use option Data - Data Filtering/Recoding - Replace Missing Data replace missing data values with weighted means.
Note: Weights in SANN are used and interpreted as measures of case importance, which means they will affect the estimation of neural network parameters themselves, but not more. For example, case weights are not used in mean substitution of missing data or calculations of data statistics such as mean and standard deviation of the variables. If you assign weights to cases in the data set, the neural network algorithm will try to predict cases with higher weights with more accuracy. This is useful in a number of situations such as imbalanced data or data sets with cases that are more important to accurately predict. Data cases with zero weights are excluded from the train, test, and validation samples (which means, they are ignored from the analysis). Cases weights can be integers or fractional numbers.
Copyright © 2021. Cloud Software Group, Inc. All Rights Reserved.