WoE Settings
- Click the Settings button in the Weight of Evidence (WoE) Startup Panel to display the Settings dialog box, which contains options to set the parameters to control the details of the algorithms used to identify the best grouping for the selected predictor candidates. The Introductory Overview also provides an overview of the approaches used in the automated WoE coding module.
- To summarize, by default, Statistica will submit each continuous predictor to the Classification and Regression Trees (C&RT) algorithm to determine an initial best partitioning (grouping) of the values into homogeneous subgroups (terminal tree nodes or bins). The minimum number of bins that will be created can be controlled with the
Minimum number of C&RT bins parameter.
- Given the initial grouping, the variables may be submitted to a CHAID algorithm, which is modified to use the Maximum and Minimum WoE values, as well as Minimum N and Bad N criteria to identify groups that can be combined.
- On the other hand, if the number of default bins (from the C&RT analysis) is fewer than 20, an exhaustive search is performed on all partitions or adjacent partitions for continuous predictors.
- In all cases, the final merging of default groups derives groupings consistent with the different constraints for continuous predictors (see also the WoE Introductory Overview).
Use C&RT preprocessor
Select this check box to use the Classification and Regression Trees (C&RT) algorithm as a preprocessor to determine the best initial grouping of value ranges (default partitioning). If this check box is not selected, the program will, by default, create 20 bins with approximately equal numbers of observations as the default.
NOTE: The C&RT algorithm implemented in Statistica is multithreaded and very efficient. We recommend that this default not be reset.
Reset selections
Click this button to reset the selections on the Settings dialog box back to the original defaults.
Set as default
Click this button to set the options on the Settings dialog box as defaults for future analyses.
CHAID parameters
The CHAID algorithm will:
- Combine or merge (default) partitions or groups if two groups show a difference in WoE values less than the Maximum WoE merge value.
- Split groups if two or more groups can be created with a difference in
WoE values greater than the
Minimum WoE split value, provided the resulting groups have at least
WoE minimum N number of cases and
WoE minimum Bad N.
The CHAID algorithm will not be used if the Use exhaustive search for continuous variables check box (see description below) is selected, and the default number of partitions (groups) is less than or equal to 20.
Maximum WoE merge value
Set the Maximum WoE value difference between two groups that will be merged.
Minimum WoE split value
Set the Minimum WoE value difference between two or more groups that will be created by splitting a previously merged group.
WoE minimum Bad N
Set the minimum number of Bad cases (observations) in a group resulting from splitting; if the result in a group will be fewer than this number, the respective split is not applied.
WoE minimum N
Set the minimum number of cases (observations) in a group resulting from splitting; if the result in a group will be fewer than this number, the respective split is not applied.
Use exhaustive search for continuous variables
Select this check box to perform an exhaustive search over all default partitions in order to identify partitions that are consistent with the constraints (see also the Introductory Overview). This option will be ignored if the number of default partitions exceeds 20.
Log odds plot
Use the options within the Log Odds plot group box to specify how to generate log-odds plots for a given predictor variable.
Logodds plot of raw variable
Select this checkbox to only generate the Log-Odds plot for continuous predictor variables. The mean of the predictor variable is plotted against the natural log of the ratio of the number of Bads to the number of Goods across all bins.
Construct bins based on above predictor settings
Select this option to create the log-odds plot based upon the default custom group generated by the analysis.
Use custom C&RT preprocessor to construct bins
Select this option to generate the bins for the Log-Odds plot based on the C&RT algorithm, using the custom minimum and maximum number of C&RT bins as specified below. The log-odds plot is based on the bin configuration that generates the largest information value that also satisfies the user specified minimum and maximum bin constraints.
Minimum number of C&RT bins: Specify the minimum number of C&RT bins for the custom C&RT preprocessor.
Maximum number of C&RT bins: Specify the maximum number of C&RT bins for the custom C&RT preprocessor.
Minimum Bad N per level
Specify the minimum number of Bads that should be contained within each bin for the Log-Odds plot.