Options for C&RT
- Number of surrogates
- By choosing "similar" predictors (surrogates) with valid data, cases (observations) with missing data can be classified so that such cases can be included in the analysis. In fact, cases with missing values in the response are treated as "prediction samples" and cases with missing values in the predictor as "surrogate samples." The entry in the Number of surrogates box controls the number of surrogates that can be chosen by the analysis during the tree-building process. By default, the number of surrogates is 0 (zero), and missing data values are excluded from the analysis.
In general, at every step during the tree building process, STATISTICA will identify a variable for the next split to improve the accuracy of prediction. If for a particular observation (case) the value for the chosen variable is missing, then the program will look to the next-best variable to split on, to act as a "surrogate" for the best variable. If the value for that variable is missing as well, then the program will look to the third-best split variable, etc. The Number of surrogates option determines how far down the list of predictors (sorted by the degree of improvement in the accuracy of prediction provided by each respective split candidate) the program will go when attempting to find a surrogate for a variable that has missing data for a particular case.
Collect sensitivity analysis data. When this check box is selected, the Pred. stats & details spreadsheet can be created from the ITrees C&RT Results dialog box - Manager tab. If there are continuous variables selected, the Sensitivity graph and the Sensitivity by rank graph can also be created from the ITrees C&RT Results dialog box - Manager tab. If this check box is cleared, the Pred. stats & details, Sensitivity, and Sensitivity by rank buttons will be dimmed.