C&RT Quick Specs - Advanced Tab
Select the Advanced tab of the C&RT Quick specs dialog box to access advanced estimation options: Number of surrogates and Sigma-restricted parameterization.
In general, at every step during the tree building process, Statistica will identify a variable for the next split to improve the accuracy of prediction. If for a particular observation (case) the value for the chosen variable is missing, then the program will look to the next-best variable to split on, to act as a "surrogate" for the best variable. If the value for that variable is missing as well, then the program will look to the third-best split variable, and so on. The Number of surrogates option determines how far down the list of predictors (sorted by the degree of improvement in the accuracy of prediction provided by each respective split candidate) the program will go when attempting to find a surrogate for a variable that has missing data for a particular case.
Note: Missing data (and surrogate splits). Missing data in predictor variables and surrogate split variables are handled differently in the General Classification and Regression Trees (GC&RT) module as compared to the Interactive Trees module. Because the Interactive Trees module does not support ANCOVA-like design matrices, it is more flexible in the handling of missing data. Specifically, in GC&RT, observations classified or predicted via surrogate split variables are not included in subsequent tree-building itself (because it would be ambiguous how to construct a unique ANCOVA-like design matrix to include surrogate split variables); to consider variables (and the missing data for those variables) one-by-one, and to include observations classified or predicted via surrogate splits in the tree building process itself, use the Interactive Trees module instead. Refer also to Missing Data in GC&RT, GCHAID, and Interactive Trees for additional details.