CHAID, C & RT, and QUEST

For classification-type problems (categorical dependent variable), all three algorithms (available in Statistica) can be used to build a tree for prediction. QUEST is generally faster than the other two algorithms, however, for very large data sets, the memory requirements are usually larger, so using the QUEST algorithms for classification with very large input data sets may be impractical.

For regression-type problems (continuous dependent variable), the QUEST algorithm is not applicable, so only CHAID and C & RT can be used. CHAID builds non-binary trees that tend to be wider. This has made the CHAID method particularly popular in market research applications: CHAID often yields many terminal nodes connected to a single branch, which can be conveniently summarized in a simple two-way table with multiple categories for each variable or dimension of the table. This type of display matches the requirements for research on market segmentation well, for example, it may yield a split on a variable Income, dividing that variable into 4 categories and groups of individuals belonging to those categories that are different with respect to some important consumer-behavior related variable ( types of cars most likely to be purchased). C & RT will always yield binary trees, which can sometimes not be summarized as efficiently for interpretation and/or presentation.

As far as predictive accuracy is concerned, it is difficult to derive general recommendations, and this issue is still the subject of active research. As a practical matter, since even for very large data sets Statistica will compute results very quickly, it is best to apply different algorithms, perhaps compare them with user-defined interactively derived trees, and decide on the most reasonably and best performing model based on the prediction errors. Of course, Statistica Data Miner includes facilities to combine the prediction from different models - even models that are completely different in nature (e.g., tree classifiers, discriminant function analysis, and Neural Networks) - and experience has shown that those types of predictions are often more accurate than predictions made from any one model. For a discussion of various schemes for combining predictions from different models, see, for example, Witten and Frank, 2000.