Program Overview
The Interactive Trees (C&RT, CHAID) module contains complete implementations of classification and regression trees (C&RT) algorithms popularized by Breiman et al. (Breiman, Friedman, Olshen, & Stone, 1984; see also Ripley, 1996) as well as CHAID and Exhaustive CHAID methods (Chi-square Automatic Interaction Detector; see Kass, 1980). The general computational methods are mostly identical to those implemented in the General Classification and Regression Trees (GC&RT) and General CHAID (GCHAID) Models modules of STATISTICA, and they are described there in greater detail.
The Interactive Trees (C&RT, CHAID) module provides a large number of options to enable users to interactively determine all aspects of the tree building process. You can select the variables to use for each split (branch) from a list of suggested variables, determine how and where to split a variable, interactively grow the tree branch by branch or level by level, grow the entire tree automatically, delete ("prune back") individual branches of trees, and more. All of these options are provided in an efficient graphical user interface, where you can "brush" the current tree, i.e., select a specific node to grow a branch, delete a branch, etc.
A large number of results options are provided for displaying and reviewing the tree, which are similar to those available for the General Classification and Regression Trees (GC&RT) and General CHAID (GCHAID) Models modules. Trees can be reviewed in tree diagrams (graphs) or the unique Tree Browser user interface (see General Computation Issues and Unique Solutions of STATISTICA GCHAID), which is similar to the browser (or the hierarchical folder structure) available in STATISTICA Workbooks. In addition, various auxiliary results tables and graphs are available to allow users to examine all details of the results.
As in all modules for predictive data mining, the decision rules contained in the final tree built for regression or classification prediction can optionally be saved in a variety of ways for deployment in data mining projects, including C/C++, STATISTICA Visual Basic, or PMML. Hence, final trees computed via this module can quickly and efficiently be turned into solutions for predicting or classifying new observations.