Workspace Node: C&RT Regression - Results - Quick Tab
In the C&RT Regression node dialog box, under the Results heading, select the Quick tab to access the following options.
Element Name | Description |
---|---|
Manager | Specify the method to grow the tree. |
Grow tree | Select this option button to automatically grow the current tree using all current settings. After growing the tree, use any of the Tree View options described below. |
Grow tree & prune | Select this option button to automatically grow the tree and apply pruning to select a "right-sized" tree given the criteria for pruning as specified on the Stopping tab. In general, in classification and regression trees, after automatically growing the tree it usually needs to be pruned back to a smaller size to avoid overfitting and to derive a tree with good predictive validity (accuracy for predicting new observations). This issue is discussed in greater detail in the General Classification and Regression Trees (GC&RT) Overviews, in particular the Computational Details topic. |
Grow tree 1 level | Select this option button to grow the tree one level down from each of the current terminal nodes. Note that branches will only be grown if this is consistent with the current parameter settings for the current method for growing the tree (e.g., stopping rules as specified on the Stopping tab). |
Tree View | Use the options in this group box to review the current tree. In order to review all details of large trees with many terminal nodes, use the Tree browser option, where each node is represented by a single graph, and where the nodes can be navigated using tree-browser facilities. |
Tree browser | Select this check box to produce a complete representation of the results tree inside a Statistica Workbook-like browser, where every node is represented by a graph containing the respective split rule (unless the respective node is a terminal node) and various summary statistics. Intermediate and terminal nodes will be shown in the browser with different symbols:
This browser provides a complete summary of the results and enables you to efficiently review even the most complex trees (see also, Reviewing Large Trees: Unique Analysis Management Tools). Clicking on a node in the tree browser will produce a graph displaying the mean and variance of the variable as well as the plot of normal density with these parameters for the selected node. |
Tree layout | Select this check box to produce a graph showing the structure of the current tree. Each node will be presented as a rectangular box; terminal nodes are highlighted in red and non-terminal nodes are highlighted in blue. |
Tree graph | Select this check box to produce a Tree graph for the current tree. In this graph, each node will be presented as a rectangular box; the terminal nodes are highlighted in red, and the intermediate nodes are highlighted in blue (by default). The following information is usually summarized in this graph: Node ID, the node size, the selected category of the response and the histogram (for classification-type problems) or the mean and variance at the node (for regression-type problems). The graph also contains splitting information for the intermediate nodes - the splitting criterion that created its child nodes and the name of the predictor that was used in the splitting criterion. Note that all labels and legends for the graph are produced as custom text and can be edited, moved, or deleted via the Graph Options dialog box. |
Scrollable tree | Select this check box to display the same Tree graph described above, but in a scrollable window. This option will display a very large graph that can be reviewed (scrolled) "behind" a (resizable) window. Note that all standard graphics editing, zooming, etc., tools for customization and reviewing further details of the graph are still available for this method of display. |
Advanced scrollable tree | Select this check box to create a scrollable tree with advanced features:
1) Generates the graph with Min, Max, Mean, and Std. Dev statistics for the node. 2) Right-click in the node to display a shortcut menu containing options to produce results or perform operations for the node selected. The options are available when the analysis is active and is not closed. 3) Surrogate information is added to the PMML output. |
Tree structure | Select this check box to produce the Tree Structure spreadsheet, which contains summary information for all splits and the terminal nodes for the current tree. The information available in the tree structure will include for each node:
In addition to the information described above, the tree structure will include information about the mean and variance of the dependent variable for the cases or objects belonging to the node. |
Terminal nodes | Select this check box to display the spreadsheet containing summary information for the terminal nodes only.
For regression problems (continuous dependent variable), the spreadsheet shows the number of cases or objects in each observed class that are sent to the node, and the respective node mean and variance. |
Importance | Select the Importance check box with the spreadsheet icon to produce a spreadsheet that contains the importance ranking on a 0-100 scale for each predictor variable in the analysis. Computational details regarding this measure can be found in Breiman (1984; p. 147). In general, with the results presented in this spreadsheet, you can judge the relative importance of each predictor variable for producing the final tree. Refer to the discussion in Breiman (1984) for details. See also, Predictor Importance in Statistica GC&RT, Interactive Trees, and Boosted Trees. |
Importance | Select the Importance check box with the plot icon to produce a bar graph that pictorially shows the importance ranking on a 0-100 scale for each predictor variable considered in the analysis. This plot can be used for visual inspection of the relative importance of the predictor variables used in the analysis and, thus, helps to conclude which predictor variable is the most important predictor. See also, Predictor Importance in Statistica GC&RT, Interactive Trees, and Boosted Trees. |
Risk estimates | Select this check box to produce a spreadsheet with risk estimates for the analysis sample, the test sample (if one is specified on the Validation tab), and the v-fold cross-validation risk (if v-fold cross-validation is requested on the Validation tab). For regression-type problems with a continuous dependent variable, risk is calculated as the within-node variance. The standard error for the risk estimate is also reported. |
Tree sequence | This option is only available if v-fold cross-validation is specified and the
Cross-validate tree sequence check box is selected on the
Validation tab. In this case, you can review the cross-validation cost of the entire tree sequence, i.e., for each level of complexity of the tree.
Predictor details. Select this check box to produce a spreadsheet for each terminal node containing one row for each of the K predictors. Each row of the spreadsheet contains the node ID; the name of the predictor; the splitting condition (i.e., less than cut-off point, etc.), and in the case of a categorical predictor, the set of its levels leading to the left son; the node ID of the successive nodes (sons) when the splitting condition is/is not satisfied (in the case of non-terminal nodes) or string "LEAVE" (in the case of a terminal node); the impurity measure for the proposed cut-off (improvement statistics); number. of observations in the node; and the number of observations in the node with missing predictor values. Options / C / W. See Common Options. |
OK | Click the OK button to accept all the specifications made in the dialog box and to close it. The analysis results will be placed in the Reporting Documents node after running (updating) the project. |