Contents

Index

Search Results

Workspace Node: C&RT Classification - Results - Quick Tab

In the C&RT Classification workspace node dialog box, under the Results heading, select the Quick tab to access the following options.

Manager.

Element Name	Description
Grow tree	Select this option button to automatically grow the current tree, using all current settings. After growing the tree, use any of the Review tree options described below to further review the tree. You can also prune the tree using the various methods and options available in this dialog box.
Grow tree & prune	Select this option button to automatically grow the tree and apply pruning to select a "right-sized" tree given the criteria for pruning as specified on the Stopping tab. In general, in classification and regression trees, after automatically growing the tree, it usually needs to be pruned back to a smaller size to avoid overfitting and to derive a tree with good predictive validity (accuracy for predicting new observations). This issue is discussed in greater detail in the General Classification and Regression Trees (GC&RT) Overviews, in particular the Computational Details topic.
Grow tree 1 level	Select this option button to grow the tree one level down from each of the current terminal nodes. Note that branches will only be grown if this is consistent with the current parameter settings for the current method for growing the tree (e.g., stopping rules as specified on the Stopping tab).
Tree View	Use the options in this group box to review the current tree. In order to review all details of large trees with many terminal nodes, use the Tree browser option for each node to be represented by a single graph; nodes can be navigated using tree-browser facilities in the Reporting Documents workspace node after the project has been run.
Tree browser	Select this check box to produce a complete representation of the results tree inside a Statistica workbook-like browser, where every node is represented by a graph containing the respective split rule (unless the respective node is a terminal node) and various summary statistics. Intermediate and terminal nodes will be shown in the browser with different symbols: Denotes a split node Denotes a terminal node You can efficiently review even the most complex trees (see also, Reviewing Large Trees: Unique Analysis Management Tools). Clicking on a node in the tree browser will produce a graph displaying the number of cases in each category of the variable as well as the histogram of statistics for the selected node.
Tree graph	Select this check box to produce the Tree graph for the current tree. In this graph, each node will be presented as a rectangular box; the terminal nodes are highlighted in red, and the intermediate nodes are highlighted in blue (by default). The following information is usually summarized in this graph: Node ID, the node size, the selected category of the response and the histogram. The graph also contains splitting information for the intermediate nodes - the splitting criterion that created its child nodes and the name of the predictor that was used in the splitting criterion. Note that all labels and legends for the graph are produced as custom text and can be edited, moved, or deleted via the Graph Options dialog box.
Tree layout	Select this check box to produce a graph showing the structure of the current tree. Each node will be presented as a rectangular box; terminal nodes are highlighted in red and non-terminal nodes are highlighted in blue.
Scrollable tree	Select this check box to produce the same tree graph as described above, but in a scrollable window. This option will display a very large graph that can be reviewed (scrolled) "behind" a (resizable) window. Note that all standard graphics editing, zooming, etc., tools for customization and reviewing further details of the graph are still available for this method of display.
Advanced scrollable tree	Select this check box to create a scrollable tree with advanced features: 1) Generates the graph tabulating category, their count, and %. Also, if the number of categories exceeds 5, "..." will be displayed at the end suggesting more categories. 2) The predicted category is marked with an asterisk ( * ) in the table. 3) A ToolTip will be displayed when the mouse pointer hovers over the node. 4) Right-click the node to display a shortcut menu containing options to produce results or perform operations for the node selected. The options are available when the analysis is active and is not closed.

Tree structure	Select this check box to produce the Tree Structure spreadsheet, which contains summary information for all splits and the terminal nodes for the current tree. The information available in the tree structure will include for each node: The node IDs of child nodes to which cases or objects are sent, depending on whether they satisfy (left branch), or do not satisfy (right branch) the split condition at a split node. The number of cases or objects belonging to the node. Information detailing the Split condition for a split node. Note that no child nodes or split conditions are displayed for the terminal nodes of the tree. In addition to the information described above, the tree structure will include the number of cases or objects in each observed class that are sent to the node. Alternatively, in the case of a continuous response variable (regression), the tree structure will contain information about the mean and variance of the dependent variable for the cases or objects belonging to the node.
Importance	Select this check box (with the spreadsheet icon) to create a spreadsheet that contains the importance ranking on a 0-100 scale for each predictor variable in the analysis. Computational details regarding this measure can be found in Breiman (1984; p. 147). In general, with the results presented in this spreadsheet, you can judge the relative importance of each predictor variable for producing the final tree. Refer to the discussion in Breiman (1984) for details.
Risk estimates	Select this check box to produce a spreadsheet with risk estimates for the analysis sample, the test sample (if one is specified on the Validation tab), and the v-fold cross-validation risk (if v-fold cross-validation is requested on the Validation tab). For classification-type problems with a categorical dependent variable and equal misclassification costs, risk is calculated as the proportion of cases incorrectly classified by the tree (in the respective type of sample); if unequal misclassification costs are specified, the risk is adjusted accordingly, i.e., expressed relative to the overall cost.
Predictor details	Select this check box to produce a spreadsheet for each terminal node containing one row for each of the K predictors. Each row of the spreadsheet contains the node ID; the name of the predictor; the splitting condition (i.e., less than cut-off point, etc.), and in the case of a categorical predictor, the set of its levels leading to the left son; the node ID of the successive nodes (sons) when the splitting condition is/is not satisfied (in the case of non-terminal nodes) or string "LEAVE" (in the case of a terminal node); the impurity measure for the proposed cut-off (improvement statistics); number. of observations in the node; and the number of observations in the node with missing predictor values.
Terminal nodes	Select this check box to produce a spreadsheet containing summary information for the terminal nodes only. The spreadsheet shows the number of cases or objects in each observed class that are sent to the node.
Importance	Select this check box (with the plot icon) to produce a bar graph that pictorially shows the importance ranking on a 0-100 scale for each predictor variable considered in the analysis. This plot can be used for visual inspection of the relative importance of the predictor variables used in the analysis and, thus, helps to conclude which predictor variable is the most important predictor. See also, Predictor Importance in Statistica GC&RT, Interactive Trees, and Boosted Trees
Tree sequence	This option is only available if v-fold cross-validation is specified and the Cross-validate tree sequence check box is selected on the Validation tab. In this case, you can review the cross-validation cost of the entire tree sequence, i.e., for each level of complexity of the tree. Note that these results are only available until the first time you change the tree by manually removing branches or adding splits: At that point, the tree sequence is no longer valid, and this option will no longer be available until you grow and cross-validate the tree sequence again. Options / C / W. See Common Options.
OK	Click this button to accept all the specifications made in the dialog box and to close it. The analysis results are placed in the Reporting Documents workspace node after running (updating) the project.

Copyright © 2021. Cloud Software Group, Inc. All Rights Reserved.