Workspace Node: C&RT Classification - Specifications - Stopping Tab
In the C&RT Classification node dialog box, under the Specifications heading, select the Stopping tab to access the following options.
The size of the tree is an important issue in computing Classification Trees. You don't want the tree to grow to an undesirable size, which could make the interpretation of results difficult. You can keep a check on the size of the tree by using the options available on the Stopping tab. This tab contains two group boxes: Stopping rule and Stopping parameters, which contain options to choose a criterion for selecting the right-sized tree.
Element Name | Description |
---|---|
Stopping rule | If the dependent variable for the current analysis is categorical in nature, this box will contain three stopping rules: Prune on misclassification error, Prune on deviance, and FACT-style direct stopping. If the dependent variable is continuous, two stopping rules are available: Prune on variance and FACT-style direct stopping. Refer to Computational Details for details concerning these stopping rules; see also Ripley (1996) for detailed discussions of these measures. |
Prune on misclassification error | This option uses costs that equal the misclassification rate when priors are estimated and misclassification costs are equal. Select the Prune on misclassification error option button to prune on the basis of misclassification error. |
Prune on deviance | Deviance is a measure of fit that is based on the likelihood principle. This option will use the difference between the log-likelihood of the best model and the current model as a basis for pruning when the dependent variable is categorical (see Ripley, 1996). Select the Prune on deviance option button to prune the trees on the basis of deviance. |
FACT-style direct stopping | Select this option to directly stop the growth of the tree based upon a fraction of cases (regression) or a fraction of cases within a specific category of the response (classification). This is in contrast to the other stopping rules, which all involve pruning a tree, that is, growing the tree too large and then pruning the tree until only the root node remains. This process of pruning creates a sequence of trees that vary in size from the largest tree grown to the root node. By using v-fold cross validation coupled with the standard error rule, STATISTICA will select the optimal tree in this sequence. With the FACT-style direct stopping method, the process of pruning is completely omitted and the growth of the tree is based solely on the fraction of objects option (see below). |
Stopping parameters | Use these options to control when split selection stops and, if a pruning method is selected as the Stopping rule, when pruning begins and which pruned tree is selected as the right-sized tree. |
Minimum n (%) of cases | If a pruning method is selected in the Stopping rule group box, i.e., Prune on misclassification error or Prune on deviance, enter a value for the Minimum n cases. If the number of observations within the node is less than this value, the node will not be considered for splitting. |
Maximum n of levels | Specify the maximum number (n) of levels in the tree. |
Fraction of objects | If FACT-style direct stopping is selected as the Stopping rule (see above), the value in the Fraction of objects box is used to stop the growth of the tree. For classification problems, a node will not be split if any of the relative frequencies of the levels of the categorical response fall at or below the value in the Fraction of objects box. |
Minimum n in child nodes | If a pruning method is selected in the Stopping rule group box, i.e., Prune on misclassification error or Prune on deviance, use this option to control the smallest permissible number in a child node, for a split to be applied. While the Minimum n parameter determines whether an additional split is considered at any particular node, the Minimum n in child node parameter determines whether a split will be applied, depending on whether any of the two resultant child nodes will be smaller (have fewer cases) than n as specified via this option. |
Maximum n of nodes | The value supplied in this box will be used for stopping on the basis of the number of nodes in the classification tree. Each time a parent node is split, the total number of nodes in the tree is examined, and the splitting is stopped if this number exceeds the number specified in
Maximum n nodes box.
Options / C / W. See Common Options. |
OK | Click the OK button to accept all the specifications made in the dialog box and to close it. The analysis results will be placed in the Reporting Documents node after running (updating) the project. |
Copyright © 2021. Cloud Software Group, Inc. All Rights Reserved.