Workspace Node: C&RT Regression - Specifications - Stopping Tab

In the C&RT Regression node dialog box, under the Specifications heading, select the Stopping tab to access the following options.

The size of the tree is an important issue in computing Classification and Regression Trees. You don't want the tree to grow to an undesirable size, which could make the interpretation of results difficult. You can keep a check on the size of the tree by using the options available on the Stopping tab. This tab contains two group boxes: Stopping rule and Stopping parameters, which contain options to choose a criterion for selecting the right-sized tree.

Element Name Description
Stopping rule If the dependent variable for the current analysis is categorical in nature, and the objective of the analysis is to classify cases (observations) into the categories defined in the dependent variable, this box will contain three stopping rules: Prune on misclassification error, Prune on deviance, and FACT-style direct stopping. If the dependent variable is continuous, two stopping rules are available: Prune on variance and FACT-style direct stopping. Refer to Computational Details for information concerning these stopping rules; see also Ripley (1996) for detailed discussions of these measures.
Prune on variance One of the ways in which the size of the tree can be checked is by pruning the tree, i.e., by removing parts of trees with the aim of computing the right-sized tree. If the dependent variable is continuous (regression), the measure used is the variance of cases in a node. Select the Prune on variance option button to prune on the basis of variance.
Prune on misclassification error This option uses costs that equal the misclassification rate when priors are estimated and misclassification costs are equal. Select the Prune on misclassification error option button to prune on the basis of misclassification error.
Prune on deviance Deviance is a measure of fit that is based on the likelihood principle. This option will use the difference between the log-likelihood of the best model and the current model as a basis for pruning when the dependent variable is categorical (see Ripley, 1996). Select the Prune on deviance option button to prune the trees on the basis of deviance.
FACT-style direct stopping Select this option to directly stop the growth of the tree based upon a fraction of cases (regression) or a fraction of cases within a specific category of the response (classification). This is in contrast to the other stopping rules, which all involve pruning a tree, that is, growing the tree too large and then pruning the tree until only the root node remains. This process of pruning creates a sequence of trees that vary in size from the largest tree grown to the root node. By using v-fold crossvalidation coupled with the standard error rule, Statistica selects the optimal tree in this sequence. With the FACT-style direct stopping method, the process of pruning is completely omitted and the growth of the tree is based solely on the fraction of objects option (see below).
Stopping parameters Use these options to control when split selection stops and, if a pruning method is selected as the Stopping rule, when pruning begins and which pruned tree is selected as the right-sized tree.
Minimum n (%) of cases If a pruning method is selected in the Stopping rule group box, i.e., Prune on misclassification error or Prune on deviance, enter a value for the Minimum n cases. If the number of observations within the node is less than this value, the node will not be considered for splitting.
Maximum n of levels Use this option to specify the maximum number (n) of levels in the tree.
Fraction of objects If FACT-style direct stopping is selected as the Stopping rule (see above), the value in the Fraction of objects box is used to stop the growth of the tree. For regression problems, a node will not be split if the relative frequency within the node falls at or below the value in the Fraction of objects box.
Minimum n in child nodes If a pruning method is selected in the Stopping rule group box, i.e., Prune on misclassification error or Prune on deviance, use this option to control the smallest permissible number in a child node, for a split to be applied. While the Minimum n parameter determines whether an additional split is considered at any particular node, the Minimum n in child node parameter determines whether a split will be applied, depending on whether any of the two resultant child nodes will be smaller (have fewer cases) than n as specified via this option.
Maximum n of nodes The value supplied in this box will be used for stopping on the basis of the number of nodes in the tree. Each time a parent node is split, the total number of nodes in the tree is examined, and the splitting is stopped if this number exceeds the number specified in Maximum n nodes box. Options / C / W. See Common Options.
OK Click the OK button to accept all the specifications made in the dialog box and to close it. The analysis results will be placed in the Reporting Documents node after running (updating) the project.