Workspace Node: Boosted Classification Trees - Results - Quick Tab
In the Boosted Classification Trees workspace node dialog box, under the Results heading, select the Quick tab to access the following options.
Element Name | Description |
---|---|
Summary | Select this check box to produce a graph of the average squared prediction error over the successive boosting steps; separate lines will be displayed in this graph for the training data and the testing data. The graph will also indicate the particular number of trees (boosting steps) that resulted in the lowest average squared error. That solution is likely near the prediction model with the best predictive validity. |
Risk estimates | Select this check box to produce a spreadsheet with risk estimates for the analysis sample and the test sample. For classification-type problems with a categorical dependent variable and equal misclassification costs, risk is calculated as the proportion of cases incorrectly classified by the trees (in the respective type of sample). If unequal misclassification costs are specified, the risk is adjusted accordingly, i.e., expressed relative to the overall cost. The standard error for the risk estimate is also reported. [See also, Breiman, et al. (1984)]. |
Bargraph of predictor importance | Select this check box to produce a plot of the importance for each predictor variable in the analysis. The predictor importance is computed as follows:
During the building of each tree, for each split, predictor statistics (i.e., sums of squares regression, since simple regression trees are built in all cases) are computed for each predictor variable; the best predictor variable (yielding the best split at the respective node) will then be chosen for the actual split. The program also computes the average of the predictor statistic for all variables over all splits and over all trees in the boosting sequence. The final predictor importance values are computing by normalizing those averages so that the highest average is assigned the value of 1, and the importance of all other predictors is expressed in terms of the relative magnitudes of the average values of the predictor statistic, relative to the most important predictor. |
Predictor importance | Select this check box to produce a spreadsheet that contains the importance values and importance ranking on a 0-100 scale for each predictor variable in the analysis. See the description of the Bargraph of predictor importance option above for additional details. |
Final solution (set of consecutive trees) | Use the options in this group box to review details of the final solution (boosting sequence of simple trees). |
Start of tree graphs/End of tree graphs | The method of stochastic gradient boosting trees (see the Introductory Overview) will generate a sequence of simple trees (the complexity of each tree can be specified on the Specifications - Advanced tab). If you want to review the actual individual trees (as Tree graphs or the Tree structure), specify here the specific numbers of trees you want to review. |
Tree graphs | Select this check box to produce the individual trees from the total sequence of boosting trees, as requested via the Start of tree graphs and End of tree graphs options. |
Tree structures | Select this check box to produce in results spreadsheets the structures of the individual trees from the total sequence of boosting trees, as requested via the
Start of tree graphs and
End of tree graphs options.
Options / C / W. See Common Options. |
OK | Click this button to accept all the specifications made in the dialog box and to close it. The analysis results are placed in the Reporting Documents workspace node after running (updating) the project. |