Boosted Trees Results - Quick Tab
Select the Quick tab of the Boosted Trees Results dialog box to access options to review the most important results statistics and graphs for the current analysis. The options available on this tab depend on the Type of analysis selected on the Boosted Trees Startup Panel - Quick tab, i.e., whether the current analysis is a Classification Analysis or Regression Analysis.
- Summary
- Click the Summary button to display a graph of the average squared prediction error over the successive boosting steps; separate lines will be displayed in this graph for the training data and the testing data.
The graph will also indicate the particular number of trees (boosting steps) that resulted in the lowest average squared error. That solution is likely near the prediction model with the best predictive validity. By default, this will be the solution that is selected into the Number of trees field on the Boosted Trees Results dialog box.
- Risk estimates
- Click this button to display a spreadsheet with risk estimates for the analysis sample and the test sample. For classification-type problems (see the description of the Boosted Trees Startup Panel - Quick tab) with a categorical dependent variable and equal misclassification costs (see the description of the Boosted Trees Specifications dialog box - Classification tab), risk is calculated as the proportion of cases incorrectly classified by the trees (in the respective type of sample). If unequal misclassification costs are specified, the risk is adjusted accordingly, i.e., expressed relative to the overall cost. For regression-type problems with a continuous dependent variable, risk is calculated as the residual variance. The standard error for the risk estimate is also reported. [See also, Breiman, et al. (1984) and General Classification and Regression Trees (GC&RT)].
- Bargraph of predictor importance
- Click this button to display a plot of the importance for each predictor variable in the analysis. The predictor importance is computed as follows:
During the building of each tree, for each split, predictor statistics (i.e., sums of squares regression, since simple regression trees are built in all cases) are computed for each predictor variable; the best predictor variable (yielding the best split at the respective node) will then be chosen for the actual split. The program also computes the average of the predictor statistic for all variables over all splits and over all trees in the boosting sequence. The final predictor importance values are computing by normalizing those averages so that the highest average is assigned the value of 1, and the importance of all other predictors is expressed in terms of the relative magnitudes of the average values of the predictor statistic, relative to the most important predictor.
- Predictor importance
- Click the Predictor importance button to display a spreadsheet that contains the importance values and importance ranking on a 0-100 scale for each predictor variable in the analysis. See the description of the Bargraph of predictor importance option above for additional details. See also, Predictor Importance in Statistica GC&RT, Interactive Trees, and Boosted Trees.
- Final solution (set of consecutive trees)
- Use the options in this group box to review details of the final solution (boosting sequence of simple trees).
- Category of response
- This option is only available if the current analysis is a Classification Analysis (as specified on the Boosted Trees Startup panel - Quick tab). In this case, each class is predicted by a separate sequence of trees (see also the Introductory Overview). Select here the specific category or class for which you want to review the results trees.
- Start of tree graphs/End of tree graphs
- The method of stochastic gradient boosting trees (see the Introductory Overview) will generate a sequence of simple trees (the complexity of each tree can be specified on the Boosted Trees Specifications dialog box - Advanced tab). If you want to review the actual individual trees (as Tree graphs or the Tree structure), specify here the specific numbers of trees you want to review.
- Tree graphs
- Click the Tree graphs button to display the individual trees from the total sequence of boosting trees, as requested via the Start of tree graphs and End of tree graphs options.
- Tree structures
- Click the Tree structures button to display in results spreadsheets the structures of the individual trees from the total sequence of boosting trees, as requested via the Start of tree graphs and End of tree graphs options.