Workspace Node: Random Forest Regression - Results - Quick Tab
In the Random Forest Regression node dialog box, under the Results heading, select the Quick tab to access the following options.
Element Name | Description |
---|---|
Summary | Select this check box to produce a graph of the average squared prediction error over the successive tree addition steps; separate lines will be displayed in this graph for the training data and the testing data. The graph will also indicate the particular number of trees that resulted in the lowest average squared error. That solution is likely near the prediction model with the best predictive validity. |
Risk estimates | Select this check box to produce a spreadsheet with risk estimates for the analysis sample and the test sample. For regression-type problems with a continuous dependent variable, risk is calculated as the residual variance. The standard error for the risk estimate is also reported. [See also, Breiman, et al. (1984)]. |
Bargraph of predictor importance | Select this check box to produce a plot of the importance for each predictor variable in the analysis. The predictor importance is computed as follows:
During the building of each tree, for each split, predictor statistics (i.e. sums of squares regression, since simple regression trees are built in all cases) are computed for each predictor variable; the best predictor variable (yielding the best split at the respective node) will then be chosen for the actual split. The program also computes the average of the predictor statistics for all variables over all splits and over all trees. The final predictor importance values are computing by normalizing those averages so that the highest average is assigned the value of 1, and the importance of all other predictors is expressed in terms of the relative magnitudes of the average values of the predictor statistics, relative to the most important predictor. |
Predictor importance | Select this check box to produce a spreadsheet that contains the importance values and importance ranking on a 0-100 scale basis for each predictor variable in the analysis. See the description of the Barplot of predictor importance option above for additional details. See also Predictor Importance in Statistica GC&RT, Interactive Trees, and Boosted Trees. |
Final solution (set of consecutive trees) | Use the options in this group box to review details of the final solution for all categories of the dependent variable. |
Start of tree graphs/End of tree graphs | The method of Random Forest (see the Introductory Overview and Technical Notes) will generate a sequence of simple trees (the complexity of each tree can be specified on the Specifications - Advanced tab). If you want to review the actual individual trees (as Tree graphs or the Tree structure, see below), specify here the specific numbers of trees you want to review. |
Tree graphs | Select this check box to produce the individual trees from the Random Forest, as specified via the Start of tree graphs and End of tree graphs options, see above. |
Tree structures | Select this check box to produce in results spreadsheets the structures of the individual trees from the
Random Forest, as specified via the
Start of tree graphs and
End of tree graphs options.
Options / C. See Common Options. |
OK | Click the OK button to accept all the specifications made in the dialog box and to close it. The analysis results will be placed in the Reporting Documents node after running (updating) the project. |