Boosted Trees Overview
The Statistica Boosted Trees module is a complete implementation of the method usually referred to as stochastic gradient boosting trees [Friedman, 1999a, b; Hastie, Tibshirani, & Friedman, 2001; also known as TreeNet ( ™ Salford Systems, Inc.) and MART ( ™ Jerill, Inc.)]. In Statistica , these techniques can be used for regression-type problems (to predict a continuous dependent variable) as well as classification problems (to predict a categorical dependent variable).
Program Overview
Estimation
You have full control over all key aspects of the estimation procedure, including the complexity of the trees fitted to the data, the maximum number of boosting steps, the subsampling rate for the training sample at each boosting step, the learning or shrinkage rate, etc.
You can also specify an independent testing sample to evaluate the predictive validity in that sample for each solution in the sequence of boosting steps. If no specific testing sample is selected, Statistica randomly selects such a sample at each boosting step, and then determines the best solution (best number of additive expansions or simple trees) based on the performance of the respective models for predicting the cases in those testing samples.