arbor.control
Control for Arbor Models

Description

Sets the various parameters that control aspects of the arbor fit.

Usage

arbor.control(minsplit = 20, minbucket = max(1,round(minsplit/3)), 
    cp = 0.01, mindev = NULL, maxcompete = 4, maxsurrogate = 5, 
    usesurrogate = 2, xval = 10, surrogatestyle = 0, maxdepth = 30)

Arguments

minsplit the minimum number of observations that must exist in a node for a split to be attempted.
minbucket the minimum number of observations in any terminal <leaf> node. If only one of minbucket or minsplit is specified, the code either sets minsplit to minbucket*3 or minbucket to minsplit/3, as appropriate.
cp the complexity parameter. Any split that does not decrease the overall lack of fit by a factor of cp is not attempted. For instance, with anova splitting, this means that the overall Rsquare must increase by cp at each step. The main role of this parameter is to save computing time by pruning off splits that are obviously not worthwhile. Essentially, the user informs the program that any split that does not improve the fit by cp is likely pruned off by cross-validation, and that hence the program need not pursue it. If cp is given a positive value, mindev is set to -1.0.
mindev the split is limited on risk instead of complexity. A node with risk less than mindev is not split. Only one of mindev and cp is used. If mindev is given a positive value, cp is set to -1.0.
maxcompete the number of competitor splits retained in the output. It is useful to both which split was chosen and which variable came in second, third, and so on.
maxsurrogate the number of surrogate splits retained in the output. If this value is set to zero the computational time is shortened, because approximately half of the computational time (other than setup) is used in the search for surrogate splits.
usesurrogate specifies how to use surrogates in the splitting process.
  • 0 specifies display only. An observation with a missing value for the primary split rule is not sent further down the tree. A value of 0 corresponds to the action of tree.
  • 1 specifies to use surrogates, in order, to split subjects missing the primary variable. If all sorrogates are missing, the observation is not split.
  • 2 specifies, if all surrogates are missing, to send the observation in the majority direction. A value of 2 is the recommendation of Breiman, et.al.
xval an integer number representing the size of the cross-validation groups or a vector of numbers to indicate in which group each observation belongs.
surrogatestyle controls the selection of a best surrogate.
  • If set to 0 (the default), the program uses the total number of correct classification for a potential surrogate variable.
  • If set to 1, it uses the percent correct, calculated over the non-missing values of the surrogate.
The first option more severely penalizes covariates with a large number of missing values.
maxdepth Set the maximum depth of any node of the final tree, with the root node counted as depth 0 (if set past 30, arbor returns nonsense results).
Value
returns a list containing the options.
See Also
arbor
Package arbor version 6.1.1-7
Package Index