Differences in Decision Tree Algorithms

Various decision tree algorithms such as CART, C4.5, and ID3 differ in certain details.

  • How the purity/impurity measure is calculated and at what level of purity the splitting procedure stops. Note that continuing to split a training set until all the leaf nodes are entirely pure usually results in over-fitting the model. The Decision Tree operator uses information gain to measure increase in purity. The CART operators use improvements in the Gini Coefficient.
  • Whether the tree is binary or can have a node with more than two children.
  • How non-categorical attributes are treated.