Pruning or Pre-Pruning
A key concept related to decision trees is pruning or pre-pruning, whereby branches are eliminated from the tree because they do not add enough informational relevance to the model.
Pruning and pre-pruning helps with avoiding over-fitting of the decision tree and makes the tree more compact and easier to read. For a typical dataset, both should be used, unless the algorithm takes too long to run.
The process of pruning involves going through each non-leaf node and determining, based on a confidence value, whether to turn the node into a leaf. In other words, it decides if the sub-tree adds enough extra value to the model. For pruning, the whole tree is built out, and then sub-branches are cut out if they are found to not be good predictors.
The process of pre-pruning limits the decision tree as it is being built based on increases in purity (that is, increase in information gain for the Decision Tree operator and improvement in Gini coefficient for the CART operator). This is a faster process than post pruning, but sometimes can result in too small of a decision tree.
Pre-pruning is the cheapest way to keep the model small (and to help prevent over-fitting).