Decision Tree Concept of Purity

In decision tree construction, concept of purity is based on the fraction of the data elements in the group that belong to the subset.

A decision tree is constructed by a split that divides the rows into child nodes. If a tree is considered "binary," its nodes can only have two children. The same procedure is used to split the child groups. This process is called "recursive partitioning." The split is selected to construct a tree that can be used to predict the value of the target variable. The primary algorithm for deriving a decision tree from a training set employs a greedy approach, which means that it strives for the "purest" of subsets or clearest division of the branch nodes as possible.

This concept of purity is based on the fraction of the data elements in the group that belong to the subset. One way the purity of a set can be defined is as the frequency of its most common constituent. For example, if a set consists of 60% of items in class A, 30% in class B, and 10% in class C, then its purity is 60%.
Note: There are other acceptable ways of defining purity that all reach a maximum when all the elements of a set belong to the same class.

Information gain is considered one of the better quantitative measures of increases in purity.