Information Value and Weight of Evidence Analysis
Information Value analysis is a data exploration technique that helps determine which columns in a data set have predictive power or influence on the value of a specified dependent variable.
See Information Value operator for more information.
Information value analysis is a popular tool for banks, for example, providing a set of variables that help determine which credit card customers are most likely to default. The Information Value operator defines IV and WOE as follows:
- IV - A numerical value that quantifies the predictive power of an independent continuous variable x in capturing the binary dependent variable y. IV is helpful for reducing the number of variables as an initial step in preparing for Logistic Regression, especially when there are a large amount of potential variables. IV is based on an analysis of each individual independent variable in turn without considering other predictor variables.
- WOE - Closely related to the IV value, WOE measures the strength of each grouped attribute in predicting the desired value of the Dependent Variable.
The following table provides a standard rule of thumb for using the Information Value to understand the predictive power of each variable.
Information Value | Predictive Power |
---|---|
< 0.02 | Useless |
0.02 - 0.1 | Weak |
0.1 - 0.3 | Medium |
0.3 - 0.5 | Strong |
> 0.5 | Suspiciously good; too good to be true |
Typically, variables with medium and strong predictive powers are selected for model development. However, some schools of thought would advocate just the variables with medium IVs for a broad based model development.
In the above example, the times 90 days late, times 30 days late over 2 years, age, number of dependents, and education level of each person was assessed to see how predictive of paying late on a loan that variable was. Using the suggested results assessment chart above, number of times 30 days late over 2 years is a medium predictor (IV=0.13231212) of whether the person would pay late on a loan in the future.