Response Optimization Overview

The Response Optimization for Data Mining Models (Response Optimization for short) module is part of the STATISTICA suit designed for optimizing and exploring deployed predictive data mining models.

You can think of a predictive model as a black box representation of the relationship between a set of independent variables (also known as measurement variables) and one or more dependent variables known as response. The independent and response variables can either be continuous or categorical. Continuous response variables implies regression tasks, while categorical responses imply classification. Thus, when presented with a set of independent values, the task of a predictive model is to produce a response. This is known as prediction making, in which a set of values for the independent variables are fed into the model and an estimate of the response is received in return.

There are situations, however, where the response variable is known, and the aim is to find a set of values for the independent variables for which the Predictive Model yields the desired response. This is a reverse engineering technique and is known as Response Optimization. Such situations frequently occur in industrial production and product development, where the response variable is a product quality variable and the measurement (independent) variables are the conditions under which the product is developed. STATISTICA Response Optimization achieves this task by conducting a discrete (guided or unguided) search in the independent variables space. For each selected set of independent values, the model prediction is evaluated and compared with the desired response. This process is then repeated until a set of independent values are discovered for which the model yield is equal or as close as possible to the desired response value.

STATISTICA Response Optimization module employs several techniques for performing the above optimization task (for one or more predictive models). These include Simplex, Grid, and Random search algorithms. The Simplex method is a guided optimization algorithm that finds a set of independent values yielding the desired response in a finite number of steps. The Grid and Random algorithms are unguided search techniques based on brute computing power.

These techniques can be used by STATISTICA Response Optimization for optimizing the response of one or more predictive model in one analysis. The models can either be optimized on a standalone basis as separate models, or can be combined to form an ensemble. The latter feature enables the user to treat the predictive models as an ensemble. Ensembles of predictive models are known to have a better generalization ability compared to their standalone members.

STATISTICA Response Optimization module also provides a feature for model exploration where the user are able to examine, explore and compare responses from predictive models as standalones, ensembles or both.