Model Comparison Node
Overview
The node picks the best model, based on statistics, and produces a PMML.
The node then automatically checks in the Champion Challenger model to Statistica or Statistica Enterprise.
What you can compare:
- Models from any nodes
- New models vs older models
- Models with new data
- Model stored in Statistica or Enterprise
- Compare multiple enterprise models
What it does:
- Compares different models
- Generates a Summary of Quality matrix
- Selects a champion model
- Compares previous model and same model with new data
- Compares recorded statistics in PMML. Then use Rapid Deployment to review the statistics from new data.

You have to have the downstream PMML linked. Option must be checked.
PMML is generated when Comparison is run
Downstream is challenging (Champion) model–the model that is updated
Upstream can be any node
Input
- Connect the Model Comparison node to connect to Models and Data Input. The input data used for comparison should specify Observed and Predicted variables, which can be specified using the Select Variables node.
For classification
Select observed variable in dependent categorical field and prediction variable in predictor categorical field to generate model quality statistics, based on the observed and predicted column.
For regression
Select observed variable in dependent continuous and prediction variable in predictor continuous to generate model quality statistics based on observed and predicted column.

Enterprise
- Specify an Enterprise model using the PMML Model node from Data Mining menu.
- Compare multiple enterprise models
Comparing to Enterprise is the default. Get the existing model from Enterprise, and see if it is different. If it is, it updates.
Output
The output from running this node is an XML spreadsheet.
The output generated shows model stats that comes from a PMML node, which is based on input data.
You can choose for Enterprise nodes to be automatically updated by checking Link to Enterprise on the PMML tab of the PMML Model node from the Data Mining menu, then clicking the Deploy to Enterprise button.
When these options are selected, a new downstream node is created of the best selected model and updated to Enterprise.
What output is this? Different outputs for different scenarios.
Options:
| Option | Description |
|---|---|
| Specifications/Quick tab | |
| Model Type | Auto, if you select auto, the node tries to automatically detect the model from the connected nodes and compares the models using the statistics defined.
You can also select one of the following:
|
| Regression comparison statistics | Select the stat on which to compare regression model. The model with the lower error and high R square statistics is selected as the best model. These are the choices:
|
| Classification comparison statistics | Select the stat on which to compare classification models:
|
| Use test sample statistics only for selecting model | You can select this check box to use test sample statistics only for selecting a model: When selected, only models having test statistics and models using data files are compared. Training statistics are ignored and would not be used for comparison. For instance, the model Advanced Classification Trees (C&RT) do not have a test Data usage tag, so the model is not used in comparison. |
| Results/Quick Tab | |
| Model statistics | |
| Results/Enterprise Tab | |
| Update selected model to Enterprise | You can select this checkbox to automatically update the model to Enterprise. The downstream model node should be linked to Enterprise with the link option selected. The model is updated if the selected model is different from the model in Enterprise.
Selecting this option uses the data supplied by the connected Select Variables dialog box (Data tab, Variables icon/select variables). The input data used for comparison should specify Observed and Predicted variables.
|
