Steps Tab

The Steps tab of Data Miner Recipes dialog box is used to create new projects or edit existing ones. The upper-left pane, the Step-nodes pane, consists of options and user configurations for creating data mining models, and the right pane contains tabs specific to the step node in use. There is an information panel in the lower-left corner of the dialog box, which displays useful instructions regarding the current step and how to proceed to the next step.

The information in this topic pertains to the Step-nodes panel and buttons (Save recipe, Report, Next step) that are always available on the Steps tab regardless of which tab is selected. For topics on tab-related options, see Data Preparation - Data Preparation tab, Data Preparation - Advanced tab and Data Miner - Annotations tab.

Step-nodes

The step-node panel is located on the upper-left side of the Steps tab. It can have up to four major nodes:
  • Data preparation
  • Data for analysis
  • Data redundancy
  • Target variable
The Target variable node has a branching structure with the parent node connecting to four child nodes including:
Each node (or step) can exist in one of three states at most (depending on whether its completion is arbitrary or not). Each state is represented by an icon: , or .
Icon State Description
Wait A step cannot be started because it is dependent on a previous step which is not completed.
Ready A step can be started because previous steps are completed.
Completed A step is completed.

Note: You must click the Next step button to change (ready state) to (completed state). The change is made if the step is complete.

The Data Miner Recipes steps are arranged in a logical and sequential order. This ensures that all the information required for successful completion of any given step is in place when the step is started. For example, in any model building task, data are used as examples for the model to learn the underlying process relating the input and target variables. Therefore, you can only start the Target variables step when you have successfully completed the Data preparation and Data for analysis steps. Below is a summary list of the Data Miner Recipe steps and the states in which they can exist. Note that not all steps have three states.

If required, the initial state of a step is (with the exception of the Data preparation step which is ). You can change the state from to by clicking the Next step button. If the step is complete, the status of the step is changed from to and the following step is changed from to . If you click the Next step button when the step is not complete, a message is displayed prompting you to complete the step.

Step (Parent node) Step (Child node)
Data preparation None No Yes Yes
Data for analysis None Yes Yes Yes
Data redundancy None Yes Yes Yes
Target variable Important variables No Yes Yes
Model building Yes Yes Yes
Evaluation Yes Yes Yes
Deployment Yes Yes Yes
Element Name Description
Save recipe Saves the current project to a Data Miner project file (file extension *.dmrproj).
Note: When the project is saved, Data Miner stores a variety of options (for the completed steps) so that they can be loaded when the project is reopened.
Report Displays the menu that contains commands for generating various spreadsheets and graphs for viewing data and results. It also creates a report for the current project and its configurations.
View data file Generates the data spreadsheet for the analysis variables (for the variables selected in Data preparation step).
Results, all steps Displays the results for all the completed steps.
Results for each step are described as follows:
  • Data preparation: Displays a spreadsheet of the variables selected for analysis. For each selected variable the variable number, name, long name and type (continuous or categorical) is reported.
  • Data for analysis: Displays a spreadsheet of basic statistics (for example mean, standard deviation, skewness, kurtosis, min, max) for all continuous inputs and targets. It also displays a spreadsheet with information about the inputs and targets specified for the analysis.
  • Data redundancy: Displays a correlation matrix for all inputs, and a spreadsheet that contains the name of the redundant variables, roles and redundancy criterion and other related information.
  • Target variable: Displays a summary spreadsheet of best predictors, a MD values graph and a spreadsheet with MD values for selected predictors, a matrix scatterplot, a spreadsheet of eliminated predictors, and a spreadsheet of predictors remaining at this step.
  • Summary report: Displays sensitivity spreadsheets for each of the models, a summary spreadsheet for neural network models, a summary report for the Model building steps, and a list of models spreadsheet (with statistics for validation sample).
Summary report Generates a summary report for each completed step.
Undo Returns to the state of the project at the point of run and validate (before clicking Next step button).
Redo Returns to the next state of the project at the last run and validate.
Clear step Permanently erases all actions taken in a current step.
Note: To clear a step, select the step first (by clicking on the appropriate step name in the Step-node panel), and then click the Clear step button. Clearing a step invalidates (or clears) all the subsequent steps.
Next step Validates the current step.
This enables you to move to the next step if the step is completed. If the current step is not complete, a message is displayed to notify, to complete the step. If the current step is complete, the yellow changes to a green .
Note: You cannot proceed from one step to another when the step-node is a red . Click the down arrow button adjacent to the Next step button to display a menu of possible run and validate options. Select Run & validate to run and validate the current step. Select Run to completion to run the entire data miner project.