How to: |
When creating a Data Flow, you can easily run predictive analytics on your data sets using Machine Learning functions, without prior knowledge of advanced statistics.
Build, train, and run multiple iterations of predictive models in parallel, evaluate and compare models actively, and select which model you want to save. Then you can re-run your model against new data sets.
Note: ML functions are available if you are using the WebFOCUS Integrated Install. Otherwise, you must install and configure Python 3.6.x. For information on configuration, from the Get Data dialog box, right-click the Python adapter icon and select Prerequisites.
After you create a Data Flow, you can select from various regression model algorithms to run against your data set.
The Data Flow opens, as shown in the following image.
The Predict Values panel opens, as shown in the following image.
The following regression model algorithms display within the Regression module:
You can click the Target dropdown menu to select a different target. All numeric Field measures are selected by default as Predictors. You can add or remove Predictors by selecting or unselecting the check boxes.
Your selected model type appears on the dataflow canvas, as shown in the following image.
To edit your model target and predictors, right-click the canvas Regression node, select Edit Settings, and then select Target and Predictors.
To edit your model hyperparameters, right-click the canvas Regression node, select Edit Settings, select Hyperparameters, and then select a model algorithm. Hyperparameters have default values that are unique to each model.
The Compare Model Evaluation Results dialog opens, as shown in the following image.
The regression model algorithms run in parallel. This allows you to easily compare results and determine which model is best to save.
The best model has the lowest Root Mean Square Error value, and a scatter plot with dots closest to the red line. In this example, the Polynomial model has the best results.
You can filter which model comparisons you want to see by selecting or deselecting the model check boxes.
You can save a model by clicking its Save icon.
Close the Compare Model Evaluation Results dialog box to return to the canvas, as shown in the following image.
To open the Compare dialog box, click the Compare icon on the canvas toolbar.
You can select different model algorithms from the Regression drop-down menu. The best model displays by default. Models display in the following tabs.
Target and predicted columns are highlighted yellow, as shown in the following image.
Note: Feature Importances is available for the Random Forest model only.
The Save dialog opens, as shown in the following image.
You can change the model algorithm, name, or location, and add a description.
Your model is saved to your selected folder location, as shown in the following image.
Trained models are saved with evaluation results, logs, and associated files, to run the model at a later time.