Predictor (DB)
Applies an input regression, classification, or clustering model to an input data set in order to predict a value (or the highest probability value)
Information at a Glance
Parameter |
Description |
---|---|
Category | Predict |
Data source type | DB |
Send output to other operators | Yes |
Data processing tool | MapReduce |
The input column names must match the column names in the data set selected for model training, except for the dependent columns.
The prediction operation outputs its prediction columns with the columns of the input data set into a user-specified prediction table.
The operator includes the following prediction columns in the user-specified output table.
- PRED_<model_abbreviation> - the predicted value or value with highest probability
- CONF_<model_abbreviation> - the confidence in the predicted value
- INFO_<model_abbreviation> - a dictionary of information about the results
Model Type | Model | Column Abbreviation |
---|---|---|
Classification |
|
|
Regression |
|
|
Clustering | K-Means | KM
K-means output columns look a bit different. The columns are:
|
Algorithm
The Predictor operator is used to predict the value of the dependent variable based on the model(s) generated from the input model operator(s).
Input Model | What Predictor Calculates |
---|---|
Classification algorithms | Value with the highest probability |
Numeric regression algorithms | Predicted value |
Clustering algorithms | Predicted cluster |
Input
An input regression, classification, or clustering model, and an input data set against which the model is applied.
Configuration
Notes | Notes or helpful information about this operator's parameter settings. When you enter content in the Notes field, a yellow asterisk appears on the operator. |
Output Schema | The schema for the output table or view. |
Output Table | Specify the table path and name where the output of the results is generated. By default, this is a unique table name based on your user ID, workflow ID, and operator. |
Storage Parameters | Advanced database settings for the operator output. Available only for
TABLE output.
See Storage Parameters dialog for more information. |
Drop If Exists | Specifies whether to overwrite an existing table.
|