Classifier (DB)

Uses any input classification model to apply a classification prediction to the input data set.

Information at a Glance

Category	Predict
Data source type	DB
Sends output to other operators	Yes
Data processing tool	n/a

Note: The Classifier (DB) operator is for database data only. For Hadoop data, use the Classifier (HD) operator.

Algorithm

The Team Studio Classifier operator is used to predict the probability of the occurrence of the event based on the model generated by the training of Alpine Forest, Decision Tree, K-Means (Hadoop), Logistic Regression, Naive Bayes, Neural Network, or SVM Classification operator models.

Input

The input data set must contain the columns such that the names are the same as the columns in the data set selected for model training with the exception of the dependent column. The Classifier operator must have both of the following.

An input Classification model.
An input data set against which the model is applied.

The model preceding the Classifier operator can be any of the following. The Classifier operator can take multiple models from the preceding operators, not just one.

Alpine Forest
Decision Tree
K-Means
Logistic Regression
Naive Bayes,
SVM Classification

Configuration

Parameter	Description
Notes	Any notes or helpful information about this operator's parameter settings. When you enter content in the Notes field, a yellow asterisk is displayed on the operator.
Output Schema	The schema for the output table or view.
Output Table	The table path and name where the results are output. By default, this is a unique table name based on your user ID, workflow ID, and operator.
Storage Parameters	Advanced database settings for the operator output. Available only for TABLE output. See Storage Parameters Dialog Box for more information.
Drop If Exists	Specifies whether to overwrite an existing table. Yes - If a table with the name exists, it is dropped before storing the results. No - If a table with the name exists, the results window shows an error message.

Output

Visual Output

The Classifier outputs its prediction columns with the columns of the input data set into a prediction table location specified by user.

The data rows of the output table/view displayed (up to 2,000 rows of the data).

For example, the output for a dependent column, srsdlqncy, might look like the following.

Data Output

The Classifier operator outputs the following standardized three prediction columns:

P_dependent_column_name: The predicted value which should be one of the possible returning values of the dependent column.
C_dependent_column: The confidence of obtaining the result being the P_dependent_column_name predicted value.
C_dependent_column_details: The confidence values associated with the dependent column's possible values.

Note: If the Classifier operator has more than one input model, the resulting output has the three prediction columns per input model, and the column names are prepended with the input model operator's name.

Data Output