Confusion Matrix
This operator displays information about the actual versus predicted counts of a classification model and helps assess the model's accuracy for each of the possible class values.
Information at a Glance
Parameter |
Description |
---|---|
Category | Model Validation |
Data source type | TIBCO® Data Virtualization |
Send output to other operators | No |
Data processing tool | TIBCO® DV, Apache Spark 3.2 or later |
Algorithm
The Confusion Matrix operator is a classification model used to evaluate the accuracy of the predicted classifications of any classification modeling algorithm. For more information, see the Confusion Matrix.
This operator takes one or more classification model objects and an input data set from upstream. It applies each of the model objects to the input data and computes the confusion matrix. The model performance is evaluated using the count of true positives, true negatives, false positives, and false negatives in a matrix.
Input
An input is a single tabular data set and one or more TIBCO Data Virtualization model operators.
- The operator accepts only the classification model.
- The operator does not accept more than one data set.
- Null values are not allowed and result in an error.
- Two types of input (tabular data and at least one model object) must be connected to this operator to prevent errors.
-
The dependent variable should be in the input data set, or else the operator produces an error.
Configuration
The following table provides the configuration details for the Confusion Matrix operator.
Parameter | Description |
---|---|
Notes | Notes or helpful information about this operator's parameter settings. When you enter content in the Notes field, a yellow asterisk appears on the operator. |
Output Schema | Specify the schema for the output table or view. |
Output Table | Specify the table path and name where the output of the results is generated. By default, this is a unique table name based on your user ID, workflow ID, and operator. |
Store Results | When set to Yes, the operator saves the results. If set to No, the operator does not save the results. |
Output
A table that displays the output for each upstream model operator.
None. This is a terminal operator.
Example
The following example uses the golf train data set to build the Naive Bayes model and then evaluates the model and golf test data set with the Confusion Matrix operator.
Data
golf train: This data set contains the following information:
- Multiple columns namely outlook, temperature, wind, humidity, and play.
- Multiple rows (14 rows).
golf test: This data set contains the following information:
- Multiple columns namely outlook, temperature, wind, humidity, and play.
- Multiple rows (14 rows).
Parameter Setting
The parameter settings for the given data set is as follows:
-
Store Results: Yes
The following figure displays the results for the parameter settings for the given data set.