Goodness of Fit
This operator computes the goodness-of-fit metrics for each class in a dependent column.
Information at a Glance
Parameter |
Description |
---|---|
Category | Model Validation |
Data source type | TIBCO® Data Virtualization |
Send output to other operators | Yes |
Data processing tool | TIBCO® DV, Apache Spark 3.2 or later |
Algorithm
This operator takes one or more classification model objects and one input data set from upstream. Then it applies each of the model objects to the input data and computes the Goodness of Fit metrics for each class, including Accuracy, Error, Recall, Precision, and FMeasure. This operator applies only to classification models.
Input
An input is a single tabular data set and TIBCO Data Virtualization model operators.
- Null values are not allowed and result in an error.
- A tabular data set and at least one model object should be connected to this operator; otherwise, the operator returns an error.
- The operator does not accept more than one data set.
- The dependent column must be available in the input data set; otherwise, the operator returns an error.
- The operator accepts only classification models.
Configuration
The following table provides the configuration details for the Goodness of Fit operator.
Parameter | Description |
---|---|
Notes | Notes or helpful information about this operator's parameter settings. When you enter content in the Notes field, a yellow asterisk appears on the operator. |
Output Schema | Specify the schema for the output table or view. |
Output Table | Specify the table path and name where the output of the results is generated. By default, this is a unique table name based on your user ID, workflow ID, and operator. |
Store Results | When set to Yes, the operator saves the results. If set to No, the operator does not save the results. |
Output
A table that displays the output of a data set.
A database table output that can be used by the downstream operator.
Example
The following example builds a Naive Bayes model, and then evaluates the model with the Goodness of Fit operator.
golf: This data set contains the following information:
- Multiple columns namely outlook, temperature, wind, humidity, and play.
- Multiple rows (14 rows).
Parameter Setting
The parameter settings for the golf data set is as follows:
-
Store Results: Yes
Output
The following figure displays the output for the parameter settings for the golf data set.