Goodness of Fit

This operator computes the goodness-of-fit metrics for each class in a dependent column.

Information at a Glance

Note: This operator can only be used with TIBCO® Data Virtualization and Apache Spark 3.2 or later.

Parameter

Description
Category Model Validation
Data source type TIBCO® Data Virtualization
Send output to other operators Yes
Data processing tool TIBCO® DV, Apache Spark 3.2 or later

Algorithm

This operator takes one or more classification model objects and one input data set from upstream. Then it applies each of the model objects to the input data and computes the Goodness of Fit metrics for each class, including Accuracy, Error, Recall, Precision, and FMeasure. This operator applies only to classification models.

Input

An input is a single tabular data set and TIBCO Data Virtualization model operators.

Bad or Missing Values
  • Null values are not allowed and result in an error.
  • A tabular data set and at least one model object should be connected to this operator; otherwise, the operator returns an error.
  • The operator does not accept more than one data set.
  • The dependent column must be available in the input data set; otherwise, the operator returns an error.
  • The operator accepts only classification models.

Configuration

The following table provides the configuration details for the Goodness of Fit operator.

Parameter Description
Notes Notes or helpful information about this operator's parameter settings. When you enter content in the Notes field, a yellow asterisk appears on the operator.
Output Schema Specify the schema for the output table or view.
Output Table Specify the table path and name where the output of the results is generated. By default, this is a unique table name based on your user ID, workflow ID, and operator.
Store Results When set to Yes, the operator saves the results. If set to No, the operator does not save the results.

Output

Visual Output

A table that displays the output of a data set.

Output to Successive Operators

A database table output that can be used by the downstream operator.

Example

The following example builds a Naive Bayes model, and then evaluates the model with the Goodness of Fit operator.

Goodness of Fit operator workflow

Data

golf: This data set contains the following information:

  • Multiple columns namely outlook, temperature, wind, humidity, and play.
  • Multiple rows (14 rows).

Parameter Setting

The parameter settings for the golf data set is as follows:

  • Store Results: Yes

Output

The following figure displays the output for the parameter settings for the golf data set.

Goodness of Fit operator Output