Row Cleanser
This operator removes the records according to the specified row completeness criteria.
Information at a Glance
|
Parameter |
Description |
|---|---|
| Category | Transform |
| Data source type | TIBCO® Data Virtualization |
| Send output to other operators | Yes |
| Data processing tool | TIBCO® DV, Apache Spark 3.2 or later |
Algorithm
This operator applies a set of rules to remove incomplete rows. The user selects the columns to focus on and then a filtering condition is set. According to this condition, rows are selectively removed.
The number of null values in selected columns per each row is calculated. The input rules are applied so that the remaining rows have the desired limit of null columns.
Input
An input is a single tabular data set.
Configuration
The following table provides the configuration details for the Row Cleanser operator.
| Parameter | Description |
|---|---|
| Notes | Notes or helpful information about this operator's parameter settings. When you enter content in the Notes field, a yellow asterisk appears on the operator. |
| Columns to Use | Specify the columns for checking the null values. Click Select Columns to select the required column. |
| How many selected columns should be null before removing rows | Specify the filtering limits to be calculated.
The following values are available:
Default: All |
| Percentage(%) / Number of Columns |
Specify the percentage or number of columns to calculate. If the previous parameter is set as A percentage of columns, specify the desired percentage. If set as A number of columns, specify the desired number. If the previous parameter is set as All or Any, this parameter is ignored. Default: 80 |
| Output Schema | Specify the schema for the output table or view. |
| Output Table | Specify the table path and name where the output of the results is generated. By default, this is a unique table name based on your user ID, workflow ID, and operator. |
| Store Results | When set to Yes, the operator saves the results. If set to No, the operator does not save the results. |
Output
Example
The following example displays the cleansed data for the given data set using the Row Cleanser operator.
- Multiple columns namely outlook, temperature, wind, humidity, and play.
- Multiple rows (14 rows).
-
Columns to Use: outlook, temperature, humidity
-
How many selected columns should be null before removing rows: A percentage of columns
-
Percentage(%) / Number of Columns: 80
-
Store Results: Yes