Returns only distinct combinations of values from specified columns of a database source. Rows are not returned in any particular order, but each combination of values within a row is distinct from other rows.
Information at a Glance
Category
|
Transform
|
Data source type
|
DB
|
Sends output to other operators
|
Yes
|
Data processing tool
|
SQL
|
Note: The Distinct (DB) operator is for database data only. For Hadoop data, use the
Distinct (HD) operator.
Input
A database source. Users choose the columns from which they want distinct combinations, and the operator performs the calculation.
- Bad or Missing Values
- Missing values are considered as part of determination of distinct values. If a column has a missing value, a missing value is considered distinct from a value.
- This operator handles null values by eliminating them from the input calculation. To prevent this behavior, use the
Null Value Replacement operator on the initial training data to replace bad or missing values.
Configuration
Parameter
|
Description
|
Notes
|
Any notes or helpful information about this operator's parameter settings. When you enter content in the
Notes field, a yellow asterisk is displayed on the operator.
|
Distinct Columns
*required
|
Select one or more columns from the data source by which to generate rows of data, where each row has a distinct combination of column values.
|
Output Type
|
- TABLE outputs a database table. Specifying
TABLE enables
Storage Parameters.
- VIEW outputs a database view.
|
Output Schema
|
The schema for the output table or view.
|
Output Table
|
The table path and name where the results are output. By default, this is a unique table name based on your user ID, workflow ID, and operator.
|
Drop If Exists
|
Specifies whether to overwrite an existing table.
- Yes - If a table with the name exists, it is dropped before storing the results.
- No - If a table with the name exists, the results window shows an error message.
|
Output
- Data Output
- A subset of data with only selected columns, and each row only distinct combinations of values in those columns.
Copyright © Cloud Software Group, Inc. All rights reserved.