Aggregation (DB)

Performs aggregate calculations on a data set by specifying a group-by configuration and an aggregate expression.

Information at a Glance

Category Transform
Data source type DB
Sends output to other operators Yes
Data processing tool n/a
Note: the Aggregation (DB) operator is for database data only. for Hadoop data, use the Aggregation (HD) operator.

The Aggregation operator also supports the ability to define window fields.

Input

A data set from the preceding operator.

Configuration

Parameter Description
Notes Any notes or helpful information about this operator's parameter settings. When you enter content in the Notes field, a yellow asterisk is displayed on the operator.
Aggregates Specify the Group By and Aggregators fields.

To display the Define Column Aggregations dialog box, click Define Aggregations.

Window Field List Specifies window columns, associated window function expressions, and the window OVER specifications for the aggregate data set.

A window column is calculated from aggregate data by assessing the values for all other data source columns first, then walking through the results and, based on the groups of records, calculating a new value for each group. This is helpful, for example, when sorting data by percent of the total or ranking data groups.

To display the Window Column Configuration dialog box, click Window Field List. For more information, see Window Column Configuration Dialog Box.

Output Type
  • TABLE outputs a database table. Specifying TABLE enables Storage Parameters.
  • VIEW outputs a database view.
Output Schema The schema for the output table or view.
Output Table The table path and name where the results are output. By default, this is a unique table name based on your user ID, workflow ID, and operator.
Storage Parameters Advanced database settings for the operator output. Available only for TABLE output.

See Storage Parameters Dialog Box for more information.

Drop If Exists Specifies whether to overwrite an existing table.
  • Yes - If a table with the name exists, it is dropped before storing the results.
  • No - If a table with the name exists, the results window shows an error message.

Output

Visual Output
The data rows of the output table/view displayed (up to 2000 rows of the data).
Data Output
A new data set.