Sort By Multiple Columns
Allows you to choose up to three columns to sort by and returns a data set sorted by the selected column(s), adding a column called row_index that enables you to filter the output based on the sorting results.
Configuration
Parameter | Description |
---|---|
Notes | Any notes or helpful information about this operator's parameter settings. When you enter content in the Notes field, a yellow asterisk is displayed on the operator. |
Primary Sort Column | First column to sort by. While Secondary Sort Column and Tertiary Sort Column can be left blank, this column is required. |
Primary Column Sort Order | Order by which to sort the first column: Ascending (the default) or Descending. |
Secondary Sort Column | Second column to sort by. To sort by one column only, leave this column and the Tertiary Sort Column blank. |
Secondary Column Sort Order | Order by which to sort the second column: Ascending (the default) or Descending. |
Tertiary Sort Column | Third column to sort by. To sort two columns only, leave this one blank. |
Tertiary Column Sort Order | Order by which to sort the third column: Ascending (the default) or Descending. |
Create 'row_index' Column | Specify whether to add the row_index column, which adds an extra column to the data set that shows the sort index.
Default value: No. |
Write Rows Removed Due to Null Data to File | Rows with null values (only in the columns selected to sort by) are removed from the analysis. This parameter allows you to specify that the data with null values be written to a file.
The file is written to the same directory as the rest of the output. The filename is suffixed with _baddata.
|
Storage Format | Select the format in which to store the results. The storage format is determined by your type of operator.
Typical formats are Avro, CSV, TSV, or Parquet. |
Compression | Select the type of compression for the output.
Available Avro compression options. |
Output Directory | The location to store the output files. |
Output Name | The name to contain the results. |
Overwrite Output | Specifies whether to delete existing data at that path. |
Advanced Spark Settings Automatic Optimization |
|
Related reference
Copyright © Cloud Software Group, Inc. All rights reserved.