Normalization (HD)
Performs normalization on the selected columns of the input data set. Normalization means adjusting values measured on different scales to a notionally common scale.
Information at a Glance
Note: The Normalization (HD) operator is for Hadoop data only. For database data, use the
Normalization (DB) operator.
Algorithm
You can accomplish normalization in various ways.
Your selection translates into four possible types of normalization to select.
- Z-Transformation.
- Proportion Transformation.
- Range Transformation.
- Divide-By-Average Transformation.
See Method under Configuration for a definition of each type.
Configuration
Parameter | Description |
---|---|
Notes | Any notes or helpful information about this operator's parameter settings. When you enter content in the Notes field, a yellow asterisk is displayed on the operator. |
Method | Normalization method to use. |
Range Minimum | Specify the minimum value in Range transformation. |
Range Maximum | Specify the maximum value in Range transformation. |
Columns | Click Select Columns to open the dialog box for selecting the available numerical columns for the columns to normalize. |
Store Results? | Specifies whether to store the results. |
Results Location | The HDFS directory where the results of the operator are stored. This is the main directory, the subdirectory of which is specified in Results Name. Click Choose File to open the Hadoop File Explorer Dialog Box and browse to the storage location. Do not edit the text directly. |
Results Name | The name of the file in which to store the results. |
Overwrite | Specifies whether to delete existing data at that path and file name. |
Storage Format | Select the format in which to store the results. The storage format is determined by your type of operator.
Typical formats are Avro, CSV, TSV, or Parquet. |
Compression | Select the type of compression for the output.
Available Avro compression options. |
Use Spark | If Yes (the default), uses Spark to optimize calculation time. |
Advanced Spark Settings Automatic Optimization |
|
Related reference
Copyright © Cloud Software Group, Inc. All rights reserved.