Convert
Provides a method for converting a Hadoop CSV file into either Avro or Parquet format.
Information at a Glance
|
Parameter |
Description |
|---|---|
| Category | Tools |
| Data source type | HD |
| Send output to other operators | Yes |
| Data processing tool | n/a |
Input
One CSV data set from the preceding operator.
Configuration
| Parameter | Description |
|---|---|
| Notes | Notes or helpful information about this operator's parameter settings. When you enter content in the Notes field, a yellow asterisk appears on the operator. |
| Storage Format | Select the format in which to store the results. The storage format is determined by your type of operator.
Typical formats are Avro, CSV, TSV, or Parquet. |
| Compression | Select the type of compression for the output.
Available Parquet compression options.
Available Avro compression options.
|
| Results Location | The HDFS directory where the results of the operator are stored. This is the main directory, the sub-directory of which is specified in Results Name. Click Choose File to open the Hadoop File Explorer dialog and browse to the storage location. Do not edit the text directly. |
| Results Name | The name of the file in which to store the results. |
| Overwrite | Specifies whether to delete existing data at that path and file name.
|
Output
Visual Output
A preview of the data.
Data Output
A data set of the chosen format and compression.