Hadoop file parameter options conref

 

Use this conref for Hadoop operator topics where the file output options. The conrefs are the tgroup, not the entire table. Typically, you would use all three of these tgroups

Storage conref
Storage Format Select the format in which to store the results. The storage format is determined by your type of operator.

Typical formats are Avro, CSV, TSV, or Parquet.

Compression Select the type of compression for the output.

Available Parquet compression options.

  • GZIP
  • Deflate
  • Snappy
  • no compression

Available Avro compression options.

  • Deflate
  • Snappy
  • no compression
Compression Only conref
Compression Select the type of compression for the output.

Available Parquet compression options are the following.

  • GZIP
  • Deflate
  • Snappy
  • no compression

Available Avro compression options are the following.

  • Deflate
  • Snappy
  • no compression
Output info conref
Output Directory The location to store the output files.
Output Name The name to contain the results.
Overwrite Output Specifies whether to delete existing data at that path.
  • Yes - if the path exists, delete that file and save the results.
  • No - fail if the path already exists.
Not all Spark-enabled operators have this-use as necessary
Use Spark If Yes (the default), uses Spark to optimize calculation time.
Advanced Spark Settings information conref
Advanced Spark Settings Automatic Optimization
  • Yes specifies using the default Spark optimization settings.
  • No enables providing customized Spark optimization. Click Edit Settings to customize Spark optimization. See Advanced Settings dialog for more information.
Store results options for Hadoop
Store Results? Specifies whether to store the results.
  • true - results are stored.
  • false - the data set is passed to the next operator without storing.
Results Location The HDFS directory where the results of the operator are stored. This is the main directory, the subdirectory of which is specified in Results Name. Click Choose File to open the Hadoop File Explorer dialog and browse to the storage location. Do not edit the text directly.
Results Name The name of the file in which to store the results.
Overwrite Specifies whether to delete existing data at that path and file name.
  • Yes - if the path exists, delete that file and save the results.
  • No - Fail if the path already exists.