Pig Execute

Executes a user-defined Pig script (for parsing and sorting Hadoop data sources). The Pig Execute operator can also reference Pig UDFs (user-defined functions) that are supplied to the Team Studio server.

Information at a Glance

Category Tools
Data source type HD
Sends output to other operators Yes
Data processing tool Pig

The Pig script executed inside this operator is passed the results from its preceding operators, and it is expected to pass along its results to the succeeding operator.

In version 5.7 and later, the resulting file structure of the output is detected automatically. Otherwise, the user must define the output structure.

Input

The Pig Execute operator can accept one or more inputs; however, input is not required.

Configuration

Parameter Description
Notes Any notes or helpful information about this operator's parameter settings. When you enter content in the Notes field, a yellow asterisk is displayed on the operator.
Pig Script The Pig script to execute.

Click Edit Pig Script to open the Pig Script editor dialog box. For more information, see Define Pig Script Dialog Box.

Pass Output File Specify whether to pass the output to the next operator.
Results Location The HDFS directory where the results of the operator are stored. This is the main directory, the subdirectory of which is specified in Results Name. Click Choose File to open the Hadoop File Explorer Dialog Box and browse to the storage location. Do not edit the text directly.
Results Name The name of the file in which to store the results.
Overwrite Specifies whether to delete existing data at that path and file name.
  • Yes - if the path exists, delete that file and save the results.
  • No - Fail if the path already exists.

Output

Visual Output
A preview of the output of the Pig script.
Data Output
The data created in the operator.