Overriding Hadoop Data Source Parameters Using Workflow Variables
It is possible to override the default Hadoop data source parameters using the workflow variable settings.
This fine-tunes the Hadoop data source settings for only the specified workflow.
- To view a workflow's variables, select Workflow Variables from the Actions drop-down list box.
- To create a new variable for overriding a default Hadoop setting (such as the amount of time before a task times out), click create.
- The default is 600,000 ms (or 10 minutes).
- To override a Hadoop variable for a specific workflow, click create and make a new variable called, for example, @alpine.mapred.mapred.task.timeout, where
- To override a Hadoop variable only for a specific operator task within a workflow, create a new variable called, for example,
@alpine.mapred.join.Hadoop_Join.mapred.task.timeout, where
- @alpine.mapred. indicates it is the Team Studio override for a Hadoop variable,
- join indicates it is for the Join operator,
- Hadoop_Join indicates the particular operator job that is being overridden, and
- mapred.task.timeout is the official Hadoop variable name.
- Set the override value (for this workflow's Join operators only) to 200,000, for example.
Note: Any of the possible Hadoop configuration parameters can be configured either here at the workflow level or at the Hadoop data source level. Here is a list of Hadoop configuration parameters:
Copyright © Cloud Software Group, Inc. All rights reserved.