Store Intermediate Results

Many operators can be configured to store intermediate results in a database table or view, or in a Hadoop file.

Database Operators

For database operators, the user can select whether to create intermediate results as a table or view, and where the results are created. The default values for the results locations use workflow variables to enable the user to make changes across the entire workflow without having to edit each individual operator.

Note: Using views avoids creating large quantities of extra data, but the composition of many views in sequence might make it hard for the database to optimize the resulting query.



Hadoop Operators
For Hadoop operators, the user is often able to choose whether to store intermediate results. The default values for the results location use workflow variables to enable the user to make changes across the entire workflow without having to edit each individual operator. The directory is based on the value of the @default_tempdir workflow variable.



In certain cases, an operator can only be connected to another operator if its Store Results? parameter is set to True. This is because certain operators (for example, most modeling operators) must work on files rather than an intermediate result set. Team Studio automatically sets Store Results? to be True when this is required.



Downloading Stored Results

When Store Results? is set to True for a particular operator, the user is able to select that operator and download the data results at any time, provided the flow has already been run.

Right-click on the operator and select Download Results:



Note: When downloading, you can choose whether to include the header row in the download file.