R Execute (HD)
To configure R Execute, connect a valid data source to the R Execute operator. An intermediate operator also constitutes a data source for R Execute.
Information at a Glance
R Execute (HD) is for Hadoop data only. For database data, use the R Execute (DB) operator.
For information about configuring and using this operator, see R Execute.
Input
You specify that you require the input dataset by referring to an R object called alpine_input in the script. This is a data frame object.
You might choose not to use the input dataset (by not referring to alpine_input in the script), in which case the data is not read in to R.
- If the input is a preceding operator, the preceding operator runs, but the data is not transferred to R if you do not use alpine_input in the script.
- If the preceding operator is a data source (a Hadoop file or the database table), the Hadoop data transfer or the database query does not run, saving execution time.
Restrictions
See the topic "R Execute Prerequisites" in TIBCO® Data Science Team Studio Installation and Administration for information about package, system, and server requirements.Configuration
Notes | Any notes or helpful information about this operator's parameter settings. When you enter content in the Notes field, a yellow asterisk is displayed on the operator. |
R Script |
The R script to execute. Select Define Clause to specify the R-Script. |
Results Location |
Specifies the HDFS directory where the results of the R Execute operator are stored. This is the main directory, the subdirectory of which is specified in the Results Name option (see below). |
Results Name | Select the name of the Hadoop file where the results of the R Execute operator are stored. |
Overwrite |
Determines whether the operator should overwrite an existing file if a file with the same name exists. Default value: Yes |
Pass Output File |
Specifies whether to pass the R Execute output to the next operator. Default value: No. |
Results File Structure |
Specifies the file structure of the operator's output to pass to the next operator (if Pass Output File is set to Yes). For more information, see Results File Structure Dialog Box. |
Output
- Visual Output
- The table is stored whether
Pass Output File is set to
Yes, and whether the
Results File Structure is provided. The
Pass Output Fie and
Results File Structure combination is used only to check the integrity of the flow in case the R Execute operator is followed by another operator.
If the alpine_output object does not exist in your R code, then the output is not generated. If you set Pass Output File to Yes, and if the R script does not contain a reference to the alpine_output object, then the flow fails at runtime, and an error message is displayed for this inconsistency.
- Data Output
-
If you create an alpine_output object in the R code, then the R Execute operator output displays the output data frame (persisted in the HDFS/MapR file structure) in the results console in the Data tab.
- R-Console Output
-
If your R code included functions that printed output to the R console, then the output is displayed in the results console in the R-Console Output tab.Note: The R Execute operator's console printing behavior is a bit different from the R console or RStudio behavior. Normally, if you have a statement in the R code that reads summary(alpine_input), it is shown in the R console or RStudio console. However, because R Execute is capturing the console output into an object (which is executed in R using R's capture.output function), you must wrap such calls using the print() function. For example, instead of summary(alpine_input), specify print(summary(alpine_input)). This is a limitation of how R's capture.output function works.
- R-Script
-
Your R code is shown in the results console in the R-Script tab.