Creating a Runtime Class
Now that you have created a signature and a GUI node, it is time to create the runtime. This is where the bulk of the operator's action occurs; you define the Spark job that performs the data transformation here.
While you could extend the base class OperatorRuntime and define output steps manually, we will use SparkDataFrameRuntime. This allows us to easily write a Spark job that performs a data transformation and uses a set of predefined methods on the back end to package the results and return them to the Team Studio application.
We will call this MyColumnFilterRuntime and extend SparkDataFrameRuntime, passing in our Spark job MyColumnFilterJob as a type parameter. For our Column Filter operator, we submit a Spark job called MyColumnFilterJob. To use the default implementation that launches a Spark job according to your default Spark settings, you do not need to add any code to the body of the MyColumnFilterRuntime class.
Procedure
When Your Operator Runs
When the operator is executed, a data frame with the selected columns is returned from the Spark job. After the job finishes, the SparkDataFrameJob class saves the results as a file using the storage parameters defined in the onPlacement() method. If you did not specify, this defaults to storing on HDFS. This information is then passed to the Team Studio plugin engine so that your results can be visualized.