Using the External Process Operator

This topic explains how to use the External Process operator and describes the configuration settings you can make in the operator's Properties view.

Introduction

The External Process operator is a Java operator that provides a way for StreamBase applications to run arbitrary operating system commands as if typed at the shell command prompt for the current operating system. This feature is especially useful in StreamBase high availability design patterns, where an application in one container might need to send an sbadmin command to an application in another container or on another StreamBase.

The External Process operator is a member of the Java Operator group in the Palette view in StreamBase Studio. Select the External Process operator from the Insert an Operator or Adapter dialog, which you invoke with one of the following methods:

  • Drag the Adapters, Java Operators token from the Operators and Adapters drawer of the Palette view to the canvas.

  • Click in the canvas where you want to place the operator, and invoke the keyboard shortcut O V

  • From the top-level menu, invoke InsertOperatorJava.

From the Insert an Operator or Adapter dialog that opens, select External Process and double-click or press OK.

Running External Commands

The External Process operator incurs operating system overhead and should not be considered for any portion of a StreamBase application that requires fast performance. The operator should never be called repeatedly to perform a task. This operator is designed to be called on rare occasions, such as in a failover event in a highly available StreamBase Server cluster.

You can use this operator to run any command on the PATH of the StreamBase Server instance that is running the operator. You can specify command switches, command arguments, and environment variables for the command you want to run, and you can modify the switches, arguments, and environment variables at runtime while calling the specified command. You can specify variables as part of the command to be run, and have those variables resolved at runtime.

Caution

This operator has the same permissions as the containing StreamBase Server to run any operating system command in the PATH and environment of that server, and by default, the working directory of the operator is the same as the working directory of the containing server. Use caution in specifying the command for this operator to run.

The specified operating system command must be locatable on the PATH of the containing StreamBase Server, or it can be specified by absolute path or a path relative to the containing StreamBase Server's current working directory.

Properties View Settings

This section describes the properties you can set for a Drools operator, using the various tabs of the Properties view in StreamBase Studio.

General Tab

This section describes the properties on the General tab in the Properties view for the External Process operator.

Name: Use this field to specify or change the component's name, which must be unique in the application. The name must contain only alphabetic characters, numbers, and underscores, and no hyphens or other special characters. The first character must be alphabetic or an underscore.

Operator: A read-only field that shows the formal name of the operator.

Class: A field that shows the fully qualified class name that implements the functionality of this operator. Use this class name when loading the operator in StreamSQL programs with the APPLY JAVA statement. You can right-click this field and select Copy from the context menu to place the full class name in the system clipboard.

Start with application: If this field is set to Yes or to a module parameter that evaluates to true, an instance of this operator starts as part of the containing StreamBase Server. If this field is set to No or to a module parameter that evaluates to false, the adapter is loaded with the server, but does not start until you send an sbadmin resume command, or until you start the component with StreamBase Manager. With this option set to No or false, the operator does not start even if the application as a whole is suspended and later resumed. The recommended setting is selected by default.

Enable Error Output Port: Select this check box to add an Error Port to this component. In the EventFlow canvas, the Error Port shows as a red output port, always the last port for the component. See Using Error Ports and Error Streams to learn about Error Ports.

Description: Optionally enter text to briefly describe the component's purpose and function. In the EventFlow canvas, you can see the description by pressing Ctrl while the component's tooltip is displayed.

Command Arguments Tab

This section describes the properties on the Command Arguments tab in the Properties view for the External Process operator.

Operator Type

Select either CommandLine or Arguments to specify how you will enter the command to be run by this operator. Select CommandLine to enter the command as a StreamBase expression. Specify Arguments to enter the command in table form, with one row for each command, switch, and option.

Command line as expression

If you selected CommandLine as the Operator Type, use this field to specify the full command to be run, expressed in the form of a StreamBase expression. You must surround the entire command line with single quotes to ensure that the expression is evaluated as an expression, even if your expression contains only literal strings.

The command expression can contain the name of one or more fields in the incoming command tuple on the input port. In this case, the value of each specified field is substituted when the command is run.

Using an expression to specify the command to be run allows you to evaluate variables at runtime, instead of passing the variables to the operator in the input command tuple. For example, an External Process operator in the HA2 sample specifies the following command. This is a single command on one line, broken into two lines for clarity:

'sbadmin -u\"' + getServerURI() 
    + '\" setLeadershipStatus LEADER'

This command fills in the argument for the –u option with an expression that evaluates to the StreamBase URI for the currently running StreamBase Server, or to the string URI-not-specified if the expression fails to evaluate.

When the command line you want to run contains single quotes, escape them with backslash-double-quote to prevent their interpretation as the end of the expression.

Include the command line in the tuple output

If you select this check box, the schema sent to the output port is extended with a field containing the exact command line run, with all variables resolved. Use this check box for debugging while developing your operator.

Command output type

Specify either blob or string (the default). This setting specifies the data type of three fields of the output tuple: stdout, stderr, and (if enabled) cmdline.

Command Arguments

If you selected Arguments as the Operator Type, use the Command Arguments table to specify the command to be run and its arguments. Specify the command name in row 1, using a relative or absolute path to the command. In subsequent rows, you can optionally add the exact switches and arguments with which to run the command, split into separate rows on whitespace boundaries. That is, if you want to run the command sbadmin setLeadershipStatus NON_LEADER, specify the three parts of the command on three rows. To run the same command except specifying a StreamBase URI, use five rows as shown in this example:

sbadmin
-u
sb://remotesbhost:10001/appcontainer
setLeadershipStatus
LEADER

Any row of the Command Arguments section can contain the name of a field in the incoming input tuple. In this case, the contents of that field are substituted when the command is run. For example, your input command tuple might contain a field named serverURI that contains the programmatically determined URI of one member of a StreamBase Server cluster. In this case, specify five rows in the Arguments section, containing:

sbadmin
-u
serverURI
setleadershipStatus
NON_LEADER

If you have a name collision with a field in the input tuple, you can use quotes to escape the interpretation as a field name. For example, if you wanted the operator to run the command cat file1, but there happens to be a field in the input tuple also named file1, enter the following in the command arguments table:

cat
"file1"
Arguments field prefix

If you selected Arguments as the Operator Type, and if you enable the Include arguments and environment variables in output option on the Advanced tab, then the string you specify in this field is prepended to the contents of any command argument fields sent to the output port. The default value is arg_.

Environment Tab

This section describes the properties on the Environment tab in the Properties view for the External Process operator.

Environment

Use the Environment table to specify any environment variables needed by the command you want to run, that are not already in the environment of the containing StreamBase Server, or whose values you want to change for the purpose of running the specified command. For each row, specify the variable name, an equals sign, and value. For example:

Environment variables field prefix

If you specify any environment variables, and if you enable the Include arguments and environment variables in output option on the Advanced tab, then the string you specify in this field is prepended to the contents of any environment variable fields sent to the output port. The default value is env_.

Advanced Tab

This section describes the properties on the Advanced tab in the Properties view for the External Process operator.

Include input tuple in output

If you select this check box, the schema sent to the output port is extended with the fields of the command tuple received at the input port.

Input tuple field prefix

If you use the check box above, the string you specify in this field is prepended to the contents of each input tuple field sent to the output port. The default value is input_.

Include arguments and environment variables in output

If you select this check box, the schema sent to the output port is extended with fields containing the path of the command run, the switches and arguments run, and any environment variables specified in the operator.

Working Directory

Use this field to specify a working directory for the command to be run. The default working directory is that of the StreamBase Server that runs this operator. Specify as a StreamBase expression the full absolute path, or a path relative to the working directory of the StreamBase Server that runs this operator. You must surround the entire expression with single quotes to ensure that the expression is evaluated as an expression, even if your expression contains only literal strings.

Run this command asynchronously

Select this check box to run the specified command without holding any locks. This allows you to call sbadmin shutdown and other such commands that attempt to place locks, without causing a deadlock situation. With this option enabled, this operator's output tuple occurs asynchronously, similar to marking the operator to run in parallel threads in the Concurrency tab. The difference is that with the parallel threads setting, the operator's command can still place a lock.

Caution

Use this feature with care and consideration of its consequences. If you are unsure, do NOT select this check box.

Number of async worker threads

This option is enabled when the previous option is selected. The default value of 1 means that the operator's commands are performed serially. This is ideal behavior when using the operator to perform commands that must be run in sequence, such as Add Container, then Modify Container. Positive values specify a limit on the number of threads to spawn to perform the operator's requested operations. Negative or 0 means the operator spawns an unlimited number of threads.

Concurrency Tab

Use the Concurrency tab to specify parallel regions for this instance of this component, or multiplicity options, or both. The Concurrency tab settings are described in Concurrency Options, and dispatch styles are described in Dispatch Styles.

Caution

Concurrency settings are not suitable for every application, and using these settings requires a thorough analysis of your application. For details, see Execution Order and Concurrency, which includes important guidelines for using the concurrency options.

Ports

By default, the External Process operator has one input port and one output port:

  • Input Port. The operator takes commands from the input tuple. The schema of the input command tuple is application-specific, but typically includes string fields containing switches, arguments, or parameters for the command to be run. Each operator runs its specified command on receipt of a tuple on the input port, optionally using fields in the incoming tuple to specify switches or parameters for the command.

  • Output Port. The operator sends a tuple on the output port whenever it runs the specified command. By default, the schema of the output tuple is the following:

    Field Name Data Type Description
    stdout blob Contains the standard output of the command that was run.
    stderr blob Contains the standard error of the command that was run.
    exitcode int Contains the exit code value returned from the command that was run.

    The schema of the output tuple can be optionally extended by using one or more of the following options, which are described in the indicated sections.

    Include the command line in the tuple output Command Arguments Tab
    Include arguments and environment variable in output Advanced Tab and Environment Tab
    Include input tuple in output Advanced Tab

You can also add an optional Error Output port, which outputs a StreamBase error tuple for any error thrown by the operator, as described in General Tab.

Examples

The HA samples shipped with StreamBase include several uses of the External Process operator. See High Availability Sample and High Availability Shared Disk Access Sample for instructions on locating and running the samples.