Contents
The Python operator allows the application designer to execute arbitrary Python code within StreamBase applications. The purpose of these operators is to enable Python-centric teams to reuse their code without requiring major rewrites to execute event processing. This includes execution of models produced with SciPy and TensorFlow.
The Python stateful sessions are attached as child processes to StreamBase applications. Python operators interact with these sessions by setting input variables, executing the script, and reading output variables. The operator guarantees that all three operations are executed sequentially even if there are multiple operator instances touching the same session or the operator is running in asynchronous mode.
Python operators support any Python runtime compliant with Python 2.7 or 3.x. The script code must be compliant with the used runtime. That means it must use libraries and language structures available in the given runtime. The operator treats the script code as opaque and does not attempt to parse or compile it before sending it to the runtime. At the same time, all power of the selected runtime (libraries, Java classes in Jython, .NET access in IronPython) is accessible from the script.
The operator integration layer uses a minimal set of features from Python 2.7 and Python 3.x. It requires a pickle library and TCP/IP networking. The constructs used are compatible with Python 2.7 and 3.x.
Runtime | Version | Notes |
---|---|---|
Python 2 | 2.7.x | |
Python 3 | 3.x.x | Tested 3.4.x on CentOS 7 and 3.6.x on Windows 10. |
PyPy | 5.x | Tested 5.0.1 on CentOS 7 and 5.9.0 on Windows 10. |
Jython | 2.7.0 | |
IronPython | 2.7.7 | Requires setting the useTempFile property in the configuration file to true.+
|
The datatype passed from the inputVars
field is inferred from the field type. When you define the datatype for the outputVars
tuple fields, the operator runtime tries the best effort to cast the Python objects to StreamBase types. This table summarizes
the conversion.
StreamBase type | to Python | from Python |
---|---|---|
boolean | truth | truth, int, float |
int | int | truth, int, float |
double | float | truth, int, float |
string | unicode (Python2), str (Python3) | str, bytes, bytearray, unicode (Python2) |
timestamp | datetime.datetime (absolute), datetime.timedelta (interval) | datetime.datetime, datetime.date, datetime.time (absolute), datetime.timedelta (interval) |
blob | bytes | bytes, bytearray |
list | list | list, tuple, array.array, materialized generator (list) |
tuple | dict | dict |
capture | unsupported | unsupported |
function | unsupported | unsupported |
Define Python instances in the adapter-configurations.xml
configuration file or as local module instances. The latter approach allows you to define Python instances that are private
to concurrent regions (for parallelism), but still shared by multiple operators (for example, to separate initialization from
execution calls).
For launch parameter reference, please consult the Python documentation:
For configuration-defined Python instances, use the adapter-configuration
element.
If a value is not present, the default is used. Those values listed without a default are required.
Property | Type | Default | Description |
---|---|---|---|
instance | string | This is the name that links the operators together and is displayed in the drop-down list on each operator's property configuration when using the global instance type. | |
executable | string | python | Path to the Python executable. When absent, the instance is launched with the command, python. |
workingDir | string | . | Working directory for the launched process. When absent, the process is started in the same directory as parent StreamBase process. |
useTempFile | boolean | false | The flag indicating that the integration layer should create temporary file with Python code wrapping the interactions with StreamBase instead of pushing it through stdin. The latter (default) method works for most Python runtimes. Use this flag when launching IronPython. |
captureOutput | boolean | false | Modifies the stdout and stderr behavior. By default, both are chained to the parent's process stdout and stderr. For tests including output, it is recommended to capture this. |
envVariables | section | Environment variable to be passed/overridden launching the Python interpreter. Use the name attribute to provide name for variable and val value.
|
|
arguments | section | Argument to the Python interpreter (not script). Can be defined multiple times. The common argument used is -u , which forces Python to use unbuffered stdin/stdout/stderr streams. Use the val attribute to provide a value.
|
<adapter-configurations> <adapter-configuration name="python"> <section name="python"> <setting name="instance" val="python"/> <setting name="executable" val="C:/Python/python.exe"/> <setting name="workingDir" val="."/> <setting name="useTempFile" val="false"/> <setting name="captureOutput" val="false"/> <section name="envVariables"> <setting name="LD_LIBRARY_PATH" val="/opt/3rdparty/lib"/> </section> <section name="arguments"> <setting val="-u"/> </section> </section> </adapter-configuration> </adapter-configurations>
For Python instances defined in EventFlow, use the Python Instance operator. It uses the same parameters as the configuration file. The Python operators within the same EventFlow can refer to this instance by setting the Local Instance Id property, where the name is the Python Instance name within the EventFlow.
operator property to and supplying the instance name in theThis section describes the properties you can set for the Python operator, using the various tabs of the Properties view in StreamBase Studio.
Name: Use this field to specify or change the component's name, which must be unique in the application. The name must contain only alphabetic characters, numbers, and underscores, and no hyphens or other special characters. The first character must be alphabetic or an underscore.
Operator: A read-only field that shows the formal name of the operator. If this operator is a global Java operator or your own custom operator, then this field also shows the fully qualified class name that implements the functionality of this operator. If you need to reference this class name elsewhere in your application, you can right-click this field and select Copy from the context menu to place the full class name in the system clipboard.
Start with application: If this field is set to Yes (default) or to a module parameter that evaluates to true
, this instance of this operator starts as part of the JVM engine that runs this EventFlow fragment. If this field is set
to No or to a module parameter that evaluates to false
, the operator instance is loaded with the engine, but does not start until you send an epadmin container resume command (or its sbadmin equivalent), or until you start the component with StreamBase Manager.
Enable Error Output Port: Select this check box to add an Error Port to this component. In the EventFlow canvas, the Error Port shows as a red output port, always the last port for the component. See Using Error Ports to learn about Error Ports.
Description: Optionally enter text to briefly describe the component's purpose and function. In the EventFlow canvas, you can see the description by pressing Ctrl while the component's tooltip is displayed.
Property | Type | Description |
---|---|---|
Instance Type | radio button | When Local is selected the operator will used the instance defined in the event flow using PythonInstance operator. When Global is selected the configuration defined in the adapter-configurations.xml file is used.
|
Local Instance ID | text | When Instance Type has Local selected this provides the name of the local Python Instance operator to use. |
Global Instance ID | text | When Instance Type has Global selected this provides the name of the globally configured Python instance configured in the adapter-configurations.xml file.
|
Asynchronous | check box | When checked, the operator executes the script using a non-blocking call. This way, long operations can be executed without suspending the processing in the module. Make sure that module invariants are preserved around the call. Note that, contrary to the concurrent parallel execution in StreamBase, this operator does not allocate additional threads and uses lightweight job scheduling. |
Log Level | Drop-down list | Controls the level of verbosity the adapter uses to issue informational traces to the console. This setting is independent of the containing application's overall log level. Available values, in increasing order of verbosity, are: OFF, ERROR, WARN, INFO, DEBUG, TRACE. |
Property | Type | Description |
---|---|---|
Script | multiline text | Python code to be executed for each incoming tuple. |
Property | Type | Description |
---|---|---|
Output variables | schema definition | Definition for the expected output variables. Each field defined for the schema corresponds to the Python session variable expected to be stored by this operator's script, or any previous call. The output variables must be of type castable to StreamBase field type. Check the type conversion matrix for hints about available types. |
The input port accepts any incoming tuple transparently. The reserved fields are inputVars and outputVars.
-
inputVars — optional tuple containing variables to be set in the Python session.
-
outputVars — tuple of the structure defined in the Output Variables containing variables read from the Python session.
-
* arbitrary pass through parameters.
Unrecognized fields are transparently passed. The inputVars
field is not propagated; the outputVars
field is not allowed in the input port.