Execute External Workspace Node
The Statistica Workspace is a graphical data preparation and analysis environment, which allows you to create, view and edit a symbolic representation of a flow from the input data to the final results and models. Such flows sometimes become very complex and hard to read when deployed to production or business processes with complex data structures. It is also difficult to collaborate with others when the whole workflow is presented on a single workspace.
Execute External Workspace nodes allow you to create references to other workspaces, pass the data, and to execute and collect the results of the workspace execution. Thus they simplify the viewing of the workspace as well as enables collaboration with other team members who may be working on separate building blocks of a large workflow. "Execute External Workspace" accepts multiple inputs but produces a single output and, optionally, collections of reporting documents. The Execute External Workspace node can reference other workspaces deployed to the Statistica Enterprise repository. The user of the main workspace is assumed to have at least read permissions for the external workspace deployed to Statsitica Enterprise.
Example
In this example of a credit scoring application, the data comes from three different sources:
Applicant Info: ID, Balance of Current Account, Payment of Previous Credits, Value of Savings, Employed by Current Employer for, Installment in % of Available Income, Marital Status, Gender, Living in Current Household for, Most Valuable Assets, Age, Further running credits, Type of Apartment, Number of previous credits at this bank, Occupation
Credit Info: ID, Duration of Credit, Purpose of Credit, Amount of Credit
Credit Rating: ID, Credit Rating
In order to build a credit scoring model all three sources should be merged. The complete workflow may look like the example below:
In practical use cases, the data preparation alone can involve dozens of nodes, which might represent:
Mappings to dictionaries
Various data quality and cleaning procedures
Business rules and transformations
In this example, you can move data preparation operations (represented by two merge variables nodes in this simple example) to another external workspace.
Create a new workspace (Execute External Workspace - Credit rating (Merge Data)), and put templates of the input nodes into it, as illustrated below:
Even though this step is not required, we recommend that you include a small subset or a single row of data in the input nodes for testing and troubleshooting purposes. During the execution of the calling workspace, the data in the inputs of the external workspace will be substituted with the respective data in the calling workspace (for nodes mentioned in the input assignment).
Once, the workspace is created, mark the name of the node that produces a spreadsheet to be returned to the calling workspace. In this example, it is Merge Variables2.
Finally, deploy the workspace to Statistica Enterprise, following a set of standard steps.
At this point you can create the main modeling workflow as follows. Now the data preparation will be substituted by a single Execute External Workspace node.
Workspace Path
Select the complete path to the external workspace deployed to Statistica Enterprise. User can modify this parameter either by providing a string or by using a Select button to navigate in the Statistica Enterprise repository.
Version Information
Select Version informationfrom this dropdown list. If SDMS integration is enabled in Statistica Enterprise, the latest or latest approved version of the external workspace can be used in the analysis.
Use Defined Input/Output Nodes
Check this option to ignore the Inputs and Output Assignment parameters, and use the nodes marked as Input/Output in the external workspace.
Inputs Assignment
In this textbox, map your inputs. This is the key parameter of this node, as it maps the inputs of the Execute External Workspace node on the main workspace to the inputs of the referenced workspace.
Put Names of the respective nodes in quotes.
Use the sign -> for mapping (node name on the main workspace -> node name on the external workspace).
Use a semicolon as a delimiter between multiple mappings.
Example:
"Applicant Info"->"Applicant Info";"Credit Info"->"Credit Info";"Credit Rating"->"Credit Rating"
Output Assignment
This parameter defines the name of the node on the external workspace from which the data will be extracted and put on a downstream spreadsheet node of the Execute External Workspace node. The name should not have quotes and should match exactly the name of the node on the external workspace.