General Scripting Nodes - General Code Nodes

Introduction:

The General Scripting Host WPF Node facilitates the creation of WPF-style (newer style) scripts nodes in Statistica .

The node supports these functions:

  1. Creation of various kinds of parameters
  2. Creation/selection of code
  3. Execution of code in the context of the node in the workspace
  4. Gathering of any results produced
  5. Exporting a node to a file to support distribution/sharing.

    An instance of this node can be created using any of the four templates (one each for SVB, R, Python and C#) already included in the NodeBrowser or by importing a new node using the GenericCodeNode.dmi file located in the DataMiner>NewNodes folder in Statistica ’s installation directory.

1. Parameters Page:

This node allows the user to create parameters without having to create a DMI file to describe the parameters and their metadata. The Parameters page on the node dialog box displays all the parameters currently associated with an instance of the node.

  • You can add, edit, and delete parameters on this page, using the buttons provided.
  • You can modify the values of parameters on this page.
  • You can create new parameters by using the buttons on the left-top corner of the property grid.
  • You can specify certain constrains/requirements on the values of the parameters when the parameters are defined. Any current value of a parameter that does not satisfy the specified requirements, will display highlighted in red.

1.1 Adding a new parameter:

    1. Click the Add Property Button to display the Property Definition Editor.
    1. Use this dialog box to specify the metadata for a new parameter.

      The following types of parameters can be specified using this dialog:

  • String
  • Long Text
  • Integer
  • Floating Point
  • Boolean
  • Enum
  • Variable Selection

    Metadata for any given parameter is partitioned into two groups:  

    Basic Extended
    Basic properties are the same for any type of parameter. Extended properties vary based on the type of the parameter

1.1.1 Basic Properties:

These basic properties are available on all types of parameters:

Type Required Used to specify the type of the parameter in context
Name Required The name/key under which the parameter will become available in scripting environment. Values should start with a letter (A-Z or a-z) or an underscore (_) and the rest of the name can contain letters (A-Z or a-z), digits(0-9), or an underscore. Values provided for the Name field should be unique and should not match the Name values for any of the existing parameters
Display Name Required The name to be used in the property grid on the Parameters page to display this parameter. Values provided for the Display Name field should be unique and should not match the Display Name values for any of the existing parameters
Description Optional The description to be shown in the footer of the property grid when a property is selected
Category Optional The token/phrase used to specify the grouping of parameters in the property grid on the Parameters page

1.1.2 Extended Properties:

These properties vary depending upon the type of the parameter.

1.1.2.1 String Type:

The following extended properties are supported on a string type parameter.  

Required Specifies whether a null value is accepted
Default Value Default value to be used for the parameter
Minimum Length Minimum length for the value to be acceptable
Maximum Length Maximum length for the value to be acceptable
Regex Match A regular expression pattern that the value of the parameter should satisfy to be acceptable
Field Order The ordinal position of the parameter in the property grid in the Parameters page

1.1.2.2 Long text Type:

The parameters for the long text type are very similar to the string type with an exception that the long text type parameter does not support the Regex Match field.

1.1.2.3 Integer Type:

The following extended properties are supported on Integer type parameters:  

Required Specifies whether a null value is accepted
Default Value Default value to be used for the parameter
Minimum Length Minimum length for the value to be acceptable
Maximum Length Maximum length for the value to be acceptable
Field Order The ordinal position of the parameter in the property grid in the Parameters page

1.1.2.4 Floating point Type:

The parameters for floating point type are similar to the parameters for the integer type.

1.1.2.5 Boolean Type:

The following extended properties are supported on Boolean type parameters:  

Default Value Default value to be used for the parameter
Field Order The ordinal position of the parameter in the property grid in the Parameters page

1.1.2.6 Enum Type:  

The following extended properties are supported on Enum type parameters.

Enum Options Various (enum) options made available for the parameter
Field Order The ordinal position of the parameter in the property grid in the Parameters page
    1. Click on the Enum options field to open up another dialog that allows defining various options available for the parameter.
    1. In this dialog, define pairs of enum names and values. Enum names and values must be unique.

1.1.2.7 Variable selection Type:

The following extended properties are supported on variable selection type parameters:

Variable Type Specifies what kinds of variables are to be offered for selection: All, Categorical, and Continuous
Min Vars Minimum number of variables to be selected for selection to be valid
Max Vars Maximum number of variables to be selected for selection to be valid
Var.Sel.Title The title to be displayed on the variable selection dialog
Var.Sel.Caption A caption/prompt to be displayed on the variable selection dialog
Field Order The ordinal position of the parameter in the property grid in the Parameters page

When the variable selection is attempted on the parameters page:

  • A message displays if no input datasets are connected to the node.
  • A dataset selection displays if multiple datasets are connected to the node.

    The following table describes the type of the variable as defined in the node and the type of the parameter in the scripting environment.

Parameter type in node Parameter type in script environment
String String
Long Text String
Integer Integer
Floating type Double
Boolean Boolean
Enum Integer
Variable Selection String

1.2 Editing an existing parameter:

    1. Click the Edit Property button (button with a pencil icon) to display the Property definition dialog box.
    1.  Use it to modify the metadata for a parameter that has been created.
    1. You can also invoke this dialog box by double-clicking the parameter. You must select a parameter before you can perform an edit operation.

1.3 Deleting a new parameter:

Delete parameters by clicking the Delete property button (). A parameter has to be selected before an edit operation can be performed.

2. Code Page:  

This node allows the user to create/specify code to be executed when the node is run.

Currently SVB, R, IronPython and Python (external interpreters) are supported. Select the language of choice from the Language dropdown list. When you first create the node, starter script nuggets will display for each language selection.

You can specify code two ways:

  • Through the script window (multiline text box)
  • By specifying a script file (by checking or un-checking the Use Script File checkbox. When the checkbox is checked, the script window is disabled.)

To specify a file:

  • You can type a path to a file into the adjacent textbox.
  • You can select the adjacent buttons to invoke the local file selection dialog or an enterprise browser dialog. The script file could be located either on the local file system or in the Statistica Enterprise system.  
  • When using files from enterprise (such as enterprise:/Folder1/Folder2/File1.svb or enterprise:/ABC/xyz.py), the version information can be appended to the file path (Example:  enterprise:/Folder1/Folder2/File1.svb|LatestApproved or enterprise:/abc/def/xyz.svb|2).

    When you specify a script file using the file selection dialog or the enterprise browser dialog, a message will display asking if the script window should be updated.

  • If you select the Yes option, a copy of the source file will be reproduced in the script window and that local copy will be used for further runs.
  • If you select the No option, the source file will be used every time the node is run.

3. Execution of code in the node-context:

When the node is run:

  • Parameters defined on the node are induced into the script execution environment,
  • Mechanisms are set in place to allow access to any datasets connected to the node,
  • Mechanisms are set in place to allow routing of spreadsheets and reports into the node’s output folder,
  • Code is executed using an appropriate engine based on language selected,
  • Any errors encountered during the execution of the script will be reported back to the user.

3.1 R-specific information:

When the R language is selected on the Code Page, the R integration frame work (assuming R Integration is enabled) is invoked to execute the code specified in the node. Parameters defined in the node become available in the R environment as variables with names as specified by the Name field and of appropriate types. The following extensions/mechanisms are in-effect:

ActiveDataSet

  • NULL if no input spreadsheets are attached to the node
  • Data frame if one spreadsheet is attached to the node
  • A list of data frames if more than one spreadsheets are attached to the node

Spreadsheet

A function that can be used to open a Statistica spreadsheet and make the data therein available as a data frame.

RouteOutput

  • A function to send data back to Statistica as spreadsheets – either to the node’s output document collection or as a downstream document.
  • The AsDownstream parameter (false by default) can be set to true to mark a document as the downstream document for the node.
  • If multiple documents are routed downstream, the last one to be routed will be used as the downstream document.
  • Any console output is captured automatically and sent to node’s reporting documents folder as a report.
  • Any plots produced are captured automatically and send node’s reporting documents folder as a graphs.

3.2 SVB-specific information:

When the SVB language is selected on the Code Page, Statistica ’s SVB engine is invoked to execute the code specified in the node.

Parameters defined in the node become available in the SVB environment through the Dictionary extension with a key as specified by the Name field and of appropriate types.

This node supports two signatures for SVB:

Signature One

Sub Main

End Sub

When using the first signature, no ready mechanisms are available to access the input datasets or node parameters, or to send documents to the node's reporting documents collection.

Signature Two

Sub AnalysisNode( _

dataIn() As InputDescriptor, _

ByVal reportDocs As StaDocCollection, _

dataOut() As InputDescriptor)

End Sub

When using the second signature, all these mechanisms are available, just as in SVB nodes. Any documents added to the reportDocs collection are pushed to the node’s reporting documents collection folder. The last document (if any) in the dataOut is used as the downstream document for the node.

3.3 Python and Iron Python-Specific Information

See: Overview and Example: Python and Iron Python Scripted Nodes

IronPython-specific information:

When the IronPython language is selected on the Code Page, Statistica’s IronPython engine is invoked to execute the code specified in the node.

The following extensions/mechanism become available:

  • COM based access to Statistica becomes available through the object named Application.
  • Parameters defined in the node become available in the IronPython environment through the NodeParameters extension with a key as specified by the Name field and of appropriate types.
  • Any spreadsheets connected to the node become available through the InputContainer object.
  • Any spreadsheet assigned to DownstreamDocument is used as the downstream document.
  • Any documents added to OutputContainer object are pushed to node’s reporting document collection.

3.4 Python-specific information:

When an external python interpreter is selected on the Code Page, Statistica ’s python integration engine is invoked to execute the code specified in the node.

The following mechanisms/extensions become available:

  • COM based access to Statistica becomes available through the object named Application.
  • Parameters defined in the node become available in the Python environment through the NodeParameters dictionary with a key as specified by the Name field and of appropriate types.
  • Any spreadsheets connected to the node become available through the InputContainer object.
  • Any spreadsheet assigned to DownstreamDocument.DataSource is used as the downstream document.
  • Any documents added to OutputContainer object are pushed to node’s reporting document collection.

    Functions:

    • An object named ActiveDataSet provides a wrapper that will convert an input spreadsheet to a pandas (>= 0.18.1 ) dataframe and return the pandas (>= 0.18.1 ) dataframe.
      • The object supports numerical (starting at 0) and string (name of the input spreadsheet) indexing.
      • As a result, an input spreadsheet can be accessed as a data frame by: ActiveDataSet[1] or ActiveDataSet["AdStudy"]. If conversion fails, or if index is out of range, a None is returned.
      • This extension requires the pandas (>= 0.18.1 )  library to be installed.
      • Conversion occurs on an as-per-need basis.
    • A function named Spreadsheet provides a wrapper that will open a Statistica spreadsheet from the local disk system and convert it into a pandas (>= 0.18.1 )  dataframe.
      • If loading/conversion fails, a None is returned.
      • This extension requires the pandas (>= 0.18.1 )  library to be installed.
    • A function named RouteOutput allows pandas (>= 0.18.1 ) data frames to be routed back to Statistica as spreadsheets.
      • If the AsDownstream parameter is set to True (False is default), the spreadsheet becomes the downstream document for the node.
    • A function named RouteReport allows strings or text files to be routed back to Statistica as reports.
      • If the content parameter is set to a string, that string is sent to a report.
      • If content is None and file is set to valid text file, the contents of the text file are sent to the report.
      • The title parameter can be used to set a title for the report.
    • A function named RouteImage allows sending images to be routed to Statistica as graphs.
      • The title parameter can be used to set a title for the graph.
    • A function named RoutePlotsToStatistica installs a plotting hook for matplotlib.pyplot
      • If this function is called after matplotlib.pyplot is imported and before invoking the show function on the plot object, plots are routed to Statistica as graphs.
      • Only one call to this function is needed per script.

4. Gathering of any results produced:

Already addressed in Section 3.

5. Exporting a node:

Process once a node is created:

    1. You can export it to a DMI file by clicking on the Export Node button on the node.
    1. A new dialog will display, where you can input certain information of about the node.
    1. Then you can publish the node.

      You can import the DMI file you publish into the Node Browser and use it in workspace.