Runtime Node Configuration

Overview

This article provides a reference for writing a StreamBase Runtime node deployment configuration file where the HOCON type is com.tibco.ep.dtm.configuration.node.

The node deploy configuration file is used to define deployment-time values, and to also override, and possibly augment, default application configuration. Deployment time configuration can be specified for multiple nodes in a single node deploy configuration file. The nodes can be in the same or different clusters. The node configuration to use from the node deploy configuration file when installing a node is determined by matching the node name being installed with a node name in the node deploy configuration file.

Node deployment configuration files can be specified using the epadmin install node command, or they can be packaged in an application archive. Packing a node deployment configuration in an application archive provides an easy mechanism to ship application specific default node deploy configuration values.

In addition to the configuration data defined in a node deploy configuration file, the node deploy configuration can also contain arbitrary configuration. There are two sections in the node deploy configuration file for this configuration:

  • global — applies to all nodes defined in the node deployment configuration file.

  • per-node — applies to a specific node in the node deployment configuration file.

Configuration files can also be contained in the fragments in an application archive and in the application itself.

Audits

The node deploy configuration is audited whenever it changes state (for example, load-to-active state). There are detailed audits for each of the configuration objects in the node deploy configuration file. Also, the following audits are enforced on node deploy configuration during application installation:

  • Node deploy configuration can only be contained in an application archive. Node deploy configurations cannot be contained in fragment archives. Node deploy configurations found in a fragment archive will cause an audit failure during application installation.

  • There can only be a single node deploy configuration in an application archive. Multiple node deploy configurations in an application archive will fail the audit during application installation.

  • All node deploy configurations loaded when a node is installed must have the same configuration name. Node deploy configurations with different configuration names will cause an audit failure during application installation.

Required Header Lines

Each configuration file must contain the following header lines, typically found at the beginning of each file:

name

Specifies an arbitrary, case-sensitive string to name this configuration, which must be unique among other files with the same type, if any. Configuration files can refer to each other by this name. Select a name that reminds you of this configuration's type and purpose. For example:

name = "NodeDeployment"
version

Specifies an arbitrary version number that you can use to keep track of file versions for this configuration type in your development project. The maintenance of version numbers is under user control; StreamBase does not compare versions when loading configuration files during the fragment launch process. The version number is a string value, and can contain any combination of characters and numbers. For example:

version = "1.0.0"
type

This essential setting specifies the unique HOCON configuration type described on this page.

type = "com.tibco.ep.dtm.configuration.node"

The three header lines taken together constitute a unique signature for each HOCON file in a project's configurations folder. Each project's configurations folder can contain only one file with the same signature.

The top-level configuration object defines the configuration envelope the same way for all HOCON file types.

configuration

On a line below the header lines, enter the word configuration followed by an open brace. The configuration object is a sibling of the name, version, and type identifiers, and serves to define the configuration envelope around this type's objects as described on this page. The file must end with the matching close brace.

configuration = {
...
...
}

HOCON Properties Explained

Below shows the configuration's HOCON properties, usage, and syntax example, where applicable.

NodeDeploy

Node deployment configuration

globalConfiguration

String. An array of global late-bound application configuration objects. The configurations are loaded onto all nodes in the cluster. Format is identical to the configuration service HOCON configuration format, but in string form.

This array is optional and has no default value; it is used only during the boot process.

nodes

Associative node instances, keyed by node name.

A1.nodedeploy

String. Example of a node name containing the following objects:

description

Human-readable description.

For example:

"First node in the nodedeploy cluster for testing."
nodeType

The node's type, as defined in the application definition. This is a reference to a type defined in the application definition configuration file. If no type is found, the configuration will fail. This object is optional and its value references a built-in node type, default.

For example:

"nodetype1"
engines

Associative objects describing engines that run the fragments used by this node's nodeType. Each engine name is mapped to a descriptor that contains the ID of the fragment the engine is running, plus any engine-specific configurations. The identifier must be present in a single fragment archive's manifest TIBCO-EP-Fragment-Identifier property.

This object is optional and is used only during the boot process. If absent, the node will run a single engine for each fragment supported by its node type.

engine1

String. An example of an engine name, containing the following properties and arrays:

fragmentIdentifier

String. The binding's fragment identifier.

For example:

fragmentIdentifier = "fragment1"
configuration

String. An array of engine-specific initial late-bound application configuration objects. Configurations that inherit from EngineAffinityConfiguration are given a default engine affinity value of this engine. Format is identical to the configuration service HOCON configuration format, but in string form. This array is optional and has no default value.

For example:

configuration = []
communication

Communication ports. Optional. If unset, a set of default property values apply, which are set in the contain types.

numberSearchPorts

Long. The number of ports to search before reporting a distribution listener start failure. The search is started at the configured network listener port number for each distribution listener interface, and is then incremented by one on each failure up to this value. A value of 0 disables port search. The minimum value that can be specified is the total number of fragment engines plus one. Optional. Default value is 20.

This value in effect specifies a range size for a set of listener ports for auto-generated distribution listeners. If you specify a dataTransportPort value of n, the range of potentially used ports is n + numberSearchPorts value. Do not assign another dataTransportPort value for one node within the port range of another node. That is, you must stagger dataTransportPort assignments at least numberSearchPorts apart.

For example:

numberSearchPorts = 10
discoveryPort

Long. Broadcast discovery port. This property is optional and its value defaults to 54321.

For example:

discoveryPort = 2222
discoveryRequestAddresses

String. An array of broadcast discovery client request addresses, either IPV4 or DNS host name. The discovery requests are broadcast to the discovery port on the network interface associated with each network address. The discovery request recipients always listen on all interfaces; that is not configurable. This array is optional and its default value is a single address, which is the system host name.

For example:

discoveryRequestAddresses = [ "localhost" ].
administration

Communication settings for administration transport. This is optional. If unset, a set of default values apply. These defaults are set in the contain types.

address

String. Administration interface address. You can specify this optional property as an IPv4 or DNS address; this is used only during the boot process. The default of "" means listen on all addresses.

For example:

address = "localhost"
transportPort

Long. Administration transport port number. This property is used only during the boot process, is optional, and its value defaults to 0, indicating that the node should auto-generate the port number.

For example:

transportPort = 1000
webEnable

This optional property enables the administration web server by default. Setting to false disables all queries of the Runtime REST API, whether browser-based or not (for example, the web server cannot be enabled from command line property using epadmin start web).

For example:

webEnable = true
webPort

Administration web server port number. A value of 0 causes the node to auto-generate the port number starting from 8008. Optional. Default value is 0.

For example:

webPort = 0
webUIEnable

This optional property enables the administration web server endpoint help UI by default, indicating the web interface is enabled and browser-based REST API queries are possible. False means browser-based REST API queries are not enabled though other queries are still possible through other methods.

For example:

webUIEnable = true
webMaximumFileSizeMegabytes

Maximum file size in megabytes for downloaded files if webFileSizeHandling is set to LIMITED_FILE_SIZE. Value must be > 0. Optional. Default value is 5.

webFileSizeHandling

Controls the maximum size of downloaded files. FILE_UPLOAD_NOT_ALLOWED disables file downloads. LIMITED_FILE_SIZE restricts file size to the value specified in webMaximumFileSizeMegabytes. UNLIMITED_FILE_SIZE sets no limit on downloaded file size. Optional. Default value is LIMITED_FILE_SIZE.

webMaximumFileCacheTimeMinutes

Time in minutes to cache downloaded files if webFileCacheTimeHandling is set to TIMED_CACHE. Value must be > 0. Optional. Default value is 15.

webFileCacheTimeHandling

Controls the caching of downloaded files. CACHE_FOREVER stores the files forever; files are never deleted. DO_NOT_CACHE never stores downloaded files on the node; files are immediately deleted after use. TIMED_CACHE caches the files for the amount of time specified in webMaximumFileCacheTimeMinutes. Optional. Default value is TIMED_CACHE.

webServiceBindings

A binding between a web service name and authentication realm name. This provides a mechanism to override the default node administration authentication realm for a specific web service. Available web services are: {admin | healthcheck}. This object is optional and has no default value.

authenticationRealmName

Authentication realm name for the specified web service binding. Required.

For example:

webServiceBindings = {
  "admin" = {
     authenticationRealmName = "my-local-auth-realm"
     }
  "healthcheck" = {
     authenticationRealmName = "my-other-local-auth-realm"
     }
  }
distributionListenerInterfaces

Distribution transport. If this optional array is unset, a set of default values apply. These defaults are set in the contain types.

address

String. Listen interface address. This address can be specified as an IPv4, IPv6, or DNS name. A special prefix of IPoSDP: indicates the use of Infiniband sockets direct protocol (Linux only).

If this optional property is unset, the default is "" indicating listening on all interfaces.

For example, address = "localhost".
dataTransportPort

Long. Distribution listener port number. This property is optional and its value defaults to 0, indicating that the node should auto-generate the port number. Do not assign another dataTransportPort value for one node within the port range of another node, as described for the numberSearchPorts property above.

For example:

dataTransportPort = 1001
secure

Deprecated as of 10.5.0. Use secureCommunicationProfileName instead.

Bool. A secure-transport indicator. If true, use TLS to secure communication to the host, if false do not. This property is optional and its default value is false.

For example:

secure = false
secureCommunicationProfileName

Name of a secure communication server profile to use to configure secure communications for the node's administration, distribution, and REST API listeners. The secure communication server profile specified here must include a truststore. This profile is also used for node-to-node connections and the truststore is required to validate the certificates sent by the other nodes in the cluster. This property is optional and has no default value.

proxyDiscovery

In the case where node discovery cannot be achieved at runtime (for example, UDP broadcast is not permitted), you can configure node discovery here as a proxy discovery.

remoteNodes

String. List of remote nodes to which to provide proxy discovery services.

For example:

remoteNodes = [ "A2.nodedeploy" ]
configuration

An array of node-specific initial late-bound application configuration objects.

availabilityZoneMemberships

An array of availability zones that this node is part of.

staticzone

The memberships are an associative array keyed by availability zone. In this example there is a downstream availability zone called staticzone. This node is declaring membership in this zone and binding that membership to a set of partitions.

staticPartitionBindings

Static partition binding object.

For example:

staticPartitionBindings = {
  P1 = {
   type = ACTIVE
   replication = SYNCHRONOUS
   }
}
Partition1

String. Example partition name. For a partition named Partition1 containing the following properties:

type

Valid types are ACTIVE and REPLICA. This property is required.

restoreFrom

DEPRECATED as of 10.4.0.

Specify that the partition should be restored from this node. When this property is set to true, the partition defined on this node is loaded to the local node. This should be done when restoring a node from a split-brain situation, where this node is the node in the cluster where all objects should be preserved, and the local node is the node being restored. Any conflicts during restore will preserve the objects on this node, and remove the conflicting objects on the local node.

A restore is needed when multiple nodes are currently the active node for a partition in a cluster due to a split-brain scenario. In this case, the application needs to decide which active node will be the node where the objects are preserved during a restore. Note that this node does not necessarily have to be the node that becomes the partition's active node after the restore completes.

The actual restore of the partition is done in the enable() or enablePartitions() method when the JOIN_CLUSTER_RESTORE enableAction is used. If any other enableAction is used, object data is not preserved, and no restoration of partition objects is done.

If restoreFrom is not set after a split-brain scenario, the runtime performs a cluster-wide broadcast to find the current active node, and use that node to restore instances in the partition. If multiple active nodes are found, the first responder is chosen.

This optional property's default value is false.

replication

Replication type. Valid values are SYNCHRONOUS and ASYNCHRONOUS. This property is optional and its default value is SYNCHRONOUS.

dynamiczone

The memberships are an association keyed by availability zone. In this case there is a downstream availability zone called dynamiczone. This node is declaring membership in this zone and binding that membership to a set of partitions.

votes

Long. Number of quorum votes for this node. Optional. If not set, a default of 1 is used.

For example, votes = 1
dynamicPartitionBinding

This object defines a dynamic partition binding for a node. Dynamic partitions can also be bound to a node using the primaryMemberPattern and the backupMemberPattern properties in the DynamicPartitionPolicy configuration object.

type

Member type. For dynamic groups, valid types are PRIMARY and BACKUP.

For example, type = PRIMARY
availabilityZones

An associative array of availability zones, keyed by zone name. Each node can be part of zero or more zones.

staticzone

The memberships are an associative array keyed by availability zone. In this case there is a downstream availability zone called staticzone. This node is declaring membership in this zone and binding that membership to a set of partitions.

dataDistributionPolicy

Data distribution policy that applies to this availability zone. This is a reference to a policy defined in the application definition file. If no policy is found, the configuration will fail.

For example:

dataDistributionPolicy = "static"
staticPartitionPolicy

Static partition distribution policy. This optional object has no default value.

disableOrder

Order when disabling partitions in static partition groups.

Value is one of REVERSE_CONFIGURATION or REVERSE_LOCAL_THEN_REMOTE.

REVERSE_CONFIGURATION disables the partitions in reverse order based on partition rank.

REVERSE_LOCAL_THEN_REMOTE disables all partitions that have the local node as the active node, then disable all partitions where a remote node is the active node.

This property is optional and its default value is REVERSE_CONFIGURATION.

enableOrder

Order when enabling partitions in static partition groups.

Value is one of CONFIGURATION or REMOTE_THEN_LOCAL.

CONFIGURATION enables the partitions based on partition rank.

REMOTE_THEN_LOCAL enables all partitions that have a remote node as the active node, then enables the partitions where the active node is the local node.

This property is optional and its default value is CONFIGURATION.

loadOnNodesPattern

String. A regular expression describing which nodes know about the partitions in this group. Note this set of nodes could be different from the nodes that actually participate in the partitions. Load-on interest can be expressed in one or both of two ways: via this regular expression or explicitly by each node.

If the local node matches loadOnNodesPattern, then the partition is added to the local node even if it is not the active or backup node. This allows creation of foreign partitions. Note that the configuration is additive; you can define a partition in the node's availability zone membership, the availability zone loadOnNodesPattern, or both.

This property is optional and its value defaults to all nodes that are members of this availability zone.

For example:

loadOnNodesPattern = ".*"
staticPartitions

An optional, associative array of partitions, keyed by partition name.

P1

String. Example partition name. For a partition named P1 containing the property:

rank

Long. The partition's ranking number for enable and disable order.

When the partition staticPartitionEnableOrder or staticPartitionDisableOrder is set to CONFIGURATION, this rank specifies the order. Partitions with a lower ranking number are enabled before those with a higher ranking number.

If multiple partitions share the same ranking number, their enable order is indeterminate.

The highAvailability, staticPartitionEnableOrder, and staticPartitionDisableOrder properties control whether ranking order is strictly observed, or whether local partitions are always enabled ahead of remote with ranking observed within those classifications.

Disable order always reverses the rankings, with higher numbers disabled ahead of lower numbers.

This property is optional and its default value is 1.

For example:

rank = 1
dynamiczone

The memberships are an associative array keyed by availability zone. In this case there is a downstream availability zone called one called dynamiczone. This node is declaring membership in that zone and binding that membership to a set of partitions.

percentageOfVotes

Long. Minimum percentage of votes needed to form a quorum. This object is optional and has no default value. This object is mutually exclusive with minimumNumberOfVotes. If neither is set then quorums are not enabled for this availability zone.

For example:

percentageOfVotes = 51
dataDistributionPolicy

Data distribution policy that applies to this availability zone. This is a reference to a policy defined in the application definition file. If no policy is found, the configuration will fail.

For example:

dataDistributionPolicy = "dynamic"
dynamicPartitionPolicy

Dynamic partition distribution policy. You must specify a dynamic data distribution policy in the dataDistributionPolicy property if this property is set. This object is optional and has no default value.

primaryMemberPattern

String. A regular expression describing the primary node membership for this dynamic partition policy. Membership can be expressed in one, or both, of two ways: via this regular expression or explicitly for each node. The bindings are additive.

Optional. Default value is all nodes in a cluster, unless backupMemberPattern is specified, in which case, there is no default value. A regular expression that does not match any nodes in the cluster must be specified to disable this binding, if backupMemberPattern is not specified.

For example:

primaryMemberPattern = ".*"
backupMemberPattern

A regular expression describing the backup node membership for this dynamic partition policy. Membership can be expressed in one, or both, of two ways: via this regular expression or explicitly for each node. The bindings are additive. If backup nodes are defined all primary nodes must be explicitly defined (that is, setting this property disables the primaryMemberPattern default value. Optional. No default.

For example:

backupMemberPattern = ".*"
minimumNumberOfVotes

Long. Minimum number of votes needed to form a quorum. This property is optional, has no default value, and is mutually exclusive with percentageOfVotes. If neither are set then quorums are not enabled for this availability zone.

For example:

minimumNumberOfVotes = 5
quorumMemberPattern

String. Quorum membership pattern. This is a Java regular expression Membership can be expressed in one or both of two ways: via this regular expression or explicitly by each node.

All nodes matching this regular expression are part of the quorum. Such members get a single quorum vote. This property has no default value.

For example:

quorumMemberPattern = ".*"

Default Availability Zone and Data Distribution Policy

Availability zones, and their associated data distribution policies, are generally created using node deploy configuration. For convenience, there is also a default availability zone and data distribution policy automatically defined that can you can use if the default characteristics are adequate. They are:

  • default-cluster-wide-availability-zone

  • default-dynamic-data-distribution-policy

The default availability zone:

  • includes all nodes in the cluster

  • node membership is elastic as nodes are added and removed from the cluster

  • disables quorum management

  • uses the default data distribution policy

The default data distribution policy:

  • is a dynamic data distribution policy

  • uses distributed consistent hashing for data partitioning

  • uses synchronous replication

  • has a replication count of two

The default availability zone and data distribution policy are "built-in" meaning they do not require a configuration file to specify them (though you can continue to override the defaults by specifying your own settings in a configuration file). These defaults are available as follows:

Configuration File Sample

The following is an example of the node file type.

name = "NodeDeployment"
version = "1.0"
type = "com.tibco.ep.dtm.configuration.node"

configuration = {
  NodeDeploy = {
    globalConfiguration = [
    ]
    nodes = {
      "A1.nodedeploy" = {
        description = "my node"
        nodeType = "nodetype1"
        engines = {
          engine1 = {
            fragmentIdentifier = "fragment1"
            configuration = []
          }
        }
        communication = {
          numberSearchPorts = 10
          discoveryPort = 2222
          discoveryRequestAddresses = [ 
            ${HOSTNAME} 
          ]
          administration = {
            address = ${HOSTNAME}
            transportPort = ${A1_ADMINPORT}
            webEnable = true
            webPort = 0 
          }
          distributionListenerInterfaces = [ 
            {
              address = ${HOSTNAME}
              dataTransportPort = ${A1_DATATRANSPORTPORT}
              secure = false
            } 
          ]
          proxyDiscovery = {
            remoteNodes = [ 
              "A1.nodedeploy" 
            ]
          }
        }
        configuration = [
        ]
        availabilityZoneMemberships = {
          staticzone = {
            staticPartitionBindings = {
              P1 = {
                type = ACTIVE
                replication = SYNCHRONOUS
              }
            }
        }
          dynamiczone = {
            votes = 1
            dynamicPartitionBinding = {
              type = PRIMARY
            }
          }
        }
        routerBindings = [
          {
            routerName = "myHashRouter"
            availabilityZone = "staticzone"
          }
        ]
      }
      "A2.nodedeploy" = {
        nodeType = "nodetype1"
        communication = {
          distributionListenerInterfaces = [ {
            address = ${HOSTNAME}
            dataTransportPort = ${A2_DATATRANSPORTPORT}
            secure = false
          } ]
        }
      }
      "B1.nodedeploy" = {
        nodeType = "nodetype1"
        communication = {
          distributionListenerInterfaces = [ {
          address = ${HOSTNAME}
          dataTransportPort = ${B1_DATATRANSPORTPORT}
          secure = false
          } ]
        }
      }
    }
    availabilityZones = {
      staticzone = {
        dataDistributionPolicy = "static"
        staticPartitionPolicy = {
          disableOrder = REVERSE_CONFIGURATION
          enableOrder = CONFIGURATION
          staticPartitions = {
            P1 = {
              rank = 1
            }
          }
          loadOnNodesPattern = ".*"
        }
      }
      dynamiczone = {
        percentageOfVotes = 51
        dataDistributionPolicy = "dynamic"
        dynamicPartitionPolicy = {
          primaryMemberPattern = ".*"
          backupMemberPattern = ".*"
      }
      minimumNumberOfVotes = 5
      quorumMemberPattern = ".*"
      }
    }
  }
}

Node Availability Zone Configuration Examples

If a node is explicitly bound to a dynamic availability zone, using the availabilityZoneMemberships object, that zone is used. However, if a node is not explicitly bound to a zone then an implicit binding can occur if primaryMemberPattern or backupMemberPattern match the node name.

The following examples describe explicit and implicit zone bindings.

Explicit binding example:

name = "HAConfig"
version = "1.0"
type = "com.tibco.ep.dtm.configuration.node"
configuration = {
  NodeDeploy = {
    nodes = {
      "A1.nodedeploy" = {
        availabilityZoneMemberships = {
          dynamiczone = {                  // use this zone
            dynamicPartitionBinding = {
              type = PRIMARY
            }
          }
        }
      }
    }
    availabilityZones = {
      dynamiczone = {
          dataDistributionPolicy = "dynamic"
      }
    }
  }
}

Implicit binding example:

name = "HAConfig"
version = "1.0"
type = "com.tibco.ep.dtm.configuration.node"
configuration = {
  NodeDeploy = {
    nodes = {
      "A1.nodedeploy" = {
      }
    }
    availabilityZones = {
      dynamiczone = {
        dataDistributionPolicy = "dynamic"
          dynamicPartitionPolicy = {
            primaryMemberPattern = "A[14].nodedeploy"  
            // nodes matching this pattern are included in the zone
          }
      }
    }
  }
}

Since the primaryMemberPattern property defaults to .*, the implicit example can be simplified to:

name = "HAConfig"
version = "1.0"
type = "com.tibco.ep.dtm.configuration.node"
configuration = {
  NodeDeploy = {
    nodes = {
      "A1.nodedeploy" = {
      }
    }
    availabilityZones = {
      dynamiczone = {
        dataDistributionPolicy = "dynamic"
        // matches dynamicPartitionBinding .* by default
        }
    }
  }
}