Cluster Monitor Application

About This Application

The Cluster Monitor application is a ready-to-run and configurable StreamBase Application that monitors the performance of clusters and nodes in a StreamBase Runtime fabric. The application consists of a StreamBase monitoring EventFlow fragment and a LiveView server fragment. The application:

  • Dynamically discovers cluster elements (nodes, JVMs).

  • Dynamically discovers changes to the cluster population.

  • Provides StreamBase performance data for EventFlow and LiveView JVMs.

  • Provides StreamBase Runtime performance data for nodes.

  • Publishes results into a set of LiveView tables.

  • Provides a set of LiveView Web cards that present a customizable view of the data.

Using the default configuration, the LiveView server automatically enables LiveView Web as the client to display cluster monitoring data.

Installing and Running the Cluster Monitor Application

Note

The Cluster Monitor application must be run in a separate cluster from the one containing the applications being monitored.

  1. Run the epadmin install node command to enter a node name where you are installing the Cluster Monitor application. For example, monitor.monitorCluster.

    The epadmin install node command example below wraps to the next lines for clarity but must be entered as one command. Adjust your command based on your installed platform location.

    Windows

    epadmin install node --nodename=monitor.monitorCluster 
      --substitutions="NODE_NAME=monitor.monitorCluster" 
      --application=C:/TIBCO/sb-cep/n.m/distrib/tibco/sb/applications/cluster-monitor.zip

    Linux

    epadmin install node --nodename=monitor.monitorCluster 
      --substitutions="NODE_NAME=monitor.monitorCluster" 
      --application=/opt/tibco/sb-cep/n.m/distrib/tibco/sb/applications/cluster-monitor.zip

    macOS

    epadmin install node --nodename=monitor.monitorCluster 
      --substitutions="NODE_NAME=monitor.monitorCluster" 
      --application=$HOME/Applications/TIBCO\ Streaming\ n.m.x/distrib/tibco/sb/applications/cluster-monitor.zip
  2. Verify your Cluster Monitor node is installed. Look for output similar to the following (Windows example shown):

    C:\Users\sbuser\Documents\StreamBaseWorkspace\example1>epadmin install node 
    --nodename=monitor.monitorCluster --substitutions="NODE_NAME=monitor.monitorCluster" 
    --application=C:/TIBCO/sb-cep/10.4/distrib/tibco/sb/applications/cluster-monitor.zip
    [monitor.monitorCluster]        Installing node
    [monitor.monitorCluster]                PRODUCTION executables
    [monitor.monitorCluster]                File shared memory
    [monitor.monitorCluster]                7 concurrent allocation segments
    [monitor.monitorCluster]                Host name sbuser
    [monitor.monitorCluster]                Container tibco/sb
    [monitor.monitorCluster]                Starting container services
    [monitor.monitorCluster]                Loading node configuration
    [monitor.monitorCluster]                Auditing node security
    [monitor.monitorCluster]                Deploying application
    [monitor.monitorCluster]                        Engine cluster-monitor
    [monitor.monitorCluster]                        Engine liveview-server
    [monitor.monitorCluster]                Application deployed
    [monitor.monitorCluster]                Administration port is 23528
    [monitor.monitorCluster]                Discovery Service running on port 54321
    [monitor.monitorCluster]                Service name is monitor.monitorCluster
    [monitor.monitorCluster]        Node installed
  3. Once the Cluster Monitor node is installed, start it by running the following command:

    epadmin --servicename=monitor.monitorCluster start node
  4. Verify the Cluster Monitor node started. Look for output similar to the following (Windows example shown) or through Studio's Cluster view:

    C:\Users\sbuser\Documents\StreamBaseStudioWorkspace\example1>epadmin 
    --servicename=monitor.monitorCluster start node
    [monitor.monitorCluster]        Starting node
    [monitor.monitorCluster]                Engine application::liveview-server started
    [monitor.monitorCluster]                Engine application::cluster-monitor started
    [monitor.monitorCluster]                Loading node configuration
    [monitor.monitorCluster]                Auditing node security
    [monitor.monitorCluster]                Host name sbuser
    [monitor.monitorCluster]                Administration port is 23528
    [monitor.monitorCluster]                Discovery Service running on port 54321
    [monitor.monitorCluster]                Service name is monitor.monitorCluster
    [monitor.monitorCluster]        Node started
  5. Verify the LiveView server is now active. Depending on your system performance, this may take several minutes. Look for the following message in the monitor.monitorCluster/logs/liveview-server.log file to confirm the LiveView server is active:

    *** All tables have been loaded. LiveView is ready to accept client connections.

    Optionally, use the epadmin tail logging command to watch the log file:

    epadmin --servicename=monitor.monitorCluster tail logging --enginename=liveview-server

    See the Tail Logging for details regarding the epadmin logging target.

Installation Notes

The value for the install --nodename=N option must match the value for the --substitutions="NODE_NAME=N" option.

You can use the install node --discoveryport=N option to set the discovery port to the appropriate value for the cluster to be monitored.

Viewing Cluster Statistics in LiveView Web

Once the Cluster Monitor node is started and the Cluster Monitor application is running, open a browser to view cluster data in LiveView Web for your running application nodes.

Enter: http://localhost:11080/lvweb. The Cluster Monitor application uses 11080 as its default LiveView Web port, to avoid conflicts with the standard LiveView Web port, 10080.

LiveView Web displays cluster statistics using the following set of LiveView Web cards:

Services

Discovered Services.

Percentage of Shared Memory in Use

Amount of shared memory in use, per node.

Host Machine CPU Utilization

CPU utilization, per node host machine. Note that multiple nodes running on a single host will report the same information.

Node Transaction Rate and Average Latency

The transaction rate and average latency, per node.

EventFlow/LiveView Amount of Heap Memory In Use

The amount of Java heap memory in use, per EventFlow/LiveView engine.

EventFlow/LiveView Total Queue Depth

The total queued tuple depth per EventFlow/LiveView JVM.

Configuration Options

The following monitoring behavior parameters are configurable:

  • Service discovery

  • Credentials for statistics collection

  • Administration commands

  • Table naming

  • Table aging

  • LiveView listener port

You can change the Cluster Monitor application's configuration:

  • At node/application installation time by replacing the default node deploy configuration using the nodedeploy parameter in the default node.conf file. The Cluster Monitor application's default node.conf file is located in the cluster-monitor.zip file. See below to learn more about the node.conf file's properties.

  • While the Cluster Monitor application is running by loading and activating a Service Discovery adapter configuration, a Cluster Monitor configuration, or both. See:

After activating a new configuration, restart the Cluster Monitor application with the following command:

epadmin --servicename=MONITOR_NODE_NAME restart container --engine=cluster-monitor

Note

Replace MONITOR_NODE_NAME with the name of the node where the Cluster Monitor application is installed.

Tables

The Cluster Monitor configures the following statistics tables by default:

Services Table: The monitor always contains a Services table showing services that have been discovered and their current state.

Field Name Type Notes
serviceState string  
serviceName string Primary key
serviceAddress string Approves or rejects artifacts committed by other users
serviceChangeTime timestamp  

StreamBaseInfo Table: EventFlow/LiveView server Java heap memory use, and aggregate total number of queued tuples.

Field Name Type Notes
service string  
id long Primary key. Per row unique identifier.
time timestamp  
usedMemory long Bytes
totalQueueDepth long  

NodeInfo Table: Per node shared memory usage, transaction rate, and deadlock counter, and CPU usage for the machine where the node is running.

Field Name Type Notes
service string  
id long Primary key. Per row unique identifier.
time timestamp  
c_Version long History command output version
c_Shared_Memory_Total long Bytes
c_Shared_Memory_Used long Percentage
c_CPU_Percent_User long Percentage
c_CPU_Percent_System long Percentage
c_CPU_Percent_Idle long Percentage
c_Transactions during the last sample long Number of per node transactions
c_Average_Transaction_Latency_usecs long Average transaction length
c_Transaction_Deadlocks long  

epadmin Command Tables

Any epadmin target command that generates output may be used. For example, the NodeInfo table is equivalent to the following epadmin command:

epadmin --servicename=monitor.monitorCluster display history --seconds=1

By default, the generated table name for epadmin commands is t_COMMAND_TARGET. For the command shown above, this becomes t_display_history. This may be changed via configuration.

The first three columns are common to both epadmin command tables and the StreamBaseInfo table.

The remaining columns consist of the epadmin command output. The output column names are discovered and converted to meet LiveView column name requirements:

  • A leading c_ prefix is inserted.

  • Non alpha-numeric characters are converted to underscores.

  • Multiple underscore sequences are converted to single underscores.

For example, the column name: Shared Memory Size (bytes), is converted to: c_Shared_Memory_Size_bytessec.

Table Aging

Two configurable parameters control the automatic removal of rows from the monitor's tables.

  • rowAgeLimitSeconds: Rows older than this limit will be periodically removed.

  • rowCheckIntervalSeconds: How often tables are checked for row removal.

Statistics Collection Authentication

The Cluster Monitor application attempts to connect to each discovered service, authenticating each using the configured credentials. The epadmin commands use the administrationAuthentication section of the ClusterMonitor configuration file, as shown in StreamBase Cluster Monitor Configuration. By default, no credentials are configured. The application is only able to monitor services running on the local node, started by the same user who installed the Cluster Monitor.

The configuration supports a single set of credentials for epadmin commands, and a single set of credentials for EventFlow and LiveView services. For simplicity, TIBCO recommends configuring a common login credential throughout the target cluster.

To configure EventFlow and LiveView services information, use the EventFlow section of the ClusterMonitor configuration.

Default node.conf File for Cluster Monitoring Application

//
// Default configuration for the StreamBase Cluster Monitor.
//
// To change this configuration, make a copy of this entire file
// and use the nodedeploy option to epadmin install node.
//
// HOCON substitution variables used in this configuration:
//
//      NODE_NAME       Required. The name of the node that the monitor application is
//                      installed to (the nodename parameter to epadmin install node).
//
//      LIVEVIEW_PORT   Defaults to 11080
//
// These variables may be overridden via the substitutions or substitutionfile
// parameters to epadmin install node.
//
name = "ClusterMonitor"
version = "1.0"
type = "com.tibco.ep.dtm.configuration.node"
configuration =
{
  NodeDeploy =
  {
    nodes  =
    {
      // A default node name such as "cluster" shown here is required for this
      // configuration to validate in the Configuration Editor in Studio.
      "${NODE_NAME:-cluster}" =
      {
        description = "Node for running the ClusterMonitor"

        engines =
        {
          "cluster-monitor" =
          {
            fragmentIdentifier = "com.tibco.ep.cluster.eventflow"

            configuration =
            [
              """
              name = "cluster-monitor-service-discovery"
              version = "1.0.0"
              type = "com.tibco.ep.streambase.configuration.servicediscoveryadapter"
              configuration =
              {
                ServiceDiscovery =
                {
                  associatedWithEngines = [ "cluster-monitor" ]

                  // The port to use for discovery requests.
                  // Optional. Defaults to the port being used by the node where
                  // this configuration is loaded.
                  // discoveryPort = 54321

                  // A list of host names that specify which network interfaces
                  // to use when sending discovery requests.
                  // Optional. Defaults to the system's host name.
                    discoveryHosts = [ ]

                  // Service names that will be discovered by the cluster monitor.
                  // Optional. Defaults to all service names found.
                  // Uncomment to restrict.
                  // serviceNames = [ "cluster1", "cluster2", "nodeA.cluster3" ]

                  // Service types that will be discovered by cluster monitor.
                  // Optional. Empty defaults to all service types, but the cluster
                  // monitor only knows how to monitor node, eventflow and
                  // liveview services.
                  serviceTypes = [ "node", "eventflow", "liveview" ]

                  // The number of seconds between discovery requests.
                  // Optional. Defaults to 1.
                  // discoveryBrowseIntervalSeconds = 1

                  // This causes the cluster-monitor event flow application
                  // to wait for the LiveView server to become ready before
                  // starting.  Do not change.
                  autostart = false

                  // Whether or not services running within the monitor's node
                  // are discovered and monitored. Defaults to false.
                  includeLocalServices = ${INCLUDE_LOCAL_SERVICES:-false}
                } // end of ServiceDiscovery object
              }
              """,

              """
              name = "clustermonitor"
              version = "1.0.0"
              type = "com.tibco.ep.streambase.configuration.clustermonitor"
              configuration =
              {
                ClusterMonitor =
                {
                  associatedWithEngines = [ "cluster-monitor" ]

                  // A list of administration commands to run against discovered
                  // node services, every collection interval, and then publish
                  // to LiveView server tables.
                  commands =
                  [
                    {
                      // Administration command name. Required.
                      commandName = "display"

                      // Administration target name. Required.
                      targetName = "history"

                      // A list of command parameters. Optional.
                      parameters =
                      {
                        "seconds" = "1"
                      }

                      // The name of the LiveView table for the command results.
                      // This name must be a legal LiveView table name, starting
                      // with a letter, followed by a combination of letters, digitis,
                      // and underscores.  Whitespace is not allowed.
                      // Optional. Defaults to "t_commandName_targetName"
                      tableName = "NodeInfo"

                      // Number of seconds between command invocations.
                      // Optional. Defaults to 1.
                      collectionIntervalSeconds = 1
                    },

                    // Additional commands can be added here.
                  ]

                  // Age limit for data rows in the monitor tables.
                  // Optional.  Defaults to 60.
                  // rowAgeLimitSeconds = 60

                  // The interval at which tables are checked for removing old rows.
                  // Optional.  Defaults to 30.
                  // rowCheckIntervalSeconds = 30

                  // Optional credentials for running the administration commands.
                  // administrationAuthentication =
                  // {
                      //    // The user name to use when connecting to the node.
                      //    //
                      //    userName = "administrator"
                      //    //
                      //    // The password to use when connecting to the node.
                      //    //
                      //    password = "This is a plain text password"
                  // }

                  // Optional credentials for collecting StreamBase statistics.
                      //
                      // eventFlowAuthentication =
                      // {
                      //    //
                      //    // The user name to use when connecting to the
                      //    // EventFlow or LiveView StreamBase listener.
                      //    //
                      //    userName = "sbAministrator"
                      //    //
                      //    // The password to use when connecting to the
                      //    // EventFlow or LiveView StreamBase listener.
                      //    // A value beginning with '#!' is enciphered
                      //    // with the sbcipher tool.
                      //    //
                      //    password = "#!xyzzy"
                      // }
                        
                   } // end of ClusterMonitor
                 } // end of configuration for c.t.e.streambase.configuration.clustermonitor
               """,

               """
               name = "cluster-monitor-operator-parameters"
               version = "1.0.0"
               type = "com.tibco.ep.streambase.configuration.sbengine"
               configuration =
               {
                 StreamBaseEngine =
                 {
                   streamBase =
                   {
                     operatorParameters =
                     {
                        LIVEVIEW_PORT = ""${LIVEVIEW_PORT:-11080}
                     }
                   }
                 }
               }
               """
             ]
           } // end of cluster-monitor engine configuration set

           "liveview-server" =
           {
             fragmentIdentifier = "com.tibco.ep.cluster.liveview-server"

             configuration =
             [
               """
               name = "liveview-server-listener"
               type = "com.tibco.ep.ldm.configuration.ldmclientapilistener"
               version = "1.0.0"
               configuration =
               {
                 ClientAPIListener =
                   {
                     associatedWithEngines = [ "liveview-server" ]
                     portNumber = ${LIVEVIEW_PORT:-11080}
                   }
                 }
               """,

               """
               name = "liveview-sb-listener"
               type = "com.tibco.ep.streambase.configuration.sbclientapilistener"
               version = "1.0.0"
               configuration =
               {
                 ClientAPIListener =
                 {
                   associatedWithEngines = [ "liveview-server" ]
                   apiListenerAddress =
                   {
                     portNumber = 0
                   }
                 }
               } // end of c.t.e.streambase.configuration.sbclientapilistener
               """,

               """
               name = "liveview-server-engine"
               version = "1.0.0"
               type = "com.tibco.ep.ldm.configuration.ldmengine"
               configuration = 
               {
                 LDMEngine = {
                   // Recommended JVM 1.8 flags for LiveVew
                   jvmArgs =
                   [
                     "-Xmx4g"
                     "-Xms1g"
                     "-XX:+UseG1GC"
                     "-XX:MaxGCPauseMillis=500"
                     "-XX:ConcGCThreads=1"
                   ]
                   ldm = { }
                 }
               } // end of com.tibco.ep.ldm.configuration.ldmengine
               """
             ] 
          } // end of liveview-server engine configuration set
        } // end of engines
      } // end of NODE_NAME settings
    } // end of nodes
  } // end of NodeDeploy
} // end of top-level configuration