Sizing Metrics

This chapter describes the metrics that must be measured or estimated to determine the system requirements for a TIBCO Streaming application.

Memory

Shared Memory

Shared memory usage has these parts:

  • a base system usage by the TIBCO Streaming Runtime

  • usage by Managed objects

  • in-flight transaction logging

  • usage by replica objects

  • usage by cached objects

The base system usage for a TIBCO Streaming node can be determined with epadmin commands like the following, directed at an unloaded node (that is, with no application running, no application usage of memory):

epadmin [--ad=adminport|--servicename=cluster.node] enable statistics –stat=memoryusage
epadmin [--ad=adminport|--servicename=cluster.node] display statistics –stat=memoryusage

As shown, a TIBCO Streaming node base system usage is 21% of 512 megabytes, or 100 megabytes. This base footprint will increase slightly as managed object classes are loaded.

Managed objects persist their state in shared memory until they are explicitly deleted by the application. The shared memory usage of a managed object can be determined programmatically by creating an instance of the managed object, populating it with typical data values, and passing it to the com.kabira.platform.osstats.Type.memorySize()method.

Warning

The call to the Type.memorySize() method must be done in a separate transaction from the object creation to get accurate results.

This is shown in Example 1, “Object size snippet”:

Example 1. Object size snippet

package com.tibco.ep.dtm.snippets.sizing;

import com.kabira.platform.ManagedObject;
import com.kabira.platform.Transaction;
import com.kabira.platform.annotation.Key;
import com.kabira.platform.annotation.Managed;
import com.kabira.platform.osstats.Type;

/**
 *  Display Managed object memory usage.
 */
public class ObjectSize
{
    /**
     * Sample application object
     */
    @Managed
    @Key(name = "ByName", fields =
    {
        "name"
    }, unique = true, ordered = false)
    private static class MyObject
    {
        static final int NUMBER_OF_ELEMENTS = 100;
        String[] stringArray;
        @SuppressWarnings("unused")
        final String name;

        /**
         * When created, populate this instance with some data
         * @param name Name value
         */
        public MyObject(String name)
        {
            this.name = name;

            stringArray = new String[NUMBER_OF_ELEMENTS];

            for (int i = 0; i < NUMBER_OF_ELEMENTS; i++)
            {
                stringArray[i] = Integer.toString(i);
            }
        }
    }

    /**
     * Main entry point
     * @param args  Not used
     */
    public static void main(String[] args)
    {
        //
        //  Create the objecct
        //
        new Transaction("Create Object")
        {
            @Override
            protected void run() throws Rollback
            {
                new MyObject("Sample");
            }
        }.execute();

        //
        //  Report object size - this must be done in a separate 
        //  transaction.  It only works for committed objects.
        //
        new Transaction("Report Object Size")
        {
            @Override
            protected void run() throws Transaction.Rollback
            {
                for (MyObject myObject : ManagedObject.extent(MyObject.class))
                {
                    System.out.println(new Type().memoryUsage(myObject));
                    ManagedObject.delete(myObject);
                }
            }
        }.execute();
    }
}

This program's output is similar to the following:

Allocation type: # of bytes, allocator bucket size, notes
=====================================================================
metadata: 480, 592, spaces: [allocation=64] [system=24] [lock=112]
key com.kabira.snippets.sizing.ObjectSize$MyObject::ByName: 128, 208
array com.kabira.snippets.sizing.ObjectSize$MyObject::stringArray: 1006 (aligned 1008), 1168
optimal allocationSpaceBytes = 1032
event queue: 0, 0
Total: 1614 1968

The output has three columns:

  1. Allocation type - what part of the object this allocation deals with.

    • metadata - Object data. This allocation is broken into two or three parts. The system space which includes the storage for all fixed length object fields, and pointers to non-fixed length fields, and the Runtime overhead. The allocation space is the size of the allocation done to store the data for variable length object fields.

      For types where the dynamicLockMemory element of the @Managed annotation has been left (or set to) its default setting of false there will also be a lock portion of the metadata, which shows the size of the space being used for the transaction lock.

    • transaction lock memory - For types where the dynamicLockMemory has been set to true there will be an additional line of report output, which shows the space used by the transaction lock that is dynamically allocated and de-allocated each time the object is locked within a transaction:

      transaction lock memory: 112, 208, dynamic

      This setting reduces memory utilization by instances of objects when they are not locked within a transaction at the cost of increased CPU path length and reduced scalability when the object is accessed.

    • key - One entry for the value of each defined key. This allocation is used to populate the indexes and is a separate allocation from the storage for the object fields covered by the key.

    • array - An array field, including storage for the array elements.

    • string - A string field, including storage for the data.

  2. # of bytes - the number of bytes requested for this part of the allocation.

  3. allocator bucket size - the number of bytes actually allocated. Allocations are fitted to the nearest fixed shared memory allocator bucket size.

The final line of the report shows the total memory requested and the actual memory allocated.

An addition line of the report shows the optimal size for the allocation space. At object creation, by default, the Runtime chooses a small value for the allocation space. User data for strings and arrays that does not fit within this space causes additional memory allocations. If the optimal size is different than the allocation space shown in the metadata line of the report space and performance savings may be gained by explicitly setting it via the allocationSpaceBytes element of the Managed annotation. From the example above, change the setting to 1032 bytes:

      @Managed(allocationSpaceBytes = 1032)
      @Key(name = "ByName", fields =
      {
         "name"
      }, unique = true, ordered = false)
      private static class MyObject      

Re-running the modified class, results in:

      Allocation type: # of bytes, allocator bucket size, notes
      =====================================================================
      metadata: 1336, 1424, spaces: [allocation=1032] [system=24]
      key com.kabira.snippets.sizing.ObjectSize$MyObject::ByName: 136, 240
      optimal allocationSpaceBytes = 1032
      event queue: 0, 0
      Total: 1472 1664

Transaction allocation pages are created during in-flight transactions to manage write and read images (snapshots) of the object. The memory used for in-flight transactions pages is equivalent to the size of the object fields. The transaction page is released when the transaction commits or aborts. The total amount of memory consumed for transaction pages should be multiplied by the number of concurrent transactions to get the total impact on shared memory size. For example, if a system is running at 1000 transactions per second, and each transaction creates a transaction page of a 50 byte object, the in-flight transaction log size is 1000 * 50, or 50,000 bytes.

Finally, any replica objects on a node consume shared memory. The amount of shared memory consumed by a replica object is the same as the shared memory consumed by the object on its active node, which can be calculated using the com.kabira.platform.Type.memoryUsage()method as discussed above.

Caching

Distributed managed objects support caching a subset of the objects in shared memory. Cached objects consume the same amount of shared memory as non-cached managed objects. When cached objects are flushed their index data is also removed from shared memory.

The total shared memory consumed by cached objects can be explicitly controlled. The allocated cache size includes both object and index data in shared memory. The size of the shared memory cache can be specified as an absolute value or as a percentage of the shared memory available to the node. The cache size is set per node using configuration values.

Heap Memory

JVM heap memory usage in TIBCO Streaming follows normal JVM heap usage, with the following differences:

  • Array fields in Managed objects only consume the size of an object reference (8 bytes).

  • Managed objects have an additional, internal 24 byte field used as a shared memory identifier.

  • POJO fields, for POJOs with the Transactional annotation, when transactionally locked and modified, will temporarily consume heap memory to log their before state. The memory used will be equivalent to the size of the fields before they are modified and is released when the transaction commits or aborts. The number of concurrent transactions should also be taken into account.

Process Memory

A TIBCO Streaming node consists of a small number of processes communicating through shared memory to provide the TIBCO Streaming Runtime services. The total size of the code that may be executed by these processes is approximately 100 megabytes. This is a system wide (per-server) cost, and not a per-process cost, because the code is contained in shared object files.

Swap Space

A typical UNIX installation requires adding at least as much swap space as there is physical memory. However it is highly recommended that a machine running a TIBCO Streaming system have enough physical memory so that it never needs to swap. Swapping runs at disk access speeds, while TIBCO Streaming is designed to run from memory at memory access speeds.

Processing Units

  • Clock speed

    Differences in processor speed have a direct linear effect upon the performance of application code. Faster processors will result in faster application execution.

  • Multi-processor and multi-core

    Both the TIBCO Streaming Runtime and the JVM are designed to take advantage of multi-threading capabilities in the underlying operating system. A TIBCO Streaming application designed for parallelism will take advantage of multiple processing units, increasing overall throughput.

Disk

  • Size

    The TIBCO Streaming product installation will make use of approximately 1 gigabyte of disk space.

    Deployed, each TIBCO Streaming node's disk space is determined mostly by the size of the shared memory. By default this size is 512 megabytes and the shared memory is an ordinary file system file, which is memory-mapped by the TIBCO Streaming Runtime. There is also an option to use System V Shared Memory instead of a file. In this case, the shared memory does not use any disk space.

    Warning

    Deploying a shared memory file on a networked file system (e.g. NFS), or in a virtual hardware environment, is not supported for production deployments. The disk I/O subsystem performance is not sufficient to support the required disk through-put in these environments. Use System V Shared Memory instead.

    After system startup, TIBCO Streaming, by default, will generate very little disk I/O, most of it involved in logging the invocation of administrative commands.

    Application specific logging or generation of disk data also needs to taken into account when choosing disk size.

  • Number

    By default, a single disk is capable of supporting a TIBCO Streaming node.

    Disk I/IO speeds need to be considered when either change logging or application specific disk I/O will occur. If a single disk does not have the sufficient space or performance characteristics, the I/O may be spread across multiple disks either through configuration of the file locations, or by using a volume manager to present multiple disks as a single logical disk to TIBCO Streaming.

  • Partitioning

    TIBCO Streaming does not have specific partitioning requirements.

    Note

    Using multiple partitions does not improve the performance characteristics of a single disk.

Network

Network speed affects both the throughput and latency when using highly available or distributed objects.

When a highly available object is modified, all of the object's data is sent to the remote node(s) when the transaction commits. The TIBCO Streaming Runtime attempts to minimize the number of separate packets by aggregating the data for multiple highly available objects that have been modified in a single transaction.

Distributed objects, depending upon configuration, and where they are being accessed may generate synchronous network I/O for each access.

For highly available and distributed objects, additionally network I/O is done between all involved nodes as part of transaction commit and abort processing. The size of the I/O is typically small, but it is a separate packet at commit/abort time.

A distributed application can saturate a Fast Ethernet (100 Mbits/second) network. It is recommended that Gigabit Ethernet be used.

Highly available and distributed objects, and the underlying TIBCO Streaming support are designed to be used on a local area network (LAN) with low latency and high throughput. They also work are optimized to work over a wide area network (WAN) to support geographic redundancy.

High Availability

  • Number of machines

    Highly available objects exist in partitions that are shared between multiple nodes. It is recommended that these nodes be located on different machines. Multiple partitions may be hosted on a node, and nodes may act as both the active and replica roles for each other in an active/active configuration.

    The number of partitions and the number of machines are chosen for both administrative and performance reasons.

  • Network interfaces

    By default, TIBCO Streaming uses a single network interface per node, but it may be configured to use multiple network interfaces.