Contents
This chapter describes the metrics that must be measured or estimated to determine the system requirements for a Spotfire Streaming application.
Shared memory usage has these parts:
-
a base system usage by the Spotfire Streaming Runtime
-
usage by Managed objects
-
in-flight transaction logging
-
usage by replica objects
-
usage by cached objects
The base system usage for a Spotfire Streaming node can be determined with epadmin commands like the following, directed at an unloaded node (that is, with no application running, no application usage of memory):
epadmin [--ad=adminport
|--servicename=cluster.node
] enable statistics –stat=memoryusage epadmin [--ad=adminport
|--servicename=cluster.node
] display statistics –stat=memoryusage
As shown, a Spotfire Streaming node base system usage is 21% of 512 megabytes, or 100 megabytes. This base footprint will increase slightly as managed object classes are loaded.
Managed objects persist their
state in shared memory until they are explicitly deleted by the application. The
shared memory usage of a managed object can be determined programmatically by
creating an instance of the managed object, populating it with typical data values,
and passing it to the com.kabira.platform.osstats.Type.memorySize()
method.
Warning
The call to the Type.memorySize()
method must be
done in a separate transaction from the object creation to get accurate results.
This is shown in Example 1, “Object size snippet”:
Example 1. Object size snippet
package com.tibco.ep.dtm.snippets.sizing; import com.kabira.platform.ManagedObject; import com.kabira.platform.Transaction; import com.kabira.platform.annotation.Key; import com.kabira.platform.annotation.Managed; import com.kabira.platform.osstats.Type; /** * Display Managed object memory usage. */ public class ObjectSize { /** * Sample application object */ @Managed @Key(name = "ByName", fields = { "name" }, unique = true, ordered = false) private static class MyObject { static final int NUMBER_OF_ELEMENTS = 100; String[] stringArray; @SuppressWarnings("unused") final String name; /** * When created, populate this instance with some data * @param name Name value */ public MyObject(String name) { this.name = name; stringArray = new String[NUMBER_OF_ELEMENTS]; for (int i = 0; i < NUMBER_OF_ELEMENTS; i++) { stringArray[i] = Integer.toString(i); } } } /** * Main entry point * @param args Not used */ public static void main(String[] args) { // // Create the objecct // new Transaction("Create Object") { @Override protected void run() throws Rollback { new MyObject("Sample"); } }.execute(); // // Report object size - this must be done in a separate // transaction. It only works for committed objects. // new Transaction("Report Object Size") { @Override protected void run() throws Transaction.Rollback { for (MyObject myObject : ManagedObject.extent(MyObject.class)) { System.out.println(new Type().memoryUsage(myObject)); ManagedObject.delete(myObject); } } }.execute(); } }
This program's output is similar to the following:
Allocation type: # of bytes, allocator bucket size, notes ===================================================================== metadata: 480, 592, spaces: [allocation=64] [system=24] [lock=112] key com.kabira.snippets.sizing.ObjectSize$MyObject::ByName: 128, 208 array com.kabira.snippets.sizing.ObjectSize$MyObject::stringArray: 1006 (aligned 1008), 1168 optimal allocationSpaceBytes = 1032 event queue: 0, 0 Total: 1614 1968
The output has three columns:
-
Allocation type
- what part of the object this allocation deals with.-
metadata
- Object data. This allocation is broken into two or three parts. Thesystem
space which includes the storage for all fixed length object fields, and pointers to non-fixed length fields, and the Runtime overhead. Theallocation
space is the size of the allocation done to store the data for variable length object fields.For types where the
dynamicLockMemory
element of the@Managed
annotation has been left (or set to) its default setting offalse
there will also be a lock portion of the metadata, which shows the size of the space being used for the transaction lock. -
transaction lock memory
- For types where thedynamicLockMemory
has been set totrue
there will be an additional line of report output, which shows the space used by the transaction lock that is dynamically allocated and de-allocated each time the object is locked within a transaction:transaction lock memory: 112, 208, dynamic
This setting reduces memory utilization by instances of objects when they are not locked within a transaction at the cost of increased CPU path length and reduced scalability when the object is accessed.
-
key
- One entry for the value of each defined key. This allocation is used to populate the indexes and is a separate allocation from the storage for the object fields covered by the key. -
array
- An array field, including storage for the array elements. -
string
- A string field, including storage for the data.
-
-
# of bytes
- the number of bytes requested for this part of the allocation. -
allocator bucket size
- the number of bytes actually allocated. Allocations are fitted to the nearest fixed shared memory allocator bucket size.
The final line of the report shows the total memory requested and the actual memory allocated.
An addition line of the report shows the optimal size for the allocation
space. At object creation, by default, the Runtime
chooses a small value for the allocation
space. User
data for strings and arrays that does not fit within this space causes additional
memory allocations. If the optimal size is different than the allocation
space shown in the metadata
line of the report space and performance savings may be
gained by explicitly setting it via the allocationSpaceBytes
element of the Managed
annotation. From the example above, change the setting to
1032 bytes:
@Managed(allocationSpaceBytes = 1032) @Key(name = "ByName", fields = { "name" }, unique = true, ordered = false) private static class MyObject
Re-running the modified class, results in:
Allocation type: # of bytes, allocator bucket size, notes ===================================================================== metadata: 1336, 1424, spaces: [allocation=1032] [system=24] key com.kabira.snippets.sizing.ObjectSize$MyObject::ByName: 136, 240 optimal allocationSpaceBytes = 1032 event queue: 0, 0 Total: 1472 1664
Transaction allocation pages are created during in-flight transactions to manage write and read images (snapshots) of the object. The memory used for in-flight transactions pages is equivalent to the size of the object fields. The transaction page is released when the transaction commits or aborts. The total amount of memory consumed for transaction pages should be multiplied by the number of concurrent transactions to get the total impact on shared memory size. For example, if a system is running at 1000 transactions per second, and each transaction creates a transaction page of a 50 byte object, the in-flight transaction log size is 1000 * 50, or 50,000 bytes.
Finally, any replica objects on
a node consume shared memory. The amount of shared memory consumed by a replica
object is the same as the shared memory consumed by the object on its active node,
which can be calculated using the com.kabira.platform.Type.memoryUsage()
method as discussed above.
Distributed managed objects support caching a subset of the objects in shared memory. Cached objects consume the same amount of shared memory as non-cached managed objects. When cached objects are flushed their index data is also removed from shared memory.
The total shared memory consumed by cached objects can be explicitly controlled. The allocated cache size includes both object and index data in shared memory. The size of the shared memory cache can be specified as an absolute value or as a percentage of the shared memory available to the node. The cache size is set per node using configuration values.
JVM heap memory usage in Spotfire Streaming follows normal JVM heap usage, with the following differences:
-
Array fields in Managed objects only consume the size of an object reference (8 bytes).
-
Managed objects have an additional, internal 24 byte field used as a shared memory identifier.
-
POJO fields, for POJOs with the Transactional annotation, when transactionally locked and modified, will temporarily consume heap memory to log their before state. The memory used will be equivalent to the size of the fields before they are modified and is released when the transaction commits or aborts. The number of concurrent transactions should also be taken into account.
A Spotfire Streaming node consists of a small number of processes communicating through shared memory to provide the Spotfire Streaming Runtime services. The total size of the code that may be executed by these processes is approximately 100 megabytes. This is a system wide (per-server) cost, and not a per-process cost, because the code is contained in shared object files.
A typical UNIX installation requires adding at least as much swap space as there is physical memory. However it is highly recommended that a machine running a Spotfire Streaming system have enough physical memory so that it never needs to swap. Swapping runs at disk access speeds, while Spotfire Streaming is designed to run from memory at memory access speeds.
-
Clock speed
Differences in processor speed have a direct linear effect upon the performance of application code. Faster processors will result in faster application execution.
-
Multi-processor and multi-core
Both the Spotfire Streaming Runtime and the JVM are designed to take advantage of multi-threading capabilities in the underlying operating system. A Spotfire Streaming application designed for parallelism will take advantage of multiple processing units, increasing overall throughput.
-
Size
The Spotfire Streaming product installation will make use of approximately 1 gigabyte of disk space.
Deployed, each Spotfire Streaming node's disk space is determined mostly by the size of the shared memory. By default this size is 512 megabytes and the shared memory is an ordinary file system file, which is memory-mapped by the Spotfire Streaming Runtime. There is also an option to use System V Shared Memory instead of a file. In this case, the shared memory does not use any disk space.
Warning
Deploying a shared memory file on a networked file system (e.g. NFS), or in a virtual hardware environment, is not supported for production deployments. The disk I/O subsystem performance is not sufficient to support the required disk through-put in these environments. Use System V Shared Memory instead.
After system startup, Spotfire Streaming, by default, will generate very little disk I/O, most of it involved in logging the invocation of administrative commands.
Application specific logging or generation of disk data also needs to taken into account when choosing disk size.
-
Number
By default, a single disk is capable of supporting a Spotfire Streaming node.
Disk I/IO speeds need to be considered when either change logging or application specific disk I/O will occur. If a single disk does not have the sufficient space or performance characteristics, the I/O may be spread across multiple disks either through configuration of the file locations, or by using a volume manager to present multiple disks as a single logical disk to Spotfire Streaming.
-
Partitioning
Spotfire Streaming does not have specific partitioning requirements.
Note
Using multiple partitions does not improve the performance characteristics of a single disk.
Network speed affects both the throughput and latency when using highly available or distributed objects.
When a highly available object is modified, all of the object's data is sent to the remote node(s) when the transaction commits. The Spotfire Streaming Runtime attempts to minimize the number of separate packets by aggregating the data for multiple highly available objects that have been modified in a single transaction.
Distributed objects, depending upon configuration, and where they are being accessed may generate synchronous network I/O for each access.
For highly available and distributed objects, additionally network I/O is done between all involved nodes as part of transaction commit and abort processing. The size of the I/O is typically small, but it is a separate packet at commit/abort time.
A distributed application can saturate a Fast Ethernet (100 Mbits/second) network. It is recommended that Gigabit Ethernet be used.
Highly available and distributed objects, and the underlying Spotfire Streaming support are designed to be used on a local area network (LAN) with low latency and high throughput. They also work are optimized to work over a wide area network (WAN) to support geographic redundancy.
-
Number of machines
Highly available objects exist in partitions that are shared between multiple nodes. It is recommended that these nodes be located on different machines. Multiple partitions may be hosted on a node, and nodes may act as both the active and replica roles for each other in an active/active configuration.
The number of partitions and the number of machines are chosen for both administrative and performance reasons.
-
Network interfaces
By default, Spotfire Streaming uses a single network interface per node, but it may be configured to use multiple network interfaces.