Contents
This chapter describes how to tune Spotfire Streaming applications. Application and system parameters are described.
The Spotfire Streaming runtime supports multiple processes communicating through shared memory, or a memory mapped file. When a JVM is started using the deployment tool, all runtime resources required by the JVM are available in the same process space. There are cases where multiple JVMs on a single node may be appropriate for an application (see Multiple JVMs), but there is a performance impact for dispatching between JVMs.
By default, the Spotfire Streaming runtime does not modify the JVM heap
(-Xms<size>
and -Xmx<size>
) or stack (-Xss<size>
) memory options. If during testing, the JVM is
found to run short of, or out of memory, these options can be modified either
setting them as arguments to the deployment tool.
Both JConsole
and VisualVM
can be used for looking at heap memory utilization.
By default, the Spotfire Streaming runtime does not modify any of the JVM garbage collection parameters.
For production systems deploying using the Oracle JVM, we recommend that you enable garbage collection logging using the following deployment options:
-
-XX:+PrintGCDateStamps
-
-XX:+PrintGCDetails
-
-Xloggc:gc.log
Note: replace
gc.log
with a name unique to your deployed JVM to avoid multiple JVMs from colliding using the same log file.
This provides a relatively low overhead logging that can be used to look for memory issues and using the timestamps may be correlated to other application logging (for example, request/response latency).
Another useful set of Oracle JVM option controls GC log file rotation. See (Java HotSpot VM Options).
-
-XX:-UseGCLogFileRotation
-
-XX:-NumberOfGCLogFiles
-
-XX:GCLogFileSize
Garbage collection tuning is a complex subject with dependencies on the application, the target load, and the desired balance of application throughput, latency, and footprint. Because there is no best one-size-fits-all answer, most JVMs offer a variety of options for modifying the behavior of the garbage collector. An Internet search shows a large selection of writings on the subject. One book with good coverage on the implementation and tuning of garbage collection in Oracle JVMs is Java Performance by Charlie Hunt and Binu John.
When deploying using the Oracle JVM, we recommend setting the following JVM deploy option, which causes a JVM heap dump to be logged on an out of memory error within the JVM:
-XX:+HeapDumpOnOutOfMemoryError
Typically, a Spotfire Streaming deployment consists of a single JVM per node. However, there may be cases where multiple JVMs per node are required (for example, Exceeding a per-process limit on the number of file descriptors).
the Spotfire Streaming runtime supports multiple JVMs deployed within a single node. These JVMs may all access the same managed objects.
-
Size
Shared memory needs to be large enough to contain the application's managed objects, the runtime state, and any in-flight transactions. See the System Sizing Guide for information on how to determine the correct size.
When caching managed objects, shared memory only needs to be large enough to store the subset of cached managed objects.
-
mmap
By default the Spotfire Streaming runtime uses a normal file in the file system. The
mmap(2)
system call is used to map it into the address space of the Spotfire Streaming processes.In a development environment, this is very convenient. Many developers may share a machine, and the operating system only allocates memory as it is actually utilized in the shared memory files. Cleanup of stranded deployments (where the processes are gone but the shared memory file remains) may be as simple as removing file system directories.
A performance disadvantage when using mmaped files for shared memory is that the operating system spends cycles writing the memory image of the file to disk. As the size of the shared memory file and the amount of shared memory accessed by the application increases, the operating system spends more and time writing the contents to disk.
Warning
Deploying a shared memory file on a networked file system (such as NFS) is not supported for production deployments. The I/O performance is not sufficient to support the required throughput. Use System V Shared Memory instead.
-
System V Shared memory
Spotfire Streaming also supports using System V Shared Memory for its shared memory.
Note
To reclaim System V Shared Memory the Spotfire Streaming node must be stopped and removed using the epadmin remove node command. The shared memory is not released by removing the node deployment directory.
An advantage of using System V Shared Memory is that the operating system does not spend any cycles attempting to write the memory to disk.
The operating system allocates the memory all at once, and it cannot be swapped, which is another advantage. In some cases, this also allows the operating system to allocate the physical memory contiguously and use the CPU's TLB (translation lookaside buffer) more efficiently. See Linux Huge Page TLB support for Linux tuning information.
See Linux System V Shared Memory Kernel Tuning for details on tuning Linux System V Shared Memory kernel parameters and macOS System V Shared Memory Kernel Tuning for details on tuning macOS System V Shared Memory kernel parameters.
Managed objects support caching of a subset of the object data in shared memory. The cache size should be set so that it is large enough to allow a working set of objects in shared memory. This avoids constantly refreshing object data from a remote node or an external data store, which negatively impacts performance. Spotfire Streaming uses an LRU (least recently used) algorithm to evict objects from shared memory, so objects that are accessed most often remains cached in shared memory.
The machine where a Spotfire Streaming node runs should always have enough available physical memory so that no swapping occurs on the system. Spotfire Streaming gains much of its performance by caching as much as possible in memory. If this memory becomes swapped, or simple paged out, the cost to access it increases by many orders of magnitude.
On Linux one can see if swapping has occurred using the following command:
$ /usr/bin/free total used free shared buffers cached Mem: 3354568 3102912 251656 0 140068 1343832 -/+ buffers/cache: 1619012 1735556 Swap: 6385796 0 6385796
The BIOS for many hardware platforms include power savings and performance settings. Significant performance differences may be seen based on the settings. For the best Spotfire Streaming performance, we recommend setting them to their maximum performance and lowest latency values.
Operating system kernels typically enforce configurable limits on System V Shared Memory usage. On Linux, these limits can be seen by running the following command:
$ ipcs -lm ------ Shared Memory Limits -------- max number of segments = 4096 max seg size (kbytes) = 67108864 max total shared memory (kbytes) = 67108864 min seg size (bytes) = 1
The tunable values that affect shared memory are:
-
SHMMAX
- This parameter defines the maximum size, in bytes, of a single shared memory segment. It should be set to at least the largest desired memory size for nodes using System V Shared Memory. -
SHMALL
- This parameter sets the total amount of shared memory pages that can be used system wide. It should be set to at leastSHMMAX/page size
. To see the page size for a particular system run the following command:$ getconf PAGE_SIZE 4096
-
SHMMNI
- This parameter sets the system wide maximum number of shared memory segments. It should be set to at least the number of nodes that are to be run on the system using System V Shared Memory.
These values may be changed either at runtime (in several different ways) or system boot time.
Change SHMMAX
to 17 gigabytes, at runtime, as root, by
setting the value directly in /proc:
# echo 17179869184 > /proc/sys/kernel/shmmax
Change SHMALL
to 4 million pages, at runtime, as root,
via the sysctl program:
# sysctl -w kernel.shmall=4194304
Change SHMMNI
to 4096 automatically at boot time:
# echo "kernel.shmmni=4096" >> /etc/sysctl.conf
On Linux, the runtime attempts to use the huge page TLB support the when allocating System V Shared Memory for sizes that are even multiples of 256 megabytes. If the support is not present, or not sufficiently configured, the runtime will automatically fall back to normal System V Shared Memory allocation.
-
The kernel must have the hugepagetlb support enabled. This is present in 2.6 kernels and later. See (http://www.kernel.org/doc/Documentation/vm/hugetlbpage.txt).
-
The system must have huge pages available. They can be reserved:
At boot time via /etc/sysctl.conf:
vm.nr_hugepages = 512
Or at runtime:
echo 512 > /proc/sys/vm/nr_hugepages
Or the kernel can attempt to allocate the from the normal memory pools as needed:
At boot time via /etc/sysctl.conf:
vm.nr_overcommit_hugepages = 512
Or at runtime:
echo 512 > /proc/sys/vm/nr_overcommit_hugepages
-
Non-root users require group permission. This can be granted:
At boot time via /etc/sysctl.conf:
vm.hugetlb_shm_group = 1000
Or at runtime by:
echo 1000 > /proc/sys/vm/hugetlb_shm_group
where 1000 is the desired group id.
-
On earlier kernels in the 2.6 series, the user ulimit on maximum locked memory (memlock) must also be raised to a level equal to or greater than the System V Shared Memory size. On RedHat systems, this involves changing /etc/security/limits.conf, and the enabling the PAM support for limits on whatever login mechanism is being used. See the operating system vendor documentation for details.
A system imposed user limit on the maximum number of processes may impact the ability to deploy multiple JVMs concurrently to the same machine, or even a single JVM if it uses a large number of threads. The limit for the current user may be seen by running:
$ ulimit -u 16384
Many RedHat systems include a limit of 1024:
$ cat /etc/security/limits.d/90-nproc.conf # Default limit for number of user's processes to prevent # accidental fork bombs. # See rhbz #432903 for reasoning. * - nproc 1024
This 1024 should be raised if your errors are like the following:
EAGAIN The system lacked the necessary resources to create another thread, or the system-imposed limit on the total number of threads in a process {PTHREAD_THREADS_MAX} would be exceeded.
Operating system kernels typically enforce configurable limits on System V Shared Memory usage. On macOS, these limits can be seen by running the following command:
ipcs -M IPC status from <running system> as of Sun Apr 29 05:38:52 PDT 2018 shminfo: shmmax: 1073741824 (max shared memory segment size) shmmin: 1 (min shared memory segment size) shmmni: 32 (max number of shared memory identifiers) shmseg: 8 (max shared memory segments per process) shmall: 2097152 (max amount of shared memory in pages)
The tunable variables that affect shared memory are:
-
kern.sysv.shmmax
- This variable defines the maximum size, in bytes, of a single shared memory segment. It should be set to at least the largest desired memory size for nodes using System V Shared Memory. -
kern.sysv.shmall
- This variable sets the total number of shared memory pages that can be used system wide. It should be set to at leastkern.sysv.shmmax/pagesize
. The current page size can be seen using thepagesize
command:pagesize 4096
-
kern.sysv.shmmni
- This variable sets the system wide maximum number of shared memory segments. It should be set to at least the number of nodes that are to be run on the system using System V Shared Memory.
These variables can be changed at runtime using sysctl
, or at system boot time using /etc/sysctl.conf
.
These sysctl
commands change the kernel to support two
million System V shared memory pages with a maximum shared memory segment size of 8
gigabytes. These changes take affect immediately, but are not maintained across
system reboots.
sudo sysctl kern.sysv.shmall=2097152 sudo sysctl kern.sysv.shmmax=8589934592
The /etc/sysctl.conf
must be updated to have the
changed variables be maintained across system reboots. Changes to /etc/sysctl.conf
require a system reboot to take affect.
# # Maximum shared memory segment size of 10 GB # Maximum of 2 million shared memory pages # kern.sysv.shmmax=1073741824 kern.sysv.shmall=2097152
A system imposed user limit on the maximum number of processes and threads may impact the ability to deploy multiple JVMs concurrently to the same machine, or even a single JVM if it uses a large number of threads. The current process limit is displayed using:
ulimit -u 2837
There are two tunable variables that control this value:
These variables can be changed at runtime using sysctl
, or at system boot time using /etc/sysctl.conf.
These sysctl
commands change the kernel to support a
total of 4K processes system wide and 2 K processes per user. These changes take
affect immediately, but are not maintained across system reboots.
sudo sysctl kern.maxproc=4096
sudo sysctl kern.maxprocperuid
=2048
The /etc/sysctl.conf
must be updated to have the
changed variables be maintained across system reboots. Changes to /etc/sysctl.conf
require a system reboot to take affect.
#
# Support 4096 total process and 2048 per user
#
kern.maxproc=4096
kern.maxprocperuid
=2048
A Spotfire Streaming application can be, and often is, run on a single node. With High-availability and Distribution features, Spotfire Streaming can run distributed applications across multiple nodes. From an operational point of view, there are very few benefits from running multiple nodes on a single machine. This document recommends and assumes that each node is run on its own machine.
When an application reaches its throughput limit on a single node, additional performance can be gained by adding multiple nodes. This is called horizontal scaling. For an application that is not designed for distribution, this often poses a problem. Sometimes this can be addressed by adding a routing device outside of the nodes. But sometimes this can only be addressed by rewriting the application.
A distributed Spotfire Streaming application can be spread across an arbitrary number of nodes at the High-availability data partition boundary. If the active node for a set of partitions has reached throughput saturation, one or more of the partitions may be migrated to other nodes.
When Spotfire Streaming detects a deadlock, a detailed trace is sent to the node's deadlock.log file. The deadlock trace shows information about the transaction that deadlocked, which resource deadlocked, transaction stacks, thread stack traces, and other transactions involved in the deadlock.
A lock order deadlock can occur when two or more transactions lock the same two or more objects in different orders. An illustration of this can be found in the Deadlock Detection section of the Architects Guide.
The program below generates a single transaction lock ordering deadlock between two threads, running in a single JVM, in a single node.
package com.tibco.ep.dtm.snippets.tuning; import java.util.concurrent.TimeUnit; import com.kabira.platform.Transaction; import com.kabira.platform.annotation.Managed; /** * Deadlock Example from the Tuning Guide. * */ public class Deadlock { private static MyManagedObject object1; private static MyManagedObject object2; /** * Main entry point * * @param args * Not used * @throws InterruptedException * Execution interrupted */ public static void main(String[] args) throws InterruptedException { // // Create a pair of Managed objects. // new Transaction("Create Objects") { @Override public void run() { object1 = new MyManagedObject(); object2 = new MyManagedObject(); } }.execute(); // // Create a pair of transaction classes to lock them. // Giving the object parameters in reverse order will // cause two different locking orders, resulting in a deadlock. // Deadlocker deadlocker1 = new Deadlocker(object1, object2); Deadlocker deadlocker2 = new Deadlocker(object2, object1); // // Run them in separate threads until a deadlock is seen. // while ((deadlocker1.getNumberDeadlocks() == 0) && (deadlocker2.getNumberDeadlocks() == 0)) { MyThread thread1 = new MyThread(deadlocker1); MyThread thread2 = new MyThread(deadlocker2); thread1.start(); thread2.start(); thread1.join(); thread2.join(); } } @Managed private static class MyManagedObject { int value; } private static class MyThread extends Thread { private final Deadlocker m_deadlocker; MyThread(Deadlocker deadlocker) { m_deadlocker = deadlocker; } @Override public void run() { m_deadlocker.execute(); } } private static class Deadlocker extends Transaction { private final MyManagedObject m_object1; private final MyManagedObject m_object2; Deadlocker(MyManagedObject object1, MyManagedObject object2) { m_object1 = object1; m_object2 = object2; } @Override public void run() { // // This will take a transaction read lock on the first object. // @SuppressWarnings("unused") int value = m_object1.value; // // Wait a while to maximize the possibility of contention. // blockForAMoment(); // // This will take a transaction write lock on the second object. // m_object2.value = 42; // // Wait a while to maximize the possibility of contention. // blockForAMoment(); } private void blockForAMoment() { try { TimeUnit.MILLISECONDS.sleep(500); } catch (InterruptedException ex) { } } } }
The program generates a deadlock trace into the deadlock.log
file, similar to the following annotated trace
shown below.
A deadlock trace begins with a separator:
============================================================
Followed by a timestamp and a short description of the deadlock.
2023-12-02 07:29:11.417506 Deadlock in transaction 64:1 running on engine application::com_tibco_ep_dtm_snippets_tuning_Deadlock0
Next there is another separator, which is between each transaction in the report
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Followed by more detailed information about the deadlock transaction.
Node = A.deadlock Transaction Id = 64:1 Transaction Name = com.tibco.ep.dtm.snippets.tuning.Deadlock$Deadlocker Isolation Level = serializable Begin Time = 2023-12-02 07:29:10.913048 State = deadlocked
Followed by the Lock Type,
Lock Target that caused the deadlock,
and which other transactions have the Lock
Target locked. This example shows that the deadlock occurred in
transaction 64:1
attempting to take a write lock on
an object ...MyManagedObject:3
, which is read locked
in transaction 65:1
.
Lock Type = write lock Lock Target = com.tibco.ep.dtm.snippets.tuning.Deadlock$MyManagedObject:3 (3184101770:178056336:1701530943848471000:3) Write Lock Waiters = 0 Locked By: read lock held by transaction 65:1
Next is a list of locks held by the deadlock transaction. Note that this example
shows the deadlock transaction holding a read lock on ...MyManagedObject:6
.
Transaction 64:1 holds locks: com.tibco.ep.dtm.snippets.tuning.Deadlock$MyManagedObject:6 (3184101770:178056336:1701530943848471000:6) read lock
The next section shows a list of engines installed in the node and their IDs. This maps to the Engine column in the transaction and thread sections.
Engines on node A.deadlock: Engine Name 720944343 System::swcoordadmin 1568875692 System::administration 3848410375 application::com_tibco_ep_dtm_snippets_tuning_Deadlock0
The next section shows a list of thread identifiers to thread names involved in the transaction. This maps to the ThreadId column in the transaction callstack and thread stack sections.
Threads on node A.deadlock: ThreadId Name 9479 Thread-2
The next section shows a transaction callstack for the deadlock transaction. A
transaction callstack contains transaction life cycle entries, and entries
showing the transaction's thread and engine usage. A transaction callstack is
read from bottom to top and always starts with a begin
transaction
entry. This example shows a transaction that deadlocked while
using a single thread (ThreadId
9479
, Engine 3848410375
).
From the tables above, thread id 9479
is named
Thread-2
and engine 3848410375
is named application::com_tibco_ep_dtm_snippets_tuning_Deadlock0
.
Transaction 64:1 call stack: TranId Engine ThreadId Method 64:1 3848410375 9479 deadlock on com.tibco.ep.dtm.snippets.tuning.Deadlock$MyManagedObject:3 64:1 3848410375 9479 begin transaction
Next are thread stack traces for each of the threads being used by the transaction at the time of the deadlock. Thread stack traces are read from bottom to top.
Transaction 64:1 thread stack: TranId Engine ThreadId Stack Type Method 64:1 3848410375 9479 Java com.kabira.platform.NativeRuntime.setInteger(Native Method) 64:1 3848410375 9479 Java com.tibco.ep.dtm.snippets.tuning.Deadlock$Deadlocker.run(Deadlock.java:110) 64:1 3848410375 9479 Java com.kabira.platform.Transaction.execute(Transaction.java:485) 64:1 3848410375 9479 Java com.kabira.platform.Transaction.execute(Transaction.java:544) 64:1 3848410375 9479 Java com.tibco.ep.dtm.snippets.tuning.Deadlock$MyThread.run(Deadlock.java:79)
The next sections show the same transaction information (when available) for each of the other transactions involved in the deadlock. In this example transaction 65:1 is the other transaction involved in the deadlock.
Other involved transactions: ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Node = A.deadlock Transaction Id = 65:1 Transaction Name = com.tibco.ep.dtm.snippets.tuning.Deadlock$Deadlocker Isolation Level = serializable Begin Time = 2023-12-02 07:29:10.913147
This details in this section show that transaction 65:1
is blocked waiting for a write lock on an object
...MyManagedObject:6
, which is held with a read lock
by the 64:1
, the deadlocked transaction. This is a
deadlock because transaction 64:1
is waiting for a
write lock on object ...MyManagedObject:6
which is
already read locked in transaction 64:1
- the
transactions are waiting on each other. Transaction 64:1
is choose as the deadlock victim so that it will release
its locks when it is rolled back allowing transaction 65:1
to continue.
State = blocked Lock Type = write lock Lock Target = com.tibco.ep.dtm.snippets.tuning.Deadlock$MyManagedObject:6 (3184101770:178056336:1701530943848471000:6) Write Lock Waiters = 1 Locked By: read lock held by transaction 64:1 Threads on node A.deadlock: ThreadId Name 65283 Thread-3 Transaction 65:1 call stack: TranId Engine ThreadId Method 65:1 3848410375 65283 begin transaction Transaction 65:1 thread stack: TranId Engine ThreadId Stack Type Method 65:1 3848410375 65283 Java com.kabira.platform.NativeRuntime.setInteger(Native Method) 65:1 3848410375 65283 Java com.tibco.ep.dtm.snippets.tuning.Deadlock$Deadlocker.run(Deadlock.java:110) 65:1 3848410375 65283 Java com.kabira.platform.Transaction.execute(Transaction.java:485) 65:1 3848410375 65283 Java com.kabira.platform.Transaction.execute(Transaction.java:544) 65:1 3848410375 65283 Java com.tibco.ep.dtm.snippets.tuning.Deadlock$MyThread.run(Deadlock.java:79) Transaction 65:1 holds locks: com.tibco.ep.dtm.snippets.tuning.Deadlock$MyManagedObject:3 (3184101770:178056336:1701530943848471000:3) read lock
Lock promotion is when a transaction currently holding a read lock on an object attempts to acquire a write lock on the same object (i.e. Promoting the read lock to a write lock). If blocking for this write lock would result in a deadlock, it is called a promotion deadlock.
The program below generates a single promotion deadlock between two threads, running in a single JVM, in a single node.
package com.tibco.ep.dtm.snippets.tuning; import java.util.concurrent.TimeUnit; import com.kabira.platform.Transaction; import com.kabira.platform.annotation.Managed; /** * Promotion deadlock Example from the Tuning Guide. */ public class PromotionDeadlock { private static MyManagedObject targetObject; /** * Main entry point * * @param args * Not used * @throws InterruptedException * Execution interrupted */ public static void main(String[] args) throws InterruptedException { // // Create a Managed objects. // new Transaction("Create Objects") { @Override public void run() { targetObject = new MyManagedObject(); } }.execute(); // // Create a pair of transaction classes that will both // promote lock the Managed object, resulting in a // promotion deadlock. // Deadlocker deadlocker1 = new Deadlocker(targetObject); Deadlocker deadlocker2 = new Deadlocker(targetObject); // // Run them in separate threads until a deadlock is seen. // while ((deadlocker1.getNumberDeadlocks() == 0) && (deadlocker2.getNumberDeadlocks() == 0)) { MyThread thread1 = new MyThread(deadlocker1); MyThread thread2 = new MyThread(deadlocker2); thread1.start(); thread2.start(); thread1.join(); thread2.join(); } } @Managed private static class MyManagedObject { int value; } private static class MyThread extends Thread { private final Deadlocker m_deadlocker; MyThread(Deadlocker deadlocker) { m_deadlocker = deadlocker; } @Override public void run() { m_deadlocker.execute(); } } private static class Deadlocker extends Transaction { private final MyManagedObject m_targetObject; Deadlocker(MyManagedObject targetObject) { m_targetObject = targetObject; } @Override public void run() { // // This will take a transaction read lock on the object. // @SuppressWarnings("unused") int value = m_targetObject.value; // // Wait a while to maximize the possibility of contention. // blockForAMoment(); // // This will take a transaction write lock on the object // (promoting the read lock). // m_targetObject.value = 42; // // Wait a while to maximize the possibility of contention. // blockForAMoment(); } private void blockForAMoment() { try { TimeUnit.MILLISECONDS.sleep(500); } catch (InterruptedException ex) { } } } }
The trace messages are similar to those shown in the previous section for a lock order deadlock, with the difference being that promotion deadlock is mentioned:
=========================================================== 2023-12-02 08:03:21.689878 Deadlock in transaction 65:1 running on engine application::com_tibco_ep_dtm_snippets_tuning_PromotionDeadlock0 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Node = A.deadlock Transaction Id = 65:1 Transaction Name = com.tibco.ep.dtm.snippets.tuning.PromotionDeadlock$Deadlocker Isolation Level = serializable Begin Time = 2023-12-02 08:03:21.186231 State = deadlocked Lock Type = promote lock Lock Target = com.tibco.ep.dtm.snippets.tuning.PromotionDeadlock$MyManagedObject:4 (3184101770:8762792:1701532994132362000:4) Write Lock Waiters = 0 Locked By: read lock (and promote waiter) held by transaction 64:1 read lock held by transaction 65:1 Transaction 65:1 holds locks: com.tibco.ep.dtm.snippets.tuning.PromotionDeadlock$MyManagedObject:4 (3184101770:8762792:1701532994132362000:4) read lock Engines on node A.deadlock: Engine Name 720944343 System::swcoordadmin 1568875692 System::administration 849118958 application::com_tibco_ep_dtm_snippets_tuning_PromotionDeadlock0 Threads on node A.deadlock: ThreadId Name 64003 Thread-3 Transaction 65:1 call stack: TranId Engine ThreadId Method 65:1 849118958 64003 promotion deadlock on com.tibco.ep.dtm.snippets.tuning.PromotionDeadlock$MyManagedObject:4 65:1 849118958 64003 begin transaction Transaction 65:1 thread stack: TranId Engine ThreadId Stack Type Method 65:1 849118958 64003 Java com.kabira.platform.NativeRuntime.setInteger(Native Method) 65:1 849118958 64003 Java com.tibco.ep.dtm.snippets.tuning.PromotionDeadlock$Deadlocker.run(PromotionDeadlock.java:106) 65:1 849118958 64003 Java com.kabira.platform.Transaction.execute(Transaction.java:485) 65:1 849118958 64003 Java com.kabira.platform.Transaction.execute(Transaction.java:544) 65:1 849118958 64003 Java com.tibco.ep.dtm.snippets.tuning.PromotionDeadlock$MyThread.run(PromotionDeadlock.java:76) Other involved transactions: ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Node = A.deadlock Transaction Id = 64:1 Transaction Name = com.tibco.ep.dtm.snippets.tuning.PromotionDeadlock$Deadlocker Isolation Level = serializable Begin Time = 2023-12-02 08:03:21.186138 State = blocked Lock Type = promote lock Lock Target = com.tibco.ep.dtm.snippets.tuning.PromotionDeadlock$MyManagedObject:4 (3184101770:8762792:1701532994132362000:4) Write Lock Waiters = 0 Locked By: read lock (and promote waiter) held by transaction 64:1 read lock held by transaction 65:1 Threads on node A.deadlock: ThreadId Name 8967 Thread-2 Transaction 64:1 call stack: TranId Engine ThreadId Method 64:1 849118958 8967 begin transaction Transaction 64:1 thread stack: TranId Engine ThreadId Stack Type Method 64:1 849118958 8967 Java com.kabira.platform.NativeRuntime.setInteger(Native Method) 64:1 849118958 8967 Java com.tibco.ep.dtm.snippets.tuning.PromotionDeadlock$Deadlocker.run(PromotionDeadlock.java:106) 64:1 849118958 8967 Java com.kabira.platform.Transaction.execute(Transaction.java:485) 64:1 849118958 8967 Java com.kabira.platform.Transaction.execute(Transaction.java:544) 64:1 849118958 8967 Java com.tibco.ep.dtm.snippets.tuning.PromotionDeadlock$MyThread.run(PromotionDeadlock.java:76) Transaction 64:1 holds locks: com.tibco.ep.dtm.snippets.tuning.PromotionDeadlock$MyManagedObject:4 (3184101770:8762792:1701532994132362000:4) read lock
The previous examples showed simple deadlocks, occurring between two transactions. More complex deadlocks are possible involving more than two transactions. For example, transaction 1 deadlocks trying to acquire a lock on an object held by transaction 2 who is blocked waiting on an object held by transaction 3.
To aid in analyzing complex deadlocks, the following is found in the trace messages:
Lock Target = com.tibco.ep.dtm.snippets.tuning.PromotionDeadlock$MyManagedObject:4 (3184101770:8762792:1701532994132362000:4) Write Lock Waiters = 0 Locked By: read lock (and promote waiter) held by transaction 64:1 read lock held by transaction 65:1
For each contended object, a display of the locks is included, including any promotion waiters.
If the runtime detects that a deadlock happens due to a read lock being blocked, it includes the transaction blocked waiting for the promotion.
Single node deadlocks are bad for performance because they are a source of contention, leading to lower throughput, higher latency, and higher CPU cost. But the deadlocks are detected immediately, because each node has a built-in transaction lock manager.
Distributed deadlocks are extremely bad for performance because they use a timeout mechanism for deadlock detection. The default setting for this timeout is 60 seconds in a production build.
The program below will generate a distributed transaction lock ordering deadlock between two transactions running across multiple nodes.
package com.tibco.ep.dtm.snippets.tuning; import com.kabira.platform.Transaction; import com.kabira.platform.annotation.Managed; import com.kabira.platform.highavailability.PartitionManager; import com.kabira.platform.highavailability.PartitionManager.EnableAction; import com.kabira.platform.highavailability.PartitionMapper; import com.kabira.platform.highavailability.ReplicaNode; import static com.kabira.platform.highavailability.ReplicaNode.ReplicationType.*; import java.util.concurrent.TimeUnit; import com.kabira.platform.property.Status; /** * Distributed deadlock example from the Tuning Guide * <h2>Target Nodes</h2> * <ul> * <li><b>servicename</b>=snippets * </ul> * Note this sample blocks on B.snippet and C.snippet nodes, and needs to be explicitly stopped. */ public class DistributedDeadlock { private static TestObject object1; private static TestObject object2; private static final String nodeName = System.getProperty(Status.NODE_NAME); private static final String NODE_A = "A.snippets"; private static final String NODE_B = "B.snippets"; private static final String NODE_C = "C.snippets"; /** * Main entry point * * @param args * Not used * @throws InterruptedException * Execution interrupted */ public static void main(String[] args) throws InterruptedException { // // Install a partition mapper on each node // AssignPartitions.installPartitionMapper(); // // Block all but the A node. // new NodeChecker().blockAllButA(); // // Define the partitions to be used by this snippet // new PartitionCreator().createPartitions(); // // Create a pair of objects, one active on node B, // and the other active on node C. // new Transaction("Create Objects") { @Override public void run() { object1 = new TestObject(); object2 = new TestObject(); // // For each distributed object, assign it a // reference to the other. // object1.otherObject = object2; object2.otherObject = object1; } }.execute(); // // Create a pair of objects, one active on node B, // and the other active on node C. // new Transaction("Spawn Deadlockers") { @Override public void run() { // // Ask them each to spawn a Deadlocker thread. // This should execute on node B for one of them // and node C for the other. // object1.spawnDeadlocker(); object2.spawnDeadlocker(); } }.execute(); // // Now block main in the A node to keep the JVM from exiting. // new NodeChecker().block(); } private static class PartitionCreator { void createPartitions() { new Transaction("Partition Definition") { @Override protected void run() throws Rollback { // // Set up the node lists - notice that the odd node list // has node B as the active node, while the even // node list has node C as the active node. // ReplicaNode[] evenReplicaList = new ReplicaNode[] { new ReplicaNode(NODE_C, SYNCHRONOUS), new ReplicaNode(NODE_A, SYNCHRONOUS) }; ReplicaNode[] oddReplicaList = new ReplicaNode[] { new ReplicaNode(NODE_B, SYNCHRONOUS), new ReplicaNode(NODE_A, SYNCHRONOUS) }; // // Define two partitions // PartitionManager.definePartition("Even", null, NODE_B, evenReplicaList); PartitionManager.definePartition("Odd", null, NODE_C, oddReplicaList); // // Enable the partitions // PartitionManager.enablePartitions(EnableAction.JOIN_CLUSTER_PURGE); } }.execute(); } } // // Partition mapper that maps objects to either Even or Odd // private static class AssignPartitions extends PartitionMapper { private Integer m_count = 0; @Override public String getPartition(Object obj) { this.m_count++; String partition = "Even"; if ((this.m_count % 2) == 1) { partition = "Odd"; } return partition; } static void installPartitionMapper() { new Transaction("installPartitionMapper") { @Override protected void run() { // // Install the partition mapper // PartitionManager.setMapper(TestObject.class, new AssignPartitions()); } }.execute(); } } @Managed private static class TestObject { TestObject otherObject; @SuppressWarnings("unused") private String m_data; public void lockObjects() { Transaction.setTransactionDescription("locking first object"); doWork(); // // Delay longer on the B node to try to force the deadlock // to occur on the C. Otherwise, both sides could see // deadlocks at the same time, making the log files less clear // for this snippet. // if (nodeName.equals(NODE_B)) { block(10000); } else { block(500); } Transaction.setTransactionDescription("locking second object"); otherObject.doWork(); block(500); } public void spawnDeadlocker() { new DeadlockThread(this).start(); } private void block(int milliseconds) { try { TimeUnit.MILLISECONDS.sleep(milliseconds); } catch (InterruptedException ex) { } } private void doWork() { m_data = "work"; } } private static class DeadlockThread extends Thread { private final Transaction m_deadlockTransaction; DeadlockThread(TestObject object) { m_deadlockTransaction = new DeadlockTransaction("DeadlockThread", object); } @Override public void run() { while (true) { if (m_deadlockTransaction.execute() == Transaction.Result.ROLLBACK) { return; } } } } private static class DeadlockTransaction extends Transaction { private final TestObject m_object; DeadlockTransaction(final String name, TestObject object) { super(name); m_object = object; } @Override public void run() throws Rollback { if (getNumberDeadlocks() != 0) { System.out.println("A deadlock has been seen, " + "you may now stop the distributed application"); throw new Transaction.Rollback(); } m_object.lockObjects(); } } private static class NodeChecker { // // If we are not the A node, block here forever // void blockAllButA() { while (!nodeName.equals(NODE_A)) { block(); } } public void block() { while (true) { try { TimeUnit.MILLISECONDS.sleep(500); } catch (InterruptedException ex) { } } } } }
The program should produce a deadlock that is processed on node C, and found in the node C deadlock.log. The deadlock trace is generated on the node where the distributed transaction was started. This is not the node where the deadlock timeout occurred.
=========================================================== 2023-12-02 08:28:32.604580 Global deadlock in transaction 85:1 running on engine application::com_tibco_ep_dtm_snippets_tuning_DistributedDeadlock0 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Node = C.snippets Transaction Id = 85:1 Transaction Name = DeadlockThread Transaction Description = locking second object Global Transaction Id = serializable:3080819280765915:85:1:1701534258543620000 Isolation Level = serializable Begin Time = 2023-12-02 08:27:32.065793 State = distributed deadlock Transaction 85:1 holds locks: com.tibco.ep.dtm.snippets.tuning.DistributedDeadlock$TestObject:7 (3184101770:3037728096:1701534235759231000:7) write lock Engines on node C.snippets: Engine Name 720944343 System::swcoordadmin 1568875692 System::administration 2765607519 application::com_tibco_ep_dtm_snippets_tuning_DistributedDeadlock0 Threads on node C.snippets: ThreadId Name 11015 Thread-2 Transaction 85:1 call stack: TranId Engine ThreadId Method 85:1 2765607519 11015 distribution calling com.tibco.ep.dtm.snippets.tuning.DistributedDeadlock$TestObject.$doWorkImpl()V on com.tibco.ep.dtm.snippets.tuning.DistributedDeadlock$TestObject:5 85:1 2765607519 11015 dispatch calling [distributed dispatch] on com.tibco.ep.dtm.snippets.tuning.DistributedDeadlock$TestObject:5 85:1 2765607519 11015 begin transaction Transaction 85:1 thread stack: TranId Engine ThreadId Stack Type Method 85:1 2765607519 11015 Java com.kabira.platform.NativeRuntime.sendTwoWay(Native Method) 85:1 2765607519 11015 Java com.kabira.platform.NativeRuntime.sendTwoWay(NativeRuntime.java:173) 85:1 2765607519 11015 Java com.tibco.ep.dtm.snippets.tuning.DistributedDeadlock$TestObject.doWork(DistributedDeadlock.java) 85:1 2765607519 11015 Java com.tibco.ep.dtm.snippets.tuning.DistributedDeadlock$TestObject.$lockObjectsImpl(DistributedDeadlock.java:193) 85:1 2765607519 11015 Java com.tibco.ep.dtm.snippets.tuning.DistributedDeadlock$TestObject.lockObjects(DistributedDeadlock.java) 85:1 2765607519 11015 Java com.tibco.ep.dtm.snippets.tuning.DistributedDeadlock$DeadlockTransaction.run(DistributedDeadlock.java:250) 85:1 2765607519 11015 Java com.kabira.platform.Transaction.execute(Transaction.java:485) 85:1 2765607519 11015 Java com.kabira.platform.Transaction.execute(Transaction.java:544) 85:1 2765607519 11015 Java com.tibco.ep.dtm.snippets.tuning.DistributedDeadlock$DeadlockThread.run(DistributedDeadlock.java:227)
Next comes information from the remote node, where the deadlock timeout occurred.
Other involved transactions: ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Node = B.snippets Transaction Id = 89:1 Transaction Name = DeadlockThread Transaction Description = locking second object Global Transaction Id = serializable:3124420528571642:89:1:1701534246888375000 Isolation Level = serializable Begin Time = 2023-12-02 08:27:32.072194 State = state not available, transaction may be running Transaction 89:1 call stack: TranId Engine ThreadId Method 89:1 2765607519 44035 distribution calling com.tibco.ep.dtm.snippets.tuning.DistributedDeadlock$TestObject.$doWorkImpl()V on com.tibco.ep.dtm.snippets.tuning.DistributedDeadlock$TestObject:7 89:1 2765607519 44035 dispatch calling [distributed dispatch] on com.tibco.ep.dtm.snippets.tuning.DistributedDeadlock$TestObject:7 89:1 2765607519 44035 begin transaction Transaction 89:1 holds locks: com.tibco.ep.dtm.snippets.tuning.DistributedDeadlock$TestObject:5 (3184101770:3037728096:1701534235759231000:5) write lock at com.kabira.platform.NativeRuntime.setReference(Native Method) at com.tibco.ep.dtm.snippets.tuning.DistributedDeadlock$TestObject.$doWorkImpl(DistributedDeadlock.java:211)
Included also from the remote node is a list of all transactions on the node that were blocked at the time of the deadlock.
All local blocked transactions on node C.snippets: Transaction anonymous transaction[serializable:3124420528571642:89:1:1701534246888375000, tid 31503], started at 2023-12-02 08:27:42.076718, is blocked waiting for a write lock on com.tibco.ep.dtm.snippets.tuning.DistributedDeadlock$TestObject:7 (3184101770:3037728096:1701534235759231000:7) locks write { 'DeadlockThread'[serializable:3080819280765915:85:1:1701534258543620000, tid 11015, locking second object] } {1 write waiter} Transaction callstack for transaction 87:1: Engine 1568875692 Thread 31503 begin transaction Engine 2765607519 Thread 6147 dispatch calling com.tibco.ep.dtm.snippets.tuning.DistributedDeadlock$TestObject.$doWorkImpl()V on com.tibco.ep.dtm.snippets.tuning.DistributedDeadlock$TestObject:7 Objects currently locked in transaction anonymous transaction[serializable:3124420528571642:89:1:1701534246888375000, tid 31503] com.tibco.ep.dtm.snippets.tuning.DistributedDeadlock$TestObject:5 (3184101770:3037728096:1701534235759231000:5) write lock
The transaction
statistic can show which classes are
involved in transaction lock contention. Often, this is sufficient to help the
developer already familiar with the application, identify application changes for
reducing the contention. For cases where the code paths involved in the contention
are not already known, the transactioncontention
statistic can be useful.
Enabling the transactioncontention
statistic causes the
Spotfire Streaming runtime to collect a stack backtrace each time a transaction lock
encounters contention. The stacks are saved per managed class name.
Note
The collection of transaction contention statistics is very expensive computationally and should only be used in development or test systems.
To use transaction contention statistics, enable them with the epadmin enable statistics --statistics=transactioncontention command.
If your application is not already running, start it. This example uses the TransactionContention snippet shown below.
package com.tibco.ep.dtm.snippets.tuning; import java.util.concurrent.TimeUnit; import com.kabira.platform.Transaction; import com.kabira.platform.annotation.Managed; /** * Simple transaction contention generator * <p> * Note this sample needs to be explicitly stopped. */ public class TransactionContention { /** * Main entry point * * @param args * Not used */ public static void main(String[] args) { // // Create a managed object to use for // generating transaction lock contention // final MyManaged myManaged = createMyManaged(); // // Create/start a thread which will // transactionally contend for the object. // new MyThread(myManaged).start(); while (true) { // // Contend for the object here // from // the main thread (competing // with the thread started above). // generateContention(myManaged); nap(200); } } private static MyManaged createMyManaged() { return new Transaction("createMyManaged") { MyManaged m_object; @Override protected void run() { m_object = new MyManaged(); } MyManaged create() { execute(); return m_object; } }.create(); } private static void generateContention(final MyManaged myManaged) { new Transaction("generateContention") { @Override protected void run() { writeLockObject(myManaged); } }.execute(); } @Managed private static class MyManaged { } private static void nap(int milliseconds) { try { TimeUnit.MILLISECONDS.sleep(milliseconds); } catch (InterruptedException e) { } } private static class MyThread extends Thread { MyManaged m_object; MyThread(MyManaged myManaged) { m_object = myManaged; } @Override public void run() { while (true) { generateContention(m_object); nap(200); } } } }
After your application has run long enough to generate some transaction lock contention, stop the data collection with the epadmin disable statistics statistics=transactioncontention command.
Display the collected data with the epadmin display statistics --statistics=transactioncontention command.
======== transaction contention report for A ======== 24 occurrences on type com.kabira.snippets.tuning.TransactionContention$MyManaged of stack: com.kabira.platform.Transaction.lockObject(Native Method) com.kabira.platform.Transaction.writeLockObject(Transaction.java:706) com.kabira.snippets.tuning.TransactionContention$2.run(TransactionContention.java:48) com.kabira.platform.Transaction.execute(Transaction.java:484) com.kabira.platform.Transaction.execute(Transaction.java:542) com.kabira.snippets.tuning.TransactionContention.generateContention(TransactionContention.java:43) com.kabira.snippets.tuning.TransactionContention$MyThread.run(TransactionContention.java:84) 57 occurrences on type com.kabira.snippets.tuning.TransactionContention$MyManaged of stack: com.kabira.platform.Transaction.lockObject(Native Method) com.kabira.platform.Transaction.writeLockObject(Transaction.java:706) com.kabira.snippets.tuning.TransactionContention$2.run(TransactionContention.java:48) com.kabira.platform.Transaction.execute(Transaction.java:484) com.kabira.platform.Transaction.execute(Transaction.java:542) com.kabira.snippets.tuning.TransactionContention.generateContention(TransactionContention.java:43) com.kabira.snippets.tuning.TransactionContention.main(TransactionContention.java:16) sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) java.lang.reflect.Method.invoke(Method.java:483) com.intellij.rt.execution.application.AppMain.main(AppMain.java:134) sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) java.lang.reflect.Method.invoke(Method.java:483) com.kabira.platform.MainWrapper.invokeMain(MainWrapper.java:65)
This output shows the two call paths, which experienced contention.
The collected data may be cleared with the epadmin clear statistics --statistics=transactioncontention command.
Transaction lock promotion can lead to deadlocks. The transaction
statistic can show which classes are involved in
transaction lock promotion. Often, this is sufficient to help the developer already
familiar with the application, identify application changes for removing the
promotion locks. For cases where the code paths involved in the contention are not
already known, the transactionpromotion
statistic can be
useful.
Enabling the transactionpromotion
statistic causes the
Spotfire Streaming runtime to collect a stack backtrace each time a transaction lock
is promoted from read to write. The stacks are saved per managed class name.
Note
The collection of transaction promotion statistics is very expensive computationally and should only be used in development or test systems.
To use transaction promotion statistics, enable them with the epadmin enable statistics --statistics=transactionpromotion command.
If your application is not already running, start it. This example uses the TransactionPromotion snippet shown below.
package com.tibco.ep.dtm.snippets.tuning; import com.kabira.platform.Transaction; import com.kabira.platform.annotation.Managed; /** * Simple transaction promotion generator */ public class TransactionPromotion { private static final MyManaged m_myManaged = createObject(); /** * Main entry point * * @param args * Not used */ public static void main(String[] args) { new Transaction("promotion") { @Override protected void run() { readLockObject(m_myManaged); // Do promotion writeLockObject(m_myManaged); } }.execute(); } private static MyManaged createObject() { return new Transaction("createObject") { MyManaged m_object; @Override protected void run() { m_object = new MyManaged(); } MyManaged create() { execute(); return m_object; } }.create(); } @Managed private static class MyManaged { } }
After your application has run stop the data collection with the epadmin disable statistics --statistics=transactionpromotion command.
Display the collected data with the epadmin display statistics --statistics=transactionpromotion command.
======== Transaction Promotion report for A ======== Data gathered between 2015-03-20 10:27:18 PDT and 2015-03-20 10:28:04 PDT. 1 occurrence on type com.kabira.snippets.tuning.TransactionPromotion$MyManaged of stack: com.kabira.platform.Transaction.lockObject(Native Method) com.kabira.platform.Transaction.writeLockObject(Transaction.java:706) com.kabira.snippets.tuning.TransactionPromotion$1.run(TransactionPromotion.java:29) com.kabira.platform.Transaction.execute(Transaction.java:484) com.kabira.platform.Transaction.execute(Transaction.java:542) com.kabira.snippets.tuning.TransactionPromotion.main(TransactionPromotion.java:22) sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) java.lang.reflect.Method.invoke(Method.java:483) com.intellij.rt.execution.application.AppMain.main(AppMain.java:134) sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) java.lang.reflect.Method.invoke(Method.java:483) com.kabira.platform.MainWrapper.invokeMain(MainWrapper.java:65)
This output shows the two call paths where the promotion occurred.
The collected data may be cleared with the epadmin clear statistics --statistics=transactionpromotion command.