When ActiveSpaces® Transactions detects a deadlock a detailed trace is sent to the log files showing which resource deadlocked, which transactions were involved in the deadlock, which resources they had locked, and which resource they were blocked waiting for. Additionally a stack trace is logged showing where in the application the deadlock occurred.
The program below will generate a single transaction lock ordering deadlock between two threads, running in a single JVM, in a single node.
// $Revision: 1.1.2.1 $ package com.kabira.snippets.tuning; import com.kabira.platform.Transaction; import com.kabira.platform.annotation.Managed; /** * Deadlock Example from the ActiveSpaces Transactions Tuning Guide. * <p> * <h2> Target Nodes</h2> * <ul> * <li> <b>domainnode</b> = A * </ul> */ public class Deadlock { static MyManagedObject object1; static MyManagedObject object2; public static void main(String[] args) throws InterruptedException { // // Create a pair of Managed objects. // new Transaction("Create Objects") { @Override public void run() { object1 = new MyManagedObject(); object2 = new MyManagedObject(); } }.execute(); // // Create a pair of transaction classes to lock them. // Giving the object parameters in reverse order will // cause two different locking orders, resulting in a deadlock. // Deadlocker deadlocker1 = new Deadlocker(object1, object2); Deadlocker deadlocker2 = new Deadlocker(object2, object1); // // Run them in separate threads until a deadlock is seen. // while ((deadlocker1.getNumberDeadlocks() == 0) && (deadlocker2.getNumberDeadlocks() == 0)) { MyThread thread1 = new MyThread(deadlocker1); MyThread thread2 = new MyThread(deadlocker2); thread1.start(); thread2.start(); thread1.join(); thread2.join(); } } @Managed static class MyManagedObject { int value; } static class MyThread extends Thread { private Deadlocker m_deadlocker; MyThread(Deadlocker deadlocker) { m_deadlocker = deadlocker; } @Override public void run() { m_deadlocker.execute(); } } static class Deadlocker extends Transaction { private final MyManagedObject m_object1; private final MyManagedObject m_object2; Deadlocker(MyManagedObject object1, MyManagedObject object2) { m_object1 = object1; m_object2 = object2; } @Override public void run() { // // This will take a transaction read lock on the first object. // int value = m_object1.value; // // Wait a while to maximize the possibility of contention. // blockForAMoment(); // // This will take a transaction write lock on the second object. // m_object2.value = 42; // // Wait a while to maximize the possibility of contention. // blockForAMoment(); } private void blockForAMoment() { try { Thread.sleep(500); } catch (InterruptedException ex) { } } } }
The log file for the A node will contain a trace similar to the following:
2010-05-19 10:58:36.190955|OBJ|WARN |19821|osproxy.cpp(857)|deadlock detected in transaction id 156:2 Transaction 156:2 deadlocked attempting to write lock com.kabira.snippets.tuning.Deadlock$MyManagedObject:1 (890258:6145144:112456:1 offset 64534360) locks read { 155:2 } which is held by transaction id 155:2 Callstack for transaction 156:2: deadlock on com.kabira.snippets.tuning.Deadlock$MyManagedObject:1 Objects currently locked in transaction id 156:2 com.kabira.snippets.tuning.Deadlock$MyManagedObject:2 (890258:6145144:112456:2 offset 64099448) read lock Transaction 155:2 is blocked waiting for a write lock on com.kabira.snippets.tuning.Deadlock$MyManagedObject:2 (890258:6145144:112456:2 offset 64099448) locks read { 156:2 } which is held by transaction 156:2 Objects currently locked in transaction 155:2 com.kabira.snippets.tuning.Deadlock$MyManagedObject:1 (890258:6145144:112456:1 offset 64534360) read lock 2010-05-19 10:58:36.192736|JAV|WARN |19821|native_tran.cpp(373)|Deadlock detected in Java thread: com.kabira.ktvm.transaction.DeadlockError: at com.kabira.platform.ManagedObject.setInteger(Native Method) at com.kabira.snippets.tuning.Deadlock$Deadlocker.run(Deadlock.java:95) at com.kabira.platform.Transaction.execute(Transaction.java:310) at com.kabira.snippets.tuning.Deadlock$MyThread.run(Deadlock.java:65)
Looking at this section by section we see:
2010-05-19 10:58:36.190955|OBJ|WARN |19821|osproxy.cpp(857)|deadlock detected in transaction id 156:2
This line is the beginning of the deadlock trace, and shows which transaction id (156:2) detected the deadlock. The transaction which detects the deadlock rolls back and retries.
Transaction 156:2 deadlocked attempting to write lock com.kabira.snippets.tuning.Deadlock$MyManagedObject:1 (890258:6145144:112456:1 offset 64534360) locks read { 155:2 } which is held by transaction id 155:2 Callstack for transaction 156:2: deadlock on com.kabira.snippets.tuning.Deadlock$MyManagedObject:1 Objects currently locked in transaction id 156:2 com.kabira.snippets.tuning.Deadlock$MyManagedObject:2 (890258:6145144:112456:2 offset 64099448) read lock
This section shows the transaction which detected the deadlock. What type of lock it was trying to take, which resource it was trying to lock, what other transaction is holding a lock which caused contributed to this deadlock, all transaction locks that it was already holding at the time.
Transaction 155:2 is blocked waiting for a write lock on com.kabira.snippets.tuning.Deadlock$MyManagedObject:2 (890258:6145144:112456:2 offset 64099448) locks read { 156:2 } which is held by transaction 156:2 Objects currently locked in transaction 155:2 com.kabira.snippets.tuning.Deadlock$MyManagedObject:1 (890258:6145144:112456:1 offset 64534360) read lock
This section shows the other involved transaction. What type of lock it is blocked waiting to take, which resource it is trying to lock, and what other transaction locks it is currently holding.
2010-05-19 10:58:36.192736|JAV|WARN |19821|native_tran.cpp(373)|Deadlock detected in Java thread: com.kabira.ktvm.transaction.DeadlockError: at com.kabira.platform.ManagedObject.setInteger(Native Method) at com.kabira.snippets.tuning.Deadlock$Deadlocker.run(Deadlock.java:95) at com.kabira.platform.Transaction.execute(Transaction.java:310) at com.kabira.snippets.tuning.Deadlock$MyThread.run(Deadlock.java:65)
This section shows the call stack of the transaction which detected the deadlock.
Lock promotion is when a transaction currently holding a read lock on an object attempts to acquire a write lock on the same object (i.e. Promoting the read lock to a write lock). If blocking for this write lock would result in deadlock, it is called a promotion deadlock.
The program below will generate a single promotion deadlock between two threads, running in a single JVM, in a single node.
// $Revision: 1.1.2.1 $ package com.kabira.snippets.tuning; import com.kabira.platform.Transaction; import com.kabira.platform.annotation.Managed; /** * Promotion deadlock Example from the ActiveSpaces Transactions Tuning Guide. * <p> * <h2> Target Nodes</h2> * <ul> * <li> <b>domainnode</b> = A * </ul> */ public class PromotionDeadlock { static MyManagedObject targetObject; public static void main(String[] args) throws InterruptedException { // // Create a Managed objects. // new Transaction("Create Objects") { @Override public void run() { targetObject = new MyManagedObject(); } }.execute(); // // Create a pair of transaction classes that will both // promote lock the Managed object, resulting in a // promotion deadlock. // Deadlocker deadlocker1 = new Deadlocker(targetObject); Deadlocker deadlocker2 = new Deadlocker(targetObject); // // Run them in separate threads until a deadlock is seen. // while ((deadlocker1.getNumberDeadlocks() == 0) && (deadlocker2.getNumberDeadlocks() == 0)) { MyThread thread1 = new MyThread(deadlocker1); MyThread thread2 = new MyThread(deadlocker2); thread1.start(); thread2.start(); thread1.join(); thread2.join(); } } @Managed static class MyManagedObject { int value; } static class MyThread extends Thread { private Deadlocker m_deadlocker; MyThread(Deadlocker deadlocker) { m_deadlocker = deadlocker; } @Override public void run() { m_deadlocker.execute(); } } static class Deadlocker extends Transaction { private final MyManagedObject m_targetObject; Deadlocker(MyManagedObject targetObject) { m_targetObject = targetObject; } @Override public void run() { // // This will take a transaction read lock on the object. // int value = m_targetObject.value; // // Wait a while to maximize the possibility of contention. // blockForAMoment(); // // This will take a transaction write lock on the object // (promoting the read lock). // m_targetObject.value = 42; // // Wait a while to maximize the possibility of contention. // blockForAMoment(); } private void blockForAMoment() { try { Thread.sleep(500); } catch (InterruptedException ex) { } } } }
The trace messages are similar to those show in the previous section for a lock order deadlock, with the difference being that promotion deadlock will be mentioned:
06-01 08:45:57|runtime::Events::TraceWarning|5391|103:177|Warning|promotion deadlock detected in transaction id 130:2 Transaction 130:2 deadlocked attempting to promote (write lock) com.kabira.snippets.tuning.PromotionDeadlock$MyManagedObject:1 (890258:9384688:2001:1 offset 64698248) locks read { 133:2, 130:2 }, promote waiter { 133:2 } which is held by transaction id 133:2 Callstack for transaction 130:2: promotion deadlock on com.kabira.snippets.tuning.PromotionDeadlock$MyManagedObject:1 Objects currently locked in transaction id 130:2 com.kabira.snippets.tuning.PromotionDeadlock$MyManagedObject:1 (890258:9384688:2001:1 offset 64698248) read lock Transaction 133:2 is blocked waiting for a promote lock on com.kabira.snippets.tuning.PromotionDeadlock$MyManagedObject:1 (890258:9384688:2001:1 offset 64698248) locks read { 133:2, 130:2 }, promote waiter { 133:2 } which is held by transaction 130:2 Objects currently locked in transaction 133:2 com.kabira.snippets.tuning.PromotionDeadlock$MyManagedObject:1 (890258:9384688:2001:1 offset 64698248) read lock 06-01 08:45:57|runtime::Events::TraceWarning|5391|103:179|Warning|Deadlock detected in Java thread: com.kabira.ktvm.transaction.DeadlockError: at com.kabira.platform.ManagedObject.setInteger(Native Method) at com.kabira.snippets.tuning.PromotionDeadlock$Deadlocker.run(PromotionDeadlock.java:92) at com.kabira.platform.Transaction.execute(Transaction.java:309) at com.kabira.snippets.tuning.PromotionDeadlock$MyThread.run(PromotionDeadlock.java:63)
The previous examples showed simple deadlocks, occurring between two transactions. More complex deadlocks are possible involving more than two transactions. For example, transaction 1 deadlocks trying to acquire a lock on an object held by transaction 2 who is blocked waiting on a object held by transaction 3.
To aid in analyzing complex deadlocks the following will be found in the trace messages:
For each contended object, a display of the locks is included, including any promotion waiters.
If the runtime detects that a deadlock happens due to a read lock being blocked, it includes the transaction blocked waiting for the promotion.
Single node deadlocks are bad for performance because they are a source of contention, leading to lower throughput, higher latency and higher CPU cost. But the deadlocks are detected immediately, because each node has a built in transaction lock manager.
Distributed deadlocks are extremely bad for performance because they use a timeout mechanism for deadlock detection. The default setting for this timeout is 60 seconds in a production build.
The program below will generate a distributed transaction lock ordering deadlock between two transactions running across multiple nodes.
// $Revision: 1.1.2.5 $ package com.kabira.snippets.tuning; import com.kabira.platform.Transaction; import com.kabira.platform.annotation.Managed; import com.kabira.platform.highavailability.PartitionManager; import com.kabira.platform.highavailability.PartitionManager.EnableAction; import com.kabira.platform.highavailability.PartitionMapper; import com.kabira.platform.highavailability.ReplicaNode; import static com.kabira.platform.highavailability.ReplicaNode.ReplicationType.*; import com.kabira.platform.property.Status; /** * Distributed deadlock example from the ActiveSpaces Transactions Tuning Guide * <p> * <h2> Target Nodes</h2> * <ul> * <li> <b>domainname</b> = Development * </ul> * Note this sample blocks on B and C nodes, * and needs to be explicitly stopped. */ public class DistributedDeadlock { private static TestObject object1; private static TestObject object2; static final String nodeName = System.getProperty(Status.NODE_NAME); public static void main(String[] args) throws InterruptedException { // // Block all but the A node. // new NodeChecker().blockAllButA(); // // Define the partitions to be used by this snippet // new PartitionCreator().createPartitions(); // // Create a pair of objects, one active on node B, // and the other active on node C. // new Transaction("Create Objects") { @Override public void run() { object1 = new TestObject(); object2 = new TestObject(); // // For each distributed object, assign it a // reference to the other. // object1.otherObject = object2; object2.otherObject = object1; } }.execute(); // // Create a pair of objects, one active on node B, // and the other active on node C. // new Transaction("Spawn Deadlockers") { @Override public void run() { // // Ask them each to spawn a Deadlocker thread. // This should execute on node B for one of them // and node C for the other. // object1.spawnDeadlocker(); object2.spawnDeadlocker(); } }.execute(); // // Now block main in the A node to keep the JVM from exiting. // new NodeChecker().block(); } private static class PartitionCreator { void createPartitions() { new Transaction("Partition Definition") { @Override protected void run() throws Rollback { // // Set up the node lists - notice that the odd node list // has node B as the active node, while the even // node list has node C as the active node. // ReplicaNode [] evenReplicaList = new ReplicaNode [] { new ReplicaNode("C", SYNCHRONOUS), new ReplicaNode("A", SYNCHRONOUS) }; ReplicaNode [] oddReplicaList = new ReplicaNode [] { new ReplicaNode("B", SYNCHRONOUS), new ReplicaNode("A", SYNCHRONOUS) }; // // Define two partitions // PartitionManager.definePartition("Even", null, "B", evenReplicaList); PartitionManager.definePartition("Odd", null, "C", oddReplicaList); // // Enable the partitions // PartitionManager.enablePartitions( EnableAction.JOIN_CLUSTER_PURGE); // // Install the partition mapper // PartitionManager.setMapper( TestObject.class, new AssignPartitions()); } }.execute(); } } // // Partition mapper that maps objects to either Even or Odd // private static class AssignPartitions extends PartitionMapper { @Override public String getPartition(Object obj) { this.m_count++; String partition = "Even"; if ((this.m_count % 2) == 1) { partition = "Odd"; } return partition; } private Integer m_count = 0; } @Managed private static class TestObject { TestObject otherObject; String data; public void lockObjects() { Transaction.setTransactionDescription("locking first object"); this.doWork(); // // Delay longer on the B node to try to force the deadlock // to occur on the C. Otherwise, both sides could see // deadlocks at the same time, making the log files less clear // for this snippet. // if (nodeName.equals("B")) { block(10000); } else { block(500); } Transaction.setTransactionDescription("locking second object"); otherObject.doWork(); block(500); } public void spawnDeadlocker() { new DeadlockThread(this).start(); } private void block(int milliseconds) { try { Thread.sleep(milliseconds); } catch (InterruptedException ex) { } } private void doWork() { data = "work"; } } private static class DeadlockThread extends Thread { private Transaction m_deadlockTransaction; DeadlockThread(TestObject object) { m_deadlockTransaction = new DeadlockTransaction("DeadlockThread", object); } @Override public void run() { while (true) { if (m_deadlockTransaction.execute() == Transaction.Result.ROLLBACK) { return; } } } } private static class DeadlockTransaction extends Transaction { private final TestObject m_object; DeadlockTransaction(final String name, TestObject object) { super(name); m_object = object; } @Override public void run() throws Rollback { if (getNumberDeadlocks() != 0) { System.out.println("A deadlock has been seen, " + "you may now stop the distributed application"); throw new Transaction.Rollback(); } m_object.lockObjects(); } } private static class NodeChecker { // // If we are not the A node, block here forever // void blockAllButA() { while (!nodeName.equals("A")) { block(); } } public void block() { while (true) { try { Thread.sleep(500); } catch (InterruptedException ex) { } } } } }
The program should produce a deadlock that is processed on node C, and found in the node C application log file, looking similar to:
2014-01-30 12:45:05.462557|ENG|WARN |5188|engine.cpp(2844)|com.kabira.ktvm.transaction.DeadlockError: Global transaction serializable:67:142:1:396494485533021 deadlock processed on node C Objects locked in local transaction 'DeadlockThread'[142:1, tid 5188, locking second object] com.kabira.snippets.tuning.DistributedDeadlock$TestObject:63 (890258:8376920:396485028116479:63) write lock Blocked transactions on local node: Global transaction id: serializable:66:139:1:396489640676425 Transaction [139:3, tid 5153] is blocked waiting for a write lock on com.kabira.snippets.tuning.DistributedDeadlock$TestObject:63 (890258:8376920:396485028116479:63) locks write { 'DeadlockThread'[142:1, tid 5188, locking second object] } Callstack for transaction 139:3: dispatch calling [java dispatch] on com.kabira.snippets.tuning.DistributedDeadlock$TestObject:63 Objects currently locked in transaction [139:3, tid 5153] com.kabira.snippets.tuning.DistributedDeadlock$TestObject:25 (890258:8376920:396485028116479:25) write lock ===== Start deadlock report for remote node B ===== com.kabira.ktvm.transaction.DeadlockError: distributed deadlock detected in transaction id [138:2, tid 5151] [engine application::com_kabira_snippets_tuning_DistributedDeadlock2] Transaction [138:2, tid 5151] deadlocked attempting to write lock com.kabira.snippets.tuning.DistributedDeadlock$TestObject:25 (890258:8376920:396485028116479:25) locks write { 'DeadlockThread'[139:1, tid 5189, locking second object] } Callstack for transaction 138:2: dispatch calling [java dispatch] on com.kabira.snippets.tuning.DistributedDeadlock$TestObject:25 distributed deadlock on com.kabira.snippets.tuning.DistributedDeadlock$TestObject:25 Objects currently locked in transaction id [138:2, tid 5151] com.kabira.snippets.tuning.DistributedDeadlock$TestObject:63 (890258:8376920:396485028116479:63) write lock Object is write locked in transaction 'DeadlockThread'[139:1, tid 5189, locking second object] Objects currently locked in transaction 'DeadlockThread'[139:1, tid 5189, locking second object] com.kabira.snippets.tuning.DistributedDeadlock$TestObject:25 (890258:8376920:396485028116479:25) write lock Object has a total of 1 transactions waiting for a write lock at com.kabira.platform.ManagedObject.setReference(Native Method) at com.kabira.snippets.tuning.DistributedDeadlock$TestObject.$doWorkImpl(DistributedDeadlock.java:206) ===== End deadlock report for remote node B ===== at com.kabira.platform.ManagedObject._sendTwoWay(Native Method) at com.kabira.platform.ManagedObject.sendTwoWay(ManagedObject.java:655) at com.kabira.snippets.tuning.DistributedDeadlock$TestObject.doWork(DistributedDeadlock.java) at com.kabira.snippets.tuning.DistributedDeadlock$TestObject.$lockObjectsImpl(DistributedDeadlock.java:183) at com.kabira.snippets.tuning.DistributedDeadlock$TestObject.lockObjects(DistributedDeadlock.java) at com.kabira.snippets.tuning.DistributedDeadlock$DeadlockTransaction.run(DistributedDeadlock.java:255) at com.kabira.platform.Transaction.execute(Transaction.java:484) at com.kabira.platform.Transaction.execute(Transaction.java:542)
Looking section by section through the B log file we see
2014-01-30 12:45:05.462557|ENG|WARN |5188|engine.cpp(2844)|com.kabira.ktvm.transaction.DeadlockError: Global transaction serializable:67:142:1:396494485533021 deadlock processed on node C Objects locked in local transaction 'DeadlockThread'[142:1, tid 5188, locking second object] com.kabira.snippets.tuning.DistributedDeadlock$TestObject:63 (890258:8376920:396485028116479:63) write lock
This shows the global transaction in which the deadlock timeout occurred, including the name of the transaction (if set), the transaction identifier, and the current current setting of the transaction desription (if set).
It then shows a list of objects already locked within the transaction.
Blocked transactions on local node: Global transaction id: serializable:66:139:1:396489640676425 Transaction [139:3, tid 5153] is blocked waiting for a write lock on com.kabira.snippets.tuning.DistributedDeadlock$TestObject:63 (890258:8376920:396485028116479:63) locks write { 'DeadlockThread'[142:1, tid 5188, locking second object] } Callstack for transaction 139:3: dispatch calling [java dispatch] on com.kabira.snippets.tuning.DistributedDeadlock$TestObject:63 Objects currently locked in transaction [139:3, tid 5153] com.kabira.snippets.tuning.DistributedDeadlock$TestObject:25 (890258:8376920:396485028116479:25) write lock
Then a list of local transactions that are blocked on a lock(s) held by the current local transaction is shown. Here, we see that transaction 139:3 is blocked waiting for a write lock on TestObject:63, which is held by the deadlocking transaction, 142:1.
===== Start deadlock report for remote node B ===== com.kabira.ktvm.transaction.DeadlockError: distributed deadlock detected in transaction id [138:2, tid 5151] [engine application::com_kabira_snippets_tuning_DistributedDeadlock2] Transaction [138:2, tid 5151] deadlocked attempting to write lock com.kabira.snippets.tuning.DistributedDeadlock$TestObject:25 (890258:8376920:396485028116479:25) locks write { 'DeadlockThread'[139:1, tid 5189, locking second object] } Callstack for transaction 138:2: dispatch calling [java dispatch] on com.kabira.snippets.tuning.DistributedDeadlock$TestObject:25 distributed deadlock on com.kabira.snippets.tuning.DistributedDeadlock$TestObject:25 Objects currently locked in transaction id [138:2, tid 5151] com.kabira.snippets.tuning.DistributedDeadlock$TestObject:63 (890258:8376920:396485028116479:63) write lock Object is write locked in transaction 'DeadlockThread'[139:1, tid 5189, locking second object] Objects currently locked in transaction 'DeadlockThread'[139:1, tid 5189, locking second object] com.kabira.snippets.tuning.DistributedDeadlock$TestObject:25 (890258:8376920:396485028116479:25) write lock Object has a total of 1 transactions waiting for a write lock at com.kabira.platform.ManagedObject.setReference(Native Method) at com.kabira.snippets.tuning.DistributedDeadlock$TestObject.$doWorkImpl(DistributedDeadlock.java:206) ===== End deadlock report for remote node B =====
Next we see a description of the local transaction on the node where the deadlock occurred. In this case, node B, where local transaction 138:2 deadlocked trying to acquire a write lock on TestObject:25, which was held by transaction 139:1
at com.kabira.platform.ManagedObject._sendTwoWay(Native Method) at com.kabira.platform.ManagedObject.sendTwoWay(ManagedObject.java:655) at com.kabira.snippets.tuning.DistributedDeadlock$TestObject.doWork(DistributedDeadlock.java) at com.kabira.snippets.tuning.DistributedDeadlock$TestObject.$lockObjectsImpl(DistributedDeadlock.java:183) at com.kabira.snippets.tuning.DistributedDeadlock$TestObject.lockObjects(DistributedDeadlock.java) at com.kabira.snippets.tuning.DistributedDeadlock$DeadlockTransaction.run(DistributedDeadlock.java:255) at com.kabira.platform.Transaction.execute(Transaction.java:484) at com.kabira.platform.Transaction.execute(Transaction.java:542)
The last section shows a stack backtrace, including source file names, and line numbers, of where the originating distributed transaction deadlock occurred.