Any communication failures to remote nodes detected during a global transaction before a commit sequence is started cause an exception that an application can handle (see the TIBCO ActiveSpaces® Transactions Java Developer's Guide). This allows the application to explicitly decide whether to commit or rollback the current transaction. If the exception is not caught, the transaction will be automatically rolled back.
Undetected communication failures to remote nodes do not
impact the commit of the transaction. This failure scenario is shown in
Figure 6.5, “Undetected communication failure”. In this case,
Node 2
failed and was restarted after all locks were
taken on Node 2
, but before the commit sequence was
started by the transaction initiator - Node 1
. Once the
commit sequence starts it continues to completion. The request to commit
is ignored on Node 2
because the transaction state was
lost when Node 2
restarted.
Transaction initiator node failures are handled transparently using a transaction outcome voting algorithm. There are two cases that must be handled:
Transaction initiator fails before commit sequence starts.
Transaction initiator fails during the commit sequence.
When a node that is participating in a distributed transaction detects the failure of a transaction initiator, it queries all other nodes for the outcome of the transaction. If the transaction was committed on any other participating nodes, the transaction is committed on the node that detected the node failure. If the transaction was aborted on any other participating nodes, the transaction is aborted on the node that detected the failure. If the transaction is still in progress on the other participating nodes, the transaction is aborted on the node that detected the failure.
Transaction outcome voting before the commit sequence is shown in
Figure 6.6, “Transaction initiator fails prior to initiating commit
sequence”.
In Figure 6.6, “Transaction initiator fails prior to initiating commit
sequence”
the initiating node, Node 1
, fails before initiating
the commit sequence. When Node 2
detects the failure it
performs the transaction outcome voting algorithm by querying other nodes
in the cluster to see if they are participating in this transaction. Since
there are no other nodes in this cluster, the Transaction
Status request is a noop and the transaction is immediately
aborted on Node 2
, releasing all locks held by the
distributed transaction.
Transaction outcome voting during a commit sequence is shown in
Figure 6.7, “Transaction initiator fails during commit sequence”. In Figure 6.7, “Transaction initiator fails during commit sequence” the initiating node,
Node 1
, fails during the commit sequence after
committing the transaction on
, but before it is committed on
Node
2
. When Node 3
Node
3
detects the failure it performs the transaction outcome voting
algorithm by querying Node 2
for the resolution of the
global transaction. Since the transaction was committed on Node
2
it is committed on Node 3
.
To support transaction outcome voting each node maintains a history of all committed and aborted transactions for each remote node participating in a global transaction. The number of historical transactions to maintain is configurable and should be based on the time for the longest running distributed transaction. For example, if 1000 transactions per second are being processed from a remote node, and the longest transaction on average is ten times longer than the mean, the transaction history buffer should be configured for 10,000 transactions.
For each transaction from each remote node, the following is captured:
global transaction identifier
node login time-stamp
transaction resolution
The size of each transaction history record is 24 bytes.