Fault Tolerance

Fault Tolerance allows multiple adapter instances to substitute for each other. When the primary adapter instance terminates unexpectedly, the token held by the primary instance can be taken over by an adapter instance in the standby state. In the process of replacement, the standby adapter instance is promoted to the primary adapter instance.

Note 

When a standby adapter instance becomes a primary adapter instance, it does not take the instance ID of the original primary adapter instance that terminated unexpectedly and still has its own instance ID.
When running a JMS topic as durable, durable names exist on an EMS server for each receiver, regardless of whether the adapter instance is primary or standby.
To detect broken connections more quickly, you can add the client_heartbeat_server=3 property to the tibemsd.conf files of all the primary servers and standby servers.
For multiple primary instances, you must enable load balancing for these instances and set the adb.noDupDetection property to on before you enable Fault Tolerance.

Fault Tolerance is based on the JMS queue. Before enabling Fault Tolerance, you have to define a JMS queue, set the prefetch parameter of the JMS queue to none, and then put several JMS messages in the JMS queue as tokens. The number of tokens corresponds to the number of primary adapter instances.

The following diagram shows how Fault Tolerance works. At first, instance 1 and instance 2 fetch one of the two tokens in the JMS queue respectively. They hold the tokens and process messages as primary instances. Instance 3 does not fetch tokens and runs in standby state. If instance 2 terminates unexpectedly, it releases the fetched token. Instance 3 fetches the token released by instance 2 and continues to process messages as primary instance.

Figure 204: Fault Tolerance

Enabling Fault Tolerance

To enable the Fault Tolerance feature, set the tibco.sdk.faultTolerance.ems.enabled property to on in the adbagent.tra file, and set the SDK fault tolerance properties accordingly. For details about the SDK fault tolerance properties, see Predefined Properties in TIBCO ActiveMatrix Adapter for Database.

Note 

When Fault Tolerance is enabled, the following issues occur:

If the number of tokens is more than 1, an exception is thrown in any of the following conditions:
The transport type is RVCM. Multiple primary instances running simultaneously do not support the RVCM transport type because the CM name for each instance must be unique.
The transport type is JMS. The primary instance and standby instance have the same client ID.
The delivery mode is Durable in Subscription Service or Request-Response Service.
If a primary EMS server switches to the standby state, all primary adapter instances that fetch tokens from the primary EMS server restart.
After primary/standby switchover and switchback are performed among the adapter instances, the delivery status of some records in the publishing table is in the intermediate state, either S or P. The S state indicates that an entry has been marked and is waiting to be processed by an adapter instance or a publication handler thread. The P state indicates that an entry has been processed and published without a message confirmation. To resolve this issue, you can set the adb.pubAutoN property in the adbagent.tra file, such as adb.pubAutoN P, adb.pubAutoN S, or adb.pubAutoN PS. After setting this property, Publication Service will change the intermediate state to N when the service starts. For details about the adb.pubAutoN property, see TIBCO ActiveMatrix Adapter for Database Properties.