Chapter 19 Fault Tolerance : Configuring Clients for Fault-Tolerant Connections

Configuring Clients for Fault-Tolerant Connections
When a backup server assumes the role of the primary server during failover, clients attempt to reconnect to the backup server (that is, the new primary). To enable a client to reconnect, you must specify the URLs of both servers when creating a connection.
Specify multiple servers as a comma-separated list of URLs. Both URLs must use the same protocol (either tcp or ssl). For example, to identify the first server as tcp://server0:7222, and the second server as tcp://server1:7344 (if first server is not available), you can specify:
   serverUrl=tcp://server0:7222, tcp://server1:7344
The client attempts to connect to each URL in the order listed. If a connection to one URL fails, the client tries the next URL in the list. The client tries the URLs in sequence until all URLs have been tried. If the first failed connection was not the first URL in the list, the attempts wrap to the start of the list (so each URL is tried). If none of the attempts succeed, the connection fails.
For information on how to lookup a fault-tolerance URL in the EMS naming service, see Performing Fault-Tolerant Lookups.
 
The reconnection logic in the client is triggered by the specifying multiple URLs when connecting to a server. If no backup server is present, the client must still provide at least two URLs (typically pointing to the same server) in order for it to automatically reconnect to the server when it becomes available after a failure.
Specifying More Than Two URLs
Even though there are only two servers (the primary and backup servers), clients can specify more than two URLs for the connection. For example, if each server has more than one listen address, a client can reconnect to the same server at a different address (that is, at a different network interface).
Setting Reconnection Failure Parameters
EMS allows you to establish separate parameters for initial connection attempts and reconnection attempts. How to set the initial connection attempt parameters is described in Setting Connection Attempts, Timeout and Delay Parameters. This section describes the parameters you can establish for reconnection attempts following a fault-tolerant switchover.
The reason for having separate connect and reconnect attempt parameters is that there is a limit imposed by the operating system to the number of connection attempts the EMS server can handle at any particular time. (For example, in Unix, this limit is adjusted by the ulimit setting.) Under normal circumstances, each connect attempt is distributed so it is less likely for the server to exceed its maximum accept queue. However, during a fault-tolerant switchover, all of the clients automatically try to reconnect to the backup server at approximately the same time. When the number of connections is large, it may require more time for each client to reconnect than for the initial connect.
By default, a client will attempt reconnection 4 times with a 500 ms delay between each attempt. You can modify these settings in the factories.conf file or by means of your client connection factory API, as demonstrated by the examples in this section.
The following examples establish a reconnection count of 10, a delay of 1000 ms and a timeout of 1000 ms.
Java
Use the TibjmsConnectionFactory object’s setReconnAttemptCount(), setReconnAttemptDelay(), and setReconnAttemptTimeout() methods to establish new reconnection failure parameters:
   factory.setReconnAttemptCount(10);
   factory.setReconnAttemptDelay(1000);
   factory.setReconnAttemptTimeout(1000);
C
Use the tibemsConnectionFactory_SetReconnectAttemptCount, tibemsConnectionFactory_SetReconnectAttemptDelay, and tibemsConnectionFactory_SetReconnectAttemptTimeout functions to establish new reconnection failure parameters:
   status = tibemsConnectionFactory_SetReconnectAttemptCount(
                factory, 10);
   status = tibemsConnectionFactory_SetReconnectAttemptDelay(
                factory, 1000);
   status = tibemsConnectionFactory_SetReconnectAttemptTimeout(
                factory, 1000);
C#
Use the ConnectionFactory.SetReconnAttemptCount, ConnectionFactory.SetReconnAttemptDelay, and ConnectionFactory.SetReconnAttemptTimeout methods to establish new reconnection failure parameters:
   factory.setReconnAttemptCount(10);
   factory.setReconnAttemptDelay(1000);
   factory.setReconnAttemptTimeout(1000);