Copyright © TIBCO Software Inc. All Rights Reserved
Copyright © TIBCO Software Inc. All Rights Reserved


Chapter 20 Fault Tolerance : Shared State

Shared State
For the most robust failover protection, the active server and standby server must share the same state. Shared state includes three categories of information:
During a failover, the standby server re-reads all shared state information.
Implementing Shared State
We recommend that you implement shared state using shared storage devices. The shared state must be accessible to both the active and standby servers.
Support Criteria
Several options are available for implementing shared storage using a combination of hardware and software. EMS requires that your storage solution guarantees all four criteria in Table 85.
Always consult your shared storage vendor and your operating system vendor to ascertain that the storage solution you select satisfies all four criteria.
 
Hardware Options
Consider these examples of commonly-sold hardware options for shared storage:
SCSI and SAN
Dual-port SCSI and SAN solutions generally satisfy the Write Order and Synchronous Write Persistence criteria. (The clustering software must satisfy the remaining two criteria.) As always, you must confirm all four requirements with your vendors.
NAS
NAS solutions require a CS (rather than a CFS) to satisfy the Distributed File Locking criterion (see below).
Some NAS solutions satisfy the criteria, and some do not; you must confirm all four requirements with your vendors.
NAS with NFS
When NAS hardware uses NFS as its file system, it is particularly difficult to determine whether the solution meets the criteria. Our research indicates the following conclusions:
NFS v2 and NFS v3 definitely do not satisfy the criteria.
NFS v4 with TCP might satisfy the criteria. Consult with the NAS vendor to verify that the NFS server (in the NAS) satisfies the criteria. Consult with the operating system vendor to verify that the NFS client (in the OS on the server host computer) satisfies the criteria. When both vendors certify that their components cooperate to guarantee the criteria, then the shared storage solution supports EMS.
For more information on how the EMS locks shared store files, see How EMS Manages Access to Shared Store Files.
Software Options
Consider these examples of commonly-sold software options:
A cluster server monitors the EMS server processes and their host computers, and ensures that exactly one server process is running at all times. If that server fails, the CS restarts it; if the CS fails to restart it, it starts the other server instead.
A clustered file system lets the two EMS server processes run simultaneously. It even lets both servers mount the shared file system simultaneously. However, the CFS assigns the lock to only one server process at a time. The CFS also manages operating system caching of file data, so the standby server has an up-to-date view of the file system (instead of a stale cache).
With dual-port SCSI or SAN hardware, either a CS or a CFS might satisfy the Distributed File Locking criterion. With NAS hardware, only a CS can satisfy this criterion (CFS software generally does not). Of course, you must confirm all four requirements with your vendors.
Messages Stored in Shared State
Messages with PERSISTENT delivery mode are stored, and are available in the event of active server failure. Messages with NON_PERSISTENT delivery mode are not available if the active server fails.
For more information about recovery of messages during failover, see Message Redelivery.
Storage Files
By default, the tibemsd server creates three file-based stores to store shared state:
$sys.failsafe—This store holds persistent messages using synchronous I/O calls.
$sys.nonfailsafe—This file stores messages using asynchronous I/O calls.
$sys.meta—This store holds state information about durable subscribers, fault-tolerant connections, and other metadata.
These stores are fully customizable through parameters in the stores configuration file. More information about these files and the default configuration settings are fully described in stores.conf on page 262.
To prevent two servers from using the same store file, each server restricts access to its store file for the duration of the server process. For more information on how the EMS manages shared store files, see How EMS Manages Access to Shared Store Files.
Storage Parameters
Several configuration parameters apply to EMS storage files (even when fault-tolerant operation is not configured); see Storage File Parameters.

Copyright © TIBCO Software Inc. All Rights Reserved
Copyright © TIBCO Software Inc. All Rights Reserved