Copyright © TIBCO Software Inc. All Rights Reserved
Copyright © TIBCO Software Inc. All Rights Reserved


Chapter 8 Setting Deployment Options : Setting Fault Tolerant Options for a Process

Setting Fault Tolerant Options for a Process
The FT Group Settings panel displays only if the TIBCO BusinessWorks process you have selected has been added to at least two (different) machines. If your domain includes components that were deployed as part of a fault-tolerant group, the display includes the information about the group.
You can start one or more process engines in the group. If more than one engine has started, only one is displayed as Running and all other engines are displayed as Standing By (or, initially, as Starting Up).
When you change the status of a component that has been deployed as part of a FT group, the status change affects all other members of the group.
After you have deployed the process engines, it is most efficient to select all process engines by selecting the check boxes, and then choosing Start. After the primary and secondary engines have communicated, the master will display as Running and all other engines as Standby. If you start only the primary, it will first go to Standby mode as it checks the status of the other engines. It then changes to Running.
To Set Fault Tolerant Options
1.
In TIBCO Administrator, click Application Management.
2.
3.
4.
Click the General tab.
5.
Select Run Fault Tolerant. Change other options as required. See FT Group Settings for field descriptions.
6.
Click Save.
Changing the Checkpoint Data Repository for a Process
To run TIBCO BusinessWorks using multiple engines in fault tolerant mode, you must specify a checkpoint data repository. See Failover and Checkpoint Data for more information.
For true fault tolerance, you must store the data in a database. You specify a JDBC Connection resource for the database to be used when you configure your project in TIBCO Designer. The database is then one of the available options on the Checkpoint Data Repository pop-up menu.
To Change Checkpoint Data Repository Properties
1.
In TIBCO Administrator, click Application Management.
2.
3.
4.
Click the Advanced tab.
5.
6.
Click Save.
Configuring Fault-Tolerant Engines
The TIBCO BusinessWorks process engine can be configured to be fault-tolerant. You can start up several engines. In the event of a failure, other engines restart process starters and the corresponding services.
If you use a database to store process engine information, a service is reinstantiated to the state of its last checkpoint. In the event of a failure, any processing done after a checkpoint is lost when the process instance is restarted by another engine. See TIBCO BusinessWorks Palette Reference for more information about Checkpoint activities. See Configuring Storage for TIBCO BusinessWorks Processes for more information about configuring process engine storage.
The next diagram illustrates normal operation of a fault-tolerant configuration. One engine is configured as the master, and it creates and executes services. The second engine is a secondary engine, and it stands by in case of failure of the master. The engines send heartbeats to notify each other they are operating normally.
Figure 47 Normal operation: master processing while secondary stands by
In the event the master process engine fails, the secondary engine detects the stop in the master’s heartbeat and resumes operation in place of the master. All process starters are restarted on the secondary, and services are restarted to the state of their last checkpoint. The next diagram illustrates a failure and the secondary restarting the service
Figure 48 Fault-tolerant failover
The expected deployment is for master and secondary engines to reside on separate machines. You can have multiple secondary engines, if desired, and you can specify a weight for each engine. The weight determines the type of relationship between the fault-tolerant engines. See Peer or Master and Secondary Relationships for more information about relationships between fault-tolerant engines.
A master and its secondary engines is known as a fault-tolerant group. The group can be configured with several advanced configuration options, such as the heartbeat interval and the weight of each group member. See TIBCO BusinessWorks Palette Reference for a complete description of configuration options for fault tolerance.
Peer or Master and Secondary Relationships
Members of a fault-tolerant group can be configured as peers or as master and secondary engines. If all engines are peers, when the machine containing the currently active process engine fails, another peer process engine resumes processing for the first engine, and continues processing until its machine fails.
If the engines are configured as master and secondary, the secondary engine resumes processing when the master fails. The secondary engine continues processing until the master recovers. Once the master recovers, the secondary engine shuts down and the master takes over processing again.
The Fault Tolerance tab of the Process Engine deployment resource allows you to specify the member weight of each member of a fault-tolerant group. The member with the highest weight is the master. You can select "Peer" in the first field on the tab to configure all engines as peers (that is, they all have the same weight). You can select Primary/Secondary to configure the engines as master and secondary. You can also select Custom to specify your own values for the weight of each member of the group.
Failover and Checkpoint Data
A checkpoint saves the current state of a running process instance. For a secondary process engine to resume running process instances from their last checkpoint, the secondary process engine must have access to the saved state of the process instances from the master process engine.
If you select the service (.par), and then the Advanced tab, a pane named TIBCO BusinessWorks Checkpoint Data Repository is displayed. In this field, you can specify where state of process instances is stored when a checkpoint is performed. The value defaults to Checkpoint Data Repository. If a JDBC Connection Resource has been configured for the project, you also have the option to choose database.
Because fault-tolerant engines are expected to be on separate machines, you should specify to use a database for storage for each process engine. By this you can specify the same JDBC Connection resource for the master and secondary engines, and therefore all engines can share the information stored for process instance checkpoints. See , Configuring Storage for TIBCO BusinessWorks Processes.
If all engines share the checkpoint information, and then the secondary engines can recover process instances up to their last checkpoint. If engines do not share the checkpoint information, process instances are not restarted.
Process Starters and Fault-Tolerance
When a master process engine fails, its process starters are restarted on the secondary engine. This may not be possible with all process starters. For example, the HTTP Receiver process starter listens for HTTP requests on a specified port on the machine where the process engine resides. If a secondary engine resumes operation for a master engine, the new machine is now listening for HTTP requests on the specified port. HTTP requests always specify the machine name, so incoming HTTP requests will not automatically be redirected to the new machine.
Each process starter has different configuration requirements, and not all process starters may gracefully resume on a different machine. You may have to provide additional hardware or software to redirect the incoming events to the appropriate place in the event of a failure.
Also, your servers may not have all of the necessary software for restarting all of instances. For example, your database may reside on the same machine as your master process engine. If that server goes down, any JDBC activities will not be able to execute. Therefore, you may not wish to load process definitions that use JDBC activities in your secondary process engine.
You can specify that your secondary process engine loads different process definitions than the master. You may only want to load the process definitions that can gracefully migrate to a new server during a failure.
See Also
See FT Group Settings for field descriptions.
For information about which process engine starts and what state it starts in, see TIBCO BusinessWorks Process Design Guide.

Copyright © TIBCO Software Inc. All Rights Reserved
Copyright © TIBCO Software Inc. All Rights Reserved