Configuring Fault Tolerant Process Engines

The ActiveMatrix BusinessWorks process engine can be configured to be fault-tolerant. You can start up several engines. In the event of a failure, other engines restart process starters and the corresponding services.

If you use a database to store process engine information, a process instance is re-instantiated to the state of its last checkpoint. In the event of a failure, any processing done after a checkpoint is lost when the process instance is restarted by another engine.

For more information about Checkpoint activities, see TIBCO ActiveMatrix BusinessWorks™ Palette Reference.

For more information about configuring process engine storage, see Changing the Checkpoint Data Repository for a Process.

Figure 2 illustrates normal operation of a fault-tolerant configuration. One engine is configured as the master, and it creates and executes services. The second engine is a secondary engine, and it stands by in case of failure of the master. The engines send heartbeats to notify each other they are operating normally.

Figure 53: Normal operation: master processing while secondary stands by

In the event the master process engine fails, the secondary engine detects the stop in the master’s heartbeat and resumes operation in place of the master. All process starters are restarted on the secondary, and services are restarted to the state of their last checkpoint.

Figure 3 illustrates a failure and the secondary restarting the service.

Figure 54: Fault-tolerant failover

The expected deployment is for master and secondary engines to reside on separate machines. You can have multiple secondary engines, if desired, and you can specify a weight for each engine. The weight determines the type of relationship between the fault-tolerant engines. For more information about relationships between fault-tolerant engines, see Peer or Master and Secondary Relationships.

A master and its secondary engines is known as a fault-tolerant group. The group can be configured with several advanced configuration options, such as the heartbeat interval and the weight of each group member. For a complete description of configuration options for fault tolerance, see TIBCO ActiveMatrix BusinessWorks™ Palette Reference.