Chapter 15 Configuring In Memory Object Management : Configuring Fault Tolerance for In Memory OM Systems

Configuring Fault Tolerance for In Memory OM Systems
With In Memory object management, data is transient and does not survive total system failure. However, for systems using In Memory object management, BusinessEvents offers priority-based fault tolerance to provide high availability of the BusinessEvents engine process.
Fault tolerance provides transitioning between inactive and active states, and provides state management for channels, but not Rete networks or working memories.
With In Memory object management, the failover behavior in a fault tolerance group is: When a node fails, the node with the next highest priority assumes responsibility for that node’s work. So, when a primary engine fails, a secondary engine takes over as primary.
In cases where the primary and secondary are on different machines on the network, the secondary takes over if the primary loses connection to the network.
With In Memory object management, the failback behavior is: When a node restarts (or reconnects to the network in the case of a lost connection), it assumes responsibility from the node with the next lowest priority. So, when a primary engine recovers, it takes over from the secondary that had been functioning as primary.
Additionally a join time is recorded internally. The priority setting and join time are used to determine the failover and failback order in larger fault tolerance groups. If two servers have the same priority setting, then the server that joined the group first takes priority in determining the failover and failback order.
Same Weight (Priority): If two BusinessEvents servers have the same weight setting, then the server that joined the fault tolerance cluster first has the higher priority.
Fault Tolerance Configuration for In Memory OM—Using TRA Files
To Configure Fault Tolerance for In Memory OM Using TRA files
For each node, set the engine properties as follows:
1.
Open be-engine.tra in a text editor.
BE_HOME\bin\be-engine.tra
(Or open the alternative property file you want to use.)
2.
Ensure that you enter the same fault tolerance group name for all members of the same fault tolerance cluster and that it is unique across fault tolerance cluster names.
3.
In each engine property file, provide a short, unique name for the engine using the property be.ft.nodename. This name is used by the fault tolerance cluster only. The name must not be more than 30 characters long.
Configure Weight.
4.
Configure the weight property to define the priorities among the servers. The primary engine has the highest priority (that is, the highest value numerically). Secondary engines have lower priorities (and lower numbers).
Engine.FT.Weight integer
If you give two or more engines the same weight value, then the actual priority is determined by join (server startup) time. The engine that joined the cluster first has the highest priority.
5.
 
Sets a stable node name used only in the fault tolerance cluster. The name is used during recovery from situations such as network disconnections.
If you deploy to a TIBCO Administrator domain, and if this property is not present, then the last 30 characters of the generated engine name are used to ensure a unique node name.
If you deploy outside a TIBCO Administrator domain and if this property is not present, then the engine name is used. See Determining the Engine (Node) Name for various ways the engine name can be set.
Note: The node name must not exceed 30 characters.
Defines the fault tolerance cluster (group) name in the be-engine.tra file, for command-line startup.
Enables or disables fault tolerance mode in the be-engine.tra file, for command-line startup.
Sets the priority for a server in a fault tolerance group in the be-engine.tra file, for command-line startup. The higher the number, the higher the priority. (Note that the TIBCO Administrator property works in the same way: the higher the number, the higher the priority.)
 
Fault Tolerance Configuration for In Memory OM—Using TIBCO Administrator
Unlike some other TIBCO products with which you may be familiar, BusinessEvents does not use Rendezvous for fault tolerance. When you configure fault tolerance for a BusinessEvents system in TIBCO Administrator, you use some of the same settings you would use for Rendezvous-based fault tolerance. However, the settings are used internally by BusinessEvents, not in their usual way.
BusinessEvents uses the following TIBCO Administrator settings for fault tolerance when In Memory object management is used. The property names shown in parentheses appear in the engine property file generated by TIBCO Administrator.
FT Weight (Engine.FT.Weight)—This property is used define the different engines as primary and secondary. The higher the number, the higher the priority.
Run Fault Tolerant (Engine.FT.UseFT)—This property is used to enable or disable fault tolerance mode.
(No UI setting) Engine.FT.GroupName —This property defines the members of the fault tolerance group. The value of this property is generated at deploytime, as explained next.
The value of Engine.FT.GroupName is generated as follows:
domain_name.domain_name-deployment_name.BAR_name_prefix2Ebar
For example, if the domain name is acme, the deployment name is test, and the bar name is my.bar, then the generated name would be: acme.acme-test.my2Ebar.
To Configure Fault Tolerance for In Memory OM Using TIBCO Administrator
This procedure assumes that you have already configured object management options, generated the project EAR file, and uploaded it into TIBCO Administrator.
1.
In TIBCO Administrator, click Application Management.
2.
3.
4.
5.
6.
Fault tolerance features appear in the Target Machines pane.
7.
The higher the number, the higher priority of the server. The primary engine has the highest priority (that is, the number with the largest value). Secondary engines have lower priorities (and values closer to 1).
8.
In the FT Group Settings pane, check the Run Fault Tolerant checkbox.
9.
Click Save.
See Deploying a Project in a TIBCO Administrator Domain for next steps.