The tables in this section help you understand how fault tolerance and object management options work in various deployment scenarios to maintain data integrity. The tables explain what is possible in each type of object management given the following conditions:Nodes One or multiple nodes, where a node is a JVM containing one BusinessEvents server.Agents One or multiple inference agents. Each inference agent is configured by one BAR resource in a project. An inference agent has a Rete network. See Designing With Multiple Active Inference Agents for related details.When implementing a recovery strategy for a rule engine product such as BusinessEvents, you must take care to maintain the integrity of stateful objects. Concepts and scorecards are stateful objects and must maintain state across inference agents. Not all options provide that option.In Memory and Persistence OM—Behavior in Multiple-Agent NodesWhen multiple agents in a node use In Memory or Persistence object management options, concept instances and scorecards are not shared between them. For behavior of multiple agents in a node with In Memory OM see Local Channels.For behavior of multiple concurrent agents in Cache OM deployments see Designing With Multiple Active Inference Agents
Data is isolated to a single node JVM. No recovery. n Agents n Nodes Data is isolated in each node JVM. Failover and failback are allowed. Object state is not preserved or transferred. Recommended only for stateless operations. Data is isolated to each node JVM. No recovery. n Nodesn Agents Object state is not maintained during failover and failback. Recommended only for stateless operations.
Fault Tolerance with Persistence-Based Object Management As explained in the table below, the BusinessEvents built-in fault tolerance feature is not supported for use with persistence-based object management. You can implement a custom solution, however.
Data is isolated in a single persistence database. On recovery, object state is recovered to the last checkpoint. n Agents In all deployment scenarios, each agent’s data is isolated in a separate persistence database. On recovery, object state is recovered to the last checkpoint of the appropriate database. n Nodes Not supported with BusinessEvents built-in fault tolerance. Automatic failover and failback is not possible due to presence of lock files. Use a custom solution. n Nodesn Agents Not supported with BusinessEvents built-in fault tolerance. Automatic failover and failback is not possible due to presence of lock files. Multiple write operations by agents on the primary node could lead to data inconsistency. Use a custom solution.In all cases it is assumed that dedicated cache servers are also running. Fault tolerance of the engine process refers to inference agents only. See Distributed Cache and Multi-Engine Architecture and Terms.If you use multi-engine features, fault tolerance is implicit. When all agents in an agent group are active, if any active agent fails, remaining agents in the group automatically handle the work load.In all cases, in the event of total system failure, use of a backing store ensures recovery of data written to the backing store.
Table 21 Cache and Fault Tolerance Scenarios n Agents (N/A) Each agent in the same node is a different agent, not part of the same agent group. n Nodes Multi-engine mode: If one or more agents in a group fails, the load is distributed among remaining agents in that group. All agents can be active or some can be inactive. Configuration uses a MaxActive property and a Priority property.Single-engine mode: Priority setting determines which agent in an agent group is active, as well as the failover and failback order.Cluster data is shared between agents in all groups across all nodes, using the cache cluster.If the number of cache object backups is one, one cache server (at a time) can fail with no data loss. If the number of backups is two, two servers can fail, and so on.Because caches exist in memory only, recovery is not available in the case of a total system failure. All data in each JVM memory is lost in a total system failure.In the event of total system failure, use of a backing store ensures recovery of data written to the backing store. Multi-engine mode: N/A. Fault tolerance is implicit. n Nodesn Agents Same as n Nodes 1 agent. Each of the agents in one node is fault tolerant with the agents in the same agent group, which are deployed in other nodes. Multi-engine mode: N/A. Fault tolerance is implicit.
Copyright © TIBCO Software Inc. All Rights Reserved.