Chapter 17 Understanding Cache OM and Multi-Engine Features : Load Balancing and Fault Tolerance Between Inference Agents

Load Balancing and Fault Tolerance Between Inference Agents
When multi-engine features are enabled, load balancing and fault tolerance between agents in an agent group are available. See Designing With Multiple Active Inference Agents for more on multi-engine features.
In single-engine mode, fault tolerance between agents in an agent group is available, but load balancing is not.
For configuration details, see Configuring Fault Tolerance and Load Balancing for an Inference Agent Group.
Load Balancing of Inference Agents in a Group (Multi-Engine Mode Only)
Load balancing enables horizontal and vertical scaling. The underlying cluster behaves like a database for all the agents connected to the cluster. Concepts are shared between all agents and the cluster uses notifications to the different agents to keep the Rete networks synchronized.
Load balancing makes use of point-to-point messaging, such as JMS queues. With point-to-point communication, messages are automatically distributed among the members of the group. You can also use different agents to listen to different queues.
Every JMS input destination runs in its own JMS Session. This provides good throughput for processing, and less connections (see Each JMS Input Destination Runs a Session). For example, multiple instances of an inference agent listen to a JMS queue.
Certain aspects of the design have to be managed by the application. See Designing With Multiple Active Inference Agents for related information.
Fault Tolerance Between Inference Agents in a Group
In multi-engine mode, all agents in an inference agent group automatically behave in a fault tolerant manner. All load is distributed equally within all active agents. If any agents fail, the other agents automatically distribute the load between the remaining active agents.
You can optionally start a certain number of agents and out of these specify that a certain number remain inactive. If an active agents fails, an inactive agent is automatically activated.
For many situations, there is no need to maintain inactive nodes.
In single-engine mode, only one agent in a group is active at a time (because each agent instance is in a different engine). A priority property determines the startup order and the failover and failback order.
Behavior of Inactive Agents
Inactive agents maintain a passive Rete network. They do not listen to events from channels, do not update working memory, and do not do read or write operations on the cache.
On failover to an inactive agent, startup rule functions do not execute when the agent becomes active.
Fault Tolerance of Cache Data
Note that fault tolerance of cache servers is handled transparently by the object management layer, and that query agents do not require fault tolerance. For fault tolerance of agents, cache data, the only configuration task is to define the number of backups you want to keep, and to provide sufficient storage capacity (see Reliability of Cache Object Management).