Fault Tolerance of Agents
Inference and query agents in an agent group (that is, all agent instances of the same agent class deployed in the same cluster) automatically behave in a fault tolerant manner.
All load is distributed equally within all active agents in the same group. If any agents fail, the other agents automatically distribute the load between the remaining active agents in the group.
You can optionally start a certain number of agents in a group and keep the rest as standby agents. If an active agents fails, a standby agent is automatically activated. For most situations, however, there is no need to maintain standby agents.
Behavior of Standby Agents
Query agents do not maintain stateful objects. When a standby agent becomes active, it simply begins to take on work.
Standby inference agents maintain a passive Rete network. They do not listen to events from channels, do not update working memory, and do not do read or write operations on the cache.