Load balancing enables horizontal and vertical scaling. The underlying cluster behaves like a database for all the agents connected to the cluster. Load balancing makes use of point-to-point messaging, such as JMS queues. With point-to-point communication, messages are automatically distributed among the members of an agent group. (You can also use different agents to listen to different queues.)
All inference agents in an agent group (that is, all agent instances of the same agent class deployed in the same cluster) automatically behave in a fault tolerant manner. All load is distributed equally within all active agents in the same group. If any agents fail, the other agents automatically distribute the load between the remaining active agents in the group.
You can optionally start a certain number of agents in a group and keep the rest as standby agents. If an active agents fails, a standby agent is automatically activated. For most situations, there is no need to maintain standby agents.
Standby agents maintain a passive Rete network. They do not listen to events from channels, do not update working memory, and do not do read or write operations on the cache.