Chapter 17 Understanding Cache OM and Multi-Engine Features : Designing With Multiple Active Inference Agents

Designing With Multiple Active Inference Agents
You can use multiple active inference agents to achieve load balancing, scaling, and performance. These are known as multi-engine features. You must be aware of some design considerations when designing project that take advantage of these multi-engine features.
Multi-engine features can be used in two ways:
Deployment of multiple instances of the same agent, each in different nodes, for load balancing and fault tolerance.
Deployment of instances of different agents, to achieve rule-chaining for high distribution of the work load and high performance.
In both multi-engine cases, the agents are in the same cache cluster and work on the same ontology objects.
Concepts are Shared Across Agents Asynchronously
All concept objects are shared between agents in the cluster in an asynchronous manner.
For instance, an Agent X receives an event E, fires a rule R1 that creates a concept C1. An agent Z receives an event E2, fires a rule R2 that joins concept C1 and event E2.
Therefore, there is inherent latency between an object change in an agent to the other agents in the cluster receiving the notification.
Because of the asynchronous sharing of objects between agents, give events an infinite time to live setting and explicitly consume them.
See the StateMachineMultiEngineExample example to see a demonstration of this point. (Requires TIBCO BusinessEvents Enterprise Suite).
Scorecards are Local to the Agent
Scorecards are not shared between agents. (This is true in all OM types.) Each inference agent maintains its own set of scorecards and the values in each agent can differ. This enables scorecards to be used for local purposes and minimizes contention between the agents.
As an analogy consider a bank ATM scenario. Money can be drawn from the same account using different ATMs. However, each ATM maintains a "scorecard" indicating only how much money it dispenses.
An agent key property (Agent.AgentGroupName.key) is available for tracking scorecards. See Defining a Unique Key for Each Agent.
Events are Clustered but not Shared Across Agents in a Group
In a load balanced and (optionally) fault tolerant group of agent instances, event instances are clustered between agents in an agent group—they are not shared. That is, each event instance is present on only one agent in the cluster.
For instance, when an agent X receives an Event E1, agent B does not see the event.
Cache cluster services provide for reassignment of ownership of these events to other agents in the same group, in the event of node failure. The events owned by this agent are redistributed to the remaining agents. (Therefore there is no single point of failure.)
Note that this can happen only if the event’s time-to-live (TTL) is long enough for other agents to receive cache notification of the event.
See the Events-MultiEngineExample example for a demonstration of this point.
Event Related Constraints
Repeating Time Events Not Supported
Time events configured to repeat at intervals are not supported in multiple-agent (multi-engine) configurations. Rule-based time events, however, are supported.
State Machine Timeouts
State machines can be configured to have state timeouts. The objects are shared across all agents, and the agents in the cluster collaborate to take ownership of management of the state machines, thereby providing automatic fault tolerance.
Understanding Concurrency and Locking Issues
Multiple agents can read and write to the same cache cluster and at times operate on the same set of objects. You must therefore deal with the possibility of concurrent modifications. Use locks to deal avoid race conditions.
See Locking and Synchronization Functions in Preprocessors for details on managing access to objects in preprocessors, which are multi-threaded.
Updates are Done at the Property Level
Multiple agents can work on different properties of the same concept without issues. The object itself does not need to be explicitly locked. Agents update the cluster based on the actual properties being modified.
Multi-Engine Example
The following example shows how concepts are shared and events are clustered in a load balancing agent group.
Rule set A has the following rules:
Rule A: Scope: Event E1
   Condition: None
   Action: Create a concept C.1
Rule B: Scope: Event E2
   Concept C1
   Condition: E2.x == C1.x;
   Action: Send an event E3
You can start multiple instances of an agent. Suppose agent A1 and A2 are instances of an agent. Each contains rule set R.
Each agent instance has an internal ID to keep it distinct from other agents, so assume that the ID is 1 and 2 respectively.
Both agents on startup are activated based on the maxActive setting and therefore they both listen to the destinations on which events (E1 and E2) arrive.
The following scenario describes the behavior:
1. Agent A1(id=1) receives an instance of Event E1:
1.
2.
3.
The events are clustered but are not shared. For example Agent A1 receives the event E1. If the Time to Live of the event is > 0, then the event is acknowledged in the channel and moved to the cluster.
2. Agent A2 receives an instance of Event E2:
1.
2.
In the case of failure of the node containing Agent A1, Agent A2 will move the pending events to its Rete network.