A TIBCO Hawk Agent manages and monitors applications and systems based on configuration objects such as rulebases, schedules, and a rulebase map loaded on the Agent.
A rulebase map directs TIBCO Hawk agents or groups of agents on your network to load particular rulebases at startup. For example, using a rulebase map you can instruct an agent to load a rulebase designed specifically for the operating system where it runs.
Every rulebase contains rules which are made up of data sources, tests, and actions. Each rule contains management logic. The management logic in a rule is defined by the tests and actions to be taken from data collected from a given data source. If a schedule is specified in a rulebase, rule, test or action, it will determine if these objects should be active or not at a specified time.
A rulebase is a configuration object that provides the rules for the monitoring activities that are to be autonomously performed on an agent. At the core of all rulebase monitoring activity is the collection of data, testing of that data, and taking actions based on the test results. All monitored data is provided by the agent's microagents through microagent subscriptions. All actions taken by a rulebase are in the form of method invocations. Rulebase objects specify their data sources and actions using the MethodSubscription and MethodInvocation classes of the Console API. Therefore, understanding these, and related classes, is a prerequisite for using the Configuration API. For more information on these classes, refer to
Chapter 2, Console API.
While rulebases are merely configuration objects, it is useful to think of them as having runtime behavior in order to understand how the RuleBaseEngine processes them. Thus this section discusses rulebases, rules, tests, and actions as if they contain logic which carries out their execution.
A rulebase object is primarily composed of a set of rule objects. Each rule has a Data Source and a list of Test objects. Each Test has a TestExpressionOperator object and a list of ConsequenceAction objects. Thus, a rulebase can be represented as a tree structure with a single Rulebase object as the root and Action objects as the leaves.
The RulebaseElement class provides common methods to get and set the element's name and schedule parameters. The Rule, Test and Action classes do not require a name to be specified in the constructor. Only the Rulebase class requires a name specified in the constructor. In places where an array of RulebaseElement objects is required, all elements in the array must have unique names.
A Rule consists of a data source and a list of tests. The DataSource of a rule specifies a MethodSubscription, which supplies a stream of data samples to be monitored. The method used in the MethodSubscription can be either synchronous or asynchronous. Every new data sample from a Rule Object 's Data Source is distributed to all Test objects contained within that Rule object.
The data source for a Rule is its source of input data, and is always a method subscription to a microagent. The data source of a Rule provides information about some condition on a managed node. After information is received, one or more tests are applied to evaluate it. The
MethodSubscription of a data source provides a stream of data objects.
The microAgentName and the method name used to construct the DataSource can be obtained from
MicroAgentDescriptor and
MethodDescriptor, respectively.
Tests define the tests which are performed on the rule's data source and what actions to take. Each test uses the data to compute a true or false value which is used in determining when to trigger actions. Test objects have a state that is either true or false. The initial state of a new Test object is false. State transitions are caused by evaluating the received data based on the specified conditions and the policies of the test. The possible Test object state transitions are:
All Test object state transitions cause its
ConsequenceAction objects to be evaluated. The policy of the
ConsequenceAction objects govern whether an evaluation results in an action execution. The
ClearAction objects are a list of actions that will be executed when the
Test object undergoes the T->F transition.
State transitions resulting from the receipt of data start with an evaluation of the TestExpressionOperator against the data. The resulting true or false value of the TestExpression, in conjunction with the Test object's TrueConditionPolicy and ClearConditionPolicy, determines the type of Test object state transition, as follows:
The ClearTest policy specifies an additional test expression (clear test expression), which governs when the T->F transition occurs. The Clear Test Expression receives data each time the Test object receives data. It will cause a T->F transition of the Test object if (the current state of the Test object is true and) the clear test expression evaluates to true.
TestExpressionOperator are created using the Operator class. The static method Operator.getOperatorDescriptors() returns a list of descriptors describing all available operators. Using the information in the OperatorDescriptor, you can then build instances of the Operator class by supplying the operator name and a list of operands. The operands you supply must be of the same number and type as those specified by the corresponding descriptor. An operand of an operator may itself be another operator, as long as its stated return type matches the operand position it occupies. Operators can thus be nested to form more complex operators.
Although operators can return different types, only those which return a Boolean value may be used in tests (i.e. as arguments to
Test.setTestExpressionOperator() ). The other non-Boolean operators are used only as nested operators.
Test operators access the rule's data source through the COM.TIBCO.hawk.config.rbengine.rulebase.operators.getRuleData operator. This operator takes a name and returns the associated data. As described, if a data source produces TabularData then that data is decomposed into CompositeData objects before seen by the tests and thus the getRuleData operator. The name parameter to this operator references the corresponding data element of the CompositeData object which is then returned by the getRuleData operator. If the data source produces one of the remaining OpenData types (String, Char, Boolean, Byte, Short, Integer, Long, Float, Double) then that value is accessible via the getRuleData operator using the name assigned to the return type in the MethodDescriptor for this data source.
The ConsequenceAction object extends the Action object. The Test object invokes its ConsequenceAction objects each time the Test object makes a state transition. The type of transition along with the ConsequenceAction object's PerformActionPolicy and EscalationPeriod determines whether or not the action is executed.
A True Series of transitions is defined as a series of transitions that begins with F->T and is followed by one or more T->T transitions. A T->F marks the end of a true series but is not part of it.
Actions are not enabled during an entire true series. The EscalationPeriod specifies the number of seconds that must elapse since the start of a true series before the action becomes enabled. An EscalationPeriod of 0 indicates that the action is always enabled. Actions may only execute when enabled.
The PerformOnceOnly policy causes the action to be executed only once during a true series. An exception to this rule involves variable substitution. If variable substitution would result in a different action than the last one that has executed within the current true series (For example, raise an alert with different text), then the action will also be re-executed on the current T->T transition.
The PerformAlways policy causes the action to be executed upon every evaluation within a true series (after the action has become enabled).
The PerformCountOnInterval policy is more involved. It causes the action to be executed at the start of a true series (or as soon as it becomes enabled), and on subsequent evaluations within the same series that occur at a time greater than Y seconds since the last action execution within the current true series. This continues until the action has executed for a maximum of X times within the current true series.
Alerts are generated when a ConsequenceAction invokes the sendAlertMessage on the RulebaseEngine microagent. The method takes a single argument named 'message'. The value of the argument may be one of the following objects:
AlertLow, AlertMedium, and AlertHigh correspond to alert with level from low to high. They are useful for sending non-alert type messages. All methods take a single string argument called 'alertMsg'. Alerts are cleared when the Test Object (that generated the alert) transitions T->F.
Posted Conditions are "posted" when a ConsequenceAction object invokes the method postCondition on the RuleBaseEngine microagent. A posted condition is an internal status message, similar to an alert message. It takes a single argument called 'condition'. The following code fragment constructs a valid ConsequenceAction which posts the condition "disk full":
A ClearAction may not contain a MethodInvocation with the postCondition method. Posted Conditions are "cleared" or "unposted" when the enclosing Test object transitions T->F. Posted conditions provide a mechanism for different rules within the same rulebase to communicate. One of the restrictions on posted conditions is that no two ConsequenceAction objects in the same rulebase may post the same condition (conditionName). This is enforced by the methods that construct and edit Rulebase objects.
Another restriction is that a posted condition may not be referenced (used in a test operator) from within the same Rule that generates it. (Rules contain tests, tests contain actions, and actions can post conditions. Thus all posted conditions are posted within the context of a particular rule but may only be referenced in tests of other rules in the same rulebase.) This is enforced by the methods that construct and edit Rule objects.
The string arguments of all action MethodInvocation objects may contain variables which are evaluated by the rules engine before invocation. By referencing variables, the rulebase can adapt to changes on multiple machines.
A character is considered to be alphanumeric if and only if it is specified to be a letter or a digit by the Unicode 2.0 standard (category "Lu", "Ll", "Lt", "Lm", "Lo", or "Nd" in the Unicode specification data file). The latest version of the Unicode specification data file can be found at
http://www.unicode.org/ucd.
Overruling is a way to have a rule in one rulebase override or overrule a rule in another rulebase in a way that causes only one to be active. Overruling is a way of setting precedence among similar rules.
Rulebase use by the agent are maintained in a Rulebase object. Agent stores and retrieves the each Rulebase to and from a rulebase file. The filename of the rulebase correspond to name of the rulebase and has an extension of .hrb. If the filename of the rulebase does not correspond to the name of the rulebase, TIBCO Hawk Agents will not load the rulebase and an error is logged. When TIBCO Hawk Agents is running in auto config mode, rulebases are loaded from the autoconfig directory. When TIBCO Hawk Agents is running in repository config mode, rulebases are loaded from the specified repository.
A Schedule is a configuration object that can be used for determining if a rulebase or part of the rulebase should be 'in-schedule' or 'out-of-schedule' at a given time. If a schedule is not specified in a rulebase, then the rulebase is always in-schedule.
A schedule object is primarily composed of a list of inclusion periods, and a list of exclusion periods. A schedule is in-schedule if at least one of its inclusion periods is in-schedule and none of its exclusion periods are in-schedule. Otherwise, the schedule is out-of-schedule. The inclusion and exclusion periods contain a list of Period objects or PeriodGroup objects.
A Period defines the time intervals, days or months that should be included or excluded in a schedule. It is composed of 4 distinct period components: MinutesInDay, DaysInWeek, DaysInMonth and MonthsInYear. A Period object is in-schedule only if all of its 4 components are in-schedule. Otherwise, it is out-of-schedule.
MinutesInDay contains a set of 1440 continuous 1-minute intervals in a day. The MinutesInDay object is in-schedule if the time for checking the schedule is included in the MinutesInDay.
DaysInWeek contains a set of 7 days in a week. A DaysInWeek is in-schedule if the day of date for checking the schedule is included in the DaysInWeek.
DaysInMonth contains a set of 31 days in a month. A DaysInMonth is in-schedule if the day in the date for checking the schedule is included in the DaysInMonth.
MonthsInYear contains a set of 12 months in a year. A MonthsInYear is in-schedule if the month of the date used for checking the schedule is included in the MonthsInYear.
A PeriodGroup object is a logical group of Period object useful for defining an abstract group of periods. Period groups are useful when you use a set of periods regularly in defining schedules. It also eases the maintenance of those schedules because you can make a change in the period group and have it automatically reflected in all the schedules that use it.
Schedules may be used to control when a monitoring activity or action is performed. Schedules may be applied to a RuleBase, Rule, Test, and Action by specifying the schedule name in the attribute of these objects. If a RuleBase, Rule, Test, or Action makes use of a schedule name that is not defined either because the agent couldn't load the Schedule object or because the Schedule object does not exist then it will be flagged as an error. However, the rulebase processing will continue as if no schedule was specified for that component; the component will behave as if always in-schedule
If the schedule name applied to a rulebase component begins with "!" then it refers to the inverse of a schedule. For example, if the schedule BusinessHours is defined in the schedules configuration, a rulebase component may use either BusinessHours or !BusinessHours to refer to it. When using BusinessHours, that component is in-schedule whenever the BusinessHours schedule is in-schedule. When using !BusinessHours, that component is in-schedule whenever the BusinessHours schedule is not in-schedule. If the schedule BusinessHours is not defined in the scheduler then components using either BusinessHours or !BusinessHours will behave as if no schedule is defined (they both will always be in-schedule).
A rulebase is a hierarchical structure: rulebases contain rules, rules contain tests, and tests contain actions. Therefore, a schedule applied to one node in the hierarchy affects all nodes below it. The following sections describe the behavior of RuleBases, Rules, Tests, and Actions when valid schedules are applied.
When a rulebase is loaded it is not activated unless its applied schedule is currently in-schedule. Thereafter, when its applied schedule transitions to an in-schedule state, the rulebase is activated. When its applied schedule transitions to an out-of-schedule state, the rulebase is deactivated. Before a rulebase becomes active, no rules are processed no monitoring is taking place by that rulebase. When a rulebase is activated, its rules are loaded and monitoring may begin. When a rulebase is deactivated, all of its rules are unloaded which results in the clearing of outstanding alerts (generated from those rules) and the cessation of all monitoring by that rulebase.
When a rule is loaded it isn't activated unless its applied schedule is currently in-schedule. Thereafter, when its applied schedule transitions to an in-schedule state, the rule is activated. When its applied schedule transitions to an out-of-schedule state, the rule is deactivated. Before a rule becomes active, no tests are processed and no monitoring is performed by this rule. When a rule is activated, its tests are loaded and monitoring may begin. When a rule is deactivated, all of its tests are unloaded which results in the clearing of outstanding alerts (generated from those tests) and the cessation of all monitoring by that rule. When a rule is inactive, its enclosing rulebase behaves as if that rule is not there.
When a test is loaded it isn't activated unless its applied schedule is currently in-schedule. Thereafter, when its applied schedule transitions to an in-schedule state, the test is activated. When its applied schedule transitions to an out-of-schedule state, the test is deactivated. Before a test becomes active, no actions are loaded and no monitoring is performed by this test. When a test is activated, its actions are loaded and monitoring begins. When a test is deactivated, all of its actions are unloaded which results in the clearing of outstanding alerts (generated from those actions) and the cessation of all monitoring by that test. When a test is inactive, its enclosing rule behaves as if that test is not there.
When an action is loaded it isn't activated unless its applied schedule is currently in-schedule. Thereafter, when its applied schedule transitions to an in-schedule state, the action is activated. When its applied schedule transitions to an out-of-schedule state, the action is deactivated. Before an action becomes active, it performs no action and does not respond in any way to its test's state transitions. When an action is activated, it begins tracking and responding to its test's state transitions. When an action is deactivated, any outstanding alert it may have generated is cleared and the action ceases to track and respond to the state transitions of its test. When an action is inactive, its enclosing test behaves as if that action is not there.
In most respects, configuration management for Schedules is identical to that for rulebases (when in auto-config mode, the agent will load and store this file from auto-config-dir, etc.). However, all schedules use by the agent are maintained in a single Schedules object. Agent stores and retrieves the Schedules to and from the file schedules.hsf.
Because all schedules are stored in a single file, each agent will load the schedules at startup. However, the scheduler in the agent will evaluate a schedule only if the agent has loaded rulebases that reference that schedule. Such schedules are referred to as
active because there is active interest in them.
In general, having a large number of schedules defined in the schedule file may marginally affect the size of the agent but it does not affect the CPU performance.
RulebaseMap is a configuration object that maps rulebases to agents. It is used when agent is running in a manual configuration mode to determine which rulebases should be loaded on the agent.
There are two types of groups in RulebaseMap, user defined and automatic. A user defined group is a group that a user creates. Automatic groups are groups that agents automatically belong to. A user can define the names of user-defined groups but not that of the automatic groups. A user defined group name begin with "+" and automatic group names begin with "++".
A user defined group can be composed of a number of agents, groups, or a combination of agents and groups. A user defined group may have both user defined and automatic groups as elements in its definition.
The OS groups are automatic groups whose names correspond to the operating systems of the machines the agents are running on. The OS groups have the form "++OSName" where OSName is the value of the Java system property "os.name". Examples of automatic group names are ++Windows 2000, ++Solaris, and ++HPUX. Examples of user defined group names are "+servers" and "+clients".
+group1 agent1 agent2 agent3
In the preceding example, agentX and
+group1 belong to
+groupX. Also,
agent1 belongs to
+group1 as well as
+groupX.
Rulebase mapping defines which rulebases are assigned to an agent or a group. It defines which agents or groups use a particular rulebase. In the following rulebase mapping:
rulebase agent1 agent2 ++Windows
agent1 uses
rulebase1. Agents in
+group2 uses
rulebase2 and
rulebase3. All agents uses
rulebase4 as
rulebase4 maps to "
++", the
all group. All agents that are running under Windows operating system will uses
rulebase1.
Command mapping allows an external command or executable (script) to be specified for an agent or a group. If specified, it is executed and the returned string is parsed on white space to indicate which rulebases to load. When the executable is invoked, the agent name and its automatic group name are passed as parameters to the command.
The use of command mapping depends on a setting of one of the attributes of the RulebaseMap. The command mapping can be used as the only mechanism to generate the rulebases to be loaded or as a supplement to the groups and rulebases mapping of the RulebaseMap. It can also be ignored for generating the rulebases.
If the agent (more specifically, the RulebaseEngine MicroAgent) is configured in one of the manual configuration mode, it will attempt to load the RulebaseMap configuration object after initialization. It will first determine which automatic groups it belongs to. Then it will read and process the group definition component to determine which user defined groups it is also a member of. Next it will process the rulebase mapping component to determine which rulebases it should load. Finally it will use the command mapping mechanism, if one is specified, to get the names of additional rulebases it should load. Once the RulebaseMap has been fully processed, the agent will proceed to load the target rulebases.
|
The -rulebases option supported by the agent ( RulebaseEngine MicroAgent) can be used together with the RulebaseMap to specify additional rulebases.
|
When constructing or modifying any of the configuration objects, copies of the supplied parameters are made and used. When accessing the data of any configuration objects through one of the get methods, copies of the internal data are returned. This insures the integrity of the configuration objects and ensures that proper validity checking can be performed. It also means that changing a configuration object requires that you use one of the set methods on that component. For example, if you extract the tests from a rule using the Rule.getTests() method and then modify one of the tests in the array, the change will not be reflected in the rule until you call Rule.setTests() with the modified test array.
When a configuration object is created using the corresponding editor on the TIBCO Hawk WebConsole, the context of the configuration object is implied by the agent or repository for which the configuration object is defined. For example, when creating rulebases in the rulebase editor, this context is used when presenting to you the choices for data sources and actions, in the form of the related microagents, methods and arguments. When creating rulebases using the Configuration Object API, the microagent name, method name and the data item names must be passed to the methods. These are obtained from the following classes of the COM.TIBCO.hawk.talon package of the Console API:
Microagent names, and the information required to build valid MethodSubscription and MethodInvocation objects, are available in MicroAgentDescriptor objects. The methods used as data sources must be Open Methods since the RuleBaseEngine can only process OpenData.
A MicroAgentDescriptor holds MethodDescriptor objects for all methods of a microagent. The MethodDescriptor for the method chosen as the data source describes the data the method returns. This information is needed to construct test expression operators. The getRuleData operator is used by test expressions to access a method's data. It requires the name of a data item. This name needs to be one of the names of the elements in the method's return which are specified in the MethodDescriptor. Obtaining MicroAgentDescriptor objects is accomplished with the
COM.TIBCO.hawk.hawkeye package of the Console API.
Retrieving and updating configuration objects on a TIBCO Hawk agent or repository is accomplished by invoking methods on the RuleBaseEngine or Repository microagents. This involved both the monitoring and management components of the TIBCO Hawk Console API as agents and microagents need to be discovered and method invocations are performed on the required microagents. See
Appendix A, Common Configuration Object API Methods, for commonly used methods on RuleBaseEngine or Repository microagent when using the Configuration Object API.
See the TIBCO Hawk Console API Reference and the
TIBCO Hawk Methods Reference for more information.