Monitoring Tab
Rulebases
Click Add to add an existing custom TIBCO Hawk rulebase. The rulebase must have been configured using the TIBCO Hawk Display. For more information, see Adding a Rulebase to a Process or Service.
Events
Click Add to create an event. For more information, see Adding an Event to a Service.
Failure Count
When an instance is down unexpectedly, the error count and last failure time are tracked. When the error count is greater or equal to the value set for Reset Failure Count
, or if the value set for Reset Failure Interval
expires (whichever comes first), the error count is reset to zero.
• | Reset Failure Count. The value in this field defines how many restarts should be attempted before resetting the error counter to 0. |
When an instance is down, the TIBCO Hawk Agent will attempt to restart the instance the number of times specified in this field. If the instance restarts after the number of times specified, the event you have defined is triggered.
• | Reset Failure Interval (seconds). The value in this field defines how much time should expire before resetting the error counter to 0. |
For example, if you define the following three events and set the Reset Failure Count
to 5:
• | Event 1, restart the instance and send an alert on the first failure. |
• | Event 2, restart the instance and send an email on the second failure |
• | Event 3, restart the instance and execute a command on subsequent failures. |
On the first failure, the error count is 1, the instance is restarted and an alert is sent.
On the second failure, the error count is 2, the instance is restarted and email is sent.
On third failure, the error count is 3, the instance is restarted and the command you configured is executed.
On fourth failure, the error count is 4, instance is restarted and the command you configured is executed.
On fifth failure, the error count is 5 and then reset to 0. The instance is restarted and the command you configured is executed.
On sixth failure, the error count is 1, the instance is restarted and an alert is sent.
The cycle repeats.
If you do not want to receive alerts frequently, Reset Failure Count should be set with a high value. When error count is reset to 0, the last failure time is reset as well. The Reset Failure Interval takes effect only after the first failure occurs.