BPM Node Recovery Configuration

On a distributed ActiveMatrix BPM system:

  • The Process Engine on each BPM node performs a periodic "heartbeat" to indicate that it is healthy and performing work. The heartbeat updates the LAST_ACTIVITY value in the BPM database PVM_ENGINE table. (The heartbeat interval is defined by the recoveryHeartbeatInterval property in the pvm.properties file.)
  • A Process Engine recovery thread periodically checks the LAST_ACTIVITY value for each Process Engine to detect if any BPM node has failed. If the value has not been updated for at least the number of seconds defined by the recoveryFailureThreshold property (in the pvm.properties file), the node is deemed to have failed. (The Process Engine recovery thread may be performed by any of the BPM nodes. The ActiveMatrix BPM system determines internally which node performs the thread each time.)
  • If a node failure is detected, an AUDIT message is written to the BPM log file indicating which node failed and when it failed.
    Note: Work is NOT recovered or resubmitted to other BPM nodes.
  • When a failed node is restarted, it automatically recovers and resubmits any outstanding work to itself.
    Warning: If a failed node cannot be restarted for any reason, you should contact TIBCO Support for assistance in recovering the node's outstanding work.