Node Failure

Loss of network connectivity is considered a node failure; that is, the miss of a heartbeat and failure to establish a TCP connection.

This can occur due to a hardware failure (the node is down), a network connection failure, or a network partitioning or a software error escalation. Every process running on the appliance is monitored and a repeated software error condition can trigger an escalation that reboots the appliance, hence triggering a node failure in the context of the failover.

Ethernet Disconnection

If the ethernet cable is unplugged from a primary appliance in an HA pair, a failover triggers.

HA pair (cluster) memberships fail, and eventually the primary appliance enters failsafe mode. Plugging in the ethernet cable stops the failures, but the appliance remains in failsafe mode.

To get an appliance out of failsafe mode:

> mtask -s cluster_membership stop
> mtask -s cluster_membership start