Active Cluster Guide > Handling Active Cluster Errors > Troubleshooting > A Cluster Node Is DISCONNECTED
 
A Cluster Node Is DISCONNECTED
A DISCONNECTED node might require intervention:
A Cluster Node Goes Offline
A Cluster Node Is Busy and Does Not Respond
Nodes Have Been Partitioned into Subgroups
A Node Has Been Evicted Due to Metadata Conflicts
A Cluster Node Goes Offline
If a cluster node is rebooted, disconnected, or taken offline, its status changes to DISCONNECTED from the view of the remaining nodes. When the cluster node comes back up, it automatically rejoins and is resynchronized. However, the cluster node cannot rejoin the cluster if while the node was disconnected the node’s metadata and the cluster metadata were modified. To rejoin the node to the cluster, follow the procedure in Adding a TDV Server to an Active Cluster.
A Cluster Node Is Busy and Does Not Respond
Each node generates a heartbeat that alerts the cluster on base port plus seven (for example, 9407) that the node is still connected. If a node fails to generate a heartbeat within a designated period, the node is temporarily disconnected from the cluster. When the busy node next sends its heartbeat, the other nodes reject it, prompting the busy node to reset its cluster connections and attempt to resynchronize and push its changes to the cluster. However, if metadata changes are in conflict, the node is removed from the cluster. See A Node Has Been Evicted Due to Metadata Conflicts for more information.
The cluster node can rejoin the cluster by following the procedure in Adding a TDV Server to an Active Cluster. However, because joining a cluster wipes all metadata in the joining node, you might need to make your changes again if you want them to be reflected in the cluster.
Nodes Have Been Partitioned into Subgroups
If connection failures occur, the network topology and configuration might cause the cluster to be partitioned into subgroups. For example, if some nodes are connected to the cluster through a common, failed router, those nodes could become a cluster subgroup. The DISCONNECTED status of the cluster nodes in Manager can help to troubleshoot this type of event.
For example, in a five-node cluster with nodes A through E, if D and E are simultaneously disconnected, two subgroups might be formed, each with its own timekeeper: A, B, and C in one, and D and E in the other. In Manager, all cluster nodes in both groups would be visible. The status of the subgroup nodes D and E would appear as DISCONNECTED from nodes A, B, and C. The status of nodes A, B, and C would appear as DISCONNECTED from nodes D and E.
During partitioning, changes can be made in each cluster group. When connections are successfully re-established, the two subgroups are automatically merged and the metadata synchronized, if there are no metadata conflicts. The original timekeeper would again become the timekeeper for the merged cluster.
See A Node Has Been Evicted Due to Metadata Conflicts for more information.
A Node Has Been Evicted Due to Metadata Conflicts
Situations can occur where metadata changes are in conflict. For example:
Nodes are out of sync because the cluster was partitioned, and metadata changes happened in both partitions.
In such a case, the partition with the original timekeeper prevails and the nodes belonging to the other partitions are removed from the cluster. (A partition can contain just one node.)
Nodes are out of sync because each is modified in the absence of the other.
The presence of a third active node in these scenarios would prevent the conflict from happening, because the third node would propagate the changes to.
Typically, the node that is out of sync is automatically evicted from the cluster, and all sessions are terminated. If a node is automatically evicted, review the server and cluster logs to find the cause and resolve it.