Node Resynchronization

An operation is periodically started in the background to check and resynchronize the archived data. The time needed to resynchronize both nodes depends on the available network bandwidth and the data size difference. It can be a time-consuming operation, from 5 minutes up to multiple days in the case of an appliance replacement.

An appliance has two types of data:

  • Archived data (read-only)
  • Active data (currently being modified)

Archiving is based on a threshold of disk usage. Node resynchronization is a background mechanism triggered by a failover membership event or activated periodically to resynchronize the standby with the active node, and guarantees that both appliances have exactly the same data in time.

The implementation is based on an open source utility that provides fast incremental file transfer. A wrapper software on top of the utility provides an online Check Point mechanism of database tables.

When a node that is configured to be part of an HA pair connects with its partner for the first time, an automatic data migration takes place. This operation makes one node a copy of the other. The node that keeps its original content is known as the Source node. The node that loses its original content is known as the Destination Node. The destination node is designated by the user when configuring the HA feature on each node. This operation resumes after disconnections or shutdown of any of the pair members, until completed. Only once this is completed is the pair able to provide the fail-over feature. Until then the Source node assumes the public IP and acts as the Active node. The Destination acts as the standby node but is not allowed to become Active.

After the data migration is complete, and as long as both nodes remain connected, the standby node contains the same data as the active node with at most one minute of latency (LogLogic LX Appliance) or less than 3 seconds of latency (LogLogic ST Appliance). If a node is temporarily disabled and later rejoins the cluster, it becomes the standby node and starts an operation to resynchronize with the active node. During this initial data migration that occurs when a node joins the cluster, the following data is removed:

  • Data collected on the standby node before it rejoins the cluster
  • Data that does not already exist on the master