Recovering after Disaster

Add these steps to your enterprise's comprehensive plan for switching business operations to the disaster recovery site. When disaster disables the main site, administrators complete these steps as part of the comprehensive plan.

Procedure

  1. Optional. Remap DNS addresses.
    If your disaster recover plan includes remapping the DNS addresses of FTL servers, then complete that remapping first.
  2. Prepare a new main configuration file.
    It is convenient to have already prepared a new main configuration file, in addition to the usual standby configuration file.
    1. Copy the standby configuration file, and modify the new main copy as follows.
    2. Omit the parameter drfor from all core servers, so the former recovery servers now become primary FTL servers.
    3. If authentication is required, supply the user and password parameters, with credentials that the new primary servers can use to authenticate themselves to affiliated servers.
      Ensure that the user name is in the authorization group ftl-internal.
  3. Stop and restart all remaining core servers, one at a time.
    As each server restarts, it reads the modified configuration file.
    Wait for the persistence services quorum to resynchronize.
    Restart the FTL servers in this order:
    1. Restart the former disaster recovery core servers as the new main site primary core servers.
    2. Adjust the satellite FTL server configuration files.
      If your recovery plan remaps DNS addresses, then you may skip this step.
      Otherwise, adjust the values of satelliteof parameters so that they connect to the new main core servers.
    3. Restart the core servers at each satellite site.
  4. Modify the primary set of each persistence cluster.
    In the persistence clusters grid of the FTL server GUI, change the primary set of each participating cluster so that the persistence services at the disaster recovery site become the primary set.

    Deploy the modified realm definition.

  5. Ensure that the persistence services at the disaster recovery site form a quorum.
    In the persistence clusters status table, verify that the cluster status is Running. In the services list sub-tables, verify that all the services in the cluster are synchronized.

    If the cluster cannot form a quorum, clients cannot connect to its services. Consider forcing a quorum; see Before Forcing a Quorum.

  6. Direct application clients to FTL servers at the disaster recovery site.
    Choose only one of these two alternatives:
    • If you remapped the DNS addresses of FTL servers:
      • Verify that the new DNS information has propagated.
      • Verify that all clients automatically connect or reconnect to FTL servers at the disaster recovery site, and are operating correctly.
    • Otherwise, when you restart all application clients, explicitly supply the locations of the new FTL servers where the clients can access realm services and persistence services.
    The disaster recovery site is now the new active site of FTL operations.
  7. Arrange another disaster recovery site, to protect against a disaster at the newly active site.