Preparing FTL Servers for a Disaster Recovery Site

The first task in arranging disaster recovery for TIBCO FTL is to configure and run FTL servers that parallel the servers at the main site.

Given the complexity of this procedure, it is strongly recommended that you review the samples/yaml/dr or samples/yaml/dr-secure subdirectory of your FTL installation, and follow along according to the steps in samples/yaml/readme.txt.

Before You Begin

The physical infrastructure at the disaster recovery site must be operational.

The communications infrastructure connecting the main site to the disaster recovery site must be operational.

If authentication is enabled, define administrative users (with the ftl-internal role) for use by both sites. Authentication services must be available at both sites.

Procedure

  1. Start the FTL servers at the primary site.

    You may use the following for reference:

    • samples/yaml/dr/tibftlserver_primary.yaml

    • samples/yaml/dr-secure/tibftlserver_primary.yaml

    In each YAML configuration file, define the drto parameter, which is a list of addresses that the FTL server will use to connect to FTL servers at the disaster recovery site. For details see FTL Server Configuration Parameters.

    If authentication is required, configure authentication parameters (for example, user and password) that the primary servers can use to authenticate themselves to the disaster recovery FTL servers. For more information, see Authenticating to FTL Server. Ensure that this username is in the ftl-internal authorization group (according to the authentication service at the disaster recovery site). See FTL Server Authorization Groups.

    If TLS security is required, distribute the keystore file and trust file to all FTL servers. See Securing FTL Servers.

    If you have FTL servers and a persistence cluster already running, you may issue the enable_dr REST command to any primary FTL server, with the URLs of the disaster recovery site as the argument. This adds DR connectivity without requiring a restart. In the event of a restart, the drto URLs are persisted even if they do not appear in the YAML configuration file. The drto URLs may be added to the YAML configuration file at any later time. For details on enable_dr, see POST cluster.

  2. Start the FTL servers at the disaster recovery site.

    You may use the following for reference:

    • samples/yaml/dr/tibftlserver_dr.yaml

    • samples/yaml/dr-secure/tibftlserver_dr.yaml

    In each YAML configuration file, the drfor parameter must be defined, which is a list of addresses that the FTL server will use to connect to FTL servers at the primary site. For details see FTL Server Configuration Parameters.

    If authentication is required, configure authentication parameters (for example, user and password) that the disaster recovery servers can use to authenticate themselves to the primary FTL servers. For more information, see Authenticating to FTL Server. Ensure that this username is in the ftl-internal authorization group (according to the authentication service at the primary site). See FTL Server Authorization Groups.

    If TLS security is required, distribute the keystore file and trust file to all FTL servers. See Securing FTL Servers.

  3. Configure a persistence cluster and stores.

    You may use the following for reference:

    • samples/yaml/dr/dr-cluster-sample.json

    • samples/yaml/dr-secure/dr-cluster-sample.json

    At the persistence cluster level, DR Enabled must be checked.

    At the persistence store level, the store should be replicated. Inspect each persistence store definition, and ensure that the Replicated checkbox is selected.

    The set of persistence services should consist of two equivalent and parallel subsets:

    • Services at the main site

    • Services at the disaster recovery site

    Select the main site subset as the primary set.

    Configure the disaster recovery transport of each persistence service so that every persistence service at the main site can communicate with every persistence service at the disaster recovery site on this transport. That is, enable full mesh connectivity within each persistence cluster. (Nonetheless, only one pair of services at a time connect across the WAN link.)

    In addition to configuring the disaster recovery transports in the realm, it is good practice to verify the configuration of routers, firewalls, and other network infrastructure that connects the two sites.

    Deploy the realm definition.

  4. Verify replication of data to the disaster recovery site.

    The administrative GUI can be used to verify that the same number of messages are stored at the primary and disaster recovery sites.