High Availability Cluster Design

TIBCO Cloud™ API Management - Local Edition architecture is quite flexible, and can be scaled up or down as needed. But other than TPS consideration, to achieve High Availability (HA), you need to carefully design the cluster. Container orchestrators (K8S or Swarm) can take care of components failure, for example, if one Local Edition component shuts down due to some reason, they will bring another one to maintain the given number of instances if all the criteria are satisfied. But to achieve HA in case of infrastructure failure (such as node failure, zone failure, etc.) of K8S or Swarm, extra planning should be done before creating cluster. HA at different levels requires different planning. The following sections provide some general guidelines to achieve HA for the Local Edition cluster to work as expected.

Cluster considerations for High Availability

Node/Instance redundancy

If your K8S or swarm cluster has been designed just to meet a TPS requirement, then node failures might degrade Local Edition functioning. Situations can be more challenging in case you placed some deployment constraints during initial deployment without HA considerations. One such deployment constraint could be that three NoSQL pods should be deployed on three different nodes in a three K8S worker nodes cluster. If any node failure happens in this scenario, then the NoSQL pod which was running on this node will not be redeployed on remaining two nodes. Unless the third node joins the K8S cluster, the Local Edition cluster will work (assuming no other constraint) but remain in inconsistent state. TIBCO recommends having extra node/s in the K8S/ Swarm cluster if any pod/ container deployment constraints is in place. Even in the case of no deployment constraint, extra nodes might come handy in maintaining TPS in case of node failure.

Availability Zone Redundancy

API Management - Local Edition supports multi zone deployment. You can deploy Local Edition components spread across different availability zones in all major cloud platforms (AWS, Azure, GCP, etc.). In multi zone deployment, configuration and token data is continuously synced across zones, so even in case of a zone failure, another zone can still serve the traffic. Multi zone deployment should be considered during the initial planning phase itself. Extending the existing cluster is currently not supported. Also, out of the box multi zone deployment is supported only in K8S. But it can be achieved in Docker Swarm as well.