Primer on Clustering and HA Using StreamBase

What is StreamBase Clustering?

StreamBase clustering is an umbrella term for a group of technologies that allow multiple StreamBase Servers to be managed together to achieve certain goals.

The goals of implementing a server cluster vary widely:

High Availability (HA). Where StreamBase event processing is critical to an enterprise's operations, a server cluster can be used to maximize the uptime of the processing engine.
Fault Tolerance. A server cluster can provide an available backup server to take over in the event of certain hardware or software failures on the primary server.
Disaster Recovery. Server cluster technologies can be used to provide a hot or warm offsite backup for a critical event processing engine.
Scalability. Clustering technologies can be used to add processing power to an event processing engine by adding servers that share the load.

StreamBase Components That Support Clustering

The following StreamBase components and features support a clustered or high-availability solution.

Component	Discussion	Link to topic
Containers	Containers are essential for the implementation of clustering and HA. Containers allow the application logic and HA logic to be separated. The HA container can do the work of monitoring the other server in a cluster pair, leaving the application logic unaffected.	Using Containers
Dynamic Container Control	Containers can be modified while running to add and remove container connections, and to start and stop enqueuing and dequeuing from the container's ports.	Container Overview sbadmin
Synchronous Container Connections	Connections between containers can be designated as synchronous to improve the latency of the container connections.	Synchronous Container Connections
HA Heartbeat Adapter	The HA Heartbeat Input adapter is used in an HA logic application to monitor its counterpart in the other member of a cluster pair.	HA Heartbeat Input Adapter
StreamBase to StreamBase Adapters	StreamBase to StreamBase input and output adapters can be used to communicate between components in a cluster of StreamBase applications.	StreamBase to StreamBase Input Adapter StreamBase to StreamBase Output Adapter
Precompiled Applications	You can start an instance of StreamBase Server with a precompiled application file, which enables fast startup of applications, and fast restart in an HA handover situation.	Precompiled Application Archives
Leadership Status	Each StreamBase Server instance can be designated as leader or nonleader of a cluster. Changes in leadership status are announced on the `system.control` stream. You can connect your application directly to this stream by using an expression to define an input stream as a connection to its container.	sbadmin Using Control Stream Features Query Table Replication Sample
Multiple URI Syntax	Some commands and configuration settings accept a string of comma-separated StreamBase URIs, which allows you to send the same command at the same time to each member of a cluster.	Multiple URI Syntax for HA Scenarios
External Process Operators	The External Process operators provide a way for StreamBase applications to run arbitrary operating system commands as if typed at the shell command prompt for the current operating system. This feature is especially useful in HA application contexts, where an application in one container might need to send an sbadmin command to an application in another container or on another StreamBase Server.	Using the External Process Operator
Built-in processes	StreamBase supports automatic table replication and automatic leadership tracking. You can use these features together to simplify and streamline high-availability applications that use Query Tables.	Replication of Query Tables High Availability Samples

Using Event Processing to Solve HA Problems

StreamBase clustering and HA technologies have the following characteristics:

The components of clustering are integrated into the product at the design level.
Support for clustering is not a single switch you can enable, it is a group of components that support clustering in different ways.
You can use different cluster-enabled components in different ways, mixing and matching to solve your site's unique problems.

StreamBase recognizes clustering and HA as complex event processing problems. StreamBase clustering technologies are self-hosted, and use StreamBase event processing to implement support for clustered servers. The implementation languages are StreamSQL and EventFlow.

Template Strategies

Implementing StreamBase clustering is a matter of assembling components in different ways. StreamBase recognizes four major patterns for assembling components into working cluster solutions. These patterns form the basis of four template strategies that StreamBase offers for planning the implementation of clustering to meet different goals.

Hot-Hot Server Pair Template

A hot-hot server pair has the following characteristics:

Both servers in the pair receive all input and perform all calculations.
The secondary server is a fully participating server.
Application logic is identical on both servers.
Application logic and HA logic are implemented in separate containers.
Heartbeats are used between the HA containers of the two servers to monitor for a possible failure of the primary server.
The downstream system receives two of everything.
Synchronization between application containers is optional.
Shared storage is optional.

The following diagram provides a high-level overview of hot-hot deployment.

Hot-Warm Server Pair Template

A hot-warm server pair is also called a quiet secondary deployment. It has the following characteristics:

Like hot-hot, both servers receive all input and perform all calculations, and application logic is identical on both servers.
In hot-warm, the secondary server does not send output downstream. All output streams and output adapters are present but disabled.
On failure of the primary, the secondary takes over and begins sending output messages.
Synchronization between application containers is optional, as it is with the Hot-Hot pair.
Shared storage is, again, optional.

The following diagram provides a high-level overview of hot-warm deployment.

Shared Disk Template

The shared disk template has the following characteristics:

The backup server is a duplicate in application logic of the primary, and is ready to take over.
Critical application state information is replicated to shared storage.
The shared storage can be a disk query table on a storage area network (SAN), or can be a relational database.
On failure of the primary server, the backup server is started and restores the application state from the same storage medium.

The following diagram provides a high-level overview of shared disk deployment.

Fast Restart Template

The fast restart template is the simplest to implement. This template is best for stateless or near-stateless applications, and has the following characteristics:

A backup server is prepared and ready. The same hardware can be assigned as backup for multiple primary servers.
The server process on the primary server is constantly monitored.
On failure of the primary, the primary server process is restarted, using StreamBase components that reduce server startup time to its minimum.
If the primary server fails to restart, the backup server is fast-started to take over processing.
Monitoring of the primary server and restart can be automated, or involve operations staff, or both.

The following diagram provides a high-level overview of fast restart deployment.

Disaster Recovery Scenario

To implement a disaster recover scenario, an offsite implementation can combine the hot-warm and shared disk templates. The disaster recovery site would run an identical deployment, with shared storage implemented over a network connection using either SAN or relational database storage.

The following diagram illustrates one possible disaster recovery scenario:

Scalability Scenario

You can use clustering techniques to implement scaling of your StreamBase Server implementation from one to multiple servers. With planning, the same parallel code and data techniques allow you to add new servers to a stream processing cluster to meet higher load demands.

The following diagram provides a high-level overview of StreamBase Server scalability:

Next Steps

Designing Applications for High Availability