Persistence Architecture
Persistence is flexible. Administrators can tailor various aspects of persistence to meet the needs of applications.
Storage, Replication, and Fault Tolerance
Persistence services manage stores and durable subscriptions (durables). A store holds messages until they are consumed. A persistent store is a store that is replicated to further ensure against loss of messages.
Replication of stores across a cluster of persistence services protects against hardware or network failures on a small scale. (However, this replication scheme cannot guarantee delivery after catastrophic failures.)
Note that for clusters with only one member it is possible to configure a replicated store.
Message Swapping
Stores hold message data typically in process memory to avoid the latency associated with disk I/O. However, with optional message swapping, if storage requirements exceed configured memory limits, excess messages are temporarily written to disk as needed. The use of message swapping can hedge against bursts. Memory threshold limits can be set on both a per-store and per-durable basis.
You can set a memory limit (Swap Byte Limit) on a per-store and a per-durable basis. If a message exceeds either limit, it is swapped out.
As a good practice, for last-value durables, either a swap memory limit of zero (swap everything to disk), or a limit high enough to contain everything in that durable. Otherwise performance for the durable may be variable.
For standard durables without prefetch, configuring a swap memory limit greater than zero is not expected to increase throughput, because messages are typically delivered on the direct path.
For shared durables and standard durables with prefetch, configuring a swap memory limit greater than a typical backlog size for those durables may improve throughput. However, throughput can vary if either the durable or the store swap memory limits are exceeded.
You can enable message swapping from the administrative GUI, via REST API, or via YAML configuration file. For example, in the YAML file, to enable message swapping on the default cluster:
servers: <ftlserver name>: - realm: default.cluster.disk.swap: true
Disk-Based Persistence
You can store FTL messages and metadata to multiple disks when disk access is more readily available and cost effective than using memory. These messages and metadata can also then be automatically recovered on a full restart of the persistence cluster.
Disk persistence is enabled on a by-cluster basis, in one of two modes:
-
sync - The client returns from a send-message call after the message has been written to a majority of disks. This mode generally provides consistent data and robustness, but at the cost of increased latency and lower throughput. If the cluster restarts, no data is lost; performance is subject to disk performance.
-
async - The client may return from a send-message call before the message has been written to disk by majority of the FTL servers. This mode generally provides less latency and more throughput, but messages could be lost if a majority of servers restart shortly after the API call.
You can enable disk-based persistence from the administrative GUI, via REST API, or via YAML configuration file. For example, in the YAML file, to enable disk persistence on the default cluster:
servers: <ftlserver name>: - realm: default.cluster.disk.persistence: sync
Non-replicated stores are never persisted to disk, though they may be swapped to disk if disk swapping is enabled.
Setting Automatic Disk Persistence File Compaction
A persistence service can compact its disk persistence files while running online. There is no interruption to active publishers or subscribers. See Compact Disk Persistence Files with Persistence Service Online.
You can enable automatic file compaction from the administrative GUI, via REST API, or via YAML configuration file. For example, in the following YAML file, auto compaction is disabled by default in the initial realm configuration for the default cluster.
servers: <ftlserver name>: - realm: default.cluster.disk.nocompact: true
Auto compaction can only be disabled by YAML file for the default cluster. For other clusters the GUI or REST API has to be used.
Latency
Using persistence for delivery assurance is consistent with high-speed message delivery with low latency. Delivery assurance operates alongside regular direct-path delivery. Transports carry messages directly from publishers to subscribers without an intermediary hop through a persistence service, which would add message latency. Separate transports carry messages from publishers to standard durables in a store in the persistence services, which retain them for as long as subscribers might need to recover them.
However, using persistence to apportion message streams or for last-value availability emphasizes throughput rather than the lowest latency. Delivery through durables replaces direct-path delivery. The persistence service is an intermediary hop, which adds message latency.
Meanwhile, a message broker emphasizes the convenience of a well-known pattern and minimal configuration, at the cost of added latency.
Wide-area stores involve the inherent latency of a WAN.
Publisher Quality of Service
For each store, administrators can balance appropriately between performance requirements and the need to confirm message replication.
Subscriber Acknowledgment
Within application programs, subscribers can acknowledge message delivery automatically or explicitly.
Administrators can configure durables to receive individual acknowledgments synchronously, or in asynchronous batches.
Durable Creation
Administrators can arrange for dynamic durables, which applications create as needed. Dynamic durables require minimal administrative configuration. Programmers take responsibility for the number of durables and their names.
Administrators can define static durables in the realm. Static durables require more administrative configuration and greater coordination between programmers and administrators. Administrators control the number of durables and their names.
Durables can be configured for Total Time to Live (TTL). Durables with a low TTL value are considered ephemeral durables.
Persistence Effectiveness
The flexibility of FTL persistence allows for various levels of persistence effectiveness, depending on factors such as the number of replicated stores, data limits store and persistence service hosting, and durable TTL. Persistence is generally considered adequately effective when stores are replicated and durables are non-ephemeral.
Logs
The persistence service reports an estimate of its own disk usage, and other statistics, via the monitoring stream and, periodically, in the log.
The persistence service periodically logs statistics about message rates. If disk persistence is configured, statistics about disk usage and disk write are also periodically logged.
See GET persistence/clusters/<clus_name>/servers and Catalog of Persistence Metrics.