Persistence Architecture
Persistence is flexible. Administrators can tailor various aspects of persistence to meet the needs of applications.
Storage, Replication, and Fault Tolerance
Persistence services manage stores and durable subscriptions (durables). A store holds messages until they are consumed. A persistent store is a store that is replicated to further ensure against loss of messages.
Replication of stores across a cluster of persistence services protects against hardware or network failures on a small scale. (However, this replication scheme cannot guarantee delivery after catastrophic failures.)
Note that for clusters with only one member it is possible to configure a replicated store.
Message Swapping
Stores hold message data typically in process memory to avoid the latency associated with disk I/O. However, with optional message swapping, if storage requirements exceed configured memory limits, excess messages are temporarily written to disk as needed. The use of message swapping can hedge against bursts. Memory threshold limits can be set on both a per-store and per-durable basis.
You can set a memory limit (Swap Byte Limit) on a per-store and a per-durable basis. If a message exceeds either limit, it is swapped out.
As a good practice, for last-value durables, either a swap memory limit of zero (swap everything to disk), or a limit high enough to contain everything in that durable. Otherwise performance for the durable may be variable.
For standard durables without prefetch, configuring a swap memory limit greater than zero is not expected to increase throughput, because messages are typically delivered on the direct path.
For shared durables and standard durables with prefetch, configuring a swap memory limit greater than a typical backlog size for those durables may improve throughput. However, throughput can vary if either the durable or the store swap memory limits are exceeded.
You can enable message swapping from the administrative GUI, via REST API, or via YAML configuration file. For example, in the YAML file, to enable message swapping on the default cluster:
servers: <ftlserver name>: - realm: default.cluster.disk.swap: true
Disk-Based Persistence
You can store FTL messages and metadata to multiple disks when disk access is more readily available and cost effective than using memory. These messages and metadata can also then be automatically recovered on a full restart of the persistence cluster.
Disk persistence is enabled on a by-cluster basis, in one of two modes:
-
sync - The client returns from a send-message call after the message has been written to a majority of disks. This mode generally provides consistent data and robustness, but at the cost of increased latency and lower throughput. If the cluster restarts, no data is lost; performance is subject to disk performance.
-
async - The client may return from a send-message call before the message has been written to disk by majority of the FTL servers. This mode generally provides less latency and more throughput, but messages could be lost if a majority of servers restart shortly after the API call.
You can enable disk-based persistence from the administrative GUI, via REST API, or via YAML configuration file. For example, in the YAML file, to enable disk persistence on the default cluster:
servers: <ftlserver name>: - realm: default.cluster.disk.persistence: sync
Non-replicated stores are never persisted to disk, though they may be swapped to disk if disk swapping is enabled.
Handling Persistence Service Disk Capacity
The FTL Server configuration parameter max.disk.fraction
monitors disk capacity and prevents a disk full state to keep the persistence cluster running in the event that disk space is not available. For example, assume a disk size is 10GB and the default max.disk.fraction
is set at 0.95. The persistence service would stop accepting messages once the disk usage reaches 9.5GB.
For FTL 6.9.0, this feature is only available on Linux platforms.
Considerations
-
Persistence services must have disk persistence enabled to enforce
max.disk.fraction
. -
Only replicated stores are persisted to disk.
-
Even with
max.disk.fraction
enabled, you still need to configure reasonable byte limits and message limits at the cluster level, store level, or both. See max.disk.fraction in Persistence Service Configuration Parameters. -
The persistence service measures total disk usage, not just its own, and compares that to
max.disk.fraction
. For example, if several persistence services use the same disk,max.disk.fraction
is compared to all disk usage across all persistence services and any other processes. -
The disk volume with the persistence data directory should have the same disk capacity for each persistence service in a cluster. Each persistence service in a cluster should also use the same value of
max.disk.fraction
.
Values and Behavior
The max.disk.fraction
default value is 0.95. Publish calls fail once the total disk usage approaches the max.disk.fraction
setting multiplied by the capacity of the disk that contains the persistence data directory. The persistence service may go over or under the limit by a small amount. A best practice is to allow for some overage so the persistence service continues to process subscriber acknowledgments while the disk is nearly full. The default value of 0.95 should allow for sufficient overage in common scenarios. In high fan-out cases, where many subscribers must acknowledge the same message before it can be deleted, consider reducing max.disk.fraction
.
The impact of publish failures due to the disk usage limit depends on the publisher mode:
-
If the publisher mode is
store_confirm_send
, the publish will be retried automatically by the FTL client library for the publisher's retry duration. This allows some time for subscribers to consume messages and free disk space before the publish call returns an exception. -
If the publisher mode is
store_send_noconfirm
, there is no retry, and the call returns immediately with no exception.
Once enough disk space has been freed, the publish call will no longer fail.
Values for max.disk.fraction
of less than 0 and more than 1 are not allowed.
A value for max.disk.fraction
of 0 disables the feature which means there is no limit on disk usage. When set to 0 and persistence services have disk persistence enabled, publish calls will not stop, and an overfull backlog may cause a system failure.
Logs
The persistence service reports an estimate of its own disk usage, and other statistics, via the monitoring stream and, periodically, in the log.
The persistence service periodically logs statistics about message rates. If disk persistence is configured, statistics about disk usage and disk write are also periodically logged.
See GET persistence/clusters/<clus_name>/servers and Catalog of Persistence Metrics.
Latency
Using persistence for delivery assurance is consistent with high-speed message delivery with low latency. Delivery assurance operates alongside regular direct-path delivery. Transports carry messages directly from publishers to subscribers without an intermediary hop through a persistence service, which would add message latency. Separate transports carry messages from publishers to standard durables in a store in the persistence services, which retain them for as long as subscribers might need to recover them.
However, using persistence to apportion message streams or for last-value availability emphasizes throughput rather than the lowest latency. Delivery through durables replaces direct-path delivery. The persistence service is an intermediary hop, which adds message latency.
Meanwhile, a message broker emphasizes the convenience of a well-known pattern and minimal configuration, at the cost of added latency.
Wide-area stores involve the inherent latency of a WAN.
Publisher Quality of Service
For each store, administrators can balance appropriately between performance requirements and the need to confirm message replication.
Subscriber Acknowledgment
Within application programs, subscribers can acknowledge message delivery automatically or explicitly.
Administrators can configure durables to receive individual acknowledgments synchronously, or in asynchronous batches.
Durable Creation
Administrators can arrange for dynamic durables, which applications create as needed. Dynamic durables require minimal administrative configuration. Programmers take responsibility for the number of durables and their names.
Administrators can define static durables in the realm. Static durables require more administrative configuration and greater coordination between programmers and administrators. Administrators control the number of durables and their names.
Durables can be configured for Total Time to Live (TTL). Durables with a low TTL value are considered ephemeral durables.
Persistence Effectiveness
The flexibility of FTL persistence allows for various levels of persistence effectiveness, depending on factors such as the number of replicated stores, data limits store and persistence service hosting, and durable TTL. Persistence is generally considered adequately effective when stores are replicated and durables are non-ephemeral.