Performance: Disk Persistence
Persistent messages (and acknowledgments of persistent messages) may be stored in memory or on disk.
Storing messages (and acknowledgments) in memory improves performance by eliminating the cost of writing to disk. However, when persistent messages and acknowledgments are stored in memory, they can be lost if a majority of persistence services in a persistence cluster restart simultaneously.
For example, if there are three persistence services in a persistence cluster, and two of them restart simultaneously, then messages or acknowledgments might be lost.
Storing messages and acknowledgments on disk has the following advantages.
-
Greater durability: Messages are not lost when multiple persistence services restart simultaneously (unless the disks themselves are lost).
-
Reduced memory usage: After a message is written to disk, the persistence service can optionally free the message data from memory, so that message fields set by an application do not consume memory on the server host. Note that the persistence service must still retain some metadata in memory for each message, so overall memory usage is proportional to the number of pending messages (unless indexes on disk are enabled; also see Performance: Indexes on Disk).
-
Reduced recovery time: When a persistence service restarts, it can recover message data from the local disk rather than copying it from another replica. Note that the persistence service must recover metadata for each message, so overall recovery time is proportional to the number of pending messages (unless indexes on disk are enabled; also see Performance: Indexes on Disk).
Enabling disk persistence comes at the cost of some performance. You have two choices:
-
Synchronous disk persistence: Confirmations for messages (or acknowledgments) are not sent to producers (or consumers) until the message or acknowledgment has been stored persistently on disk.
-
Asynchronous disk persistence: Confirmations for messages (or acknowledgments) are sent to producers (or consumers) once the OS has buffered the write for an eventual flush to disk. The message or acknowledgment may or may not be stored persistently on disk when the application receives a confirmation.
Synchronous disk persistence offers the greatest durability, since the data has been stored persistently on disk by the time the application receives a confirmation. However, each time the application makes a synchronous send call or acknowledgment call, it must pay the latency cost of a write to disk.
Some high-latency disks nevertheless offer high throughput. (Often this is some type of network storage.) For these cases, asynchronous disk persistence may improve performance significantly, as the OS is allowed to batch multiple messages into one write to disk. As applications receive confirmation when the OS has buffered the write, a tail end of the message/acknowledgment data might be lost if a majority of the server hosts crash.
For details, see Disk-Based Persistence. For information about how to configure disk persistence, see Cluster Details Panel.