In a distributed cache, cached object data is partitioned between the storage PUs in the cache cluster for efficient use of memory. This means that no two storage PUs are responsible for the same item of data. A distributed caching scheme has the following characteristics:
When a node hosting a cache agent fails the object manager redistributes objects among the remaining cache agents, using backup copies, if the remaining number of cache agents are sufficient to provide the number of backups, and if they have sufficient memory to handle the additional load.
However, because this is a memory-based system, if one cache agent fails, and then another cache agent fails before the data can be redistributed, data may be lost. To avoid this issue, use a backing store.
If redistribution is successful, the complete cache of all objects, plus the specified number of backups, is restored. When the failed node starts again, the object management layer again redistributes cache data.
Specifically, when a cache agent JVM fails, the cache agent that maintains the backup of the failed JVM’s cache data objects takes over primary responsibility for that data. If two backup copies are specified, then the cache agent responsible for the second backup copy is promoted to primary backup. Additional backup copies are made according to the configuration requirements. When a new cache agent comes up, data is again redistributed across the cluster to make use of this new cache agent.
Because they store data in memory, cache-based systems are reliable only to the extent that enough cache agents with sufficient memory are available to hold the objects. If one cache agent fails, objects are redistributed to the remaining cache agents, if they have enough memory. You can safely say that if backup count is one, then one cache agent can fail without risk of data loss. In the case of a total system failure, however, the cache is lost.
Performance is best when all the data is in cache. But if the amount of data exceeds the amount of memory available in the cache machines, you must limit the cache size and use a backing store to store additional data. Depending on the application needs, you can use the backing store as the main storage and retrieve objects from the backing store as needed.
With a limited cache, objects are evicted from the cache when the number of entries exceeds the limit. The Coherence cache provider uses a hybrid policy. TIBCO BusinessEvents DataGrid uses a Least Recently Used (LRU) policy.
A hybrid eviction policy chooses which entries to evict based on the combination (weighted score) of how often and how recently they were accessed, evicting first those that are accessed least frequently and have not been accessed for the longest time.