Distributed Cache OM : Characteristics of Distributed Caching Schemes

Characteristics of Distributed Caching Schemes
The cache characteristics are defined by a caching scheme. The provided caching schemes are all distributed caching schemes, and the appropriate scheme is chosen internally based on configuration choices. This section explains in some more detail the advantages of a distributed scheme over a replicated scheme, in which all data is replicated in all JVMs.
In a distributed cache, cached object data is partitioned between the storage PUs in the cache cluster for efficient use of memory. This means that no two storage PUs are responsible for the same item of data. A distributed caching scheme has the following characteristics:
Data is written to the cache and to one backup on a different JVM (or to more than one backup copy, depending on configuration). Therefore, memory usage and write performance are better than in a replicated cache scheme. There is a slight performance penalty because modifications to the cache are not considered complete until all backups have acknowledged receipt of the modification. The benefit is that data consistency is assured.
Each piece of data is managed by only one cluster node, so data access over the network is a "single-hop" operation. This type of access is extremely scalable, because it can use point-to-point communication and take advantage of a switched network.
Data is distributed evenly across the JVMs, so the responsibility for managing the data is automatically load-balanced across the cluster. The physical location of each cache is transparent to services (so, for example, API developers don’t need to be concerned about cache location).
The system can scale in a linear manner. No two servers (JVMs) are responsible for the same piece of cached data, so the size of the cache and the processing power associated with the management of the cache can grow linearly as the cluster grows.
Overall, the distributed cache system is the best option for systems with a large data footprint in memory.
Failover and Failback of Distributed Cache Data
It is not necessary to use fault tolerance for cache agents: the cluster transparently handles failover of data to other cache agents if one cache agent fails.
The object manager handles failover of the cache data on a failed cache agent and it handles failback when the agent recovers.
When a storage node (that is a node hosting a cache agent) fails, the object manager redistributes objects among the remaining storage nodes using backup copies, if the remaining number of cache nodes are sufficient to provide the number of backups, and if they have sufficient memory to handle the additional load.
However, because this is a memory-based system, note that if one node fails, and then another cache node fails before the data can be redistributed, data loss may occur. To avoid this issue, use a backing store.
If redistribution is successful, the complete cache of all objects, plus the specified number of backups, is restored. When the failed node starts again, the object management layer again redistributes cache data.
Specifically, when a cache agent JVM fails, the cache agent that maintains the backup of the failed JVM’s cache data takes over primary responsibility for that data. If two backup copies are specified, then the cache agent responsible for the second backup copy is promoted to primary backup. Additional backup copies are made according to the configuration requirements. When a new cache agent comes up, data is again redistributed across the cluster to make use of this new cache agent.
Because they store data in memory, cache-based systems are reliable only to the extent that enough cache agents with sufficient memory are available to hold the objects. If one cache agent fails, objects are redistributed to the remaining cache agents, if they have enough memory. You can safely say that if backup count is one, then one cache agent can fail without risk of data loss. In the case of a total system failure, however, the cache is lost.
Limited and Unlimited Cache Size
The size of a cache can be unlimited or limited. Limited cache size is used when a backing store is available to hold the evicted cache entries. (see Specifying Limited Cache Size in TIBCO BusinessEvents Administration).
Performance is best when all the data is in cache. But if the amount of data exceeds the amount of memory available in the cache machines, you must limit the cache size and use a backing store to store additional data. Depending on the application needs, you can use the backing store as the main storage and retrieve objects from the backing store as needed.
Eviction Policy
With a limited cache, objects are evicted from the cache when the number of entries exceeds the limit. The default eviction policy is a hybrid policy:
Hybrid eviction policy chooses which entries to evict based the combination (weighted score) of how often and recently they were accessed, evicting those that are accessed least frequently and were not accessed for the longest period first.
The evicted objects are transparently loaded from the backing store when needed by agents.
For backing store configuration, see Chapter 15, JDBC Backing Store Configuration in TIBCO BusinessEvents Administration.