Evolution of Caching Technologies


In an early stage, caching started with simply putting data into memory (also known as local caching) in the same process node (e.g. application server). Gradually it evolved with externalizing the cache as a separate process node and eventually re-engineered to adopt distributed computing architecture. In the current landscape, most of the caching system architecture uses in-memory data grid technology, which is essentially a part of in-memory computing, and in-memory computing uses grid computing as the underlying architecture.

The complexity of managing a distributed caching system necessitated the birth of many focused caching solutions (by both open-source and other vendors). This also led to many caching patterns evolved for different industries solving complex business use-cases focused on high performance & high throughput.

Essentially, Caching is not an afterthought in modern architecture as it not only takes performance to new levels but also helps to improve system resiliency, scalability & availability, and elevates user experience driving business value.

Caching is based on Write-Once, Read-Many in-memory design paradigm, which helps in designing highly scalable and high performance distributed applications.

Caching Use-cases

Figure 1 – Caching Use-Cases

How it Works

  • Caching starts with a simple PUT operation (Key/Value pair), but the complexity is hidden from the cache client in terms of managing a distributed caching system
  • Most of the high-available caching system operate in distributed mode using a multi-master setup and replicating cache data (in shards) across cluster nodes
  • Caching deployment topology determines the required caching infrastructure, and most of the caching solution providers provide Cloud deployment model including managed service
  • Maintenance of cache (e.g. cache eviction approach, time-to-live, etc.) also plays an important role in overall cache capacity
Figure 2 – Multi-node Cluster (Multi-Master Setup)

Key Highlights in Multi-Master Setup
– Cluster of nodes holding primary data
– Back-up of primary data is distributed across all other nodes
– Logical view of all data from any node
– All nodes verify the health of each other
– In the event a node is unhealthy, other nodes diagnose state
– Unhealthy node isolated from the cluster
– Remaining nodes redistribute primary and back-up responsibilities to healthy nodes

Another key aspect is to differentiate between In-memory distributed caching, and In-memory distributed data grid.

In-memory distributed caching is to cache frequently accessed data and distribute it across caching nodes for high-availability, whereas in-memory data grid adds compute capability so that computation on large data-set can happen closer to data. Data processing capability differentiates the two different paradigms.

Caching Patterns

While there are many design patterns in practice, these are the two most commonly used patterns applied for caching at the architecture level.

Pattern A – Using Caching as an enterprise in-memory data grid

Approach: Use a clustered multi-node highly available data grid layer between application & storage systems. Choose the right caching deployment approach (e.g. customer data cache can be replicated, whereas transaction data can utilize optimistic cache). Choose the caching strategy for retrieval or storage (e.g. using read-through pattern for frequently accessed data).
Applicability: For application caching ensuring sub-millisecond response time, high read performance, and offload resource contention from storage layer. The enterprise-wide applicability applies using in-memory grid for multiple applications.

Caching Pattern Grid
Figure 3 – Pattern A – Caching with In-memory data grid

Pattern B – Using Cache as a side-car (micro-cache)

Approach: Use a minimal micro-cache (deployed alongside micro-service application as a sidecar) without any distributed caching (each service has its copy of cache data).
Applicability: For application caching (mostly as L1/L2 level caching) for limited caching (scalability limited) requirements.

Figure 4 – Pattern B – Using Caching as a side-car

Key Vendors

To implement caching patterns with a focus on time-to-market, choosing a suitable caching product is key to the successful rollout of caching strategy. The opensource community has played a vital role in building enterprise caching products with support for both in-memory caching and in-memory compute applying grid computing. Also, cloud-service providers (AWS, Azure & GCP) have partnered with most of these vendors to offer both public, private or hybrid deployment solution options.

Figure 5- Key Caching Vendors

To conclude, caching is no longer an afterthought and needs to be thought through during the architecture & design phase to ensure high performance, high available, and high scalable application. Focusing on what to cache, where to cache, how to cache, how long to cache, and most importantly why to cache, helps to apply the suitable practices before the caching implementations.
Please share your view, experience, and learning as comments. Keep experimenting!

Leave a Comment