Demystifying System Design: Designing a Scalable Distributed Cache

Understanding Distributed Caching

In modern web applications, databases are frequently the bottleneck. A distributed cache sits between your application servers and database instances to accelerate reads and reduce database load. But how do you build one that scales to millions of requests per second?

1. Consistent Hashing

Traditional caching structures use a simple modulo function (e.g., hash(key) % N) to map keys to servers. However, when a cache server is added or removed, this formula shifts almost every key, causing a catastrophic cache miss storm. Consistent Hashing solves this by mapping both keys and servers to a circular ring (often called a hash ring).

2. Eviction Policies

When memory is full, the cache must evict entries. Common strategies include:

LRU (Least Recently Used): Evicts the item that has not been accessed for the longest duration.
LFU (Least Frequently Used): Tracks access counters and evicts the least requested item.
FIFO (First In First Out): Evicts the oldest item first.

Demystifying System Design: Designing a Scalable Distributed Cache

Understanding Distributed Caching

1. Consistent Hashing

2. Eviction Policies

ABOUT THE WRITER

Kalash Jindal

Discussion & Comments