System Design Fundamentals

Cache Eviction Policies

A

Cache Eviction Policies

When Your Cache Runs Out of Room

Cache memory is finite. You’ve designed a brilliant caching layer that speeds up your application dramatically, but every byte stored is a byte not available for the next hot data. When your cache fills up—and it will—you face a critical decision: which data should you remove to make room for new entries?

This seemingly simple question has profound implications for system performance. Choose poorly, and you’ll evict the data your users need most, causing a cascade of cache misses that defeat the entire purpose of having a cache. Choose well, and you maximize cache hit rates while keeping your system responsive. The policies we’ll explore in this chapter represent different philosophies about predicting which data will be needed next.

We’ve already discussed how to populate caches and how to keep them fresh. Now we need to master the art of deciding what to forget.

Understanding Eviction Policies

When a cache is full and a new item arrives, an eviction policy determines which resident item to remove. Think of this as triage in a hospital with limited beds—when a critical new patient arrives, who gets discharged?

Least Recently Used (LRU) is the most popular policy in practice. It assumes that if data hasn’t been accessed recently, it probably won’t be needed soon. Every time you access an item, you update its position to mark it as “recently used.” When the cache is full, you evict the item with the oldest access time. This works remarkably well because of temporal locality—a principle from computer science stating that data accessed recently tends to be accessed again soon.

Least Frequently Used (LFU) tracks how many times each item has been accessed and removes the least-accessed items first. The intuition here is different: if something rarely gets used, it’s probably not important to keep around. LFU works well for workloads with distinct popular and unpopular items, but it’s more complex to implement and can struggle with workload changes.

First In, First Out (FIFO) simply removes the oldest item in the cache, regardless of whether it was recently accessed or not. While this sounds naive, FIFO can actually perform well in certain scenarios and requires minimal metadata tracking. It’s simple, fast, and predictable.

Time to Live (TTL) takes a different approach entirely. Instead of evicting based on access patterns, you assign each cache entry an expiration time. When that time arrives, the entry automatically becomes invalid, even if it’s never been accessed. TTL is often used alongside other policies—for example, LRU with TTL to handle both temporal patterns and absolute freshness requirements.

Random eviction does exactly what it sounds like: picks a random item to remove. This is surprisingly effective for certain distributed systems and requires almost no overhead, though it lacks the sophistication of smarter policies.

Real-World Analogy: Your Photo Library

Imagine your smartphone’s storage is nearly full, and you need space for a new video. How do you decide which photos to delete?

If you use a FIFO approach, you delete your oldest photos first—the ones from your vacation last year, your friend’s birthday party, the concert you attended. But some of those might be cherished memories you look at occasionally.

With an LRU approach, you’d delete photos you haven’t viewed in months, reasoning that if you haven’t looked at them recently, you probably don’t need them. This makes intuitive sense for digital hoarding—out of sight, out of mind.

The LFU approach is more nuanced. You’d notice that some photos you’ve never even opened once (someone’s group photo where you appear in the background), while others you’ve scrolled through dozens of times (your kid’s first steps). You’d delete the former to preserve the latter.

Finally, with TTL, you might set photos to auto-delete after two years, with important exceptions marked to persist indefinitely. This is like having your phone automatically clean up old temporary photos while protecting your intentional archive.

How Eviction Policies Work: Under the Hood

LRU (Least Recently Used) Implementation

LRU is implemented using a doubly-linked list combined with a hash map (dictionary). Here’s why this clever combination matters:

  • The doubly-linked list maintains the order of access, with the most recently used item at the head and least recently used at the tail
  • The hash map allows constant-time lookups: given a key, we instantly find its node
  • When you access an item, you move its node to the head in O(1) time
  • When you need to evict, you remove the tail node in O(1) time
class LRUCache {
  constructor(capacity) {
    this.capacity = capacity;
    this.cache = new Map(); // key -> node
    this.head = new Node(0, 0); // dummy head
    this.tail = new Node(0, 0); // dummy tail
    this.head.next = this.tail;
    this.tail.prev = this.head;
  }

  get(key) {
    if (!this.cache.has(key)) return -1;
    const node = this.cache.get(key);
    this.moveToHead(node);
    return node.value;
  }

  put(key, value) {
    if (this.cache.has(key)) {
      const node = this.cache.get(key);
      node.value = value;
      this.moveToHead(node);
    } else {
      const node = new Node(key, value);
      this.cache.set(key, node);
      this.addToHead(node);
      if (this.cache.size > this.capacity) {
        this.removeFromTail();
      }
    }
  }

  moveToHead(node) {
    this.removeNode(node);
    this.addToHead(node);
  }

  addToHead(node) {
    node.next = this.head.next;
    node.prev = this.head;
    this.head.next.prev = node;
    this.head.next = node;
  }

  removeNode(node) {
    node.prev.next = node.next;
    node.next.prev = node.prev;
  }

  removeFromTail() {
    const node = this.tail.prev;
    this.removeNode(node);
    this.cache.delete(node.key);
  }
}

LFU (Least Frequently Used) Implementation

LFU is more complex because you need to track both frequency and recency. You maintain:

  • A frequency map: key -> frequency count
  • A frequency bucket structure: frequency -> list of keys with that frequency
  • For each frequency, you track the least recently used key (for tie-breaking)

When an item is accessed, you increment its frequency and move it to the next frequency bucket. Eviction removes the least frequently used item (or the least recently used among items with the same minimum frequency).

The challenge with LFU is the complexity overhead and the “cold start” problem: when your cache first starts, all items have low frequency, so initial items might get evicted unfairly even if they’re valuable.

TTL (Time to Live) Mechanics

TTL is simpler conceptually but requires background cleanup:

class TTLCache {
  constructor() {
    this.cache = new Map();
  }

  set(key, value, ttlMs) {
    this.cache.set(key, {
      value,
      expiresAt: Date.now() + ttlMs
    });
  }

  get(key) {
    const entry = this.cache.get(key);
    if (!entry) return null;

    if (Date.now() > entry.expiresAt) {
      this.cache.delete(key);
      return null;
    }

    return entry.value;
  }

  // Lazy deletion: check on access (shown above)
  // Or active deletion: scan periodically
  cleanupExpired() {
    const now = Date.now();
    for (const [key, entry] of this.cache.entries()) {
      if (now > entry.expiresAt) {
        this.cache.delete(key);
      }
    }
  }
}

TTL can use either lazy deletion (check when accessed) or active deletion (scan periodically). Lazy deletion is more efficient but leaves stale data temporarily. Active deletion is cleaner but more resource-intensive.

Comparison of All Policies

PolicyTime ComplexitySpace OverheadBest ForWorst Case
LRUO(1) for get/putMedium (2 pointers per node)General workloads, web cachesSequential access patterns
LFUO(log n) or O(1)*High (frequency buckets)Stable workloads, distinct hot/cold dataFrequency thrashing, cold start
FIFOO(1)Low (timestamp only)Sequential data, streamingHot data evicted if accessed late
TTLO(1)Low (timestamp only)Time-sensitive data, sessionsNo adaptation to access patterns
RandomO(1)NoneDistributed systems, uniform dataPoor predictability

*With a frequency list implementation, LFU can achieve O(1) amortized time.

Redis Eviction Policies

Redis, a widely-used in-memory cache, offers multiple eviction policies you can configure:

  • noeviction: Don’t evict; return an error when full
  • allkeys-lru: Evict least recently used key from all keys
  • allkeys-lfu: Evict least frequently used key from all keys
  • allkeys-random: Evict random key from all keys
  • volatile-lru: Evict least recently used key with TTL set
  • volatile-lfu: Evict least frequently used key with TTL set
  • volatile-random: Evict random key with TTL set
  • volatile-ttl: Evict key with shortest TTL first

Notice the distinction between “allkeys” (considers all data) and “volatile” (only considers data with expiration set). This flexibility allows you to mix policies: most data uses LRU, but data you’ve explicitly marked as temporary gets prioritized for eviction.

Practical Examples in Action

Configuring Redis Eviction

# In redis.conf or via CLI
maxmemory 2gb
maxmemory-policy allkeys-lru

# Or set dynamically
redis-cli CONFIG SET maxmemory 2gb
redis-cli CONFIG SET maxmemory-policy allkeys-lru

With this configuration, Redis uses LRU eviction for all keys once it reaches 2GB of memory.

Social Media Feed Cache

Consider a social media platform where user feeds are cached. When a user requests their feed, you fetch posts from the database, apply machine learning models to rank them, and cache the result. Your cache is limited to preserve memory.

Using LRU with TTL makes sense here: you want to keep recently-viewed feeds in cache (LRU) but also let them expire after a few hours so users see fresh content (TTL). Code might look like:

async function getUserFeed(userId) {
  const cacheKey = `feed:${userId}`;
  let feed = cache.get(cacheKey);

  if (!feed) {
    // Cache miss - fetch and rank
    feed = await fetchAndRankPosts(userId);
    // Cache for 4 hours
    cache.set(cacheKey, feed, 4 * 60 * 60 * 1000);
  }

  return feed;
}

When your cache fills up, users with stale feeds get evicted first (LRU), and very old feeds are auto-removed (TTL). New active users get fast cached results.

Trade-offs in Practice

LRU vs. LFU: LRU is simpler and works well for most workloads due to temporal locality. LFU requires more overhead but adapts better to changing workload patterns. In practice, LRU dominates because its simplicity and predictable behavior outweigh LFU’s theoretical advantages for most systems.

Memory overhead: Every eviction policy except random requires metadata. LRU needs two pointers per node (doubly-linked list). LFU needs frequency tracking. TTL needs timestamps. Random needs nothing. If your items are small or memory extremely constrained, this matters.

Scan resistance and cache pollution: Some workloads (like sequential scans over huge datasets) can “pollute” your cache by evicting useful data. A single scan through a billion records will evict everything if you have LRU. Some systems use variations like ARC (Adaptive Replacement Cache) or LIRS to handle this, though they’re less common.

The LFU cold start problem: New data starts with frequency zero. If your cache is under memory pressure immediately, this new data might get evicted before it even proves useful. Some implementations artificially boost new items’ frequency to mitigate this.

Key Takeaways

  • Eviction policies determine which cache entry to remove when the cache is full, directly impacting cache hit rate and system performance
  • LRU (Least Recently Used) is the gold standard: simple, efficient O(1), and effective for most workloads thanks to temporal locality
  • LFU (Least Frequently Used) adapts to what’s actually popular but requires more complex implementation and metadata tracking
  • FIFO and random eviction are simple and low-overhead but lack adaptability
  • TTL (Time to Live) ensures data freshness but doesn’t optimize for access patterns
  • Real systems often combine policies: LRU with TTL, volatile policies in Redis, or adaptive algorithms for extreme workloads
  • The right policy depends on your workload characteristics, available memory, and whether data freshness or hit rate matters more

Practice Scenarios

Scenario 1: A real-time analytics dashboard caches query results that change every 5 minutes. Users often re-run the same query multiple times within a session. What eviction policy would you choose, and why? How would you implement TTL for this?

Scenario 2: A content delivery network (CDN) caches popular movies and documents. Some content is accessed thousands of times per day, while other content might be accessed once a month but still valuable. An LRU policy evicts newly-cached content that will become hot tomorrow. How could you detect and prevent this cache pollution?

Scenario 3: Design an LRU cache that also supports time-to-live for entries. Entries should expire after a set duration even if frequently accessed. What data structures would you use? How would you handle cleanup efficiently?

Next Steps: Consistency and Invalidation

We’ve now mastered how to populate caches, keep them fresh through TTL, and decide what to evict. But we haven’t fully addressed a critical challenge: what happens when the underlying data changes? If someone updates a user’s profile, should all caches holding that profile be invalidated? Should they update automatically? In the next section, we’ll explore cache consistency and invalidation strategies that ensure your cached data stays accurate across distributed systems.

The eviction policies you choose here form the foundation for those discussions. A well-chosen policy keeps your most important data cached longest, which is exactly what you need for consistency strategies to work effectively.