System Design Fundamentals

News Feed System

A

News Feed System

The Problem: Personalized Feed for Billions

You’ve seen it a thousand times: open Facebook, Twitter, or LinkedIn, and within 200 milliseconds, your personalized feed appears. You see posts from people you follow, sorted by relevance and recency. This is one of the most challenging problems in system design.

Here’s the scale: 500M users, 100M daily active users. Each user follows an average of 200 people. When you open your feed, we need to fetch the most relevant posts from those 200 people, rank them, and serve them in under 200ms. Simultaneously, we’re ingesting 100M new posts per day.

The core tension: do we pre-compute feeds (write-heavy, read-light) or compute them on demand (write-light, read-heavy)? The answer, as usual, is “it depends.”

Functional and Non-Functional Requirements

Functional requirements:

  • Create, edit, delete posts (text, images, video)
  • Follow and unfollow users
  • View personalized feed sorted by time and relevance
  • Like and comment on posts
  • Search posts by hashtag or user

Non-functional requirements:

  • Latency: Feed loads under 200ms (p99)
  • Throughput: 100M DAU making 10 feed requests/day = 1B requests/day = 11,500 requests/second
  • Consistency: Eventual consistency acceptable for feeds; strong consistency for post creation
  • Availability: 99.9% uptime for feed reads
  • Scale: 500M users, each following 200 people on average

Scale Estimation

Let’s break it down:

Feed requests:

  • 100M DAU × 10 requests/day = 1B requests/day
  • 1B / 86,400 seconds = ~11,500 requests/second
  • Peak: ~50,000 requests/second

Post creation:

  • Average user posts once every few days → ~100M posts/day
  • 100M / 86,400 = ~1,160 posts/second
  • Peak: ~5,000 posts/second

Data storage:

  • 100M DAU, each following 200 people
  • Social graph: 20B edges, ~1 KB per edge = 20 TB
  • Posts: assume 500M users post once/day on average = 500M posts/day
  • Each post: 2 KB metadata (id, user_id, timestamp, text, likes, comments) = 1 TB/day
  • Keep 1 year of posts = 365 TB

This is massive. We cannot serve all of this from a single database.

Two Fundamental Architectures

Approach 1: Fan-Out on Write (Push Model)

When a user posts, immediately write to the feed of all their followers.

Example:

  • User Alice posts a photo
  • System identifies Alice’s 50K followers
  • System writes Alice’s post_id to the feed cache (Redis) of each follower

Pros:

  • Feed reads are extremely fast (just fetch from cache)
  • Ideal for high-read, low-write systems

Cons:

  • Post creation is slow (write to 50K feeds can take 10+ seconds)
  • If Alice has 10M followers (celebrity), the write latency explodes
  • Cache storage is huge (every user’s feed is pre-computed)

When to use: Regular users (followers in the thousands, not millions)

Approach 2: Fan-Out on Read (Pull Model)

When a user requests their feed, fetch the latest posts from all followed users in real-time.

Example:

  • User Bob requests their feed
  • System fetches the timeline of each of Bob’s 200 followed users
  • System merges, ranks, and returns the top 20 posts

Pros:

  • Post creation is fast (just write to user’s own timeline)
  • Works well for celebrities (no need to fan-out to millions)
  • Less cache storage needed

Cons:

  • Feed read is slow (merge 200 timelines every request)
  • High database/cache load on popular users’ timelines
  • Difficult to rank across timelines in real-time

When to use: For high-fan-out scenarios (celebrities with millions of followers)

Approach 3: Hybrid (The Real-World Solution)

Most platforms use a hybrid:

  • Regular users (followers under 10K): Fan-out on write
  • Celebrity users (followers over 10K): Fan-out on read (or hybrid with selective fan-out)
  • Hot posts (viral posts): Pull at read time and cache the merge result
graph TD
    User["User Creates Post"]
    Celebrity{Is User<br/>a Celebrity?}

    Regular["Regular User<br/>followers under 10K"]
    FanOutWrite["Fan-Out on Write<br/>Write to all follower feeds"]

    Celeb["Celebrity User<br/>followers over 10K"]
    FanOutRead["Fan-Out on Read<br/>Fetch on demand"]

    FeedCache["Feed Cache<br/>Redis Sorted Sets"]

    User --> Celebrity
    Celebrity -->|No| Regular
    Celebrity -->|Yes| Celeb
    Regular --> FanOutWrite
    Celeb --> FanOutRead
    FanOutWrite --> FeedCache
    FanOutRead --> FeedCache

High-Level Architecture

graph LR
    API["API Gateway"]
    PostService["Post Service<br/>(CRUD posts)"]
    FanOutService["Fan-Out Service<br/>(Async, Celery)"]
    FeedService["Feed Service<br/>(Read feed, rank)"]
    SocialGraphService["Social Graph Service<br/>(Follow relationships)"]

    FeedCache["Feed Cache<br/>Redis Sorted Sets<br/>per user_id"]
    PostDB["Post DB<br/>MySQL/PostgreSQL"]
    SocialGraphDB["Social Graph DB<br/>MySQL/Neo4j"]
    MediaService["Media Service<br/>(Images/videos)"]

    API -->|POST /posts| PostService
    API -->|GET /feed| FeedService
    API -->|POST /follow| SocialGraphService

    PostService --> PostDB
    PostService --> FanOutService
    FanOutService --> FeedCache

    FeedService --> FeedCache
    FeedService --> SocialGraphService
    FeedService --> PostDB

    SocialGraphService --> SocialGraphDB
    MediaService -.->|CDN| API

Deep Dive: Feed Storage and Computation

Redis Sorted Sets for Feed Cache

We store each user’s feed as a Redis sorted set:

Key: "user_feed:{user_id}"
Score: timestamp of post (for sorting by recency)
Member: post_id

Example:
user_feed:12345
{
    member: post_789, score: 1707043200  (most recent)
    member: post_788, score: 1707042500
    member: post_787, score: 1707041000
    ...
    member: post_001, score: 1706000000  (oldest)
}

Benefits:

  • Sorted by score automatically (timestamp)
  • Range queries with ZRANGE user_feed:12345 0 20 get the top 20 posts
  • O(log n) operations
  • Pagination with cursor: ZRANGE user_feed:12345 {offset} {offset + 20}

Fan-Out on Write Implementation

When Alice (user_id=100) posts:
1. Write post to PostDB: insert into posts (user_id, content, timestamp)
2. Get Alice's followers: SELECT follower_id FROM follows WHERE following_id = 100
3. For each follower, push to their feed cache:
   ZADD user_feed:{follower_id} {timestamp} {post_id}
   ZREMRANGEBYRANK user_feed:{follower_id} 0 -501  (keep top 500)

This runs asynchronously via a job queue (Celery/RQ) so it doesn’t block the post creation API.

Fan-Out on Read Implementation

When Bob requests /feed:
1. Get Bob's following list: SELECT following_id FROM follows WHERE follower_id = {bob_id}
2. For each followed user, fetch their timeline:
   ZRANGE user_timeline:{followed_id} 0 100  (top 100 posts)
3. Merge all timelines into one sorted list (merge sort)
4. Rank posts:
   - Apply ranking formula: score = time_decay(timestamp) + engagement_score(likes, comments)
5. Return top 20
6. Cache result in user_feed:{bob_id} for 5 minutes (avoid repeated computation)

This is O(k log m) where k is number of followed users and m is posts per user. For k=200 and m=100, that’s manageable.

Ranking and Relevance

We don’t just show posts in reverse chronological order. We rank by relevance:

ranking_score =
    0.5 * recency_score(post_age) +
    0.3 * engagement_score(likes + comments) +
    0.15 * affinity_score(with_followed_user) +
    0.05 * diversity_score(content_type)

Recency score: Posts decay in score over time (exponential decay with 24-hour half-life).

Engagement score: Normalized by post age (recent posts have lower engagement, so we normalize to avoid bias).

Affinity score: How much does Bob interact with posts from this followed user? (tracked via interactions table).

Diversity score: Slight boost for different content types (text vs image vs video) to avoid monotony.

Pagination and Cursor-Based Pagination

We use cursor-based pagination (not offset):

Request: GET /feed?limit=20&cursor=abc123def

Response:
{
  "posts": [...],
  "next_cursor": "xyz789uvw"
}

Cursor is base64-encoded: {post_id}:{timestamp}
Enables efficient pagination: ZRANGE user_feed:12345 0 20 LIMIT {cursor}

This avoids the “offset problem” (offset + limit) where deep pagination scans are expensive.

Scaling Considerations

Cache Partitioning by User ID

Feed cache is partitioned by user_id:

User 1-100M: Redis cluster 1
User 100M-200M: Redis cluster 2
User 200M-300M: Redis cluster 3
...

Use consistent hashing to determine which cluster.

Social Graph Sharding

The social graph (follow relationships) is sharded by user_id:

User 1-100M followers: DB shard 1
User 100M-200M followers: DB shard 2

This prevents one user’s followers from overloading the database.

Hot User Optimization

Extremely popular users (celebrities with millions of posts per day) are cached separately:

When fetching a celebrity's timeline:
1. Check celebrity_cache:{celebrity_id}
2. If miss, fetch from postDB and cache for 60 seconds
3. Feeds that include this celebrity's posts use the cached timeline

Media and CDN

Posts include images and videos. We use:

  1. Media Service: Stores and encodes media (different resolutions)
  2. CDN: Distributes media globally (CloudFront, Akamai)
  3. Client: Loads media from CDN, not from primary servers

Feed API responses are ~2-5 KB per post, but media is served from CDN.

Trade-offs and Design Decisions

DimensionFan-Out on WriteFan-Out on ReadHybrid
Write latencySlow for celebritiesFastFast
Read latencyFastSlow for high-follow usersFast
StorageHigh (pre-computed feeds)LowMedium
ConsistencyEventual (cache lag)Strong (real-time)Mixed
ScalabilityLimited by celebrity writesLimited by celebrity readsBest

Pro tip: Start with fan-out on write (simpler) and switch to hybrid as you scale. Most platforms support millions of users before needing fan-out on read.

Key Takeaways

  1. Choose your model based on follow distribution: If most users have normal followers, fan-out on write. If you have many celebrities, use hybrid.
  2. Cache is everything: Feed reads are under 200ms only because of Redis caching. Your database cannot handle real-time merges at scale.
  3. Ranking > Chronological: Users prefer relevant posts even if slightly older. Apply ranking formulas.
  4. Eventual consistency is acceptable: Users don’t need to see posts 100% instantaneously. A 1-2 second delay is fine.
  5. Separate reads and writes: Don’t let the heavy fan-out writes block read latency. Use async workers.

Practice Exercise

Extend this design to support:

  • Stories: Temporary posts that disappear after 24 hours. How does TTL work in Redis?
  • Reels/short videos: Algorithmic feed (not just social) with infinite scroll.
  • Notifications: Alert users when followed people post. Use pub/sub (Redis Streams or Kafka)?

Next up: We’ve designed how users discover content from people they follow. What about discovering new content by keyword? How do we build a search autocomplete system that handles 10B queries per day?