News Feed System
The Problem: Personalized Feed for Billions
You’ve seen it a thousand times: open Facebook, Twitter, or LinkedIn, and within 200 milliseconds, your personalized feed appears. You see posts from people you follow, sorted by relevance and recency. This is one of the most challenging problems in system design.
Here’s the scale: 500M users, 100M daily active users. Each user follows an average of 200 people. When you open your feed, we need to fetch the most relevant posts from those 200 people, rank them, and serve them in under 200ms. Simultaneously, we’re ingesting 100M new posts per day.
The core tension: do we pre-compute feeds (write-heavy, read-light) or compute them on demand (write-light, read-heavy)? The answer, as usual, is “it depends.”
Functional and Non-Functional Requirements
Functional requirements:
- Create, edit, delete posts (text, images, video)
- Follow and unfollow users
- View personalized feed sorted by time and relevance
- Like and comment on posts
- Search posts by hashtag or user
Non-functional requirements:
- Latency: Feed loads under 200ms (p99)
- Throughput: 100M DAU making 10 feed requests/day = 1B requests/day = 11,500 requests/second
- Consistency: Eventual consistency acceptable for feeds; strong consistency for post creation
- Availability: 99.9% uptime for feed reads
- Scale: 500M users, each following 200 people on average
Scale Estimation
Let’s break it down:
Feed requests:
- 100M DAU × 10 requests/day = 1B requests/day
- 1B / 86,400 seconds = ~11,500 requests/second
- Peak: ~50,000 requests/second
Post creation:
- Average user posts once every few days → ~100M posts/day
- 100M / 86,400 = ~1,160 posts/second
- Peak: ~5,000 posts/second
Data storage:
- 100M DAU, each following 200 people
- Social graph: 20B edges, ~1 KB per edge = 20 TB
- Posts: assume 500M users post once/day on average = 500M posts/day
- Each post: 2 KB metadata (id, user_id, timestamp, text, likes, comments) = 1 TB/day
- Keep 1 year of posts = 365 TB
This is massive. We cannot serve all of this from a single database.
Two Fundamental Architectures
Approach 1: Fan-Out on Write (Push Model)
When a user posts, immediately write to the feed of all their followers.
Example:
- User Alice posts a photo
- System identifies Alice’s 50K followers
- System writes Alice’s post_id to the feed cache (Redis) of each follower
Pros:
- Feed reads are extremely fast (just fetch from cache)
- Ideal for high-read, low-write systems
Cons:
- Post creation is slow (write to 50K feeds can take 10+ seconds)
- If Alice has 10M followers (celebrity), the write latency explodes
- Cache storage is huge (every user’s feed is pre-computed)
When to use: Regular users (followers in the thousands, not millions)
Approach 2: Fan-Out on Read (Pull Model)
When a user requests their feed, fetch the latest posts from all followed users in real-time.
Example:
- User Bob requests their feed
- System fetches the timeline of each of Bob’s 200 followed users
- System merges, ranks, and returns the top 20 posts
Pros:
- Post creation is fast (just write to user’s own timeline)
- Works well for celebrities (no need to fan-out to millions)
- Less cache storage needed
Cons:
- Feed read is slow (merge 200 timelines every request)
- High database/cache load on popular users’ timelines
- Difficult to rank across timelines in real-time
When to use: For high-fan-out scenarios (celebrities with millions of followers)
Approach 3: Hybrid (The Real-World Solution)
Most platforms use a hybrid:
- Regular users (followers under 10K): Fan-out on write
- Celebrity users (followers over 10K): Fan-out on read (or hybrid with selective fan-out)
- Hot posts (viral posts): Pull at read time and cache the merge result
graph TD
User["User Creates Post"]
Celebrity{Is User<br/>a Celebrity?}
Regular["Regular User<br/>followers under 10K"]
FanOutWrite["Fan-Out on Write<br/>Write to all follower feeds"]
Celeb["Celebrity User<br/>followers over 10K"]
FanOutRead["Fan-Out on Read<br/>Fetch on demand"]
FeedCache["Feed Cache<br/>Redis Sorted Sets"]
User --> Celebrity
Celebrity -->|No| Regular
Celebrity -->|Yes| Celeb
Regular --> FanOutWrite
Celeb --> FanOutRead
FanOutWrite --> FeedCache
FanOutRead --> FeedCache
High-Level Architecture
graph LR
API["API Gateway"]
PostService["Post Service<br/>(CRUD posts)"]
FanOutService["Fan-Out Service<br/>(Async, Celery)"]
FeedService["Feed Service<br/>(Read feed, rank)"]
SocialGraphService["Social Graph Service<br/>(Follow relationships)"]
FeedCache["Feed Cache<br/>Redis Sorted Sets<br/>per user_id"]
PostDB["Post DB<br/>MySQL/PostgreSQL"]
SocialGraphDB["Social Graph DB<br/>MySQL/Neo4j"]
MediaService["Media Service<br/>(Images/videos)"]
API -->|POST /posts| PostService
API -->|GET /feed| FeedService
API -->|POST /follow| SocialGraphService
PostService --> PostDB
PostService --> FanOutService
FanOutService --> FeedCache
FeedService --> FeedCache
FeedService --> SocialGraphService
FeedService --> PostDB
SocialGraphService --> SocialGraphDB
MediaService -.->|CDN| API
Deep Dive: Feed Storage and Computation
Redis Sorted Sets for Feed Cache
We store each user’s feed as a Redis sorted set:
Key: "user_feed:{user_id}"
Score: timestamp of post (for sorting by recency)
Member: post_id
Example:
user_feed:12345
{
member: post_789, score: 1707043200 (most recent)
member: post_788, score: 1707042500
member: post_787, score: 1707041000
...
member: post_001, score: 1706000000 (oldest)
}
Benefits:
- Sorted by score automatically (timestamp)
- Range queries with
ZRANGE user_feed:12345 0 20get the top 20 posts - O(log n) operations
- Pagination with cursor:
ZRANGE user_feed:12345 {offset} {offset + 20}
Fan-Out on Write Implementation
When Alice (user_id=100) posts:
1. Write post to PostDB: insert into posts (user_id, content, timestamp)
2. Get Alice's followers: SELECT follower_id FROM follows WHERE following_id = 100
3. For each follower, push to their feed cache:
ZADD user_feed:{follower_id} {timestamp} {post_id}
ZREMRANGEBYRANK user_feed:{follower_id} 0 -501 (keep top 500)
This runs asynchronously via a job queue (Celery/RQ) so it doesn’t block the post creation API.
Fan-Out on Read Implementation
When Bob requests /feed:
1. Get Bob's following list: SELECT following_id FROM follows WHERE follower_id = {bob_id}
2. For each followed user, fetch their timeline:
ZRANGE user_timeline:{followed_id} 0 100 (top 100 posts)
3. Merge all timelines into one sorted list (merge sort)
4. Rank posts:
- Apply ranking formula: score = time_decay(timestamp) + engagement_score(likes, comments)
5. Return top 20
6. Cache result in user_feed:{bob_id} for 5 minutes (avoid repeated computation)
This is O(k log m) where k is number of followed users and m is posts per user. For k=200 and m=100, that’s manageable.
Ranking and Relevance
We don’t just show posts in reverse chronological order. We rank by relevance:
ranking_score =
0.5 * recency_score(post_age) +
0.3 * engagement_score(likes + comments) +
0.15 * affinity_score(with_followed_user) +
0.05 * diversity_score(content_type)
Recency score: Posts decay in score over time (exponential decay with 24-hour half-life).
Engagement score: Normalized by post age (recent posts have lower engagement, so we normalize to avoid bias).
Affinity score: How much does Bob interact with posts from this followed user? (tracked via interactions table).
Diversity score: Slight boost for different content types (text vs image vs video) to avoid monotony.
Pagination and Cursor-Based Pagination
We use cursor-based pagination (not offset):
Request: GET /feed?limit=20&cursor=abc123def
Response:
{
"posts": [...],
"next_cursor": "xyz789uvw"
}
Cursor is base64-encoded: {post_id}:{timestamp}
Enables efficient pagination: ZRANGE user_feed:12345 0 20 LIMIT {cursor}
This avoids the “offset problem” (offset + limit) where deep pagination scans are expensive.
Scaling Considerations
Cache Partitioning by User ID
Feed cache is partitioned by user_id:
User 1-100M: Redis cluster 1
User 100M-200M: Redis cluster 2
User 200M-300M: Redis cluster 3
...
Use consistent hashing to determine which cluster.
Social Graph Sharding
The social graph (follow relationships) is sharded by user_id:
User 1-100M followers: DB shard 1
User 100M-200M followers: DB shard 2
This prevents one user’s followers from overloading the database.
Hot User Optimization
Extremely popular users (celebrities with millions of posts per day) are cached separately:
When fetching a celebrity's timeline:
1. Check celebrity_cache:{celebrity_id}
2. If miss, fetch from postDB and cache for 60 seconds
3. Feeds that include this celebrity's posts use the cached timeline
Media and CDN
Posts include images and videos. We use:
- Media Service: Stores and encodes media (different resolutions)
- CDN: Distributes media globally (CloudFront, Akamai)
- Client: Loads media from CDN, not from primary servers
Feed API responses are ~2-5 KB per post, but media is served from CDN.
Trade-offs and Design Decisions
| Dimension | Fan-Out on Write | Fan-Out on Read | Hybrid |
|---|---|---|---|
| Write latency | Slow for celebrities | Fast | Fast |
| Read latency | Fast | Slow for high-follow users | Fast |
| Storage | High (pre-computed feeds) | Low | Medium |
| Consistency | Eventual (cache lag) | Strong (real-time) | Mixed |
| Scalability | Limited by celebrity writes | Limited by celebrity reads | Best |
Pro tip: Start with fan-out on write (simpler) and switch to hybrid as you scale. Most platforms support millions of users before needing fan-out on read.
Key Takeaways
- Choose your model based on follow distribution: If most users have normal followers, fan-out on write. If you have many celebrities, use hybrid.
- Cache is everything: Feed reads are under 200ms only because of Redis caching. Your database cannot handle real-time merges at scale.
- Ranking > Chronological: Users prefer relevant posts even if slightly older. Apply ranking formulas.
- Eventual consistency is acceptable: Users don’t need to see posts 100% instantaneously. A 1-2 second delay is fine.
- Separate reads and writes: Don’t let the heavy fan-out writes block read latency. Use async workers.
Practice Exercise
Extend this design to support:
- Stories: Temporary posts that disappear after 24 hours. How does TTL work in Redis?
- Reels/short videos: Algorithmic feed (not just social) with infinite scroll.
- Notifications: Alert users when followed people post. Use pub/sub (Redis Streams or Kafka)?
Next up: We’ve designed how users discover content from people they follow. What about discovering new content by keyword? How do we build a search autocomplete system that handles 10B queries per day?