URL Shortener Architecture
The Problem
Picture this: you’re at a conference and need to share a link to your paper. Pasting “https://university.edu/research/2024/quantum-computing/paper-v3-final-FINAL-actually-final.pdf” in a tweet is ugly. You need a URL shortener—a service that converts long URLs into compact, shareable links like “https://short.url/abc123”.
We’re designing a system like bit.ly or TinyURL: accept a long URL, return a short one; when someone visits the short URL, redirect them to the original. Sounds simple, but at scale—with 100 million new shortened URLs per month and 1 billion redirects monthly—it’s complex. This system will teach us about data generation, caching, and handling massive read-heavy workloads.
Requirements and Estimation
Functional Requirements
- Shorten URL: Accept a long URL, return a shortened version
- Redirect: Accept a short URL, return HTTP redirect (301/302) to the original
- Custom aliases: Allow users to specify custom short URLs if available (e.g., “bit.ly/mycompany”)
- Expiration: Support URLs that expire after a certain time
- Analytics: Track clicks, referrer, geographic location of visitors
- User accounts: Users can manage and view analytics for their shortened URLs
Non-Functional Requirements
- Latency: Redirect response under 10 milliseconds (perceived as instant)
- Throughput: 100M URLs created per month = ~40 URLs/second write rate
- Read-heavy: 10:1 read-to-write ratio = 400 redirects per second
- Availability: 99.9% uptime (under 43 minutes downtime per month)
- Durability: URLs should persist even if one database fails
- Short URLs: Minimal length (7 characters is ideal)
Back-of-the-Envelope Calculation
Write throughput: 100M URLs/month ÷ 30 days ÷ 86400 seconds ≈ 40 URLs/sec
Read throughput: 400 URLs/sec (from 10:1 ratio)
5-year projection: 100M × 12 × 5 = 6 billion URLs
Storage per URL:
- Short key: 7 bytes (base62)
- Long URL: ~100 bytes (average)
- User ID: 8 bytes
- Metadata (created, expires, clicks): 50 bytes
- Total per record: ~165 bytes
Storage needed: 6B × 165 bytes ≈ 1 TB
With 3x replication for durability: 3 TB
Not massive—even a single database could theoretically handle it. But with 400 redirects per second and the need for sub-10ms latency, we need smart caching and sharding.
Key Generation Strategy
The most critical decision: how do we generate short IDs? We need 3.5 trillion combinations (62^7) to safely generate 6 billion URLs.
Approach 1: Hashing
Hash the full URL with MD5 or SHA256, take the first 7 characters, convert to base62.
hash = SHA256("https://example.com/very/long/url")
short_id = hash[0:7].toBase62()
Advantages: Simple, stateless, no coordination needed.
Disadvantages: Collision probability (birthday paradox—even with 7 chars, you’d expect collisions after ~60M URLs), need collision resolution strategy.
Approach 2: Counter-Based (Auto-increment)
Use a global counter, convert to base62. Database assigns the next available ID.
SELECT next_id FROM url_counter;
short_id = base62(next_id)
INSERT INTO urls (short_id, long_url, ...);
Advantages: Zero collisions, simple, compact IDs, sequential (predictable).
Disadvantages: Global counter is a bottleneck, sequential IDs reveal usage patterns (security risk), requires careful database design to avoid conflicts in distributed setup.
Approach 3: Pre-generated Key Service (Recommended)
A separate service pre-generates billions of unique IDs, stores them in a pool, and hands them out.
Key Generation Service:
- Generate candidate IDs: base62(random(0, 62^7))
- Check against "used" set
- If new, add to queue of available keys
- Maintain pool of 1M+ unused keys at all times
URL Shortener Service:
- Request next available key from Key Service
- Assign to user's URL
- Return short URL
Advantages: Fast (no collisions, O(1) handout), no coordination at write time, high throughput.
Disadvantages: More complex architecture, need to manage the pool.
This is the industry standard. We’ll use this approach.
High-Level Architecture
graph TB
subgraph "Write Path (Shorten URL)"
Client1["Client"]
API1["API Server"]
KeySvc["Key Generation Service"]
DB1[(("Master DB"))]
Cache1["Redis Cache"]
Client1 -->|POST /shorten| API1
API1 -->|Get next ID| KeySvc
KeySvc -->|Return ID| API1
API1 -->|Write mapping| DB1
API1 -->|Cache result| Cache1
end
subgraph "Read Path (Redirect)"
Client2["Client"]
LB["Load Balancer"]
API2["API Server"]
Cache2["Redis Cache"]
DB2[(("Read Replica"))]
CDN["CDN"]
Client2 -->|GET /abc123| LB
LB --> API2
API2 -->|Check cache| Cache2
Cache2 -->|Miss| DB2
API2 -->|Send 301| Client2
CDN -->|Popular URLs| Client2
end
Deep Dive: Implementation Details
Database Schema
CREATE TABLE urls (
id BIGINT PRIMARY KEY AUTO_INCREMENT,
short_key VARCHAR(7) UNIQUE NOT NULL,
long_url VARCHAR(2048) NOT NULL,
user_id BIGINT NOT NULL,
created_at TIMESTAMP DEFAULT NOW(),
expires_at TIMESTAMP,
is_custom BOOLEAN DEFAULT FALSE,
redirect_count INT DEFAULT 0,
INDEX (user_id),
INDEX (expires_at),
INDEX (created_at)
);
CREATE TABLE click_analytics (
id BIGINT PRIMARY KEY AUTO_INCREMENT,
short_key VARCHAR(7) NOT NULL,
timestamp TIMESTAMP DEFAULT NOW(),
referrer VARCHAR(512),
user_agent VARCHAR(512),
country VARCHAR(2),
INDEX (short_key, timestamp),
INDEX (timestamp)
);
Observation: click_analytics is a separate table because:
- URLs table is small and frequently read (redirect path)
- Analytics table is write-heavy (every click) and append-only
- We can shard analytics separately from URLs
301 vs 302 Redirect
This matters more than you’d think:
301 Permanent Redirect:
- Browser caches the mapping (second visit doesn’t hit our server)
- Good for latency, bad for analytics (we miss cached redirects)
- Good for SEO (some authority transfers to the original URL)
- Cannot change the destination later
302 Temporary Redirect:
- Browser doesn’t cache (every visit hits our server)
- Better analytics tracking
- Can update destination without old URLs breaking
- More hits on our servers (higher cost)
Hybrid Approach:
- Use 302 by default (flexibility)
- Switch to 301 for popular URLs (caching reduces load)
- Cache the 301 response in CDN
Caching Strategy
The read path dominates (400 redirects/sec). A simple LRU cache in Redis helps:
def get_url(short_key):
# Check Redis cache
long_url = redis.get(f"url:{short_key}")
if long_url:
return long_url
# Cache miss, query database
row = database.query(f"SELECT long_url FROM urls WHERE short_key = ?", short_key)
if row:
long_url = row['long_url']
# Cache for 24 hours (or until expiration)
redis.setex(f"url:{short_key}", 86400, long_url)
return long_url
else:
return None
Hit rate: With 80-90% of traffic going to 20% of URLs (Zipf distribution), caching 100K most popular URLs gets us 80%+ cache hit rate.
Custom Aliases
Users want “bit.ly/mycompany” instead of “bit.ly/abc123”. Check availability:
def shorten_with_custom_alias(long_url, custom_alias, user_id):
if not is_valid_alias(custom_alias):
return error("Invalid alias")
existing = database.query(
"SELECT id FROM urls WHERE short_key = ?",
custom_alias
)
if existing:
return error("Alias already taken")
# Insert with custom alias
database.insert("urls", {
"short_key": custom_alias,
"long_url": long_url,
"user_id": user_id,
"is_custom": True
})
return success(custom_alias)
Race condition: Two users might request the same alias simultaneously. Solve with database constraint (UNIQUE on short_key) and retry with random ID on conflict.
URL Expiration
Some URLs should expire (time-sensitive event links, temporary documents):
def create_url(long_url, user_id, ttl_hours=None):
short_key = get_next_key_from_key_service()
expires_at = None
if ttl_hours:
expires_at = now() + timedelta(hours=ttl_hours)
database.insert("urls", {
"short_key": short_key,
"long_url": long_url,
"user_id": user_id,
"expires_at": expires_at
})
# Cache with TTL matching expiration
cache_ttl = ttl_hours * 3600 if ttl_hours else 86400
redis.setex(f"url:{short_key}", cache_ttl, long_url)
return short_key
Analytics Pipeline
Tracking every click in the main request path would block the redirect. Instead, log asynchronously:
User clicks short URL
↓
API returns 301/302 redirect
↓
Meanwhile, emit event to Kafka: {short_key, timestamp, referrer, user_agent, ip}
↓
Analytics consumer reads from Kafka
↓
Writes to time-series database (InfluxDB, MongoDB)
↓
Dashboard queries for graphs (by day, by country, by referrer)
This way, redirects complete in 10ms without waiting for analytics storage.
Scaling for Global Traffic
Database Sharding
With 6 billion URLs and 400 redirects/sec, we need multiple databases:
Strategy: Hash-based sharding
shard_id = hash(short_key) % num_shards
db = database_pool[shard_id]
With 10 shards, each shard handles 40 redirects/sec (easily handled by one database server).
Each shard needs read replicas for the redirect path (reads scale horizontally).
Multi-Region Deployment
Users in Europe shouldn’t redirect through US data centers:
- Write: Primary region only (user accounts in US)
- Read: Regional replicas (EU, APAC) replicate from primary
Replication lag is fine (eventual consistency) since URLs are immutable after creation.
CDN for Popular URLs
The most popular 0.1% of URLs account for 10%+ of traffic:
CDN = Content Delivery Network
Cache the redirect response geographically
First user in London clicks "bit.ly/viral":
→ No cache, hits London edge, queries DB
→ Response cached in London
Next user in London:
→ CDN serves from cache (1-2ms latency)
→ Never hits origin server
Reduces origin load by 90% for viral URLs.
Scaling Considerations
| Component | Bottleneck | Solution |
|---|---|---|
| Short URL generation | Key Service becomes bottleneck | Pre-generate billions of keys, distribute across multiple Key Services |
| Database writes | Sharding required | Hash-based sharding (10-20 shards), batch inserts |
| Database reads (redirects) | Read replicas saturate | Add more replicas, Redis cache, CDN |
| Analytics ingestion | Click tracking blocks redirect | Async event pipeline (Kafka), batch writes |
| Expiration cleanup | Old URLs accumulate | Background job runs hourly, deletes by expires_at timestamp |
Pro Tip: Use bulk operations. Instead of inserting 40 URLs/sec one at a time, batch them (insert 1000 URLs every 25 seconds). Reduces overhead by 50%.
Trade-offs
| Decision | Chosen | Alternative | Trade-off |
|---|---|---|---|
| Key generation | Pre-generated pool | Hashing | Simpler implementation vs guaranteed availability |
| Redirect type | 302 + 301 for popular | Always 301 | Flexibility vs client-side caching |
| Analytics | Async (Kafka) | Sync writes | Fast redirects vs real-time analytics |
| Replication | Async to replicas | Sync | Fast writes vs stronger consistency |
| Expiration cleanup | Background job | TTL in cache | Simple vs automatic |
Key Takeaways
- Scale estimation is crucial—40 writes/sec is very different from 40K writes/sec
- Read caching dominates—URLs are immutable, so caching is safe and high-impact
- Analytics must be decoupled from the critical redirect path
- Pre-generated keys eliminate contention at generation time
- Multi-level caching (Redis, CDN) is essential for sub-10ms latency
- Simple data model (key-value mapping) means we can scale horizontally
- Expiration requires thought—both for TTL caches and database cleanup
Practice
Design a feature where users can export analytics for their shortened URLs (clicks per day, top referrers, geographic distribution). The export might be 1GB+ of data. How would you generate this efficiently? Should it be real-time or pre-computed? Where would you store it?
Connection to Next Section
URL shortening is fast because URLs are immutable and we cache aggressively. What if data constantly changes and we need consistency across servers? In the next section, we’ll design a Distributed Cache (like Redis Cluster) that handles mutable data, requires careful consistency strategies, and scales to millions of operations per second—the infrastructure behind services like URL shorteners.