System Design Fundamentals

URL Shortener Architecture

A

URL Shortener Architecture

The Problem

Picture this: you’re at a conference and need to share a link to your paper. Pasting “https://university.edu/research/2024/quantum-computing/paper-v3-final-FINAL-actually-final.pdf” in a tweet is ugly. You need a URL shortener—a service that converts long URLs into compact, shareable links like “https://short.url/abc123”.

We’re designing a system like bit.ly or TinyURL: accept a long URL, return a short one; when someone visits the short URL, redirect them to the original. Sounds simple, but at scale—with 100 million new shortened URLs per month and 1 billion redirects monthly—it’s complex. This system will teach us about data generation, caching, and handling massive read-heavy workloads.

Requirements and Estimation

Functional Requirements

  • Shorten URL: Accept a long URL, return a shortened version
  • Redirect: Accept a short URL, return HTTP redirect (301/302) to the original
  • Custom aliases: Allow users to specify custom short URLs if available (e.g., “bit.ly/mycompany”)
  • Expiration: Support URLs that expire after a certain time
  • Analytics: Track clicks, referrer, geographic location of visitors
  • User accounts: Users can manage and view analytics for their shortened URLs

Non-Functional Requirements

  • Latency: Redirect response under 10 milliseconds (perceived as instant)
  • Throughput: 100M URLs created per month = ~40 URLs/second write rate
  • Read-heavy: 10:1 read-to-write ratio = 400 redirects per second
  • Availability: 99.9% uptime (under 43 minutes downtime per month)
  • Durability: URLs should persist even if one database fails
  • Short URLs: Minimal length (7 characters is ideal)

Back-of-the-Envelope Calculation

Write throughput: 100M URLs/month ÷ 30 days ÷ 86400 seconds ≈ 40 URLs/sec
Read throughput: 400 URLs/sec (from 10:1 ratio)

5-year projection: 100M × 12 × 5 = 6 billion URLs

Storage per URL:
  - Short key: 7 bytes (base62)
  - Long URL: ~100 bytes (average)
  - User ID: 8 bytes
  - Metadata (created, expires, clicks): 50 bytes
  - Total per record: ~165 bytes

Storage needed: 6B × 165 bytes ≈ 1 TB

With 3x replication for durability: 3 TB

Not massive—even a single database could theoretically handle it. But with 400 redirects per second and the need for sub-10ms latency, we need smart caching and sharding.

Key Generation Strategy

The most critical decision: how do we generate short IDs? We need 3.5 trillion combinations (62^7) to safely generate 6 billion URLs.

Approach 1: Hashing

Hash the full URL with MD5 or SHA256, take the first 7 characters, convert to base62.

hash = SHA256("https://example.com/very/long/url")
short_id = hash[0:7].toBase62()

Advantages: Simple, stateless, no coordination needed.

Disadvantages: Collision probability (birthday paradox—even with 7 chars, you’d expect collisions after ~60M URLs), need collision resolution strategy.

Approach 2: Counter-Based (Auto-increment)

Use a global counter, convert to base62. Database assigns the next available ID.

SELECT next_id FROM url_counter;
short_id = base62(next_id)
INSERT INTO urls (short_id, long_url, ...);

Advantages: Zero collisions, simple, compact IDs, sequential (predictable).

Disadvantages: Global counter is a bottleneck, sequential IDs reveal usage patterns (security risk), requires careful database design to avoid conflicts in distributed setup.

A separate service pre-generates billions of unique IDs, stores them in a pool, and hands them out.

Key Generation Service:
  - Generate candidate IDs: base62(random(0, 62^7))
  - Check against "used" set
  - If new, add to queue of available keys
  - Maintain pool of 1M+ unused keys at all times

URL Shortener Service:
  - Request next available key from Key Service
  - Assign to user's URL
  - Return short URL

Advantages: Fast (no collisions, O(1) handout), no coordination at write time, high throughput.

Disadvantages: More complex architecture, need to manage the pool.

This is the industry standard. We’ll use this approach.

High-Level Architecture

graph TB
    subgraph "Write Path (Shorten URL)"
        Client1["Client"]
        API1["API Server"]
        KeySvc["Key Generation Service"]
        DB1[(("Master DB"))]
        Cache1["Redis Cache"]
        Client1 -->|POST /shorten| API1
        API1 -->|Get next ID| KeySvc
        KeySvc -->|Return ID| API1
        API1 -->|Write mapping| DB1
        API1 -->|Cache result| Cache1
    end

    subgraph "Read Path (Redirect)"
        Client2["Client"]
        LB["Load Balancer"]
        API2["API Server"]
        Cache2["Redis Cache"]
        DB2[(("Read Replica"))]
        CDN["CDN"]
        Client2 -->|GET /abc123| LB
        LB --> API2
        API2 -->|Check cache| Cache2
        Cache2 -->|Miss| DB2
        API2 -->|Send 301| Client2
        CDN -->|Popular URLs| Client2
    end

Deep Dive: Implementation Details

Database Schema

CREATE TABLE urls (
  id BIGINT PRIMARY KEY AUTO_INCREMENT,
  short_key VARCHAR(7) UNIQUE NOT NULL,
  long_url VARCHAR(2048) NOT NULL,
  user_id BIGINT NOT NULL,
  created_at TIMESTAMP DEFAULT NOW(),
  expires_at TIMESTAMP,
  is_custom BOOLEAN DEFAULT FALSE,
  redirect_count INT DEFAULT 0,

  INDEX (user_id),
  INDEX (expires_at),
  INDEX (created_at)
);

CREATE TABLE click_analytics (
  id BIGINT PRIMARY KEY AUTO_INCREMENT,
  short_key VARCHAR(7) NOT NULL,
  timestamp TIMESTAMP DEFAULT NOW(),
  referrer VARCHAR(512),
  user_agent VARCHAR(512),
  country VARCHAR(2),

  INDEX (short_key, timestamp),
  INDEX (timestamp)
);

Observation: click_analytics is a separate table because:

  • URLs table is small and frequently read (redirect path)
  • Analytics table is write-heavy (every click) and append-only
  • We can shard analytics separately from URLs

301 vs 302 Redirect

This matters more than you’d think:

301 Permanent Redirect:

  • Browser caches the mapping (second visit doesn’t hit our server)
  • Good for latency, bad for analytics (we miss cached redirects)
  • Good for SEO (some authority transfers to the original URL)
  • Cannot change the destination later

302 Temporary Redirect:

  • Browser doesn’t cache (every visit hits our server)
  • Better analytics tracking
  • Can update destination without old URLs breaking
  • More hits on our servers (higher cost)

Hybrid Approach:

  • Use 302 by default (flexibility)
  • Switch to 301 for popular URLs (caching reduces load)
  • Cache the 301 response in CDN

Caching Strategy

The read path dominates (400 redirects/sec). A simple LRU cache in Redis helps:

def get_url(short_key):
  # Check Redis cache
  long_url = redis.get(f"url:{short_key}")
  if long_url:
    return long_url

  # Cache miss, query database
  row = database.query(f"SELECT long_url FROM urls WHERE short_key = ?", short_key)
  if row:
    long_url = row['long_url']
    # Cache for 24 hours (or until expiration)
    redis.setex(f"url:{short_key}", 86400, long_url)
    return long_url
  else:
    return None

Hit rate: With 80-90% of traffic going to 20% of URLs (Zipf distribution), caching 100K most popular URLs gets us 80%+ cache hit rate.

Custom Aliases

Users want “bit.ly/mycompany” instead of “bit.ly/abc123”. Check availability:

def shorten_with_custom_alias(long_url, custom_alias, user_id):
  if not is_valid_alias(custom_alias):
    return error("Invalid alias")

  existing = database.query(
    "SELECT id FROM urls WHERE short_key = ?",
    custom_alias
  )

  if existing:
    return error("Alias already taken")

  # Insert with custom alias
  database.insert("urls", {
    "short_key": custom_alias,
    "long_url": long_url,
    "user_id": user_id,
    "is_custom": True
  })

  return success(custom_alias)

Race condition: Two users might request the same alias simultaneously. Solve with database constraint (UNIQUE on short_key) and retry with random ID on conflict.

URL Expiration

Some URLs should expire (time-sensitive event links, temporary documents):

def create_url(long_url, user_id, ttl_hours=None):
  short_key = get_next_key_from_key_service()

  expires_at = None
  if ttl_hours:
    expires_at = now() + timedelta(hours=ttl_hours)

  database.insert("urls", {
    "short_key": short_key,
    "long_url": long_url,
    "user_id": user_id,
    "expires_at": expires_at
  })

  # Cache with TTL matching expiration
  cache_ttl = ttl_hours * 3600 if ttl_hours else 86400
  redis.setex(f"url:{short_key}", cache_ttl, long_url)

  return short_key

Analytics Pipeline

Tracking every click in the main request path would block the redirect. Instead, log asynchronously:

User clicks short URL

API returns 301/302 redirect

Meanwhile, emit event to Kafka: {short_key, timestamp, referrer, user_agent, ip}

Analytics consumer reads from Kafka

Writes to time-series database (InfluxDB, MongoDB)

Dashboard queries for graphs (by day, by country, by referrer)

This way, redirects complete in 10ms without waiting for analytics storage.

Scaling for Global Traffic

Database Sharding

With 6 billion URLs and 400 redirects/sec, we need multiple databases:

Strategy: Hash-based sharding

shard_id = hash(short_key) % num_shards
db = database_pool[shard_id]

With 10 shards, each shard handles 40 redirects/sec (easily handled by one database server).

Each shard needs read replicas for the redirect path (reads scale horizontally).

Multi-Region Deployment

Users in Europe shouldn’t redirect through US data centers:

  • Write: Primary region only (user accounts in US)
  • Read: Regional replicas (EU, APAC) replicate from primary

Replication lag is fine (eventual consistency) since URLs are immutable after creation.

The most popular 0.1% of URLs account for 10%+ of traffic:

CDN = Content Delivery Network
  Cache the redirect response geographically

First user in London clicks "bit.ly/viral":
  → No cache, hits London edge, queries DB
  → Response cached in London

Next user in London:
  → CDN serves from cache (1-2ms latency)
  → Never hits origin server

Reduces origin load by 90% for viral URLs.

Scaling Considerations

ComponentBottleneckSolution
Short URL generationKey Service becomes bottleneckPre-generate billions of keys, distribute across multiple Key Services
Database writesSharding requiredHash-based sharding (10-20 shards), batch inserts
Database reads (redirects)Read replicas saturateAdd more replicas, Redis cache, CDN
Analytics ingestionClick tracking blocks redirectAsync event pipeline (Kafka), batch writes
Expiration cleanupOld URLs accumulateBackground job runs hourly, deletes by expires_at timestamp

Pro Tip: Use bulk operations. Instead of inserting 40 URLs/sec one at a time, batch them (insert 1000 URLs every 25 seconds). Reduces overhead by 50%.

Trade-offs

DecisionChosenAlternativeTrade-off
Key generationPre-generated poolHashingSimpler implementation vs guaranteed availability
Redirect type302 + 301 for popularAlways 301Flexibility vs client-side caching
AnalyticsAsync (Kafka)Sync writesFast redirects vs real-time analytics
ReplicationAsync to replicasSyncFast writes vs stronger consistency
Expiration cleanupBackground jobTTL in cacheSimple vs automatic

Key Takeaways

  1. Scale estimation is crucial—40 writes/sec is very different from 40K writes/sec
  2. Read caching dominates—URLs are immutable, so caching is safe and high-impact
  3. Analytics must be decoupled from the critical redirect path
  4. Pre-generated keys eliminate contention at generation time
  5. Multi-level caching (Redis, CDN) is essential for sub-10ms latency
  6. Simple data model (key-value mapping) means we can scale horizontally
  7. Expiration requires thought—both for TTL caches and database cleanup

Practice

Design a feature where users can export analytics for their shortened URLs (clicks per day, top referrers, geographic distribution). The export might be 1GB+ of data. How would you generate this efficiently? Should it be real-time or pre-computed? Where would you store it?

Connection to Next Section

URL shortening is fast because URLs are immutable and we cache aggressively. What if data constantly changes and we need consistency across servers? In the next section, we’ll design a Distributed Cache (like Redis Cluster) that handles mutable data, requires careful consistency strategies, and scales to millions of operations per second—the infrastructure behind services like URL shorteners.