How CDNs Work
Why Distance Matters More Than You Think
Imagine you’re sitting in Singapore, visiting a website hosted on a server in Virginia. That request doesn’t teleport—it travels across submarine cables, through multiple routers, and across continents. Even at the speed of light (186,000 miles per second), that journey takes roughly 150 milliseconds just for the roundtrip. Now multiply that by dozens of assets: images, stylesheets, JavaScript, videos. A user in Singapore might wait 3–5 seconds for a page that loads instantly in New York, purely because of physics and geography.
This is where Content Delivery Networks (CDNs) enter the picture. Instead of every user fetching content from a single origin server thousands of miles away, CDNs distribute copies of your content across servers positioned in cities, regions, and continents worldwide. A user in Singapore now fetches from a server in Singapore or nearby Asia-Pacific region—reducing latency from 150ms to 10–20ms. The impact is dramatic: faster page loads, better user experience, and happier customers.
Throughout this chapter, you’ll learn how CDNs work, why they’re essential for scale, and how to integrate them into your architecture. We’ll connect this to the caching strategies from Chapter 6 (where we learned about cache layers) and the networking fundamentals from Chapter 3 (latency, bandwidth, and routing). By the end, you’ll understand not just the “what” but the “why” behind one of the internet’s most critical infrastructure pieces.
Understanding CDNs: The Basics
A Content Delivery Network is a geographically distributed network of servers that cache and serve content to users from locations close to them. Think of it as a middleman between your origin server (the source of truth for your content) and your end users. The origin server lives in one location—say, us-east-1 in AWS. The CDN duplicates that content across dozens or hundreds of edge servers worldwide, strategically placed to minimize the distance data travels.
Origin servers are where your content lives initially. They hold the “golden copy”—the authoritative version of all your static assets (images, CSS, JavaScript), and increasingly, dynamic content too. Edge servers (also called Points of Presence or PoPs) are the distributed caches that sit much closer to end users. When a user requests content, the CDN intelligently routes them to the nearest edge server. If that edge server has the content cached, it serves it instantly. If not, it fetches from the origin (or a parent cache) and stores a copy for future requests.
CDNs operate using two primary models: push and pull. In a push model, you actively upload content to the CDN, ensuring it’s pre-positioned everywhere before users request it. This guarantees availability but requires more operational overhead. In a pull model, the CDN only fetches content from your origin when a user requests it, then caches it for future requesters. Pull is simpler to operate but can result in cache misses for less popular content.
Major CDN providers dominate the market. Cloudflare is known for DDoS protection and speed. AWS CloudFront integrates seamlessly with AWS services. Akamai is a legacy giant with unmatched reach. Fastly specializes in real-time content manipulation and video delivery. Each has different strengths, pricing models, and geographic coverage. Choosing one depends on your use case, budget, and performance requirements.
CDNs started by caching static content: images, CSS stylesheets, JavaScript bundles, and video files. These are perfect for CDN caching—they don’t change often, and serving them from a nearby edge server is nearly identical to serving from the origin. Over the past decade, CDNs have evolved to handle increasingly dynamic content: personalized HTML, real-time API responses, and even streaming. Some modern CDNs now offer serverless computing at the edge (like Cloudflare Workers or AWS Lambda@Edge), transforming them from purely content-serving infrastructure into programmable platforms.
The Franchise Kitchen Analogy
Imagine a restaurant chain that serves signature dishes worldwide. The naive approach: operate one central kitchen in New York, prepare all food there, and ship it globally. Every meal arrives cold and stale. The modern approach: license the recipes and train local chefs. Now you have kitchens in Tokyo, London, São Paulo, and Sydney. Each kitchen sources local ingredients (fresh eggs, dairy, vegetables) but follows the same recipes (origin content). A customer in Tokyo orders at a Tokyo kitchen, not New York. Their meal arrives hot and fresh in minutes, not weeks.
CDNs work the same way. Your origin server is the central kitchen with the authoritative recipes. Your edge servers are the local kitchens, each serving their geographic neighborhood. When a user in Tokyo requests a video, it comes from a Tokyo edge server, not from your US origin. When a user in London makes a request, it routes to London. The “recipes” (your content and caching logic) remain identical everywhere—that consistency and reliability is the beauty of the model.
How CDN Requests Actually Flow
Let’s trace the journey of a user request through a CDN, step by step:
User in Singapore
↓
Browser makes HTTP request to your domain
↓
DNS resolution: "where should this request go?"
CDN's GeoDNS returns the nearest edge server IP (e.g., Singapore PoP)
↓
User connects to Singapore edge server
↓
Edge server checks its cache
├─ Cache HIT: Serve content immediately (10–50ms roundtrip)
└─ Cache MISS: Fetch from origin or parent cache, store locally, then serve
↓
Content arrives at user
This is the fundamental flow, and it’s elegant. The critical piece is DNS resolution. When a user’s browser looks up your domain, the CDN intercepts that DNS query using a technique called GeoDNS or anycast routing. Instead of always returning the same IP (your origin), the CDN’s nameserver detects the user’s location (via their ISP’s DNS resolver IP) and returns the IP of the geographically nearest edge server. Some CDNs use anycast, where multiple data centers announce the same IP address, and BGP (Border Gateway Protocol) routes the user to the nearest one based on network topology.
Cache hierarchies add another layer of efficiency. Not every edge server is full size. A typical hierarchy looks like this:
| Layer | Purpose | Size | Hit Rate |
|---|---|---|---|
| Edge (PoP) | Serve users | Small (100GB–1TB) | 85–95% |
| Shield/Mid-tier | Shared cache for multiple edges | Larger | 90–98% |
| Origin | Authoritative content source | Full | 100% |
When a user in a small city requests content, their local edge might miss. That miss goes to a regional shield cache serving five nearby cities. The shield has better hit rates because it aggregates requests from all those cities. If the shield also misses, only then does the request go back to the origin. This reduces origin load dramatically.
Cache keys determine what the CDN considers “the same resource.” By default, a cache key is the URL, but CDNs let you customize this. Should cache key include query parameters? The Accept-Encoding header? User cookies? Getting cache keys right is crucial. Too broad and you’ll serve stale content to the wrong users. Too narrow and you’ll have cache misses and waste storage.
TTL (Time To Live) controls how long an edge server keeps a cached object. You can set a default TTL (often 24 hours), but you can also set TTLs per object or per path. If your homepage changes hourly, give it a 1-hour TTL. If your product images change monthly, give them a 30-day TTL. Cache headers from your origin (like Cache-Control: max-age=3600) inform the CDN’s TTL decisions.
Cache warming and prepopulation are techniques used for critical content. Instead of waiting for users to request an asset and triggering a cache miss, you proactively push the most important assets to all edge servers before launch. Netflix does this before a show premieres—they don’t want the first 10 million viewers to trigger origin fetches. You can warm caches via the CDN’s API or by running a scheduled job that requests key assets.
Origin shielding protects your backend. When you enable shielding, the CDN adds an extra cache layer between the edge and your origin. Multiple edge servers don’t hammer your origin independently; they all route through a shared shield. This is a lifesaver during traffic spikes or when dealing with popular content.
Here’s a mermaid diagram showing a request flow with origin shielding:
graph TD
A["User in Brazil"] -->|DNS resolves to nearest edge| B["São Paulo Edge Server"]
B -->|Cache hit 90% of the time| C["Serve from cache"]
B -->|Cache miss| D["Query Shield Cache<br/>São America region"]
D -->|Shield hit 95% of the time| E["Fetch from shield"]
D -->|Shield miss| F["Fetch from Origin<br/>us-east-1"]
F --> G["Return to shield, edge, user"]
E --> H["Return to edge, user"]
C --> I["User receives content"]
H --> I
G --> I
Real-World CDN Usage: Case Studies
Setting up CloudFront for static content is straightforward. You create a distribution, point it to your S3 bucket (or any origin), and CloudFront immediately starts caching. Here’s a simplified config:
{
"DistributionConfig": {
"Enabled": true,
"Origins": [
{
"DomainName": "mybucket.s3.amazonaws.com",
"Id": "myS3Origin",
"S3OriginConfig": {}
}
],
"DefaultCacheBehavior": {
"TargetOriginId": "myS3Origin",
"ViewerProtocolPolicy": "redirect-to-https",
"CachePolicyId": "658327ea-f89d-4fab-a63d-7e88639e58f6",
"Compress": true
}
}
}
Within minutes, your content is cached globally. A user in Mumbai requesting an image gets served from the Mumbai edge server’s cache, not from your us-east-1 bucket.
Netflix’s Open Connect is the canonical example of a purpose-built CDN. Netflix recognized that content streaming at scale required more than third-party CDNs could offer, so they built their own network of edge servers (called “Open Connect appliances”) installed directly inside ISP data centers worldwide. When you watch Netflix in Brazil, the video streams from a server inside your ISP’s network, not from Netflix’s origin in California. This reduces ISP backbone costs, improves video quality, and takes load entirely off the internet backbone. It’s overkill for most companies but brilliant for Netflix’s scale and use case.
A news website during a viral traffic spike shows CDN value in real time. A political story breaks, traffic explodes from 100 requests/second to 100,000 requests/second. Without a CDN, your origin would buckle immediately. With a CDN, the edge servers absorb the spike. Requests hit the cache at 95% rate because everyone’s requesting the same few articles. Your origin sees only 5,000 requests/second instead of 100,000, and it handles that fine. The difference between downtime and business as usual is often just a CDN.
Cache Invalidation: The Hard Part
You know what they say: “There are only two hard things in Computer Science: cache invalidation and naming things.” CDNs make cache invalidation tricky. You push an update to your origin, but thousands of edge servers are still serving the old cached version to users worldwide. How do you flush that cache?
Purging or invalidation is the answer. Most CDNs let you explicitly invalidate specific files or paths. You can invalidate /images/hero.png and wait for the change to propagate across all edges (usually under 60 seconds). You can also use versioning: instead of /images/hero.png, serve /images/hero-v2.png. Old URLs remain cached, new URLs get fresh content, and users gradually migrate to new assets as they reload pages. This avoids invalidation delays entirely.
Stale content risks are real if you’re not careful. Imagine you patch a security vulnerability in your JavaScript bundle but forget to set a short TTL. Users spend a week running the vulnerable version. Conversely, if you set TTLs too short (5 minutes or less), you lose the benefits of caching and waste CDN resources. The balance is critical.
Trade-offs and Practical Considerations
Cost is a real concern. Major CDNs charge per GB transferred. For a small site with modest traffic, it’s negligible. For a video streaming company transferring petabytes monthly, CDN costs rival server costs. Some providers offer better pricing for “always-on” commitments; others charge per region. Cloudflare offers some services free; AWS CloudFront charges per-request and per-GB.
Complexity increases when you adopt a CDN. You now manage cache keys, TTLs, invalidation, health checks, and origin failover. You need monitoring to detect cache misses, stale content, and origin errors. You need a strategy for handling near-edge cache warmth. All of this is worth it for scale, but it’s not zero-overhead.
Privacy and data sovereignty matter for some use cases. Sending all your user requests through a third-party CDN means that provider sees your traffic patterns and user locations. Certain regulated industries (healthcare, finance) have data residency requirements. You might need to use a CDN that keeps data in specific geographic regions or manage your own edge servers for sensitive content.
When you don’t need a CDN: If your application is heavily dynamic (every response is personalized, never cacheable), a CDN helps less. If your users are geographically concentrated, adding global edge servers provides minimal benefit. If your content is small and bandwidth is cheap, the cost-benefit math might not work. That said, modern CDNs are so cheap and good that even small sites benefit from using one.
Key Takeaways
- CDNs reduce latency by serving content from edge servers near users instead of a single origin server thousands of miles away.
- CDNs work via DNS routing to the geographically nearest edge server, which caches content and serves hits in milliseconds.
- Cache hierarchies (edge → shield → origin) reduce origin load and improve hit rates through request aggregation.
- Cache keys, TTLs, and invalidation strategies are crucial for correctness and performance.
- Major providers (Cloudflare, AWS CloudFront, Akamai, Fastly) offer different strengths; choose based on use case and budget.
- CDNs now serve dynamic content and enable edge computing, not just static files.
- Cache invalidation, stale content risks, and privacy concerns are real trade-offs to manage.
Practice Scenarios
-
The Speed Benchmark: You run a SaaS platform with users in North America and Europe. Without a CDN, users in Europe experience 200ms latency for page loads. You deploy CloudFront with a default 1-hour TTL for static assets. Measure the latency improvement and estimate the percentage of requests that hit the cache. What TTLs would you set for different asset types (hero images vs. avatars vs. product data)?
-
The Cache Invalidation Puzzle: Your marketing team pushes a new ad banner image every 4 hours. Your origin has
Cache-Control: max-age=86400(24 hours). Users see stale banners for up to 24 hours after updates. Propose three strategies to fix this: short TTLs, versioned URLs, and explicit invalidation. What are the trade-offs of each? -
The Origin Shield Scenario: Your origin server is in us-east-1 and averages 1,000 requests/second. You’re planning a product launch that’ll spike traffic to 50,000 requests/second. Your CDN edges are distributed globally. Without origin shielding, estimate the origin load during the spike. With shielding, how does that change? Why is the shield valuable here?
Looking Ahead
We’ve now explored how CDNs distribute content globally using edge servers and intelligent routing. In the next section, “Edge Servers and Points of Presence,” we’ll dive deeper into what actually runs on edge servers, how operators decide where to place them, and how modern platforms like Cloudflare Workers enable computation at the edge, not just caching. You’ll see that CDNs are no longer just delivery networks—they’re becoming the distributed compute platform of the internet.