Compression Strategies
The 90% Savings No One Argues About
Your API returns a 500KB JSON response. A user on a typical broadband connection would wait 2+ seconds for that download. Then you enable gzip compression. The response shrinks to 50KB. Same data, 90% reduction in size. The user now waits 0.2 seconds.
For a service handling 1 million requests daily, that’s a 450GB bandwidth savings and proportional reduction in egress costs. Compression is one of the few optimizations that’s simultaneously cheap to implement, universally beneficial, and almost impossible to do wrong.
But not all compression is created equal. Gzip is great for most cases, but Brotli gives better ratios for static content. For real-time systems, LZ4 compresses faster at the cost of lower compression ratios. For databases, columnar compression beats row-level compression. Understanding when to use which algorithm is the difference between a well-optimized system and one that leaves bandwidth and latency on the table.
This section builds on network optimization (Chapter 108) by diving into the compression strategies that make those network savings concrete.
Lossless vs. Lossy: Understanding the Boundary
Lossless compression encodes information more efficiently while preserving every bit exactly. Text, code, JSON, databases—these require lossless compression. Algorithms include gzip, Brotli, Zstandard, LZ4, and Snappy. The math is elegant: if your data has patterns or repeated sequences, lossless compression finds and exploits them.
Lossy compression discards information deemed less important. JPEG throws away some color information your eye can’t perceive. WebP goes further, trading quality for size. Video codecs like H.264 drop temporal data. You use lossy compression when you’re willing to trade perfect fidelity for smaller size.
The boundary matters in system design: compressing a JSON API response requires lossless compression (users will notice if a field vanishes). Serving images to browsers can use lossy compression (users rarely care that JPEG discards some color data).
Lossless Compression Algorithms: Speed vs. Ratio
When you decide to compress text data, you have choices. Each algorithm sits at a different point on the speed-to-compression-ratio spectrum.
gzip: The Universal Standard
gzip has dominated for two decades because it’s universally supported and hits a good balance between speed and compression ratio.
- Compression ratio: 70-80% for typical JSON (100KB becomes 20-30KB)
- Speed: Fast compression, moderate-speed decompression
- Support: Every browser, server, and client library supports gzip
- Best for: General-purpose HTTP compression, backward compatibility
- Overhead: Minimal—modern clients decompress transparently
Configuration (nginx):
gzip on;
gzip_types application/json text/css text/javascript application/javascript;
gzip_level 6; # Balance speed and compression (1=fast, 9=best ratio)
gzip_min_length 1000; # Don't compress tiny responses
The gzip_level parameter is the key trade-off: level 1 compresses and decompresses very quickly but achieves only 60% reduction. Level 9 achieves 80% reduction but is 3-4x slower to compress.
Brotli: Better Compression, Slower
Brotli was developed by Google specifically to compress text better than gzip, especially for static assets. It’s becoming the standard for content delivery networks and modern browsers.
- Compression ratio: 80-88% for typical JSON (better than gzip)
- Speed: Slower compression, similar decompression to gzip
- Support: Modern browsers and servers support it; older clients don’t
- Best for: Static assets (JS bundles, CSS, HTML), CDN edge caching
- Overhead: Higher CPU cost during compression, negligible during decompression
Decision point: Use Brotli for static assets that are compressed once and served many times. Use gzip for dynamic API responses where compression happens on every request.
nginx configuration:
gzip on;
gzip_types application/json text/css text/javascript;
gzip_level 6;
brotli on;
brotli_types application/json text/css text/javascript;
brotli_comp_level 6;
The server will use Brotli if the client advertises support (Accept-Encoding: br), and fall back to gzip otherwise.
Zstandard: The Modern Sweet Spot
Zstandard (zstd) is newer and offers the best compression-to-speed trade-off for many use cases. It’s faster than Brotli but compresses nearly as well.
- Compression ratio: 75-85% (between gzip and Brotli)
- Speed: Fast compression and decompression
- Support: Growing but not yet ubiquitous
- Best for: Real-time systems, services where compression happens frequently, modern infrastructure
- Overhead: Moderate CPU, good performance at both ends
Zstandard is seeing adoption in databases, message queues, and modern APIs because it doesn’t force you to choose between speed and compression ratio.
LZ4 and Snappy: Speed Kings
When you need compression for real-time systems where latency is more important than disk space, LZ4 and Snappy shine.
- Compression ratio: 50-60% (worse than gzip, but acceptable)
- Speed: Extremely fast compression and decompression (microseconds)
- Support: Less universal; used in internal systems, databases, message queues
- Best for: Internal service communication, message queue payloads, real-time systems
- Overhead: Minimal CPU impact
You’ll see LZ4 in Kafka brokers (producer compresses messages, consumer decompresses rapidly) and in high-frequency trading systems where latency trumps bandwidth.
Where Compression Applies: A System-Wide Perspective
Compression isn’t just for HTTP responses. Think of it as a layer that can sit almost anywhere data moves through your system.
HTTP Response Compression
Standard practice—servers apply compression before sending responses to clients.
Client: GET /api/users
Accept-Encoding: gzip, br, zstd
Server: 200 OK
Content-Encoding: gzip
Content-Length: 45000 (was 500000 before compression)
[compressed data]
Modern frameworks handle this automatically, but you need to:
- Enable compression on your web server (nginx, Apache, etc.)
- Set appropriate compression levels (avoid level 9 on dynamic responses—too slow)
- Exclude already-compressed formats (images, video, encrypted data)
- Set minimum payload sizes (don’t compress 100-byte responses—overhead exceeds savings)
Database Compression
Databases can compress data at multiple levels:
- Page-level compression (MySQL InnoDB): Compress disk pages at rest. Decompression happens automatically when pages are read into memory.
- Column compression (PostgreSQL TOAST): Store large text columns separately, compressed.
- Columnar compression (analytical databases like Parquet): Store data by column instead of row, enabling algorithm-specific compression for each column.
Example (PostgreSQL):
ALTER TABLE large_table SET (compression = pglz);
VACUUM FULL large_table; -- Rewrite table with compression
This reduces storage, but adds CPU overhead on write. For read-heavy analytical workloads, the trade-off is worthwhile.
Message Queue Compression
Kafka producers can compress messages before sending to brokers:
# Producer config
compression.type=snappy
Options: lz4, snappy, zstd, gzip. Brokers store messages compressed; consumers decompress automatically. This reduces broker disk usage and network transfer between brokers.
Trade-off: CPU overhead on producers and consumers vs. broker storage savings. For high-throughput topics, using Snappy or LZ4 (fast) is better than gzip.
Storage Compression
Filesystems and object stores can compress data:
- ZFS compression:
zfs set compression=lz4 pool/dataset - S3 object compression: Upload pre-compressed objects; S3 transparently decompresses on download
- Archive compression: Old logs and backups compressed before long-term storage
When NOT to Compress
Counter-intuitively, compression isn’t always beneficial.
Already-Compressed Data
JPEG images, MP4 videos, and encrypted data don’t compress further. Applying gzip to a PNG adds CPU overhead with minimal size reduction.
File: image.png (500KB)
After gzip: 480KB (4% reduction)
Wasted CPU cycles: Not worth it
Modern web servers automatically exclude common formats:
gzip_types application/json text/css text/javascript;
# Images, video, compressed archives excluded by default
Very Small Payloads
Compression has overhead: the algorithm initialization, creating huffman trees, etc. For payloads under 1KB, compression overhead can exceed savings.
Payload: 800 bytes
Gzip overhead: ~200 bytes
After gzip: ~600 bytes (25% reduction)
Network time saved: negligible
CPU wasted: unnecessary
Set gzip_min_length to exclude small responses:
gzip_min_length 1000; # Only compress responses over 1KB
Image Optimization: The Lossy Frontier
Images are often 50-80% of page weight. Lossy compression here yields massive savings.
Format Wars: JPEG vs. WebP vs. AVIF
| Format | Quality | File Size | Browser Support | Best For |
|---|---|---|---|---|
| JPEG | Good | Medium | 100% | Photos, existing standard |
| WebP | Excellent | Small (25-30% smaller than JPEG) | ~95% | Modern browsers |
| AVIF | Excellent | Smallest (50% smaller than JPEG) | ~80% | Progressive enhancement |
| PNG | Lossless | Large | 100% | Icons, transparency |
| SVG | Scalable | Small | 100% | Icons, diagrams, logos |
Modern approach: Serve WebP to supporting browsers, JPEG as fallback.
<picture>
<source srcset="image.webp" type="image/webp">
<source srcset="image.avif" type="image/avif">
<img src="image.jpg" alt="Description">
</picture>
Responsive Images: Right Size for Right Device
Sending a 2MB image for desktop to a mobile user on 3G is wasteful.
<img
srcset="small.jpg 480w, medium.jpg 1024w, large.jpg 2048w"
sizes="(max-width: 480px) 100vw, (max-width: 1024px) 80vw, 1024px"
src="medium.jpg"
alt="Product"
>
The browser downloads the image that matches the viewport size and pixel density.
Lazy Loading Images
Load images only when they’re about to enter the viewport.
<img src="placeholder.jpg" loading="lazy" alt="Product">
Native lazy loading (supported in 95% of browsers) defers image loading until the image is within 50 pixels of the viewport.
Compression Trade-offs: The CPU-Bandwidth Spectrum
Every compression decision is a trade-off:
| Scenario | Recommendation | Reasoning |
|---|---|---|
| Static assets (CDN cached) | Brotli, level 11 | Compressed once, served many times; CPU cost amortized |
| Dynamic API responses | gzip, level 6 | Fast compression on every request; good ratio |
| Real-time systems | LZ4 or Snappy | Latency critical; 50% compression acceptable |
| Kafka topics | Snappy or LZ4 | Speed matters; compression ratio secondary |
| Database storage | zstd or pglz | Once-written, many-times-read; CPU cost acceptable |
| Already-compressed data | None | Don’t waste CPU on JPEG, video, encrypted data |
Pro tip: Monitor CPU usage on your compression tier. If gzip is consuming 20% CPU, consider LZ4 or zstd for marginal bandwidth savings. Saved CPU could power additional request processing.
Encryption and Compression: A Security Gotcha
Critical warning: Never compress data before encrypting it for transmission. The compressed size leaks information about the plaintext.
The CRIME and BREACH attacks demonstrated this: an attacker observes that adding Authorization: Bearer secret123 causes the compressed size to grow, proving they guessed part of the token. Never compress HTTPS request/response bodies that contain secrets.
Correct approach: Encrypt first (no compression before encryption), then optionally compress after encryption if needed (though encrypted data usually doesn’t compress anyway).
Correct: plaintext → encrypt → optionally compress
Wrong: plaintext → compress → encrypt (leaks information)
Most frameworks handle this correctly—HTTP compression happens at the transport layer, after TLS encryption is applied to the payload.
Key Takeaways
-
Compression is high-ROI: 80-90% size reduction with minimal code changes. Implement this before almost any other optimization.
-
Algorithm choice matters: gzip for compatibility, Brotli for static assets, zstd for modern systems, LZ4/Snappy for real-time. Measure your use case.
-
Static vs. dynamic: Pre-compress static assets (level 9 Brotli). Dynamically compress responses with moderate settings (gzip level 6). The CPU cost is different.
-
Images dominate page weight: Lossy compression (WebP, AVIF) saves 50% or more. Responsive images and lazy loading complete the picture.
-
Compression has limits: Don’t compress already-compressed data. Don’t compress payloads under 1KB. Don’t compress before encryption.
-
Measure CPU cost: Compression trades CPU for bandwidth. If CPU is your bottleneck, choose faster algorithms. If bandwidth is the constraint, choose better compression.
Practice Scenarios
Scenario 1: Your API serves 100MB of traffic daily, mostly JSON responses. You’re on a budget and need to reduce bandwidth costs immediately. Would you implement gzip, Brotli, or Zstandard first? Why? What would you measure to decide?
Scenario 2: Your Kafka cluster handles 1 million messages daily, each 5KB. You need to reduce broker disk usage and network transfer between data centers. Message latency is under 100ms and not a constraint. Which compression algorithm would you choose (lz4, snappy, zstd, gzip) and why?
Scenario 3: Your e-commerce site loads 150 product images on the homepage. Mobile users are abandoning the site due to slow load times. You’re already using HTTP/2. What’s your first compression optimization (image format, responsive images, lazy loading) and why?
Up Next
We’ve optimized the network path and reduced payload sizes. But what about resources that aren’t critical right now? In the next section, we explore lazy loading and prefetching—controlling when resources are loaded to optimize both perceived and actual performance.