System Design Fundamentals

Compression Strategies

A

Compression Strategies

The 90% Savings No One Argues About

Your API returns a 500KB JSON response. A user on a typical broadband connection would wait 2+ seconds for that download. Then you enable gzip compression. The response shrinks to 50KB. Same data, 90% reduction in size. The user now waits 0.2 seconds.

For a service handling 1 million requests daily, that’s a 450GB bandwidth savings and proportional reduction in egress costs. Compression is one of the few optimizations that’s simultaneously cheap to implement, universally beneficial, and almost impossible to do wrong.

But not all compression is created equal. Gzip is great for most cases, but Brotli gives better ratios for static content. For real-time systems, LZ4 compresses faster at the cost of lower compression ratios. For databases, columnar compression beats row-level compression. Understanding when to use which algorithm is the difference between a well-optimized system and one that leaves bandwidth and latency on the table.

This section builds on network optimization (Chapter 108) by diving into the compression strategies that make those network savings concrete.


Lossless vs. Lossy: Understanding the Boundary

Lossless compression encodes information more efficiently while preserving every bit exactly. Text, code, JSON, databases—these require lossless compression. Algorithms include gzip, Brotli, Zstandard, LZ4, and Snappy. The math is elegant: if your data has patterns or repeated sequences, lossless compression finds and exploits them.

Lossy compression discards information deemed less important. JPEG throws away some color information your eye can’t perceive. WebP goes further, trading quality for size. Video codecs like H.264 drop temporal data. You use lossy compression when you’re willing to trade perfect fidelity for smaller size.

The boundary matters in system design: compressing a JSON API response requires lossless compression (users will notice if a field vanishes). Serving images to browsers can use lossy compression (users rarely care that JPEG discards some color data).


Lossless Compression Algorithms: Speed vs. Ratio

When you decide to compress text data, you have choices. Each algorithm sits at a different point on the speed-to-compression-ratio spectrum.

gzip: The Universal Standard

gzip has dominated for two decades because it’s universally supported and hits a good balance between speed and compression ratio.

  • Compression ratio: 70-80% for typical JSON (100KB becomes 20-30KB)
  • Speed: Fast compression, moderate-speed decompression
  • Support: Every browser, server, and client library supports gzip
  • Best for: General-purpose HTTP compression, backward compatibility
  • Overhead: Minimal—modern clients decompress transparently

Configuration (nginx):

gzip on;
gzip_types application/json text/css text/javascript application/javascript;
gzip_level 6;  # Balance speed and compression (1=fast, 9=best ratio)
gzip_min_length 1000;  # Don't compress tiny responses

The gzip_level parameter is the key trade-off: level 1 compresses and decompresses very quickly but achieves only 60% reduction. Level 9 achieves 80% reduction but is 3-4x slower to compress.

Brotli: Better Compression, Slower

Brotli was developed by Google specifically to compress text better than gzip, especially for static assets. It’s becoming the standard for content delivery networks and modern browsers.

  • Compression ratio: 80-88% for typical JSON (better than gzip)
  • Speed: Slower compression, similar decompression to gzip
  • Support: Modern browsers and servers support it; older clients don’t
  • Best for: Static assets (JS bundles, CSS, HTML), CDN edge caching
  • Overhead: Higher CPU cost during compression, negligible during decompression

Decision point: Use Brotli for static assets that are compressed once and served many times. Use gzip for dynamic API responses where compression happens on every request.

nginx configuration:

gzip on;
gzip_types application/json text/css text/javascript;
gzip_level 6;

brotli on;
brotli_types application/json text/css text/javascript;
brotli_comp_level 6;

The server will use Brotli if the client advertises support (Accept-Encoding: br), and fall back to gzip otherwise.

Zstandard: The Modern Sweet Spot

Zstandard (zstd) is newer and offers the best compression-to-speed trade-off for many use cases. It’s faster than Brotli but compresses nearly as well.

  • Compression ratio: 75-85% (between gzip and Brotli)
  • Speed: Fast compression and decompression
  • Support: Growing but not yet ubiquitous
  • Best for: Real-time systems, services where compression happens frequently, modern infrastructure
  • Overhead: Moderate CPU, good performance at both ends

Zstandard is seeing adoption in databases, message queues, and modern APIs because it doesn’t force you to choose between speed and compression ratio.

LZ4 and Snappy: Speed Kings

When you need compression for real-time systems where latency is more important than disk space, LZ4 and Snappy shine.

  • Compression ratio: 50-60% (worse than gzip, but acceptable)
  • Speed: Extremely fast compression and decompression (microseconds)
  • Support: Less universal; used in internal systems, databases, message queues
  • Best for: Internal service communication, message queue payloads, real-time systems
  • Overhead: Minimal CPU impact

You’ll see LZ4 in Kafka brokers (producer compresses messages, consumer decompresses rapidly) and in high-frequency trading systems where latency trumps bandwidth.


Where Compression Applies: A System-Wide Perspective

Compression isn’t just for HTTP responses. Think of it as a layer that can sit almost anywhere data moves through your system.

HTTP Response Compression

Standard practice—servers apply compression before sending responses to clients.

Client: GET /api/users
  Accept-Encoding: gzip, br, zstd
Server: 200 OK
  Content-Encoding: gzip
  Content-Length: 45000  (was 500000 before compression)
  [compressed data]

Modern frameworks handle this automatically, but you need to:

  1. Enable compression on your web server (nginx, Apache, etc.)
  2. Set appropriate compression levels (avoid level 9 on dynamic responses—too slow)
  3. Exclude already-compressed formats (images, video, encrypted data)
  4. Set minimum payload sizes (don’t compress 100-byte responses—overhead exceeds savings)

Database Compression

Databases can compress data at multiple levels:

  • Page-level compression (MySQL InnoDB): Compress disk pages at rest. Decompression happens automatically when pages are read into memory.
  • Column compression (PostgreSQL TOAST): Store large text columns separately, compressed.
  • Columnar compression (analytical databases like Parquet): Store data by column instead of row, enabling algorithm-specific compression for each column.

Example (PostgreSQL):

ALTER TABLE large_table SET (compression = pglz);
VACUUM FULL large_table;  -- Rewrite table with compression

This reduces storage, but adds CPU overhead on write. For read-heavy analytical workloads, the trade-off is worthwhile.

Message Queue Compression

Kafka producers can compress messages before sending to brokers:

# Producer config
compression.type=snappy

Options: lz4, snappy, zstd, gzip. Brokers store messages compressed; consumers decompress automatically. This reduces broker disk usage and network transfer between brokers.

Trade-off: CPU overhead on producers and consumers vs. broker storage savings. For high-throughput topics, using Snappy or LZ4 (fast) is better than gzip.

Storage Compression

Filesystems and object stores can compress data:

  • ZFS compression: zfs set compression=lz4 pool/dataset
  • S3 object compression: Upload pre-compressed objects; S3 transparently decompresses on download
  • Archive compression: Old logs and backups compressed before long-term storage

When NOT to Compress

Counter-intuitively, compression isn’t always beneficial.

Already-Compressed Data

JPEG images, MP4 videos, and encrypted data don’t compress further. Applying gzip to a PNG adds CPU overhead with minimal size reduction.

File: image.png (500KB)
After gzip: 480KB (4% reduction)
Wasted CPU cycles: Not worth it

Modern web servers automatically exclude common formats:

gzip_types application/json text/css text/javascript;
# Images, video, compressed archives excluded by default

Very Small Payloads

Compression has overhead: the algorithm initialization, creating huffman trees, etc. For payloads under 1KB, compression overhead can exceed savings.

Payload: 800 bytes
Gzip overhead: ~200 bytes
After gzip: ~600 bytes (25% reduction)
Network time saved: negligible
CPU wasted: unnecessary

Set gzip_min_length to exclude small responses:

gzip_min_length 1000;  # Only compress responses over 1KB

Image Optimization: The Lossy Frontier

Images are often 50-80% of page weight. Lossy compression here yields massive savings.

Format Wars: JPEG vs. WebP vs. AVIF

FormatQualityFile SizeBrowser SupportBest For
JPEGGoodMedium100%Photos, existing standard
WebPExcellentSmall (25-30% smaller than JPEG)~95%Modern browsers
AVIFExcellentSmallest (50% smaller than JPEG)~80%Progressive enhancement
PNGLosslessLarge100%Icons, transparency
SVGScalableSmall100%Icons, diagrams, logos

Modern approach: Serve WebP to supporting browsers, JPEG as fallback.

<picture>
  <source srcset="image.webp" type="image/webp">
  <source srcset="image.avif" type="image/avif">
  <img src="image.jpg" alt="Description">
</picture>

Responsive Images: Right Size for Right Device

Sending a 2MB image for desktop to a mobile user on 3G is wasteful.

<img
  srcset="small.jpg 480w, medium.jpg 1024w, large.jpg 2048w"
  sizes="(max-width: 480px) 100vw, (max-width: 1024px) 80vw, 1024px"
  src="medium.jpg"
  alt="Product"
>

The browser downloads the image that matches the viewport size and pixel density.

Lazy Loading Images

Load images only when they’re about to enter the viewport.

<img src="placeholder.jpg" loading="lazy" alt="Product">

Native lazy loading (supported in 95% of browsers) defers image loading until the image is within 50 pixels of the viewport.


Compression Trade-offs: The CPU-Bandwidth Spectrum

Every compression decision is a trade-off:

ScenarioRecommendationReasoning
Static assets (CDN cached)Brotli, level 11Compressed once, served many times; CPU cost amortized
Dynamic API responsesgzip, level 6Fast compression on every request; good ratio
Real-time systemsLZ4 or SnappyLatency critical; 50% compression acceptable
Kafka topicsSnappy or LZ4Speed matters; compression ratio secondary
Database storagezstd or pglzOnce-written, many-times-read; CPU cost acceptable
Already-compressed dataNoneDon’t waste CPU on JPEG, video, encrypted data

Pro tip: Monitor CPU usage on your compression tier. If gzip is consuming 20% CPU, consider LZ4 or zstd for marginal bandwidth savings. Saved CPU could power additional request processing.


Encryption and Compression: A Security Gotcha

Critical warning: Never compress data before encrypting it for transmission. The compressed size leaks information about the plaintext.

The CRIME and BREACH attacks demonstrated this: an attacker observes that adding Authorization: Bearer secret123 causes the compressed size to grow, proving they guessed part of the token. Never compress HTTPS request/response bodies that contain secrets.

Correct approach: Encrypt first (no compression before encryption), then optionally compress after encryption if needed (though encrypted data usually doesn’t compress anyway).

Correct:  plaintext → encrypt → optionally compress
Wrong:    plaintext → compress → encrypt (leaks information)

Most frameworks handle this correctly—HTTP compression happens at the transport layer, after TLS encryption is applied to the payload.


Key Takeaways

  1. Compression is high-ROI: 80-90% size reduction with minimal code changes. Implement this before almost any other optimization.

  2. Algorithm choice matters: gzip for compatibility, Brotli for static assets, zstd for modern systems, LZ4/Snappy for real-time. Measure your use case.

  3. Static vs. dynamic: Pre-compress static assets (level 9 Brotli). Dynamically compress responses with moderate settings (gzip level 6). The CPU cost is different.

  4. Images dominate page weight: Lossy compression (WebP, AVIF) saves 50% or more. Responsive images and lazy loading complete the picture.

  5. Compression has limits: Don’t compress already-compressed data. Don’t compress payloads under 1KB. Don’t compress before encryption.

  6. Measure CPU cost: Compression trades CPU for bandwidth. If CPU is your bottleneck, choose faster algorithms. If bandwidth is the constraint, choose better compression.


Practice Scenarios

Scenario 1: Your API serves 100MB of traffic daily, mostly JSON responses. You’re on a budget and need to reduce bandwidth costs immediately. Would you implement gzip, Brotli, or Zstandard first? Why? What would you measure to decide?

Scenario 2: Your Kafka cluster handles 1 million messages daily, each 5KB. You need to reduce broker disk usage and network transfer between data centers. Message latency is under 100ms and not a constraint. Which compression algorithm would you choose (lz4, snappy, zstd, gzip) and why?

Scenario 3: Your e-commerce site loads 150 product images on the homepage. Mobile users are abandoning the site due to slow load times. You’re already using HTTP/2. What’s your first compression optimization (image format, responsive images, lazy loading) and why?


Up Next

We’ve optimized the network path and reduced payload sizes. But what about resources that aren’t critical right now? In the next section, we explore lazy loading and prefetching—controlling when resources are loaded to optimize both perceived and actual performance.