System Design Fundamentals

Load Balancers & Proxies

A

Load Balancers and Reverse Proxies

Load balancers and reverse proxies are the traffic directors of your infrastructure. They sit in front of your backend services, distributing requests, handling SSL termination, and enabling graceful degradation. This reference covers the most common tools, from software solutions you deploy yourself to fully managed cloud services.

NGINX

NGINX is the world’s most popular reverse proxy and load balancer. Its event-driven architecture handles thousands of concurrent connections efficiently.

Architecture:

NGINX uses an event-driven, asynchronous model rather than spawning a new process per request. One master process spawns worker processes (typically one per CPU core), each capable of handling thousands of concurrent connections. This design makes NGINX exceptionally efficient with memory and CPU.

Key Features:

  • Reverse proxy and load balancer combined
  • Layer 7 (application) and Layer 4 (transport) load balancing
  • SSL/TLS termination and SNI support
  • Static file serving (excellent for static assets)
  • URL rewriting and redirection
  • Caching (reverse proxy cache)
  • Gzip compression
  • HTTP/2 and HTTP/3 support
  • Stream module for TCP/UDP load balancing
  • Minimal configuration (well-documented, concise syntax)

Load Balancing Algorithms:

  • Round-robin (default)
  • Least connections
  • IP hash
  • Random
  • Weighted (assign more weight to powerful servers)

Session Persistence:

  • IP hash (client IP maps to same backend)
  • Cookie-based sticky sessions (less reliable)
  • Custom logic (with NGINX Plus)

NGINX Plus (commercial) adds:

  • Active health checks
  • Session persistence with cookies
  • Dashboard and API
  • JWT authentication
  • Rate limiting and DDoS protection

When to Use: As the front-end reverse proxy for almost any system, for static file serving, for simple load balancing, when you want minimal operational overhead.

Typical Use Cases:

  • API gateway
  • Web server reverse proxy
  • Static file CDN
  • SSL termination
  • Simple load balancing across microservices
  • Development and production

Considerations: Configuration is code (text-based), not GUI. Reloading configuration requires downtime (gracefully, but not zero-downtime). Limited observability compared to modern proxies.

Pro Tip: Start with NGINX for most workloads. It’s simple, fast, and battle-tested. Upgrade to specialized tools only when you need features it lacks.

HAProxy

HAProxy is a high-performance, open-source load balancer and reverse proxy. It excels at Layer 4 (TCP) and Layer 7 (HTTP) load balancing.

Key Features:

  • High-performance TCP/HTTP load balancing
  • Layer 4 and Layer 7 simultaneously
  • Connection draining (graceful shutdown)
  • Health checking (active and passive)
  • Stick tables for session persistence
  • Advanced routing based on URL, hostname, headers
  • Stats dashboard (HTML page with live stats)
  • Very low latency (comparable to NGINX)
  • Excellent for database load balancing (MySQL, PostgreSQL)
  • ACL (Access Control Lists) for sophisticated routing

Configuration:

HAProxy is configured via a single text file. While verbose, it’s powerful and explicit:

frontend http_in
    bind *:80
    acl is_api path_beg /api
    acl is_static path_beg /static
    use_backend api_servers if is_api
    use_backend static_servers if is_static
    default_backend app_servers

backend app_servers
    mode http
    balance roundrobin
    server web1 10.0.0.1:8000
    server web2 10.0.0.2:8000

When to Use: For sophisticated load balancing requirements, when Layer 4 routing is important, for database load balancing, when you need advanced ACL and routing.

Typical Use Cases:

  • Database load balancing
  • Complex request routing (Layer 4 + Layer 7)
  • Multi-protocol load balancing
  • Telecom and financial systems
  • Mission-critical infrastructure

Considerations: Configuration syntax is steep. Less modern than newer proxies. Smaller ecosystem than NGINX. Requires more operational expertise.

Pro Tip: HAProxy is the load balancer of choice for complex routing. If you need IP-level persistence or advanced Layer 4 logic, HAProxy is your answer.

Envoy Proxy

Envoy is a modern Layer 4/Layer 7 proxy designed for microservices architectures. It’s the data plane for service meshes (like Istio).

Architecture:

Envoy is fundamentally different from NGINX/HAProxy. It’s designed to be controlled dynamically via APIs (xDS protocol), not static configuration. This makes it ideal for service meshes where configuration changes constantly.

Key Features:

  • Layer 4 and Layer 7 load balancing
  • Service mesh integration (Istio, Linkerd use Envoy)
  • Advanced traffic management (circuit breaking, retries, timeouts)
  • Distributed tracing (Jaeger, Zipkin integration)
  • Metrics and observability (Prometheus-native)
  • Dynamic configuration (xDS: Cluster Discovery Service, Endpoint Discovery Service, etc.)
  • gRPC load balancing (supports multiple load balancing algorithms per service)
  • Rate limiting
  • Request/response mutation
  • TLS and mTLS support
  • Excellent for HTTP/2 and HTTP/3

When to Use: In service mesh deployments, when you need observability and dynamic configuration, for microservices with frequent topology changes, when using Istio or similar.

Typical Use Cases:

  • Kubernetes service mesh data plane
  • Service-to-service routing
  • Multi-protocol environments (HTTP, gRPC, TCP)
  • Advanced observability (distributed tracing)
  • Zero-trust/mTLS infrastructure

Considerations: Steep learning curve. Configuration is complex (typically managed by control plane like Istio, not by hand). Higher resource overhead than NGINX (due to advanced features). Overkill for simple load balancing.

Pro Tip: Don’t use Envoy unless you’re running a service mesh or have sophisticated dynamic routing needs. It’s powerful but adds operational complexity.

Traefik

Traefik is a cloud-native reverse proxy designed for containerized environments. It auto-discovers services from Docker, Kubernetes, Consul, and other orchestration platforms.

Key Features:

  • Auto-discovery (finds services automatically via Docker labels or Kubernetes annotations)
  • Automatic Let’s Encrypt certificate provisioning
  • HTTP/HTTPS routing with dynamic configuration
  • Web dashboard (friendly UI for monitoring)
  • Middleware system (plugins for auth, rate limiting, circuit breaking)
  • Native Docker and Kubernetes integration
  • Configuration as code (YAML or TOML)
  • Reasonable performance (not as fast as NGINX but sufficient)

Configuration Example (Kubernetes):

Traefik uses Kubernetes ingress and custom CRDs (Custom Resource Definitions):

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: api-ingress
spec:
  rules:
  - host: api.example.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: api-service
            port:
              number: 8000

When to Use: Kubernetes environments, Docker Compose stacks, when you want minimal configuration overhead, when auto-discovery is valuable.

Typical Use Cases:

  • Kubernetes Ingress controller
  • Docker Compose load balancing
  • Cloud-native deployments
  • Microservices in containers
  • Automatic certificate management (Let’s Encrypt)

Considerations: Slower than NGINX for pure performance. Less widely adopted than NGINX or HAProxy. Learning curve for Kubernetes-native deployments.

Pro Tip: Use Traefik in Kubernetes to avoid the complexity of managing certificates and service discovery manually. It just works.

AWS Elastic Load Balancer (ELB) Family

AWS provides three types of load balancers, each for different use cases.

Application Load Balancer (ALB)

ALB is Layer 7 (application layer), designed for HTTP/HTTPS workloads.

Key Features:

  • Layer 7 HTTP/HTTPS routing
  • Path-based routing (/api/* to one target group, /images/* to another)
  • Host-based routing (api.example.com vs www.example.com)
  • Header-based and query parameter-based routing
  • WebSocket and HTTP/2 support
  • Automatic health checks
  • Target groups for flexible backend management
  • CloudWatch metrics and logging
  • Fully managed (AWS handles patching, scaling, availability)

When to Use: Most web applications, microservices on AWS, when you want AWS to manage the load balancer.

Typical Use Cases:

  • Web application load balancing
  • Microservices routing
  • RESTful API load balancing
  • WebSocket applications
  • Multi-tenant SaaS

Considerations: Layer 7 routing adds latency compared to Layer 4. Pricing is per-hour plus per-LCU (Load Capacity Unit). Limited to AWS.

Network Load Balancer (NLB)

NLB is Layer 4 (transport layer), designed for extreme performance and scale.

Key Features:

  • Layer 4 TCP/UDP load balancing
  • Ultra-low latency (microseconds)
  • Millions of requests per second
  • High throughput (Gbps)
  • Static IP addresses (useful for DNS)
  • Connection draining
  • Target groups with health checks
  • Suitable for non-HTTP protocols (gaming, IoT, DNS)

When to Use: When you need extreme performance, when you’re handling millions of requests per second, for non-HTTP protocols, for extreme low-latency requirements.

Typical Use Cases:

  • Gaming platforms
  • IoT message ingestion
  • DNS services (Route 53 uses NLB internally)
  • Financial trading systems
  • Real-time data streaming

Considerations: Layer 4 only (no HTTP routing). More expensive per-unit than ALB. Overkill for most web applications.

Gateway Load Balancer (GLB)

GLB is designed for inline virtual appliances (firewalls, DPI systems, etc.).

Key Features:

  • Balances traffic to virtual appliances
  • Transparent mode (appliance sees actual client IP)
  • Used for security, DPI, inspection

When to Use: Rare. Only when you need inline traffic inspection or transformation.

Considerations: Specialized use case. Not for typical web applications.

Cloudflare Load Balancer

Cloudflare provides a load balancer integrated with their global network and DDoS protection.

Key Features:

  • Global load balancing (with Cloudflare’s 200+ datacenters)
  • Geographic routing (route to nearest datacenter)
  • Health checks (monitor backend availability)
  • Failover (automatic reroute on backend failure)
  • Integrated DDoS protection
  • Integrated with Cloudflare DNS
  • Performance routing (route to fastest backend)

When to Use: When you want global load balancing, when you’re already using Cloudflare for DNS/CDN, when DDoS protection is critical.

Typical Use Cases:

  • Global web applications
  • Disaster recovery (failover between regions)
  • DDoS-protected services
  • CDN + load balancing combo

Considerations: Adds Cloudflare into your critical path. Not for latency-sensitive applications where you need local control.

Load Balancer Comparison Matrix

ToolLayerAuto-DiscoveryHealth ChecksObservabilityBest ForManaged/Self-Hosted
NGINX4 + 7NoPassiveLimited (logs)General-purpose, static filesSelf-hosted (or NGINX Plus)
HAProxy4 + 7NoActive + passiveGood (stats page, logs)Complex routing, databasesSelf-hosted (or managed)
Envoy4 + 7Via xDSActiveExcellent (metrics, tracing)Service mesh, advanced routingSelf-hosted (control plane manages)
Traefik4 + 7Yes (Docker, K8s)ActiveGood (dashboard, metrics)Kubernetes, containersSelf-hosted
ALB7Yes (target groups)ActiveGood (CloudWatch)Web applications on AWSAWS (managed)
NLB4Yes (target groups)ActiveGood (CloudWatch)Extreme scale, non-HTTPAWS (managed)
Cloudflare LB3-7Yes (Cloudflare DNS)ActiveGood (dashboard)Global, DDoS protectionCloudflare (managed)

Load Balancing Algorithms

Round-robin: Each request goes to the next server in sequence. Simple and fair for uniform workloads.

Least connections: Route to the server currently handling the fewest connections. Better for long-lived connections or variable request duration.

IP hash: Client IP determines destination. Ensures the same client always reaches the same server (useful for session affinity).

Least loaded: Route to the server with lowest CPU/memory usage. Requires metrics collection.

Random: Simple and surprisingly effective for distributed systems.

Weighted: Administrators assign weights to servers (more powerful servers get more traffic).

Consistent hashing: Client IP maps to a point on a ring. When servers are added/removed, minimal redistribution occurs. Popular in distributed caches.

Choose based on your workload. Round-robin is usually fine. Least connections helps with connection pooling. IP hash enables session affinity.

Health Checking

Active health checks: Load balancer sends periodic requests (HTTP GET, TCP connect) to each backend. If it fails N times in a row, the backend is marked down.

Passive health checks: Load balancer monitors responses from actual client requests. If they fail consistently, the backend is marked down.

Active checks are more reliable. Passive checks avoid additional network traffic. Most systems use both.

Session Affinity (Sticky Sessions)

Session affinity ensures requests from the same client reach the same backend server. Necessary for applications storing session state locally.

Methods:

  • IP hash (not reliable if clients share IP or use proxies)
  • Cookie-based (load balancer sets a cookie indicating backend; requires client to send it back)
  • Source IP (simple but brittle)

Better approach: Store sessions in Redis or similar external store. Then any backend can serve any client.

Key Takeaways

  • NGINX is the default reverse proxy and load balancer. Use it first for most workloads. Simple, fast, well-documented.
  • HAProxy excels at complex routing and database load balancing. Choose it when you need Layer 4 sophistication.
  • Envoy is for service meshes and microservices with dynamic topology. Overkill otherwise.
  • Traefik is the Kubernetes-native choice. Auto-discovery and automatic certificates are valuable.
  • AWS ALB/NLB are managed services. Excellent for AWS-native applications. ALB for HTTP, NLB for extreme scale.
  • Cloudflare for global load balancing with DDoS protection.

Most new systems should start with NGINX or (if on Kubernetes) Traefik. Upgrade to specialized tools only when specific requirements demand it.

Quick Reference: Which Load Balancer to Use?

I’m building a simple web application: Use NGINX or ALB (if on AWS).

I need Kubernetes Ingress: Use Traefik or NGINX Ingress Controller. Avoid managing load balancing yourself.

I have complex routing requirements (database-level, Layer 4 logic): Use HAProxy.

I’m running a service mesh (Istio, Linkerd): Envoy is built-in. No separate load balancer needed.

I need global load balancing across regions: Use Cloudflare Load Balancer, AWS Route 53 with ALB, or your cloud provider’s global load balancer.

I need extreme performance (millions of requests/second): Use NLB or HAProxy.

I want zero operational overhead: Use ALB or Cloudflare (managed services).

I’m deploying on-premises in production: Use NGINX or HAProxy (open-source, battle-tested).