System Design Fundamentals

Stateless vs Stateful

A

Stateless vs Stateful

The Scaling Problem: Why Your Logged-In Users Keep Getting Logged Out

Imagine a user logs into your e-commerce website through Server A. Everything works perfectly—they browse products, add items to their cart, and proceed to checkout. But somewhere in the middle of the checkout flow, their next request gets routed to Server B due to load balancing. Suddenly, they’re logged out. Their cart is empty. The user hits the back button, logs in again, and now they’re on Server C. This frustrating experience happens because the servers don’t know about each other’s conversations with the user.

This is the fundamental tension in horizontal scaling: when we add more servers to handle more traffic, we create a new problem. Each server becomes an island, maintaining its own memory of user sessions, shopping carts, and preferences. This works fine with one server, but breaks immediately with two. The solution lies in understanding the difference between stateless and stateful architectures—and when to use each.

In this chapter, we’ll explore how to build systems that scale horizontally without losing information, and how to know when a stateful service is actually the right choice.

Understanding State in Distributed Systems

Let’s define our terms precisely, because “state” means something specific in system design.

State is any information a server remembers between requests. If Server A knows that User Alice has a shopping cart with 3 items, and Server B doesn’t have that information, then the cart lives in Server A’s state. Common examples include: user sessions, database connection pools in memory, cache data, preferences loaded from a configuration file, or a WebSocket connection with an active user.

A stateless service receives every request and contains all the information needed to process it. The server doesn’t rely on previous requests or stored memory. Each request is completely independent. If User Alice wants to add an item to her cart, the request must include her user ID, authentication token, and the current cart contents. The server reads this information, modifies it, and returns the response. The next request—whether it goes to Server A, B, or C—can be processed identically because nothing is stored on the server between requests.

A stateful service remembers previous interactions with clients. When User Bob connects to Server A, that server maintains information about Bob: his authentication status, his session ID, his real-time chat messages, or his WebSocket connection. If Bob’s next request goes to Server B, Server B doesn’t have this information and can’t fulfill the request properly. Stateful services create “sticky” sessions—users must return to the same server.

Session management is the art of keeping users logged in across multiple requests in a web environment. We have several strategies:

  1. Sticky Sessions (Server-Side Sessions): A load balancer ensures each user always hits the same server. This is stateful but simple. The problem? If that server dies, all sessions on that server are lost.

  2. External Session Stores: We store session data in a shared database (usually Redis or Memcached) that all servers can access. Server A stores the session; Server B can retrieve it. This is stateless from the perspective of individual servers, though the session store itself is stateful.

  3. JWT Tokens (Stateless Sessions): We encode session information directly into a cryptographically signed token (a JSON Web Token). The client sends this token with each request. The server validates the signature without needing to look up the session in a database. This is true statelessness—servers maintain zero memory of users.

The profound insight: statelessness enables horizontal scaling. If every server is interchangeable—if any server can process any request from any user—then we can add or remove servers dynamically, distribute traffic evenly, and survive server failures. This is the foundation of modern cloud infrastructure.

The Restaurant Analogy: Stateless and Stateful Service Patterns

Picture two restaurants on the same street.

The first is a drive-through. You pull up to the speaker, order, pay at one window, get your food at another. The experience is identical whether you speak to Employee A, B, or C. They don’t remember you from yesterday. Each transaction is self-contained: you place an order, give payment information, receive food, and drive away. If Employee A calls in sick and Employee D takes their shift, your experience doesn’t change. The drive-through is stateless—each “request” (order) contains all necessary information.

The second is a sit-down restaurant. You arrive and the host seats you at Table 5. Your server, Marcus, remembers that you’re at Table 5 and brings your water. He remembers your appetizer order, then later your main course order. He checks on you in between, and when you’re ready for dessert, he doesn’t need you to repeat everything—he knows what you’ve already eaten. Marcus’s memory of your table is state. If Marcus walks out midshift, a new server (who doesn’t know anything about your table) has to ask you what you’ve already ordered. Users get frustrated. The sit-down restaurant is stateful—servers remember previous interactions.

Both models work, but they have different properties. The drive-through can handle much higher volume with fewer employees because every employee is interchangeable. The sit-down restaurant provides better service and more customization, but staffing is more complex. In system design, we choose based on our requirements.

How Stateless Architectures Actually Work at Scale

Let’s dig into the mechanics. When you build a truly stateless service, where does the state go? It doesn’t disappear—it moves outside the application servers.

In a stateless architecture using external session stores, here’s the flow:

User Login Request

[Load Balancer]

    Server A
       ├─ Validate credentials
       ├─ Create session object
       └─ Send to Redis

    [Redis Session Store]
       └─ Stores: { sessionId: "abc123", userId: 42, cart: [...] }

    Server returns response with sessionId cookie



User's Next Request (same sessionId cookie)

[Load Balancer]

    Server B
       ├─ Reads sessionId from cookie
       ├─ Fetches session from Redis
       ├─ Processes request
       └─ Updates Redis

Server A and Server B are completely interchangeable. They read and write to the shared session store. The load balancer can route user traffic randomly. If Server A goes down, users simply hit Server B on their next request.

JWT tokens take this further by eliminating the external lookup entirely:

User Login Request

    Server A
       ├─ Validate credentials
       ├─ Create JWT: Header.Payload.Signature
       │  where Payload = { userId: 42, name: "Alice", exp: 1234567890 }
       └─ Sign with server's private key

    Server returns JWT in response



User stores JWT in localStorage/cookie



User's Next Request (includes JWT)

[Load Balancer]

    Server C
       ├─ Reads JWT from request headers
       ├─ Verifies signature with public key
       │  (No database lookup needed!)
       ├─ Extracts userId, exp, etc. from payload
       └─ Processes request

With JWTs, servers need zero memory. They just verify the signature and trust the claims inside. This is pure statelessness.

Here’s where stateful services remain necessary: databases and real-time services. A PostgreSQL database is stateful—it remembers data across requests, and you can’t scale it horizontally by adding more database servers without sophisticated replication. WebSocket servers are stateful—they maintain open connections with specific clients. Message queues are stateful—they store messages until consumed. These services become bottlenecks, which is why database scaling is its own chapter (coming next!).

Here’s a simplified diagram of a complete architecture:

graph TB
    Users["Users"]
    LB["Load Balancer"]
    API1["API Server 1 (Stateless)"]
    API2["API Server 2 (Stateless)"]
    API3["API Server 3 (Stateless)"]

    SessionStore["Redis<br/>(Session Store)"]
    DB["PostgreSQL<br/>(Database)"]

    Users -->|"Request with JWT or sessionId"| LB
    LB -->|"Route randomly"| API1
    LB -->|"Route randomly"| API2
    LB -->|"Route randomly"| API3

    API1 -->|"Lookup/store session"| SessionStore
    API2 -->|"Lookup/store session"| SessionStore
    API3 -->|"Lookup/store session"| SessionStore

    API1 -->|"Read/write data"| DB
    API2 -->|"Read/write data"| DB
    API3 -->|"Read/write data"| DB

Converting a Stateful Server to Stateless: A Real Example

Let’s see this in practice with Node.js/Express.

The Stateful Approach (Problematic at Scale):

// server.js - Stateful implementation
const express = require('express');
const app = express();

// This object lives in memory
const sessions = {};

app.post('/login', (req, res) => {
  const { username, password } = req.body;

  // Validate credentials (simplified)
  if (username === 'alice' && password === 'secret') {
    // Create session in THIS server's memory
    const sessionId = Math.random().toString();
    sessions[sessionId] = {
      userId: 42,
      username: 'alice',
      loginTime: Date.now(),
      cartItems: []
    };

    res.cookie('sessionId', sessionId);
    res.json({ success: true });
  } else {
    res.status(401).json({ error: 'Invalid credentials' });
  }
});

app.post('/cart/add', (req, res) => {
  const { sessionId } = req.cookies;
  const { itemId } = req.body;

  // Only works if THIS server has the session
  if (sessions[sessionId]) {
    sessions[sessionId].cartItems.push(itemId);
    res.json({ success: true, cart: sessions[sessionId].cartItems });
  } else {
    res.status(401).json({ error: 'Not logged in' });
  }
});

app.listen(3000);

Problem: If a user logs in on Server A but their next request goes to Server B, the session doesn’t exist on Server B. Session is lost.

The Stateless Approach (With Redis):

// server.js - Stateless implementation
const express = require('express');
const redis = require('redis');
const app = express();

// Connect to shared Redis instance
const redisClient = redis.createClient({
  host: 'redis-server.internal',
  port: 6379
});

app.post('/login', async (req, res) => {
  const { username, password } = req.body;

  if (username === 'alice' && password === 'secret') {
    const sessionId = Math.random().toString();

    // Store session in Redis, not in memory
    const sessionData = {
      userId: 42,
      username: 'alice',
      loginTime: Date.now(),
      cartItems: []
    };

    // Redis key expires after 24 hours
    await redisClient.setex(
      `session:${sessionId}`,
      86400,
      JSON.stringify(sessionData)
    );

    res.cookie('sessionId', sessionId);
    res.json({ success: true });
  } else {
    res.status(401).json({ error: 'Invalid credentials' });
  }
});

app.post('/cart/add', async (req, res) => {
  const { sessionId } = req.cookies;
  const { itemId } = req.body;

  // Look up session in Redis (works from ANY server)
  const sessionData = await redisClient.get(`session:${sessionId}`);

  if (sessionData) {
    const session = JSON.parse(sessionData);
    session.cartItems.push(itemId);

    // Update Redis
    await redisClient.setex(
      `session:${sessionId}`,
      86400,
      JSON.stringify(session)
    );

    res.json({ success: true, cart: session.cartItems });
  } else {
    res.status(401).json({ error: 'Session not found' });
  }
});

app.listen(3000);

The difference is critical: now any server can handle any user request because all servers look up session data in the same place. We can scale from 1 to 100 servers without changing the code.

The JWT Approach (No External Lookup):

// server.js - Stateless with JWT
const express = require('express');
const jwt = require('jsonwebtoken');
const app = express();

const JWT_SECRET = 'super-secret-key-keep-this-safe';

app.post('/login', (req, res) => {
  const { username, password } = req.body;

  if (username === 'alice' && password === 'secret') {
    // Create JWT (no database write needed)
    const token = jwt.sign(
      {
        userId: 42,
        username: 'alice',
        cartItems: []
      },
      JWT_SECRET,
      { expiresIn: '24h' }
    );

    res.json({ token });
  } else {
    res.status(401).json({ error: 'Invalid credentials' });
  }
});

app.post('/cart/add', (req, res) => {
  const authHeader = req.headers.authorization;
  const token = authHeader?.split(' ')[1];

  try {
    // Verify signature (no Redis lookup needed)
    const decoded = jwt.verify(token, JWT_SECRET);

    decoded.cartItems.push(req.body.itemId);

    // Could issue a new token with updated cart, or store cart separately
    res.json({ success: true, cart: decoded.cartItems });
  } catch (err) {
    res.status(401).json({ error: 'Invalid token' });
  }
});

app.listen(3000);

JWTs are fastest because there’s no external I/O. But there’s a trade-off: updating JWT contents requires either reissuing the token or storing mutable state (like cart contents) in a database.

The Trade-offs: Choosing Your Approach

Each approach has costs.

Stateless + Redis: You add a new dependency, so you need to operate a Redis cluster. Redis failures impact your whole system. There’s network latency on every session lookup (usually 1-5ms, but it adds up). You pay for Redis infrastructure. The upside: simple to reason about, scales easily, standard approach.

JWT Tokens: Zero external dependencies means fewer things break. Pure statelessness. The downside: you can’t revoke tokens easily (a logout endpoint won’t help—the client still has a valid token). JWT payload is visible to the client (don’t put sensitive data there). Updating data in the token requires issuing a new token. Best for read-only session data.

Sticky Sessions (Stateful): Simplest to implement. No external store needed. The downside: load balancer complexity, server failures lose sessions, doesn’t scale well horizontally, and load distribution becomes uneven if users vary in traffic intensity.

Hybrid Approaches: Many real systems mix these. Use JWTs for lightweight auth and validation, store cart/preference data in Redis for shopping experiences, keep user account data in the database. Use sticky sessions for WebSocket connections (they’re stateful anyway), but REST endpoints are stateless.

Pro tip: When you add Redis or another session store, you’ve shifted the bottleneck. Now Redis becomes your scaling limit. This is acceptable if Redis can handle 100x more requests than your app servers, but monitor it carefully. Many teams discover that Redis capacity, not app server capacity, limits their scaling.

Key Takeaways

  • Statelessness enables horizontal scaling by making servers interchangeable. Any server can handle any request without prior context.
  • Sessions must move outside application servers (to Redis, database, or JWT tokens) for multi-server architectures to work.
  • JWT tokens are fastest for stateless auth but have payload size limits and can’t be easily revoked.
  • Redis/external stores add latency and dependencies but provide revocation and simple updates.
  • Stateful services (databases, WebSockets, caches) remain necessary and often become scaling bottlenecks—they need their own strategies.
  • Choose your approach based on your data patterns: read-heavy = JWTs, mutable state = Redis, real-time = sticky sessions for WebSockets.

Practice Scenarios

Scenario 1: The Cart Problem You’re building a checkout flow. A user adds items to their cart on the homepage, then navigates to checkout on what might be a different server. Cart contents vanish because they were stored in-memory on the homepage server.

Design a solution that maintains cart state across multiple server instances. What are the trade-offs between storing cart in Redis vs. in a JWT token?

Scenario 2: The Authentication Revocation Challenge Your team deployed your application with JWT tokens. Everything works great. Then you realize that when a user changes their password, their old tokens are still valid for another hour (the token expiration time). An attacker could steal a token and still use it.

How would you solve the authentication revocation problem with JWTs? (Hint: It might involve a small stateful component.) What’s the simplest solution?


From Statelessness to Database Scaling

Now that we’ve conquered horizontal scaling of application servers, we face a deeper challenge: the database. Databases are stateful by nature—they must maintain consistency across all the data written to them. In the next chapter, we’ll explore how to scale database reads (with replicas) and database writes (with sharding), introducing even more complexity as we build toward production-grade architectures.

The principle remains: move state outside of the expensive component (servers), externalize it to a scalable backing service, and make your application servers ephemeral and replaceable.