System Design Fundamentals

OSI and TCP/IP Models

A

OSI and TCP/IP Models

Picture this: you’re at your computer, you type example.com into your browser and hit enter. Within a fraction of a second, a webpage appears. But what actually traveled from your device to a server somewhere across the world, and how did it come back to exactly your computer, in the right order, uncorrupted?

The answer involves multiple layers working together, each one handling a specific part of the job. Some layers care about the physical electricity flowing through cables. Others care about whether the message is intended for your computer or your neighbor’s. Still others care about what you actually asked for — whether you want a webpage, an email, or a video stream. This chapter will show you how these layers work together and why understanding them is essential for building systems that work reliably.

By the end of this chapter, you’ll understand the OSI model (a framework for thinking about networks) and the TCP/IP model (the framework that actually powers the internet). You’ll know how data travels from application to wire and back, what gets added at each step, and crucially, how to debug network problems by understanding which layer might be broken. This connects directly to Chapter 2’s requirements analysis — because now you can estimate latency, identify bottlenecks, and choose the right protocols for your system.

Introducing Layered Architecture

Networks are complicated. A single request involves electrical signals, physical wires, addressing schemes, error detection, encryption, compression, and application-level logic. Rather than trying to solve all these problems at once, engineers organized network communication into layers. Each layer has a specific job and talks to the layers above and below it.

Think of it like building a house. The foundation engineer doesn’t need to understand interior design. The electrician doesn’t need to know about the plumbing. The architect designs the floor plan without micromanaging the wiring. Each layer handles its responsibility independently. This separation allows teams to innovate at one layer without breaking everything else.

The OSI model (Open Systems Interconnection) defines seven layers and has been around since 1984. It’s a theoretical framework that’s incredibly useful for learning and discussing networks. The TCP/IP model is more practical — it’s the actual model the internet uses, with four layers. We’ll learn both because the OSI model teaches you the concepts deeply, while the TCP/IP model shows you how things actually work.

Here’s the thing: the OSI model is like a detailed instruction manual that shows you every single step. The TCP/IP model is like the condensed quick-start guide that works in practice. Both describe the same journey; one is just more granular.

The OSI Seven-Layer Model

Let’s build from bottom to top:

Layer 1: Physical Layer — This is electricity and light. Cables, voltages, fiber optic pulses. When your computer sends data, it converts it into electrical signals that travel through copper wires or light pulses through fiber. If there’s a broken cable or a signal that’s too weak, the problem is at Layer 1.

Layer 2: Data Link Layer — This is about moving data between two devices on the same local network using MAC addresses (Media Access Control addresses — think of them as hardware identifiers). Ethernet, Wi-Fi, and PPP all work here. If your computer and printer are on the same Wi-Fi network, they communicate at Layer 2. This layer handles “how do I talk to my immediate neighbor?”

Layer 3: Network Layer — This is the internet layer. It uses IP addresses to route data across networks globally. It answers “how do I reach a computer thousands of miles away?” Routers live here. When you send data across the internet, Layer 3 figures out the path it takes.

Layer 4: Transport Layer — This layer ensures data arrives reliably and in order. TCP (Transmission Control Protocol) and UDP (User Datagram Protocol) work here. TCP is careful: it checks that everything arrived and in the right order. UDP is fast: it just sends and doesn’t check. Your email cares about TCP. Online games often use UDP.

Layer 5: Session Layer — This manages conversations. Once a connection is established, how is it maintained? How do we know when to close it? This layer manages the session lifecycle. In practice, many developers don’t think about this explicitly — it often happens automatically.

Layer 6: Presentation Layer — This is about translation. Encryption, compression, and format conversion happen here. If data is encrypted for transport, the presentation layer decrypts it. If it’s compressed, it decompresses it. This layer says “the data is traveling as a compressed ZIP, so unzip it before the application sees it.”

Layer 7: Application Layer — This is the software you interact with. HTTP, HTTPS, FTP, SMTP, DNS. When you open a browser and load a webpage, that’s the application layer. When you send an email, that’s the application layer. This is where your code usually lives.

A helpful acronym to remember the layers from bottom to top: Please Do Not Throw Sausage Pizza Away (Physical, Data Link, Network, Transport, Session, Presentation, Application).

The TCP/IP Four-Layer Model

Now, the TCP/IP model is simpler. In practice, some of those seven OSI layers got combined or aren’t as relevant to how the internet actually works:

Network Access / Link Layer — Roughly corresponds to OSI Layers 1 and 2. Physical hardware and local network communication. How do we physically send and receive bits?

Internet Layer — OSI Layer 3. IP, routing, addressing. How do we reach distant networks?

Transport Layer — OSI Layer 4. TCP and UDP. How do we ensure data gets there, reliably or fast?

Application Layer — OSI Layers 5, 6, and 7 combined. HTTP, SMTP, DNS, everything the user cares about.

Comparing the Two Models

AspectOSI (7 Layers)TCP/IP (4 Layers)
PurposeEducational framework, vendor-neutral standardPractical, describes actual internet architecture
How it startedDefined by ISO in 1984Emerged from ARPANET work in the 1970s
LayersPhysical, Data Link, Network, Transport, Session, Presentation, ApplicationNetwork Access, Internet, Transport, Application
Real-world useUsed to discuss and debug networks; vendor agnosticActually powers the internet
ComplexityMore granular, sometimes redundantLeaner, focused

The TCP/IP model won the “which model to actually use” competition. Today, most of the internet uses TCP/IP. The OSI model is invaluable for learning and for precise conversations about where problems occur, but when engineers talk about network layers in practice, they’re usually thinking TCP/IP.

Pro tip: When debugging a network issue, the OSI model is your friend. If you know which layer the problem is at, you know exactly which part of the system to inspect.

A Postal Service Analogy

Imagine you’re sending a letter to a friend across the country. This is exactly like how networks move data:

You (the application) write a letter with your message. Before you send it, you want privacy, so you put the letter in a sealed envelope and write a secret code on it (encryption — Presentation Layer). You address the envelope with their street address (Network Layer), put it in a larger package with a delivery label (Transport Layer), and hand it to the postal service (Session/Transport). The postal service loads your package onto a truck (Data Link Layer), and electricity moves through the truck’s lights, engine, and controls as it drives (Physical Layer).

On the receiving end, the process reverses. The truck delivers the package (Physical). Someone at the post office reads the routing information (Data Link). The package travels through the country’s network of postal facilities using the address (Network). It arrives at the right building (Transport). The recipient opens the package (Session), unseals the envelope (Presentation), and reads your letter (Application).

Each layer doesn’t care about the others. The postal truck driver doesn’t know what’s in the letter. You don’t need to know how electricity works to send mail. Yet all the layers work together perfectly.

Walking Through a Real HTTP Request

Let’s trace what happens when you request a webpage. Your browser makes an HTTP request to fetch example.com:

Layer 7 (Application): Your browser creates an HTTP request: GET / HTTP/1.1 with headers and a destination of example.com.

Layer 6 (Presentation): The request might be compressed and/or encrypted (if it’s HTTPS). Let’s say it’s encrypted. The plaintext becomes ciphertext.

Layer 5 (Session): A TCP session is established. The browser and the server perform a “three-way handshake” to confirm they’re both ready to talk.

Layer 4 (Transport): TCP divides the encrypted HTTP request into smaller chunks called segments. Each segment gets a header with source and destination port numbers. Your browser gets a random port (like 54321). The server listens on port 80 (HTTP) or 443 (HTTPS). The segment header includes “from port 54321 to port 443.”

Layer 3 (Network): IP adds another header with source and destination IP addresses. Your computer’s IP address and the server’s IP address. These segments become packets. Now a router can look at the IP addresses and decide where to send the packet next.

Layer 2 (Data Link): Ethernet or Wi-Fi adds a frame header with source and destination MAC addresses. This frame only moves one hop (from your computer to your router, or from router to router). The MAC addresses change at each hop, but the IP addresses stay the same. Segments become frames.

Layer 1 (Physical): The frame is converted into electrical signals (copper wire) or light pulses (fiber optic) and sent.

Here’s what that layering looks like:

graph TB
    A["Application Layer<br/>HTTP Request GET /"] -->|encrypt| B["Presentation Layer<br/>Encrypted HTTP Request"]
    B -->|add TCP header<br/>Port: 54321 → 443| C["Transport Layer<br/>TCP Segment"]
    C -->|add IP header<br/>192.168.1.5 → 93.184.216.34| D["Network Layer<br/>IP Packet"]
    D -->|add Ethernet header<br/>MAC A → MAC B| E["Data Link Layer<br/>Ethernet Frame"]
    E -->|convert to signals| F["Physical Layer<br/>Electrical Signals"]
    F -->|over the internet| G["Router receives signal"]
    G -->|converts back| H["Physical Layer"]
    H -->|reads Ethernet| I["Data Link Layer"]
    I -->|reads IP| J["Network Layer"]
    J -->|reads TCP| K["Transport Layer"]
    K -->|decrypts| L["Presentation Layer"]
    L -->|parses HTTP| M["Application Layer<br/>Server receives GET /"]

This process is called encapsulation — each layer wraps the data from the layer above it in its own header. On the receiving end, we decapsulate — peel off each layer’s header.

Did you know? The combined headers add around 40-60 bytes to your request, depending on encryption and other factors. If you’re sending thousands of requests, those overhead bytes add up. That’s why protocols like HTTP/2 and HTTP/3 work hard to batch requests and reuse connections.

Why Layering Matters for System Design

Understanding layers helps you become a better engineer. Here’s why:

Debugging: If a user says “the website is slow,” you ask yourself: is it a Layer 7 problem (bad code), Layer 4 problem (packet loss), or Layer 3 problem (routing issues)? Each has completely different fixes.

Choosing protocols: Should you use TCP or UDP? TCP ensures reliability but is slower. UDP is fast but doesn’t guarantee delivery. Understanding what each layer does helps you pick the right tool.

Estimating latency: Layer 1 problems involve light speed (under 200 microseconds per hop). Layer 4 problems involve retransmissions (milliseconds). Layer 7 problems involve database queries (seconds). Knowing which layer contributes which overhead helps you build realistic timelines.

Scalability: If your system is bottlenecked at Layer 4 (many connections), you might implement connection pooling. If it’s bottlenecked at Layer 7 (database queries), you might add caching. The layer tells you the solution space.

TCP/IP in Practice vs. OSI in Theory

Here’s the honest truth: nobody talks about “Layer 5” in everyday engineering. The TCP/IP model is what matters in practice. But here’s why the OSI model is still useful:

When someone says “layer,” they might be vague. By knowing the OSI model precisely, you can ask clarifying questions: “Are we talking about Layer 3 routing, or Layer 4 congestion?” This precision saves debugging time.

The OSI model also reveals potential problems. For instance, encryption (Layer 6) breaks certain Layer 3 optimizations, because routers can’t inspect the encrypted content to make smarter routing decisions. By understanding the layers, you see why some solutions clash with others.

Abstraction leaks: Sometimes a lower layer’s implementation details leak up and affect higher layers. For example, TCP’s flow control (Layer 4) affects your application’s performance (Layer 7). Knowing this, you might implement application-level buffering. Without layer awareness, you’d be confused why your application was mysteriously slow.

Key Takeaways

  • Networking is layered: Each layer handles a specific concern. Physical wires, local addressing, global routing, reliable delivery, and application logic all operate independently.
  • OSI (7 layers) is the educational model: Physical, Data Link, Network, Transport, Session, Presentation, Application. Use this when you need precision or when learning.
  • TCP/IP (4 layers) is the practical model: Network Access, Internet, Transport, Application. This is what actually powers the internet.
  • Encapsulation moves data down: Each layer adds a header. Decapsulation moves it back up, peeling off headers until the application sees the original message.
  • Layer awareness aids debugging: Know which layer your problem lives in, and you know how to fix it.
  • Different layers have different tradeoffs: Speed vs. reliability (TCP vs. UDP), encryption vs. efficiency, direct routing vs. optimized paths.

Practice Scenarios

Scenario 1: A user can’t reach your website. Their phone is on Wi-Fi, the Wi-Fi is connected to the internet, but they get “connection refused.” Walk through the layers. Where might the problem be? (Hint: think about Layer 4 — is the server listening on the right port?)

Scenario 2: An online multiplayer game is playable, but players report “lag.” A colleague suggests “we need more bandwidth.” You suspect it’s actually a Layer 4 issue (TCP’s reliability mechanisms causing retransmissions). How would you investigate which one is correct?

Scenario 3: You’re building a system that streams video. Should you use TCP or UDP at Layer 4? What are the tradeoffs? (Hint: video can handle losing a frame, but can’t handle packets arriving out of order.)

Bridging to the Next Chapter

Now you understand how data moves through network layers, but there’s still a mystery: when you type example.com, how does your computer know what IP address to ask for? How does example.com translate to 93.184.216.34? That’s DNS — the Domain Name System — and it’s the focus of our next chapter. DNS is actually an application layer (Layer 7) protocol, but it’s so foundational that it deserves deep exploration. Understanding DNS will show you how the entire internet’s directory works, and how to design systems that respond quickly to name lookups.