System Design Fundamentals

Three-Phase Commit (3PC)

A

Three-Phase Commit (3PC)

From 2PC’s Fatal Flaw to a “Better” Protocol

If you’ve read the previous section on Two-Phase Commit, you know the critical weakness: after the coordinator collects “YES” votes from all participants and before sending the final commit message, it crashes. The participants sit there holding locks, waiting indefinitely for the commit instruction that will never come. They can’t commit because they haven’t heard the final decision. They can’t abort because they already voted yes. This uncertainty window is the Achilles heel of 2PC.

In 1983, computer scientist Dale Skeen proposed an elegantly simple idea: add another phase. By splitting the commit phase into two steps—a preliminary “prepare to commit” announcement followed by the actual commit—you can eliminate that blocking window. If a participant has heard the “we’re all going to commit” message, it knows that all other participants have agreed, so even if the coordinator disappears, it can safely decide to commit.

The Three-Phase Commit protocol seems to solve the problem that haunted 2PC. But as we’ll see, it’s a classic case of solving one problem while leaving the harder one untouched.

The Three Phases Explained

Phase 1: CanCommit

The coordinator sends a CanCommit? request to every participant. This is a lightweight check with no resource locks acquired yet.

Each participant responds with one of:

  • YES: I can commit this transaction without conflict
  • NO: I cannot commit (perhaps due to local constraint violations or unavailable data)

Key insight: Participants don’t hold any locks during this phase. They’re just checking if they could commit if asked to do so. Think of it as a non-binding poll.

Phase 2: PreCommit

If the coordinator receives YES from all participants, it moves to Phase 2.

The coordinator sends a PreCommit message to all participants. Now participants:

  • Acquire necessary locks and resources
  • Write undo/redo logs to stable storage
  • Send back ACK when ready

This is the critical phase that distinguishes 3PC from 2PC. By sending PreCommit, the coordinator is essentially announcing to all participants: “Everyone agreed. We’re proceeding with the commit.”

From this point forward, every participant knows that all others have said yes. This shared knowledge is what enables safety even if the coordinator fails.

If any participant votes NO in Phase 1, the coordinator broadcasts Abort instead of PreCommit.

Phase 3: DoCommit

Once all participants acknowledge PreCommit, the coordinator sends DoCommit (or Abort if recovery timeout occurred).

Participants:

  • Commit the transaction to stable storage
  • Release locks
  • Send final ACK

If the coordinator crashes before Phase 3, participants in the PreCommit state know they must commit because they’ve already confirmed all participants agreed.

The Wedding Ceremony Redux

Let’s extend our 2PC wedding analogy. In 2PC, after both bride and groom say “I do,” if the officiant faints, nobody knows their legal status. Are they married? Nobody can say.

In 3PC, there’s an additional ritual: after both parties say “I do,” the officiant publicly announces: “I have confirmed both parties have agreed. We will now solemnize this marriage.” This announcement is broadcast to all witnesses (participants).

Now, if the officiant faints immediately after this announcement, what happens? Every witness heard the announcement. So a backup officiant (or anyone else) can step forward and complete the ceremony with confidence. The marriage is valid because the necessary agreement was publicly confirmed.

This is what eliminates the uncertainty window—not magical elimination of the failure itself, but distributed knowledge that makes the decision recoverable.

The Mechanics: State Machines and Timeout-Based Recovery

Coordinator State Machine

Start

CanCommit phase
  ├→ All YES → PreCommit phase
  └→ Any NO → Abort

  PreCommit phase
  ├→ All ACK → DoCommit phase
  └→ Timeout or NO → Abort

  DoCommit phase

  Committed

Participant State Machine

Initial

CanCommit received → (check locally)
  ├→ YES → Waiting for PreCommit
  └→ NO → Abort

  PreCommit received → (acquire locks, log state)
     ├→ ACK → Precommitted (can safely commit if coordinator fails)
     └→ Can't lock → Abort

     DoCommit received → Commit

     Committed

The Elimination of the Uncertainty Window

In 2PC, the dangerous zone is between “all participants vote YES” and “commit message received.” A crash here leaves everyone hung.

In 3PC, a participant moves from “uncertain” to “certain to commit” when it enters the Precommitted state. Why? Because reaching Precommitted means:

  1. The participant itself voted YES
  2. The participant received PreCommit from the coordinator
  3. PreCommit is only sent if all participants voted YES

Therefore, a participant in Precommitted state can safely assume that all other participants also voted YES. Even without hearing from the coordinator, it can commit.

Timeout-Based Recovery Protocol

If a participant is in Precommitted state and doesn’t hear DoCommit within a timeout:

  1. It tries to contact other participants
  2. If any other participant is in Precommitted state, it knows the transaction was universally agreed upon → commit
  3. If all other participants are in initial or waiting state (they never reached Precommitted), it’s safe to abort

This is the mechanism that makes 3PC non-blocking.

The Complexity: A Mermaid Message Flow

sequenceDiagram
    participant Coord as Coordinator
    participant P1 as Participant 1
    participant P2 as Participant 2

    Coord->>P1: CanCommit?
    Coord->>P2: CanCommit?
    P1-->>Coord: YES
    P2-->>Coord: YES

    rect rgb(200, 220, 255)
    Note over Coord: All voted YES<br/>Now sends PreCommit
    end

    Coord->>P1: PreCommit
    Coord->>P2: PreCommit

    rect rgb(200, 255, 200)
    Note over P1,P2: Critical point: if Coord fails here,<br/>both know all agreed and can commit
    end

    P1-->>Coord: PreCommit ACK
    P2-->>Coord: PreCommit ACK

    Coord->>P1: DoCommit
    Coord->>P2: DoCommit
    P1-->>Coord: Commit ACK
    P2-->>Coord: Commit ACK

The Theoretical Advantage (In Theory)

In a synchronous network with no network partitions, 3PC is non-blocking. By “non-blocking,” we mean:

  • No participant is ever left holding locks indefinitely
  • Crash failures can be recovered from through timeouts and inter-participant coordination
  • Progress is guaranteed even if the coordinator dies

This is a genuine improvement over 2PC in those specific conditions.

The Fatal Flaw: Network Partitions

Here’s where 3PC hits a fundamental brick wall: it fails catastrophically under network partitions.

Imagine this scenario:

  1. All three participants vote YES in Phase 1
  2. Coordinator sends PreCommit to all participants
  3. A network partition occurs: Coordinator and P1 are isolated from P2

Now:

  • P2 is in Precommitted state. It knows all participants agreed.
  • Coordinator and P1 are isolated. The coordinator times out waiting for P2’s ACK and aborts.
  • P2 times out waiting for DoCommit. It contacts the coordinator and P1, gets no response, but can contact each other locally. Believing all agreed, it commits.
  • P1, on recovery, doesn’t know if the transaction committed or aborted.

Result: P2 committed a transaction while the other side aborted it. Atomicity violated.

This is not a bug in the protocol design—it’s a fundamental impossibility. This is precisely what the FLP impossibility result (Fischer, Lynch, Paterson, 1985) tells us: in an asynchronous network with even one possible process failure, there is no consensus protocol that is both safe (never violates correctness) and live (always makes progress).

3PC achieves liveness (non-blocking) at the expense of safety when partitions occur.

3PC vs 2PC: The Trade-off Table

Aspect2PC3PC
Blocking windowAfter YES votes and before commitMuch smaller; eliminated if PreCommit received
Network assumptionsCan handle single failures but gets blockedRequires synchronous, partition-free network
Practical failure handlingBlocked indefinitely if coordinator crashes between phasesNon-blocking in synchronous model; catastrophic under partitions
Message complexity2 rounds (4 messages per participant)3 rounds (6 messages per participant)
LatencyLowerHigher
ImplementationSimplerMore complex; timeout logic required
Real-world applicabilityBetter than 3PC due to failure patternsRare; network partitions are common

A Concrete Example: The Partition Problem

Let’s walk through a 3PC execution with a network partition:

Initial state: Database transaction across Bank A and Bank B. Transfer 100 units from A to B.

Phase 1 (CanCommit):

  • Coordinator: “Can you commit?”
  • Bank A: “YES”
  • Bank B: “YES”

Phase 2 (PreCommit):

  • Coordinator: “Prepare to commit”
  • Bank A: ACK (locks held, logs written)
  • Bank B: ACK (locks held, logs written) ← At this exact moment, a network partition occurs

Network partition: Bank B is isolated from Coordinator and Bank A.

Coordinator’s view:

  • It waits for Bank B’s ACK to Phase 2 (arrives before the partition was complete, so it has the ACK)
  • Actually, let me revise: the partition occurs such that Bank B never sends the Phase 2 ACK back
  • Coordinator times out after waiting for 5 seconds
  • Coordinator concludes: “Bank B didn’t respond. Abort the transaction.”
  • Sends Abort to Bank A

Bank A’s view:

  • Receives Abort from Coordinator
  • Rolls back the transaction

Bank B’s view:

  • Sent ACK for PreCommit, now waiting for DoCommit
  • Times out after 10 seconds
  • Checks the recovery protocol: “Contact other participants to see if all agreed”
  • Can’t reach Coordinator or Bank A (partition)
  • Timeout protocol says: “In PreCommitted state, if you can’t reach the coordinator, commit anyway because all must have agreed”
  • Bank B commits the transaction

Result:

  • Bank A rolled back: A still has its 100 units
  • Bank B committed: B now has 100 additional units
  • Atomicity violation: The transfer completed on one side and didn’t complete on the other

This is the scenario 3PC cannot handle.

Why the Industry Largely Skipped 3PC

Despite its elegance, 3PC remains rarely used in real systems. Here’s why:

1. Network partitions are the common failure mode The scenarios 3PC solves (coordinator crash without partition) are less common than the scenarios it doesn’t solve (partitions). In the real world, timeouts often indicate a partition, not a clean crash.

2. The FLP impossibility is unavoidable You cannot build a consensus protocol that is both safe and live in an asynchronous network. 3PC chose liveness (non-blocking) at the expense of safety under partitions. The opposite trade-off (like Paxos) is often preferable.

3. Consensus-based alternatives are better Modern distributed systems use Paxos Commit or Raft-based commit protocols, which handle partitions gracefully by leveraging consensus. These are more robust and have better theoretical guarantees.

4. Application-level patterns are more flexible Many systems use Sagas (compensating transactions) or event-driven architectures, which abandon the strict atomicity guarantee for better availability and partition tolerance. This is often the right trade-off for modern scale.

Pro Tip: When 3PC Might Still Be Relevant

In highly controlled environments where you can guarantee:

  • Reliable, low-latency networks (LANs, data centers with dedicated links)
  • Fast failure detection
  • Synchronous operation assumptions

…3PC can be a reasonable choice. Databases within a single organization’s data center, for instance. But even then, consensus-based approaches offer better guarantees with less complexity.

State Space: The Complete Picture

For a transaction T across participants P1, P2, …, Pn:

Distributed state at each phase:

  • CanCommit phase: (Waiting, Waiting, …, Waiting)
  • After all YES votes: (CanCommit-ACK, CanCommit-ACK, …, CanCommit-ACK)
  • During PreCommit phase: (Precommitted, Waiting, Waiting, …)
  • After all PreCommit ACKs: (Precommitted, Precommitted, …, Precommitted) ← Safety guarantee point
  • After DoCommit: (Committed, Committed, …, Committed)

The safety property of 3PC: If any participant reaches Committed state, all participants must eventually reach Committed state (barring permanent failures).

This property holds as long as there are no network partitions.

Key Takeaways

  • 3PC eliminates the blocking window of 2PC by adding a PreCommit phase that signals universal agreement before the commit decision is made.
  • Participants in Precommitted state can unilaterally decide to commit if the coordinator disappears, because they know all participants agreed.
  • The extra phase costs latency: 3PC requires 3 round trips instead of 2, increasing transaction completion time.
  • Network partitions break 3PC’s safety guarantees: A partition can cause one partition to commit while another aborts, violating atomicity—precisely what the FLP impossibility theorem predicts.
  • The industry largely abandoned 3PC in favor of consensus-based protocols (Paxos, Raft) or application-level patterns (Sagas) that better address real-world failure modes.
  • 3PC is historically significant because it revealed the fundamental trade-off between liveness (non-blocking) and safety under asynchronous failures—a lesson that shaped modern distributed systems design.

Practice Scenarios

Scenario 1: The Multi-Datacenter Transfer You’re designing a payment system that debits from a customer’s account in Datacenter A and credits a merchant’s account in Datacenter B. Your team suggests using 3PC to ensure atomicity.

What would you need to assume about the network to use 3PC safely? What happens if Datacenter A and B temporarily lose connectivity? How would you redesign this system to handle partitions better?

Scenario 2: The Timeout Configuration Challenge In a 3PC implementation, you need to set timeout values for:

  • How long the coordinator waits for participants to respond to CanCommit?
  • How long the coordinator waits for PreCommit ACKs?
  • How long a participant waits before deciding the coordinator has failed?

What are the trade-offs in setting these values too short vs. too long? How would you choose different timeouts for different phases?

Scenario 3: Recovery After Partition Healing Your 3PC system experiences a network partition. When the partition heals and participants can communicate again, what state inconsistencies might exist? How would you detect and recover from them? Why is this problem fundamentally harder in 3PC than in Paxos-based systems?

From Strict Atomicity to Eventual Consistency

The industry’s move away from 3PC represents a paradigm shift in how we think about distributed transactions. Rather than insisting on strict atomicity at the cost of availability, modern systems often embrace eventual consistency through Sagas, event sourcing, or other application-level patterns.

The Saga pattern, which we explore next, represents the opposite philosophy: give up the guarantee of atomicity in exchange for availability and partition tolerance. It’s a trade-off that’s proved far more practical for large-scale distributed systems than trying to bend the laws of physics with protocols like 3PC.


Did you know? The name “Three-Phase Commit” is actually somewhat misleading. Some protocols add a fourth phase for garbage collection and recovery. The essence of what makes 3PC special—the separated PreCommit phase—is what matters, regardless of how many phases you count.