Reserved vs On-Demand Instances

The Commitment Decision

Your production database runs 24/7, 365 days a year. An r5.xlarge instance costs $0.192/hour on-demand pricing. Over one year, that’s $0.192 × 8,760 hours = $1,682 in annual compute costs.

A 1-year reserved instance for the same r5.xlarge costs $1,066 upfront. Savings: 37%.

A 3-year reserved instance costs $674 upfront. Over three years, that’s $224/year or 60% cheaper than on-demand pricing.

But here’s the catch: what if your workload shrinks and you no longer need an r5.xlarge? What if your application architecture changes and you migrate to a different instance type? With a reserved instance, you’re committed. You can sell unused capacity on the AWS Marketplace, but you lose money.

Commitment decisions are the core of cost optimization strategy. You’re trading flexibility for savings. This chapter teaches you when to commit, when to stay flexible, and how to optimize across the commitment spectrum.

The Commitment Spectrum

Cloud pricing offers multiple options, arrayed on a spectrum from maximum flexibility to maximum savings:

On-Demand (No Commitment): You pay full hourly rate. An r5.xlarge costs $0.192/hour, period. You can spin up or shut down instances at any time. Maximum flexibility, maximum cost. Best for unpredictable workloads, new applications, testing environments.

Savings Plans (Flexible Commitment): You commit to spending a certain amount per hour for 1 or 3 years. For example, you commit to $100/month on Compute Savings Plan. You can use that $100 toward any instance type, any region (within a commitment type). Discount: 30–42%. Moderate flexibility, moderate savings. Best for flexible baseline capacity.

Reserved Instances (Specific Commitment): You reserve a specific instance type in a specific region for 1 or 3 years. An r5.xlarge in us-east-1 is reserved; you get a 40–60% discount. You’re committed to that instance type in that region. Can be exchanged or sold on the Marketplace if you change your mind, but with some friction. Best for steady-state production workloads with stable requirements.

Spot Instances (Interruptible, Cheapest): You bid for unused capacity at a discount (up to 90% off on-demand pricing). The cloud provider can terminate your instance anytime if they need the capacity back. Unpredictable, ultra-cheap. Best for fault-tolerant batch jobs, non-critical workloads, cost-insensitive work.

Here’s a visual representation:

┌─────────────────────────────────────────────────────────┐
│ Flexibility vs Cost Trade-off                           │
├─────────────────────────────────────────────────────────┤
│ On-Demand    → Savings Plan → Reserved Instance → Spot │
│ (100% cost)  → (70% cost)   → (40% cost)        → (10%) │
└─────────────────────────────────────────────────────────┘

When to Use Each Tier

Different workload types benefit from different pricing tiers:

Workload Type	Best Option	Reason
Production baseline (steady-state)	Reserved Instances (1-3 year)	Predictable, long-term, maximum savings
Production with seasonal spikes	Reserved (baseline) + On-Demand (spikes)	Fixed baseline, flexible overflow
New application (first 3 months)	On-Demand or Savings Plan	Usage patterns unknown, pivot likely
Batch processing jobs	Spot instances	Fault-tolerant, interruptible okay, cost-sensitive
Development/testing	On-Demand	Ephemeral, cost doesn’t matter much
Flexible production	Savings Plans	Comfortable with commitment, want flexibility on instance type

The key question: Is this workload steady-state and predictable, or variable and unpredictable? Steady-state → commitment. Variable → flexibility.

Did you know? Combining reservation types is optimal. Most mature applications use a mixed fleet: 60–70% reserved for baseline, 20–30% on-demand for overflow, 5–10% spot for non-critical work. This balances cost and flexibility.

AWS Savings Plans vs Reserved Instances

AWS offers two types of commitments, and the distinction matters:

Reserved Instances (RIs): You reserve a specific instance type (e.g., r5.xlarge) in a specific region (e.g., us-east-1) for 1 or 3 years. If you use that instance type in that region, you get the discount. If you switch to a different instance type or region, you lose the discount on the moved workload.

Standard RIs are cheaper but locked to instance type. Convertible RIs let you exchange to a different instance type (with some restrictions) but cost slightly more.

Savings Plans: You commit to spending a certain amount per hour across any instance type (Compute Savings Plans) or any instance family (EC2 Instance Savings Plans) within a commitment type (standard or convertible). More flexible, slightly less discount than standard RIs but more adaptable to changing workloads.

Example comparison for a $0.192/hour workload (r5.xlarge):

Option	Upfront (1 year)	Hourly	Annual Cost	Savings
On-Demand	$0	$0.192	$1,682	—
Standard RI (1-year)	$1,066	$0	$1,066	37%
Convertible RI (1-year)	$1,143	$0	$1,143	32%
Savings Plan (1-year)	$1,150	$0	$1,150	32%

Standard RIs are cheapest for locked-in workloads. Savings Plans offer flexibility. The choice depends on how confident you are in your workload’s stability.

Pro tip: Use AWS Cost Explorer to get automatic RI recommendations. It analyzes your usage patterns and recommends which instances to reserve, often finding savings you’d miss manually.

GCP Committed Use Discounts and Azure Reservations

GCP and Azure offer commitment pricing similar to AWS:

GCP Committed Use Discounts (CUDs): Commit for 1 or 3 years to get discounts on Compute Engine instances (30–37% for 1 year, 55–70% for 3 years). More flexible than AWS RIs—you can exchange between instance types and regions within a commitment tier. The flexibility comes at a slight cost premium versus standard RIs.

Azure Reservations: Similar to AWS RIs. Commit for 1 or 3 years to get discounts (15–40% typically). Also offer flexibility to exchange between instance types.

All three clouds operate under the same principle: commitment delivers savings.

Spot Instances: Ultra-Cheap, Interruptible

Spot instances run on unused data center capacity that cloud providers can reclaim anytime. Discount: up to 90% off on-demand prices.

Spot Pricing Dynamics: Spot price fluctuates based on supply and demand. If many people want m5.large instances, spot price rises. If supply exceeds demand, price drops. You can set a max price and your instance terminates if spot price exceeds it.

Interruption Handling: When spot capacity is reclaimed, you get a 2-minute warning (AWS). Your application should be stateless or checkpointable—losing an instance shouldn’t cause data loss.

Best Practices for Spot:

Use for fault-tolerant workloads: batch jobs, data processing, non-critical services.
Diversify instance types: instead of all m5.large, request m5.large, m5.xlarge, m6i.large. The scheduler can pick any available.
Combine with on-demand fallback: if spot instances aren’t available, fall back to on-demand to maintain availability.
Monitor interruption frequency: if a particular instance type gets interrupted frequently, switch to a different type.

# Kubernetes Pod requesting spot instances with on-demand fallback
apiVersion: v1
kind: Pod
metadata:
  name: batch-job
spec:
  containers:
  - name: job
    image: batch-processor:latest
  affinity:
    nodeAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
      - weight: 100
        preference:
          matchExpressions:
          - key: karpenter.sh/capacity-type
            operator: In
            values: ["spot"]
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: karpenter.sh/capacity-type
            operator: In
            values: ["spot", "on-demand"]

Reservation Coverage Analysis: The 70-20-10 Rule

How should you allocate your fleet across pricing tiers? A practical starting framework:

70% Reserved: For your predictable, steady-state baseline capacity. If you consistently run 100 instances, reserve 70. Lowest cost.

20% On-Demand: For predictable peaks and traffic spikes. Reserve extra capacity that you use frequently but not constantly.

10% Spot: For non-critical work, batch jobs, or additional savings if you can tolerate interruptions.

This balance gives you 37–45% average savings versus pure on-demand pricing while maintaining flexibility for spikes and changes.

You adjust this ratio based on your risk tolerance and workload characteristics:

More conservative (higher risk aversion): 70/20/10 or 60/30/10
More aggressive (tolerates some risk): 80/10/10 or 75/15/10

The goal: maximize savings while maintaining availability and performance.

Reservation Planning: Coverage and Utilization Metrics

Before committing to reservations, measure two key metrics:

Coverage: What percentage of your resource consumption is covered by reservations? If you run 100 instances daily but only have 40 reserved, your coverage is 40%. If you have 80 reserved, your coverage is 80%.

Higher coverage means more savings. But over-reserving wastes money if workload shrinks.

Utilization: Do you actually use the resources you’ve reserved? If you have 80 reserved instances but only use 70 daily, your utilization is 87.5%. Wasted 12.5% capacity.

AWS Cost Explorer shows both metrics. Ideal state: 90–95% utilization on 70–80% coverage. This leaves room for spikes while minimizing waste.

Reserved Instance Metrics Dashboard
┌─────────────────────────────┐
│ Coverage: 75% (good)        │
│ Utilization: 92% (healthy)  │
│ Unused Reserved: 6 instances│
│ Potential Savings: $8,500   │
└─────────────────────────────┘

Combining Reserved Instances with Auto Scaling

Here’s where reserved instances and Auto Scaling create an optimal cost profile:

Set your Auto Scaling Group’s minimum capacity equal to your reserved instance count. When load increases and Auto Scaling spins up additional instances, those extra instances run on-demand pricing. When load decreases, Auto Scaling terminates on-demand instances first, keeping reserved instances running.

Example: You reserve 50 instances. Your ASG min/max is 50/200. Under normal load, you run 50 instances (all reserved, discounted). During a traffic spike, you auto-scale to 150 instances (50 reserved + 100 on-demand). You get the benefit of discounted baseline capacity and flexible overflow.

Load Pattern Over Time
┌─────────────────────────────────────────────┐
│ 200 ├─ Spot/On-demand (spike capacity)     │
│     │                    ╱╲                 │
│ 125 ├─ On-demand        ╱  ╲                │
│     │                  ╱    ╲               │
│  50 ├─ Reserved ────────      ──────────── │
│     └──────────────────────────────────────┤
│       Monday  Tuesday  Wednesday Thursday  │
└─────────────────────────────────────────────┘

The Risk of Over-Reserving

Here’s where commitment becomes dangerous: if you reserve 100 instances expecting 100 daily usage, but actual usage drops to 60, you’re paying for 40 unused instances indefinitely. Each month, you’re “locked in” to that cost.

Solutions:

Measure first, reserve later: Before committing for 3 years, run for 2–3 months on-demand, measure, then reserve.
Start conservative: Reserve 60% of your baseline, use on-demand for the rest. Once you’re confident, increase reservation percentage.
Use the Marketplace: AWS Marketplace lets you sell unused reserved instances. You won’t recover 100% of the cost, but you can recoup 50–70%.
Convertible RIs: More expensive than standard, but if your instance type needs change, you can exchange without loss.

Multi-Cloud Cost Optimization

If you operate across AWS, GCP, and Azure, you have an additional optimization lever: use the cheapest cloud for each workload.

Example: Your batch processing job runs 10 hours/day on 50 instances. Where is it cheapest?

Cloud	Instance Type	Hourly Cost	Monthly (300 hours)
AWS	m5.large (on-demand)	$0.096	$1,440
GCP	n2-standard-2 (on-demand)	$0.084	$1,260
Azure	Standard B2s (on-demand)	$0.068	$1,020

Running on Azure saves $420/month versus AWS. With 50 instances × $420 = $21,000/month savings for this workload alone.

The trade-off: operational complexity of multi-cloud management. But for cost-sensitive workloads, it’s worth considering.

Key Takeaways

Commitment saves money but reduces flexibility: Reserved instances offer 40–60% savings versus on-demand. Use for steady-state workloads; stay on-demand for variable workloads.
Savings Plans offer more flexibility than Reserved Instances: If your instance type might change, Savings Plans let you commit to spending with flexibility on what you buy.
Spot instances are for fault-tolerant workloads: Up to 90% savings, but the cloud provider can terminate you anytime. Use for batch jobs, data processing, non-critical services.
The 70-20-10 rule balances cost and flexibility: 70% reserved, 20% on-demand, 10% spot is a reasonable starting point for most applications.
Measure before committing: Use AWS Cost Explorer or cloud provider tools to analyze usage patterns and get recommendations. Reserve only what you’re confident you’ll use.
Combine reservations with Auto Scaling: Set minimum capacity to your reserved instance count and let Auto Scaling handle spikes with on-demand overflow.

Practice Scenarios

Scenario 1: Your application fleet consists of 100 m5.large instances running 24/7, plus 20 additional instances that run only during business hours (9 AM to 5 PM, 5 days/week). Calculate the optimal mix of on-demand, 1-year reserved, and 3-year reserved instances to minimize cost while maintaining flexibility for growth.

Scenario 2: You operate a batch processing workload that runs on 200 c5.2xlarge instances for 8 hours every night. Spot instances average $0.15/hour (70% off on-demand). On-demand c5.2xlarge costs $0.34/hour. Calculate the annual cost difference between (a) pure on-demand, (b) pure spot, and (c) a hybrid approach with spot and on-demand fallback.

Scenario 3: Your reserved instance utilization is 65% (you reserve capacity you don’t fully use). Your coverage is 50% (only half your instances are reserved). Should you buy more reservations or reduce existing reservations? What questions would you ask before deciding?

Next: We’ve covered compute pricing, right-sizing, and commitment decisions. But compute is only part of the cost picture. Storage cost optimization involves different strategies—particularly choosing the right storage class and data lifecycle policies. That’s Chapter 24’s focus, but let’s mention it here: storage costs compound over time, and a poor data lifecycle policy can add thousands to your monthly bill.