System Design Fundamentals

Data Transfer Costs

A

Data Transfer Costs

The Hidden Tax of Cloud Architecture

Your architecture spans two regions for “global redundancy.” Services in us-east-1 communicate with services in eu-west-1 for backup and disaster recovery. It feels right — spread across the globe, resilient to regional failures.

Then the monthly AWS bill arrives: $5,000 for data transfer alone. That’s more than your compute costs.

You investigate. Of the 500GB transferred daily across regions, 80% is internal service communication — microservice A talking to microservice B, not customer-facing traffic. That cross-region redundancy that felt like good architecture is actually a 5,000-dollar-per-month tax.

Data transfer is the hidden cost multiplier of cloud computing. The basic principle is deceptively simple: ingress is free, but egress costs money. What catches teams by surprise is how these costs compound in distributed architectures.

The Asymmetry: Why Ingress is Free

Cloud providers have a business reason for this pricing model. Free ingress encourages data migration to their platform. But egress — that’s the lock-in mechanism. Once your data is in AWS, you pay to move it anywhere else.

But the asymmetry reflects real infrastructure costs. Ingress traffic comes from customers and partners already paying for their outbound connections. Egress requires the cloud provider to purchase expensive international bandwidth. Cross-region traffic requires peering agreements and backbone connectivity. It’s genuinely expensive to move data globally.

However, this means your architecture decisions directly impact your bill. A microservice architecture that minimizes internal communication costs less than one that doesn’t.

The Data Transfer Tier Hierarchy

Data transfer pricing depends on where the data goes:

PathCost per GBExample
Within same AZ (private IP)FreeEC2 to EC2 in us-east-1a
Cross-AZ (same region)$0.01 each directionEC2 in us-east-1a to us-east-1b
Cross-region$0.02us-east-1 to eu-west-1
To Internet (0-10TB/month)$0.09EC2 to external API
To Internet (10-100TB/month)$0.085EC2 to external API at scale
CloudFront to Internet$0.085Delivered via CDN

The costs escalate quickly. That’s why architectural decisions around co-location matter.

The NAT Gateway Trap

Here’s a gotcha that catches many teams: NAT Gateways charge for data processing in addition to standard data transfer charges.

When you push data through a NAT Gateway to reach the internet, you pay:

  1. NAT Gateway hourly charge: $0.045/hour
  2. Data processing charge: $0.045 per GB processed
  3. Plus standard data transfer charges: $0.09 per GB to internet

That same 1GB of data to the internet costs $0.09 directly from an EC2 instance with an Elastic IP, but $0.225 through a NAT Gateway ($0.045 + $0.09). That’s 2.5x more expensive.

And in VPC-to-VPC communication via NAT Gateway (a common pattern for service-to-service communication across VPCs), you pay:

  1. Data processing in the source NAT Gateway: $0.045/GB
  2. Standard cross-VPC charge: $0.01/GB (if different AZ)
  3. Data processing in the destination NAT Gateway: $0.045/GB

Suddenly that service-to-service call costs $0.10/GB just for the NAT processing, before you add standard transfer costs.

Pro tip: VPC Endpoints provide private connectivity to AWS services (S3, DynamoDB, SQS, SNS, etc.) without traversing the internet or NAT Gateway. A query from EC2 to S3 via VPC Endpoint incurs no data transfer charges. This often saves thousands monthly if you’re doing lots of S3 operations.

Content Delivery Networks as Cost Optimization

You might think CDNs are a luxury for high-traffic sites. They’re actually a cost optimization tool.

Serving static assets directly from EC2: $0.09/GB to customers on the internet.

Serving the same assets through CloudFront: $0.085/GB at volume, but with aggressive caching, you might serve 90% of requests from CloudFront edge caches. That means only 10% of requests hit your origin.

Example: 1TB of static assets per month served to global users.

  • Direct from EC2: 1,000GB × $0.09 = $90
  • CloudFront (90% cache hit): (100GB × $0.09 origin) + (900GB × $0.085 edge) = $85.50

In this scenario, CloudFront is actually cheaper, not more expensive. And that’s before factoring in the compute savings from reduced origin requests.

But the real savings come from bandwidth reduction. CDN edge caching means:

  1. Users get faster downloads (edge server is geographically closer)
  2. Your origin servers get fewer requests (less compute, less data transfer from your VPC)
  3. You pay less for global distribution

For a mobile app downloading 50MB per user with 1M monthly active users, CloudFront might reduce your transfer costs by $400,000 per month while improving user experience.

Architecture Patterns That Minimize Transfer Costs

1. Co-locate Services That Communicate Frequently

If microservice A calls microservice B 1,000 times per request, and each call sends 10KB of data:

  • Same AZ: 1,000 × 10KB × 1M requests/day = 10TB/day × $0 = $0
  • Cross-AZ: 10TB/day × $0.01 = $100/day = $3,000/month
  • Cross-region: 10TB/day × $0.02 = $200/day = $6,000/month

The co-location difference is dramatic. In a typical microservices architecture, 70-80% of traffic is internal service-to-service communication. Where you place these services directly impacts cost.

2. Use VPC Endpoints for AWS Service Access

Instead of accessing S3 or DynamoDB through the internet gateway (via NAT Gateway), use VPC Endpoints:

Service-to-Service Query Path (with NAT Gateway):
VPC → NAT Gateway → Internet Gateway → S3
Cost: $0.045 (NAT processing) + $0.045 (NAT data) = $0.09/GB

Service-to-Service Query Path (with VPC Endpoint):
VPC → VPC Endpoint → S3 (private AWS network)
Cost: $0/GB

For a service making 10 billion requests to S3 monthly (retrieving 100GB of data), VPC Endpoints save you:

  • 10B requests × $0.0000004 per request (estimate) = $4,000
  • 100GB × $0.09/GB = $9,000
  • Total: ~$13,000/month

3. Compress Data Before Transfer

Compression is one of the highest-ROI optimizations. Most structured data compresses to 30-50% of original size.

A service transferring 100GB of JSON daily:

  • Uncompressed: 100GB/day × $0.01 (cross-AZ) × 30 days = $30,000/month
  • Compressed (50%): 50GB/day × $0.01 × 30 days = $15,000/month
  • Savings: $15,000/month for the cost of a few CPU cycles on compression

4. Cache Aggressively to Reduce Origin Fetches

Every cache miss is data transfer. In a CDN setup:

  • Cache hit: delivered from edge, no origin transfer
  • Cache miss: origin delivers, you pay for transfer

A 10% cache miss rate on 1TB of content = 100GB of origin transfer = $9,000/month to the internet. Improving cache hit rate to 5% cuts that to $4,500/month.

Multi-Region Architecture: The Cost Reality

Multi-region deployments are often justified for disaster recovery or compliance. But the ongoing transfer costs are substantial.

Let’s calculate a realistic scenario:

Setup: Primary region (us-east-1), standby region (eu-west-1)

Traffic Patterns:

  • 1M customer requests/day to primary
  • Each request generates 5KB of response = 5GB outbound to customers
  • Service-to-service communication: 50GB/day per region
  • Database replication: 10GB/day cross-region

Monthly Costs:

ComponentMonthly VolumeUnit CostTotal
Customer responses (primary)150GB$0.09/GB$13.50
Customer responses (secondary)150GB$0.09/GB$13.50
Service-to-service (each region)1,500GB$0.01/GB (cross-AZ)$30
Database replication300GB$0.02/GB (cross-region)$6
Total$63/month

This seems cheap for multi-region. But now add:

  • NAT Gateway processing for services using NAT: +$2,000-5,000/month
  • Cross-region load balancing traffic: +$1,000/month
  • Redundant service communication: 2x the normal service-to-service traffic

Your “cheap” multi-region setup is suddenly $8,000-10,000/month just for data movement.

Real question: Do you need the standby region active all the time? Or can you keep it cold and activate it only during failover? Keeping a standby “warm” (with services running, data replicating) is expensive. Keeping it “cold” (infrastructure exists, but services stopped, data syncs periodically) is much cheaper.

Cost Audit: Where Does Your Data Transfer Cost Hide?

graph TD
    A["Data Transfer Audit"] --> B["Identify High-Volume Paths"]
    B --> C{"Same AZ?"}
    C -->|No| D["Calculate cross-AZ cost"]
    D --> E{"NAT Gateway Involved?"}
    E -->|Yes| F["Add NAT processing cost"]
    E -->|No| G["Check for VPC Endpoints"]
    G --> H{"Could use VPC Endpoint?"}
    H -->|Yes| I["Potential savings identified"]
    C -->|Yes| J["Look for compression opportunities"]
    J --> K["Quantify bandwidth savings"]

Use CloudWatch to identify your high-cost paths:

# Find services with high cross-AZ traffic (via VPC Flow Logs)
aws ec2 describe-flow-logs \
  --filter "Name=resource-type,Values=NetworkInterface" \
  --query 'FlowLogs[*].[FlowLogId,ResourceId]'

# Analyze data transfer by destination in Cost Explorer
aws ce get-cost-and-usage \
  --time-period Start=2024-01-01,End=2024-01-31 \
  --granularity MONTHLY \
  --metrics "UnblendedCost" \
  --filter file://data-transfer-filter.json \
  --group-by Type=DIMENSION,Key=REGION

Key Takeaways

  • Ingress is free; egress costs compound. Every architecture decision that moves data across boundaries has a direct cost. Minimize cross-AZ, cross-region, and cross-to-internet traffic.
  • NAT Gateways have hidden costs. Each GB through NAT costs $0.045 in processing fees alone, on top of standard transfer charges. Prefer direct EC2 Elastic IPs or VPC Endpoints.
  • VPC Endpoints save thousands. Private connectivity to AWS services eliminates data transfer charges. If you access S3 or other AWS services frequently, this is essential.
  • CDN isn’t just for performance. CloudFront is cheaper than direct EC2-to-internet transfer at scale and reduces origin load.
  • Co-location matters. Place services that communicate frequently in the same AZ. Cross-region communication should be minimal and asynchronous when possible.
  • Multi-region is expensive. Truly active multi-region costs $8,000-15,000/month in data transfer alone, before accounting for compute duplication. Justify it carefully.

Practice Scenarios

Scenario 1: The Microservices Surprise

Your microservices architecture has 25 services in a single region. Each service makes an average of 50 requests to other services per customer request. Each request is 5KB. You process 1M customer requests/day. Services are spread across 3 AZs for resilience.

Calculate monthly cross-AZ data transfer costs (internal service communication only, not customer traffic).

Answer: 1M × 50 × 5KB × 30 days = 7,500GB/month cross-AZ. At $0.01/GB each direction (bi-directional), that’s roughly 15,000GB × $0.01 = $150/month. This seems small, but if services were co-located in one AZ, it would be $0. The real question: is the 3-AZ spread worth $150/month? For most teams, that’s cheap resilience. But if your service-to-service traffic is heavier, it compounds.

Scenario 2: The NAT Gateway Decision

A compliance requirement demands all S3 access goes through a specific NAT Gateway in a centralized VPC (for audit logging). This VPC isn’t in the same AZ as your service VPC. Your services make 100M S3 requests monthly, totaling 500GB of data transfer.

Calculate the cost of routing through NAT Gateway vs using a VPC Endpoint directly to S3 in your service VPC.

Answer: NAT Gateway path = 500GB × $0.045 (processing) + 500GB × $0.01 (cross-AZ) = $25,000/month. VPC Endpoint path = $0 (no data transfer through NAT). The solution: use a VPC Endpoint to S3 in the service VPC for actual data transfer, but route API calls through the audit NAT Gateway (small control plane overhead, not data plane). Or negotiate the compliance requirement — audit logging doesn’t require all data to flow through one NAT Gateway.


Next: Implementing cost monitoring and budgeting to catch surprises before they hit your bill.