Skip to content
Back to blog NAT Gateway Alternatives - Cutting Your AWS Bill Without Losing Sleep

NAT Gateway Alternatives - Cutting Your AWS Bill Without Losing Sleep

AWSNetworking

NAT Gateway Alternatives - Cutting Your AWS Bill Without Losing Sleep

NAT Gateways are AWS’s best-kept profit center. They’re easy to set up, fully managed, and quietly drain your budget at $0.045/hour plus $0.045/GB of data processed.

Run the numbers on a moderately busy workload - 1TB of outbound traffic per month - and you’re looking at $77/month. Per NAT Gateway. Per AZ. For something that just routes packets.

In one environment I worked on, NAT Gateway costs were 40% of the total AWS bill. Not compute. Not storage. NAT Gateways.

Let’s fix that.

TL;DR

  • NAT Gateways cost $0.045/hour + $0.045/GB - adds up fast
  • NAT instances can cut costs 80%+ but require management
  • VPC endpoints eliminate NAT entirely for AWS services
  • IPv6 removes the need for NAT for many workloads
  • The right solution depends on your traffic patterns and team capacity

Understanding the Cost

Before optimising, understand where the money goes:

NAT Gateway Pricing (us-east-1):
- Hourly charge: $0.045/hour = $32.40/month per NAT Gateway
- Data processing: $0.045/GB

Example: 3 AZs, 2TB outbound/month each
- Hourly: 3 × $32.40 = $97.20/month
- Data: 6TB × $0.045 = $270/month
- Total: $367.20/month just for NAT

And that’s before data transfer charges to the internet ($0.09/GB for the first 10TB).

Where does NAT traffic come from?

Most teams are surprised when they analyze their NAT traffic:

  1. AWS API calls - Every aws s3 cp, ECR image pull, Secrets Manager fetch
  2. Package downloads - npm, pip, apt during builds and deployments
  3. External APIs - Payment providers, SaaS integrations
  4. Logging/monitoring - If you’re shipping to external services
  5. Legitimate application traffic - Your actual workload

Categories 1 and 2 often dominate - and they’re the easiest to eliminate.


Solution 1: VPC Endpoints (Gateway & Interface)

Best for: Eliminating NAT traffic to AWS services

VPC Endpoints let private subnets talk directly to AWS services without going through NAT.

Gateway Endpoints (Free)

S3 and DynamoDB have Gateway Endpoints - completely free, just routing table entries.

# Terraform - S3 Gateway Endpoint
resource "aws_vpc_endpoint" "s3" {
  vpc_id            = aws_vpc.main.id
  service_name      = "com.amazonaws.${var.region}.s3"
  vpc_endpoint_type = "Gateway"

  route_table_ids = [
    aws_route_table.private_a.id,
    aws_route_table.private_b.id,
    aws_route_table.private_c.id,
  ]

  tags = {
    Name = "s3-gateway-endpoint"
  }
}

resource "aws_vpc_endpoint" "dynamodb" {
  vpc_id            = aws_vpc.main.id
  service_name      = "com.amazonaws.${var.region}.dynamodb"
  vpc_endpoint_type = "Gateway"

  route_table_ids = [
    aws_route_table.private_a.id,
    aws_route_table.private_b.id,
    aws_route_table.private_c.id,
  ]

  tags = {
    Name = "dynamodb-gateway-endpoint"
  }
}

Impact: If you’re pulling container images from ECR (which uses S3), this alone can cut NAT traffic by 50%+.

Interface Endpoints (Paid, but cheaper than NAT)

For other AWS services, Interface Endpoints cost $0.01/hour + $0.01/GB - significantly cheaper than NAT’s $0.045/GB.

Priority order for Interface Endpoints:

# High-value endpoints - create these first
locals {
  interface_endpoints = [
    "ecr.api",           # Container registry API
    "ecr.dkr",           # Container registry Docker
    "logs",              # CloudWatch Logs
    "secretsmanager",    # Secrets Manager
    "ssm",               # Systems Manager
    "ssmmessages",       # Session Manager
    "ec2messages",       # SSM agent
    "sts",               # STS for IAM roles
    "kms",               # KMS for encryption
  ]
}

resource "aws_vpc_endpoint" "interface" {
  for_each = toset(local.interface_endpoints)

  vpc_id              = aws_vpc.main.id
  service_name        = "com.amazonaws.${var.region}.${each.value}"
  vpc_endpoint_type   = "Interface"
  subnet_ids          = var.private_subnet_ids
  security_group_ids  = [aws_security_group.vpc_endpoints.id]
  private_dns_enabled = true

  tags = {
    Name = "${each.value}-endpoint"
  }
}

resource "aws_security_group" "vpc_endpoints" {
  name_prefix = "vpc-endpoints-"
  vpc_id      = aws_vpc.main.id

  ingress {
    from_port   = 443
    to_port     = 443
    protocol    = "tcp"
    cidr_blocks = [aws_vpc.main.cidr_block]
  }

  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }
}

Cost comparison for ECR pulls (1TB/month):

Via NAT Gateway:     1000GB × $0.045 = $45.00
Via Interface EP:    1000GB × $0.01  = $10.00 + $7.20 (hourly)
Savings: ~62%

Solution 2: NAT Instances

Best for: Teams comfortable with EC2 management, high-throughput workloads

A NAT instance is just an EC2 instance configured to forward traffic. No per-GB charge - just the instance cost.

Modern NAT Instance Setup

# Use the latest Amazon Linux 2023 AMI with NAT configuration
data "aws_ami" "amazon_linux" {
  most_recent = true
  owners      = ["amazon"]

  filter {
    name   = "name"
    values = ["al2023-ami-*-x86_64"]
  }
}

resource "aws_instance" "nat" {
  ami                         = data.aws_ami.amazon_linux.id
  instance_type               = "t3.micro"  # Start small, monitor
  subnet_id                   = var.public_subnet_id
  associate_public_ip_address = true
  source_dest_check           = false  # Required for NAT

  iam_instance_profile = aws_iam_instance_profile.nat.name

  user_data = <<-EOF
    #!/bin/bash
    # Enable IP forwarding
    echo 1 > /proc/sys/net/ipv4/ip_forward
    echo "net.ipv4.ip_forward = 1" >> /etc/sysctl.conf

    # Configure iptables for NAT
    yum install -y iptables-services
    iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE
    iptables -A FORWARD -i eth0 -o eth0 -m state --state RELATED,ESTABLISHED -j ACCEPT
    iptables -A FORWARD -i eth0 -o eth0 -j ACCEPT
    service iptables save
    systemctl enable iptables
  EOF

  tags = {
    Name = "nat-instance"
  }
}

# Route table for private subnets
resource "aws_route" "nat_instance" {
  route_table_id         = var.private_route_table_id
  destination_cidr_block = "0.0.0.0/0"
  network_interface_id   = aws_instance.nat.primary_network_interface_id
}

Cost Comparison

NAT Gateway (3 AZs, 2TB/month):
- Hourly: 3 × $32.40 = $97.20
- Data: 2000GB × $0.045 = $90.00
- Total: $187.20/month

NAT Instance (t3.small, single AZ):
- Instance: $15.18/month (on-demand)
- Total: $15.18/month

Savings: 92%

The Trade-offs

NAT instances require you to manage:

  1. High availability - Instance failure = no outbound connectivity
  2. Scaling - t3.micro maxes out at ~5Gbps
  3. Patching - It’s your EC2, you patch it
  4. Monitoring - Network throughput, CPU, connections

HA NAT Instance Architecture

For production, run NAT instances in an Auto Scaling Group:

resource "aws_autoscaling_group" "nat" {
  name                = "nat-asg"
  min_size            = 1
  max_size            = 1
  desired_capacity    = 1
  vpc_zone_identifier = [var.public_subnet_id]

  launch_template {
    id      = aws_launch_template.nat.id
    version = "$Latest"
  }

  health_check_type         = "EC2"
  health_check_grace_period = 120

  tag {
    key                 = "Name"
    value               = "nat-instance"
    propagate_at_launch = true
  }

  lifecycle {
    create_before_destroy = true
  }
}

# Lambda to update route table when instance replaces
resource "aws_lambda_function" "nat_failover" {
  filename         = "nat_failover.zip"
  function_name    = "nat-route-failover"
  role             = aws_iam_role.nat_failover.arn
  handler          = "index.handler"
  runtime          = "python3.11"

  environment {
    variables = {
      ROUTE_TABLE_ID = var.private_route_table_id
    }
  }
}

Solution 3: IPv6

Best for: Modern architectures, eliminating NAT entirely

IPv6 addresses are globally routable - no NAT needed. AWS provides them free.

Enabling IPv6

resource "aws_vpc" "main" {
  cidr_block                       = "10.0.0.0/16"
  assign_generated_ipv6_cidr_block = true

  tags = {
    Name = "main-vpc"
  }
}

resource "aws_subnet" "private" {
  vpc_id                          = aws_vpc.main.id
  cidr_block                      = "10.0.1.0/24"
  ipv6_cidr_block                 = cidrsubnet(aws_vpc.main.ipv6_cidr_block, 8, 1)
  assign_ipv6_address_on_creation = true

  tags = {
    Name = "private-subnet"
  }
}

# Egress-only internet gateway for IPv6
resource "aws_egress_only_internet_gateway" "main" {
  vpc_id = aws_vpc.main.id
}

resource "aws_route" "private_ipv6" {
  route_table_id              = aws_route_table.private.id
  destination_ipv6_cidr_block = "::/0"
  egress_only_gateway_id      = aws_egress_only_internet_gateway.main.id
}

The Catch

Not everything supports IPv6:

  • Many third-party APIs are IPv4-only
  • Some AWS services don’t have IPv6 endpoints
  • Legacy applications may not handle dual-stack

Hybrid approach: Use IPv6 for AWS-to-internet traffic, keep a small NAT Gateway for IPv4-only destinations.


Solution 4: Architectural Changes

Sometimes the best NAT optimization is not needing NAT.

Move builds to public subnets

CI/CD runners pulling packages don’t need to be in private subnets:

# GitLab Runner in public subnet with no private data
[[runners]]
  executor = "docker"
  [runners.docker]
    network_mode = "host"
    # Runner in public subnet, direct internet access

Use ECR pull-through cache

Instead of pulling from Docker Hub (through NAT), cache in ECR:

# Create pull-through cache rule
aws ecr create-pull-through-cache-rule \
  --ecr-repository-prefix docker-hub \
  --upstream-registry-url registry-1.docker.io

# Pull via ECR (through VPC endpoint, no NAT)
docker pull 123456789.dkr.ecr.us-east-1.amazonaws.com/docker-hub/nginx:latest

Pre-bake AMIs and container images

Don’t download packages at runtime:

# Bad: Downloads at every deploy
FROM node:20
RUN npm install

# Good: Dependencies in image
FROM node:20 as builder
COPY package*.json ./
RUN npm ci

FROM node:20-slim
COPY --from=builder /node_modules ./node_modules

Use S3 for artifact distribution

Instead of downloading from the internet:

# Upload build artifacts to S3 (via gateway endpoint)
aws s3 cp build.zip s3://my-artifacts/

# Download in private subnet (no NAT needed)
aws s3 cp s3://my-artifacts/build.zip .

Decision Framework

ScenarioRecommendation
Mostly AWS API callsVPC Endpoints (Gateway + Interface)
High throughput, ops capacityNAT Instances
New/modern architectureIPv6 with minimal NAT fallback
Cost-critical, low trafficSingle NAT Gateway + VPC Endpoints
Multi-AZ HA requiredNAT Gateway (accept the cost)

For most production environments:

# 1. Gateway endpoints (free) - always
resource "aws_vpc_endpoint" "s3" { ... }
resource "aws_vpc_endpoint" "dynamodb" { ... }

# 2. Interface endpoints for heavy AWS services
resource "aws_vpc_endpoint" "ecr_api" { ... }
resource "aws_vpc_endpoint" "ecr_dkr" { ... }
resource "aws_vpc_endpoint" "logs" { ... }

# 3. Single NAT Gateway for remaining traffic
resource "aws_nat_gateway" "main" {
  # One NAT Gateway, not three
  # Accept ~5 min failover during AZ issues
  # Use for actual internet-bound traffic only
}

# 4. Enable IPv6 for future flexibility
resource "aws_vpc" "main" {
  assign_generated_ipv6_cidr_block = true
}

Result: 60-80% cost reduction with minimal operational overhead.


Monitoring NAT Costs

Set up alerts before costs spiral:

resource "aws_cloudwatch_metric_alarm" "nat_bytes" {
  alarm_name          = "nat-gateway-high-throughput"
  comparison_operator = "GreaterThanThreshold"
  evaluation_periods  = 1
  metric_name         = "BytesOutToDestination"
  namespace           = "AWS/NATGateway"
  period              = 86400  # Daily
  statistic           = "Sum"
  threshold           = 107374182400  # 100GB/day

  dimensions = {
    NatGatewayId = aws_nat_gateway.main.id
  }

  alarm_actions = [aws_sns_topic.alerts.arn]
}

Use VPC Flow Logs to identify what’s generating traffic:

# Query flow logs for NAT traffic
aws logs filter-log-events \
  --log-group-name vpc-flow-logs \
  --filter-pattern "[version, account, eni, srcaddr, dstaddr, srcport, dstport, protocol, packets, bytes, start, end, action, status]" \
  --query 'events[*].message' \
  | grep "NAT-gateway-eni"

Conclusion

NAT Gateways are convenient but expensive. For most workloads:

  1. Start with VPC Endpoints - Free for S3/DynamoDB, cheap for other AWS services
  2. Analyze your traffic - Know what’s going through NAT before optimising
  3. Consider NAT instances - If you have ops capacity and high throughput
  4. Enable IPv6 - Future-proof your architecture

The “right” answer depends on your traffic patterns, team capacity, and risk tolerance. But doing nothing and paying $0.045/GB is almost never the right answer.


References

Found this helpful?

Comments