Skip to content
Back to blog BGP in the Cloud – A Deep Dive into AWS Direct Connect, Routing Pathologies, and What Breaks in Production

BGP in the Cloud – A Deep Dive into AWS Direct Connect, Routing Pathologies, and What Breaks in Production

AWSNetworking

TL;DR

  • AWS Direct Connect is just BGP, but AWS makes several opinionated choices that can surprise experienced network engineers
  • Active/active DX designs will not behave symmetrically unless you explicitly manipulate BGP attributes
  • AS_PATH prepending is often the wrong tool – Local Preference and selective advertisements work better
  • Blackholing during DX ↔ VPN failover is usually caused by route propagation timing, not AWS bugs
  • You must design for BGP convergence, not link availability – these are different failure modes

Introduction

Most cloud engineers treat AWS Direct Connect (DX) as “a faster VPN”.

That mental model breaks the moment you:

  • Run multiple DXs
  • Add backup VPNs
  • Introduce multiple on-prem routers
  • Or try to control traffic directionally

Under the hood, DX is pure eBGP, but AWS enforces a very specific routing model that behaves differently from traditional MPLS or data-centre peering.

This post is a production-grade deep dive into how BGP actually works with AWS Direct Connect, what breaks, and how to design hybrid routing focused on failure behaviour, not happy paths.

AWS Direct Connect – What You’re Actually Getting

At the control plane level, AWS DX gives you:

  • Private Virtual Interface (VIF) → VPC routing
  • Public VIF → AWS public IP ranges
  • Transit VIF → TGW (multi-VPC)

All of them run eBGP:

  • You supply an ASN (private or public)
  • AWS uses ASN 7224 (public VIF) or a regional private ASN
  • IPv4 and IPv6 are separate sessions

Key constraint:

AWS does not allow you to influence its Local Preference directly.

That single design choice explains ~70% of DX “mysteries”.

BGP Route Selection – AWS vs Traditional Networks

Standard BGP decision order (simplified):

  1. Highest Local Preference
  2. Shortest AS_PATH
  3. Lowest MED
  4. eBGP over iBGP
  5. Lowest IGP cost
  6. Oldest route
  7. Lowest router ID

What AWS Ignores or Fixes

AttributeAWS Behaviour
Local PreferenceFixed internally
MEDIgnored
CommunitiesLimited + AWS-specific
AS_PATHHonoured
Prefix lengthHonoured

This means:

  • You cannot bias inbound traffic inside AWS using Local Pref
  • AS_PATH prepending only affects AWS → on-prem, not vice versa
  • MED is useless on DX

Active / Active Direct Connect – Why Traffic Isn’t Balanced

A common design:

DC Router A ── DX1 ── AWS
DC Router B ── DX2 ── AWS

Expectation:

  • “Traffic should split evenly”

Reality:

  • AWS will pick one path per prefix
  • ECMP is not guaranteed
  • You often get per-AZ or per-prefix stickiness

What Actually Works

Selective prefix advertisement beats prepending.

Example:

  • DX1 advertises 10.0.0.0/17
  • DX2 advertises 10.0.128.0/17

This forces deterministic ingress paths.

Trade-off: operational complexity vs predictable routing.

DX + VPN Failover – The Blackhole Problem

The classic failure:

  1. DX link drops
  2. VPN is “up”
  3. Traffic still dies for 30–120 seconds

Why This Happens

  • BGP withdrawal on DX
  • AWS routing tables update before TGW propagation
  • VPN routes exist but are not yet preferred
  • Packets get dropped during convergence

This is expected BGP behaviour, not an AWS bug.

Designing Proper Failover – Metrics That Matter

Forget link status. Design for:

MetricWhy
BGP Hold TimeDetermines failure detection speed
Route propagation delayAWS internal
TGW attachment updatesSlowest step
Client retry behaviourOften overlooked

Practical Defaults That Work

# BGP timers for DX
neighbor x.x.x.x timers 10 30
  • Keepalive: 10s
  • Hold: 30s
  • Faster detection without flapping risk

Terraform – Production Direct Connect Setup

DX Gateway + Transit VIF

resource "aws_dx_gateway" "this" {
  name            = "prod-dx-gw"
  amazon_side_asn = 64512  # Dedicated ASN per environment
}

resource "aws_dx_transit_virtual_interface" "tgw" {
  name               = "prod-transit-vif"
  dx_gateway_id      = aws_dx_gateway.this.id
  vlan               = 101
  address_family     = "ipv4"
  bgp_asn            = 65001  # On-prem ASN

  amazon_address     = "169.254.100.1/30"
  customer_address   = "169.254.100.2/30"
}

Importance:

  • Dedicated ASN per env prevents route leaks
  • /30 link-local avoids overlap
  • Transit VIF scales far better than Private VIF sprawl

On-Prem BGP – What to Advertise (and What Not To)

DO

Aggregate aggressively

Advertise only what AWS must see

Use explicit prefix lists

ip prefix-list AWS-OUT seq 10 permit 10.0.0.0/16
router bgp 65001
  neighbor aws prefix-list AWS-OUT out

DON’T

  • Advertise default routes blindly
  • Leak RFC1918 ranges you don’t own
  • Depend on AS_PATH prepending for primary control

Monitoring – What You Actually Need Visibility On

CloudWatch metrics alone are insufficient.

You want:

  • BGP session state (on-prem)
  • Prefix count drift
  • Route flap detection
  • TGW route table changes

Tools that actually help:

  • BIRD / FRR + Prometheus exporter
  • ThousandEyes for path validation
  • AWS Reachability Analyzer (limited but useful)

Gotchas & Pitfalls

1. Route Leaks Across Environments

Shared DX + shared ASN = eventual disaster.

Fix: unique ASN per environment.

2. TGW Route Table Explosion

Each VPC attachment adds propagation latency.

Fix: isolate critical paths into dedicated TGWs.

3. IPv6 Is a Different Beast

Separate sessions, separate failures.

Fix: treat IPv6 as first-class, not “later”.

4. MTU Mismatches

DX supports jumbo frames, VPN often doesn’t.

Fix: clamp MSS on VPN failover paths.

Results From a Real Migration

From a 3-DC hybrid estate → AWS:

MetricBeforeAfter
Median latency42ms18ms
Failover time~3–5 min~45s
Route incidents2–3 / quarter0 in 12 months
Ops overheadHighLow

The biggest gain wasn’t speed – it was predictability.

Alternatives & Trade-Offs

OptionProsCons
DX onlyLow latencyNo instant failover
VPN onlySimpleLatency + jitter
DX + VPNResilientOperational complexity
SD-WANDynamicCost + vendor lock-in

Conclusion

AWS Direct Connect doesn’t fail often.

When it does, your BGP design decides whether users notice.

If you treat DX as “a cable”, you’ll get burned. If you treat it as distributed routing, you’ll sleep better.

Next deep dive:

  • BGP Communities on AWS
  • Multi-Region DX with TGW peering
  • Traffic steering without prepending

References

Found this helpful?

Comments