BGP in the Cloud – A Deep Dive into AWS Direct Connect, Routing Pathologies, and What Breaks in Production

TL;DR

AWS Direct Connect is just BGP, but AWS makes several opinionated choices that can surprise experienced network engineers
Active/active DX designs will not behave symmetrically unless you explicitly manipulate BGP attributes
AS_PATH prepending is often the wrong tool – Local Preference and selective advertisements work better
Blackholing during DX ↔ VPN failover is usually caused by route propagation timing, not AWS bugs
You must design for BGP convergence, not link availability – these are different failure modes

Introduction

Most cloud engineers treat AWS Direct Connect (DX) as “a faster VPN”.

That mental model breaks the moment you:

Run multiple DXs
Add backup VPNs
Introduce multiple on-prem routers
Or try to control traffic directionally

Under the hood, DX is pure eBGP, but AWS enforces a very specific routing model that behaves differently from traditional MPLS or data-centre peering.

This post is a production-grade deep dive into how BGP actually works with AWS Direct Connect, what breaks, and how to design hybrid routing focused on failure behaviour, not happy paths.

AWS Direct Connect – What You’re Actually Getting

At the control plane level, AWS DX gives you:

Private Virtual Interface (VIF) → VPC routing
Public VIF → AWS public IP ranges
Transit VIF → TGW (multi-VPC)

All of them run eBGP:

You supply an ASN (private or public)
AWS uses ASN 7224 (public VIF) or a regional private ASN
IPv4 and IPv6 are separate sessions

Key constraint:

AWS does not allow you to influence its Local Preference directly.

That single design choice explains ~70% of DX “mysteries”.

BGP Route Selection – AWS vs Traditional Networks

Standard BGP decision order (simplified):

Highest Local Preference
Shortest AS_PATH
Lowest MED
eBGP over iBGP
Lowest IGP cost
Oldest route
Lowest router ID

What AWS Ignores or Fixes

Attribute	AWS Behaviour
Local Preference	Fixed internally
MED	Ignored
Communities	Limited + AWS-specific
AS_PATH	Honoured
Prefix length	Honoured

This means:

You cannot bias inbound traffic inside AWS using Local Pref
AS_PATH prepending only affects AWS → on-prem, not vice versa
MED is useless on DX

Active / Active Direct Connect – Why Traffic Isn’t Balanced

A common design:

DC Router A ── DX1 ── AWS
DC Router B ── DX2 ── AWS

Expectation:

“Traffic should split evenly”

Reality:

AWS will pick one path per prefix
ECMP is not guaranteed
You often get per-AZ or per-prefix stickiness

What Actually Works

Selective prefix advertisement beats prepending.

Example:

DX1 advertises 10.0.0.0/17
DX2 advertises 10.0.128.0/17

This forces deterministic ingress paths.

Trade-off: operational complexity vs predictable routing.

DX + VPN Failover – The Blackhole Problem

The classic failure:

DX link drops
VPN is “up”
Traffic still dies for 30–120 seconds

Why This Happens

BGP withdrawal on DX
AWS routing tables update before TGW propagation
VPN routes exist but are not yet preferred
Packets get dropped during convergence

This is expected BGP behaviour, not an AWS bug.

Designing Proper Failover – Metrics That Matter

Forget link status. Design for:

Metric	Why
BGP Hold Time	Determines failure detection speed
Route propagation delay	AWS internal
TGW attachment updates	Slowest step
Client retry behaviour	Often overlooked

Practical Defaults That Work

# BGP timers for DX
neighbor x.x.x.x timers 10 30

Keepalive: 10s
Hold: 30s
Faster detection without flapping risk

Terraform – Production Direct Connect Setup

DX Gateway + Transit VIF

resource "aws_dx_gateway" "this" {
  name            = "prod-dx-gw"
  amazon_side_asn = 64512  # Dedicated ASN per environment
}

resource "aws_dx_transit_virtual_interface" "tgw" {
  name               = "prod-transit-vif"
  dx_gateway_id      = aws_dx_gateway.this.id
  vlan               = 101
  address_family     = "ipv4"
  bgp_asn            = 65001  # On-prem ASN

  amazon_address     = "169.254.100.1/30"
  customer_address   = "169.254.100.2/30"
}

Importance:

Dedicated ASN per env prevents route leaks
/30 link-local avoids overlap
Transit VIF scales far better than Private VIF sprawl

On-Prem BGP – What to Advertise (and What Not To)

DO

Aggregate aggressively

Advertise only what AWS must see

Use explicit prefix lists

ip prefix-list AWS-OUT seq 10 permit 10.0.0.0/16
router bgp 65001
  neighbor aws prefix-list AWS-OUT out

DON’T

Advertise default routes blindly
Leak RFC1918 ranges you don’t own
Depend on AS_PATH prepending for primary control

Monitoring – What You Actually Need Visibility On

CloudWatch metrics alone are insufficient.

You want:

BGP session state (on-prem)
Prefix count drift
Route flap detection
TGW route table changes

Tools that actually help:

BIRD / FRR + Prometheus exporter
ThousandEyes for path validation
AWS Reachability Analyzer (limited but useful)

Gotchas & Pitfalls

1. Route Leaks Across Environments

Shared DX + shared ASN = eventual disaster.

Fix: unique ASN per environment.

2. TGW Route Table Explosion

Each VPC attachment adds propagation latency.

Fix: isolate critical paths into dedicated TGWs.

3. IPv6 Is a Different Beast

Separate sessions, separate failures.

Fix: treat IPv6 as first-class, not “later”.

4. MTU Mismatches

DX supports jumbo frames, VPN often doesn’t.

Fix: clamp MSS on VPN failover paths.

Results From a Real Migration

From a 3-DC hybrid estate → AWS:

Metric	Before	After
Median latency	42ms	18ms
Failover time	~3–5 min	~45s
Route incidents	2–3 / quarter	0 in 12 months
Ops overhead	High	Low

The biggest gain wasn’t speed – it was predictability.

Alternatives & Trade-Offs

Option	Pros	Cons
DX only	Low latency	No instant failover
VPN only	Simple	Latency + jitter
DX + VPN	Resilient	Operational complexity
SD-WAN	Dynamic	Cost + vendor lock-in

Conclusion

AWS Direct Connect doesn’t fail often.

When it does, your BGP design decides whether users notice.

If you treat DX as “a cable”, you’ll get burned. If you treat it as distributed routing, you’ll sleep better.

Next deep dive:

BGP Communities on AWS
Multi-Region DX with TGW peering
Traffic steering without prepending

References