RDS Proxy for Lambda - Solving the Connection Exhaustion Problem
Your Lambda function connects to RDS. Works fine in development. Then you hit production traffic - 500 concurrent executions - and your database falls over with “too many connections.”
This is the Lambda-RDS connection problem. Each Lambda execution creates a new database connection. At scale, you exhaust your database’s connection limit, causing failures across your entire application.
RDS Proxy solves this by sitting between Lambda and your database, pooling and reusing connections. Instead of 500 Lambda executions creating 500 database connections, they share a pool of maybe 50 connections managed by the proxy.
This post covers when to use RDS Proxy, how it works, and complete Terraform setup for production.
TL;DR
- Lambda functions create new DB connections per invocation - doesn’t scale
- RDS Proxy pools connections between Lambda and RDS/Aurora
- Reduces connection count, improves failover handling
- Use IAM authentication from Lambda to the proxy
- Proxy connects to RDS using Secrets Manager credentials
- Lambda must be in a VPC to use RDS Proxy
Code Repository: All code from this post is available at github.com/moabukar/blog-code/rds-proxy-lambda
The Problem: Lambda Connection Exhaustion
Traditional applications maintain a connection pool - open connections at startup, reuse them for requests. Lambda doesn’t work that way:
Traditional App:
┌─────────────┐ 10 connections ┌──────────┐
│ App │ ══════════════════════► │ RDS │
│ (pooled) │ (reused) │ │
└─────────────┘ └──────────┘
Lambda at Scale:
┌─────────────┐ ─┐
│ Lambda 1 │ │
├─────────────┤ │
│ Lambda 2 │ │
├─────────────┤ │ 500 connections ┌──────────┐
│ Lambda 3 │ ├═══════════════════════►│ RDS │
├─────────────┤ │ (new each time) │ 💥 │
│ ... │ │ └──────────┘
├─────────────┤ │
│ Lambda 500 │ │
└─────────────┘ ─┘
Problems:
- Connection limit exhaustion - RDS instances have max connection limits based on instance size (e.g., db.t3.micro = ~85 connections)
- Connection overhead - Each new connection requires TCP handshake, TLS negotiation, authentication
- Cold starts are worse - Establishing DB connections adds latency
- Failover handling - If RDS fails over, Lambda functions holding connections get errors
RDS Connection Limits
Connection limits vary by instance class:
| Instance Class | Max Connections (approx) |
|---|---|
| db.t3.micro | 85 |
| db.t3.small | 170 |
| db.t3.medium | 340 |
| db.r5.large | 1,000 |
| db.r5.xlarge | 2,000 |
With 500 concurrent Lambda executions, even a db.r5.large might struggle.
How RDS Proxy Solves This
RDS Proxy sits between Lambda and your database:
┌─────────────┐ ─┐
│ Lambda 1 │ │
├─────────────┤ │
│ Lambda 2 │ │ 500 Lambda ┌───────────┐ 50 DB ┌──────────┐
├─────────────┤ │ connections │ │ connections │ │
│ Lambda 3 │ ├════════════════════►│ RDS Proxy │═══════════════►│ RDS │
├─────────────┤ │ (to proxy) │ (pool) │ (reused) │ ✓ │
│ ... │ │ └───────────┘ └──────────┘
├─────────────┤ │
│ Lambda 500 │ │
└─────────────┘ ─┘
The proxy:
- Maintains a connection pool to your database
- Multiplexes Lambda requests over fewer database connections
- Reuses connections - no per-request connection overhead
- Handles failovers - automatically reconnects to new primary
- Queues requests when the pool is busy (instead of failing)
When to Use RDS Proxy
Use RDS Proxy when:
- Lambda functions make frequent, short-lived database queries
- You have high concurrency (100+ concurrent executions)
- You’re hitting connection limits
- You need improved failover handling
- You want IAM-based database authentication
Don’t use RDS Proxy when:
- Low concurrency (a few requests per second)
- Long-running transactions (proxy pins connections)
- Cost is a major concern (proxy adds ~$0.015/hour per vCPU of target DB)
- You need features that cause connection pinning (see below)
Connection Pinning - The Important Gotcha
RDS Proxy multiplexes connections - multiple Lambda executions share database connections. But some operations require a dedicated connection. This is called pinning.
When a connection is pinned, that Lambda execution holds a database connection until the session ends. This reduces the effectiveness of pooling.
Operations that cause pinning:
| Operation | Causes Pinning |
|---|---|
| Open transaction | Yes (until COMMIT/ROLLBACK) |
| Temporary tables | Yes |
| User-defined variables | Yes |
| Prepared statements | Depends on settings |
| SET statements | Some |
| LOCK TABLES | Yes |
| Large result sets | Yes (>16KB statement text) |
Best practices to minimise pinning:
# BAD - Transaction stays open across Lambda invocation
def handler(event, context):
conn = get_connection()
cursor = conn.cursor()
cursor.execute("BEGIN")
cursor.execute("INSERT INTO orders ...")
# Connection is pinned until COMMIT
return {"statusCode": 200}
# Never committed! Connection stays pinned.
# GOOD - Complete transactions quickly
def handler(event, context):
conn = get_connection()
cursor = conn.cursor()
try:
cursor.execute("BEGIN")
cursor.execute("INSERT INTO orders ...")
cursor.execute("UPDATE inventory ...")
conn.commit() # Transaction complete, unpinned
except:
conn.rollback()
raise
finally:
conn.close()
return {"statusCode": 200}
Architecture Overview
Here’s what we’ll build:
┌─────────────────────────────────────────────────────────────────────┐
│ VPC │
│ │
│ ┌──────────────────┐ ┌──────────────────┐ ┌─────────────┐ │
│ │ Lambda Function │────►│ RDS Proxy │────►│ RDS │ │
│ │ (private subnet)│ │ (private subnet)│ │ (private) │ │
│ └──────────────────┘ └──────────────────┘ └─────────────┘ │
│ │ │ │ │
│ │ IAM Auth │ │ │
│ ▼ ▼ │ │
│ ┌──────────────────┐ ┌──────────────────┐ │ │
│ │ IAM Role │ │ Secrets Manager │───────────┘ │
│ │ (rds-db:connect) │ │ (DB credentials) │ │
│ └──────────────────┘ └──────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────┘
Authentication flow:
- Lambda → Proxy: IAM authentication (recommended) or Secrets Manager
- Proxy → RDS: Credentials from Secrets Manager
Terraform Implementation
1. VPC and Networking
Lambda and RDS Proxy must be in the same VPC:
# vpc.tf
resource "aws_vpc" "main" {
cidr_block = "10.0.0.0/16"
enable_dns_hostnames = true
enable_dns_support = true
tags = {
Name = "lambda-rds-vpc"
}
}
# Private subnets for Lambda, RDS Proxy, and RDS
resource "aws_subnet" "private" {
count = 2
vpc_id = aws_vpc.main.id
cidr_block = cidrsubnet(aws_vpc.main.cidr_block, 8, count.index)
availability_zone = data.aws_availability_zones.available.names[count.index]
tags = {
Name = "private-${count.index + 1}"
}
}
data "aws_availability_zones" "available" {
state = "available"
}
2. Security Groups
# security-groups.tf
# Lambda security group
resource "aws_security_group" "lambda" {
name = "lambda-sg"
description = "Security group for Lambda functions"
vpc_id = aws_vpc.main.id
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
tags = {
Name = "lambda-sg"
}
}
# RDS Proxy security group
resource "aws_security_group" "rds_proxy" {
name = "rds-proxy-sg"
description = "Security group for RDS Proxy"
vpc_id = aws_vpc.main.id
# Allow inbound from Lambda
ingress {
from_port = 5432
to_port = 5432
protocol = "tcp"
security_groups = [aws_security_group.lambda.id]
description = "PostgreSQL from Lambda"
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
tags = {
Name = "rds-proxy-sg"
}
}
# RDS security group
resource "aws_security_group" "rds" {
name = "rds-sg"
description = "Security group for RDS"
vpc_id = aws_vpc.main.id
# Allow inbound from RDS Proxy only
ingress {
from_port = 5432
to_port = 5432
protocol = "tcp"
security_groups = [aws_security_group.rds_proxy.id]
description = "PostgreSQL from RDS Proxy"
}
tags = {
Name = "rds-sg"
}
}
3. RDS Instance
# rds.tf
resource "aws_db_subnet_group" "main" {
name = "main"
subnet_ids = aws_subnet.private[*].id
tags = {
Name = "main-db-subnet-group"
}
}
resource "aws_db_instance" "main" {
identifier = "lambda-app-db"
engine = "postgres"
engine_version = "15.4"
instance_class = "db.t3.medium"
allocated_storage = 20
max_allocated_storage = 100
storage_type = "gp3"
storage_encrypted = true
db_name = "appdb"
username = "dbadmin"
password = random_password.db_password.result
db_subnet_group_name = aws_db_subnet_group.main.name
vpc_security_group_ids = [aws_security_group.rds.id]
# Required for RDS Proxy
iam_database_authentication_enabled = true
backup_retention_period = 7
skip_final_snapshot = true
deletion_protection = false
tags = {
Name = "lambda-app-db"
}
}
resource "random_password" "db_password" {
length = 32
special = false
}
4. Secrets Manager
RDS Proxy needs database credentials stored in Secrets Manager:
# secrets.tf
resource "aws_secretsmanager_secret" "db_credentials" {
name = "rds-proxy/db-credentials"
description = "Database credentials for RDS Proxy"
}
resource "aws_secretsmanager_secret_version" "db_credentials" {
secret_id = aws_secretsmanager_secret.db_credentials.id
secret_string = jsonencode({
username = aws_db_instance.main.username
password = random_password.db_password.result
engine = "postgres"
host = aws_db_instance.main.address
port = aws_db_instance.main.port
dbname = aws_db_instance.main.db_name
})
}
5. RDS Proxy
# rds-proxy.tf
resource "aws_db_proxy" "main" {
name = "lambda-app-proxy"
debug_logging = false
engine_family = "POSTGRESQL"
idle_client_timeout = 1800
require_tls = true
role_arn = aws_iam_role.rds_proxy.arn
vpc_security_group_ids = [aws_security_group.rds_proxy.id]
vpc_subnet_ids = aws_subnet.private[*].id
auth {
auth_scheme = "SECRETS"
client_password_auth_type = "POSTGRES_SCRAM_SHA_256"
iam_auth = "REQUIRED"
secret_arn = aws_secretsmanager_secret.db_credentials.arn
}
tags = {
Name = "lambda-app-proxy"
}
}
resource "aws_db_proxy_default_target_group" "main" {
db_proxy_name = aws_db_proxy.main.name
connection_pool_config {
connection_borrow_timeout = 120
max_connections_percent = 100
max_idle_connections_percent = 50
}
}
resource "aws_db_proxy_target" "main" {
db_instance_identifier = aws_db_instance.main.identifier
db_proxy_name = aws_db_proxy.main.name
target_group_name = aws_db_proxy_default_target_group.main.name
}
# IAM role for RDS Proxy to access Secrets Manager
resource "aws_iam_role" "rds_proxy" {
name = "rds-proxy-role"
assume_role_policy = jsonencode({
Version = "2012-10-17"
Statement = [{
Action = "sts:AssumeRole"
Effect = "Allow"
Principal = {
Service = "rds.amazonaws.com"
}
}]
})
}
resource "aws_iam_role_policy" "rds_proxy_secrets" {
name = "rds-proxy-secrets"
role = aws_iam_role.rds_proxy.id
policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Effect = "Allow"
Action = [
"secretsmanager:GetSecretValue"
]
Resource = [aws_secretsmanager_secret.db_credentials.arn]
},
{
Effect = "Allow"
Action = [
"kms:Decrypt"
]
Resource = "*"
Condition = {
StringEquals = {
"kms:ViaService" = "secretsmanager.${var.region}.amazonaws.com"
}
}
}
]
})
}
6. Lambda Function
# lambda.tf
resource "aws_lambda_function" "api" {
filename = data.archive_file.lambda.output_path
function_name = "api-handler"
role = aws_iam_role.lambda.arn
handler = "index.handler"
source_code_hash = data.archive_file.lambda.output_base64sha256
runtime = "python3.11"
timeout = 30
memory_size = 256
vpc_config {
subnet_ids = aws_subnet.private[*].id
security_group_ids = [aws_security_group.lambda.id]
}
environment {
variables = {
DB_PROXY_ENDPOINT = aws_db_proxy.main.endpoint
DB_NAME = aws_db_instance.main.db_name
DB_PORT = "5432"
DB_USER = aws_db_instance.main.username
AWS_REGION_NAME = var.region
}
}
tags = {
Name = "api-handler"
}
}
data "archive_file" "lambda" {
type = "zip"
source_dir = "${path.module}/lambda"
output_path = "${path.module}/lambda.zip"
}
# IAM role for Lambda
resource "aws_iam_role" "lambda" {
name = "lambda-api-role"
assume_role_policy = jsonencode({
Version = "2012-10-17"
Statement = [{
Action = "sts:AssumeRole"
Effect = "Allow"
Principal = {
Service = "lambda.amazonaws.com"
}
}]
})
}
# Basic Lambda execution policy
resource "aws_iam_role_policy_attachment" "lambda_basic" {
role = aws_iam_role.lambda.name
policy_arn = "arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole"
}
# VPC access for Lambda
resource "aws_iam_role_policy_attachment" "lambda_vpc" {
role = aws_iam_role.lambda.name
policy_arn = "arn:aws:iam::aws:policy/service-role/AWSLambdaVPCAccessExecutionRole"
}
# RDS Proxy IAM authentication
resource "aws_iam_role_policy" "lambda_rds_proxy" {
name = "lambda-rds-proxy-connect"
role = aws_iam_role.lambda.id
policy = jsonencode({
Version = "2012-10-17"
Statement = [{
Effect = "Allow"
Action = "rds-db:connect"
Resource = "arn:aws:rds-db:${var.region}:${data.aws_caller_identity.current.account_id}:dbuser:${aws_db_proxy.main.id}/${aws_db_instance.main.username}"
}]
})
}
data "aws_caller_identity" "current" {}
7. Lambda Function Code (Python)
# lambda/index.py
import os
import boto3
import psycopg2
def get_connection():
"""
Connect to RDS via RDS Proxy using IAM authentication.
"""
# Get RDS auth token
client = boto3.client('rds')
token = client.generate_db_auth_token(
DBHostname=os.environ['DB_PROXY_ENDPOINT'],
Port=int(os.environ['DB_PORT']),
DBUsername=os.environ['DB_USER'],
Region=os.environ['AWS_REGION_NAME']
)
# Connect using the token as password
conn = psycopg2.connect(
host=os.environ['DB_PROXY_ENDPOINT'],
port=os.environ['DB_PORT'],
database=os.environ['DB_NAME'],
user=os.environ['DB_USER'],
password=token,
sslmode='require'
)
return conn
def handler(event, context):
"""
Example Lambda handler that queries the database.
"""
conn = None
try:
conn = get_connection()
cursor = conn.cursor()
# Example query
cursor.execute("SELECT version();")
version = cursor.fetchone()[0]
cursor.close()
return {
'statusCode': 200,
'body': f'Connected via RDS Proxy! PostgreSQL version: {version}'
}
except Exception as e:
return {
'statusCode': 500,
'body': f'Error: {str(e)}'
}
finally:
if conn:
conn.close()
Connection Pool Configuration
The proxy’s connection pool settings are critical:
resource "aws_db_proxy_default_target_group" "main" {
db_proxy_name = aws_db_proxy.main.name
connection_pool_config {
# How long a client can wait for a connection from the pool
connection_borrow_timeout = 120 # seconds
# Maximum connections as percentage of max_connections
max_connections_percent = 100
# Idle connections to keep as percentage of max_connections
max_idle_connections_percent = 50
# Optional: SQL to run when connection is created
# init_query = "SET timezone='UTC'"
}
}
Tuning guidance:
max_connections_percent: Start at 100%, reduce if you want to reserve connections for direct accessmax_idle_connections_percent: Higher = faster response to traffic spikes, but more idle connectionsconnection_borrow_timeout: How long Lambda waits if pool is exhausted. Set higher than Lambda timeout minus expected query time
Monitoring and Metrics
Key CloudWatch metrics for RDS Proxy:
# cloudwatch.tf
resource "aws_cloudwatch_metric_alarm" "proxy_connections" {
alarm_name = "rds-proxy-high-connections"
comparison_operator = "GreaterThanThreshold"
evaluation_periods = 2
metric_name = "DatabaseConnections"
namespace = "AWS/RDS"
period = 60
statistic = "Average"
threshold = 80
alarm_description = "RDS Proxy connections high"
dimensions = {
DBProxyName = aws_db_proxy.main.name
}
alarm_actions = [aws_sns_topic.alerts.arn]
}
resource "aws_cloudwatch_metric_alarm" "client_connections" {
alarm_name = "rds-proxy-high-client-connections"
comparison_operator = "GreaterThanThreshold"
evaluation_periods = 2
metric_name = "ClientConnections"
namespace = "AWS/RDS"
period = 60
statistic = "Average"
threshold = 400
alarm_description = "Too many Lambda connections to proxy"
dimensions = {
DBProxyName = aws_db_proxy.main.name
}
alarm_actions = [aws_sns_topic.alerts.arn]
}
Key metrics to watch:
| Metric | Description | What to look for |
|---|---|---|
ClientConnections | Connections from Lambda to proxy | Should be <= your Lambda concurrency |
DatabaseConnections | Connections from proxy to RDS | Should be much lower than ClientConnections |
DatabaseConnectionsBorrowLatency | Time to get a connection from pool | Spikes indicate pool exhaustion |
QueryDatabaseResponseLatency | Query response time | Baseline for your queries |
ClientConnectionsReceived | New connections per second | High values = Lambda cold starts |
Pricing
RDS Proxy pricing is based on the vCPUs of your target database:
| Target Instance vCPUs | Proxy Cost (per hour) |
|---|---|
| 1 vCPU | ~$0.015 |
| 2 vCPUs | ~$0.030 |
| 4 vCPUs | ~$0.060 |
| 8 vCPUs | ~$0.120 |
Example: db.t3.medium (2 vCPUs) = ~$21.60/month for the proxy
When is it worth it?
- If connection exhaustion is causing errors: worth it
- If you need improved failover: worth it
- If Lambda cold start latency is hurting you: worth it
- If you’re running low concurrency with no issues: probably not
Common Issues and Solutions
1. “IAM authentication is not enabled”
Error: IAM authentication is not enabled for this database instance
Fix: Enable IAM authentication on the RDS instance:
resource "aws_db_instance" "main" {
# ...
iam_database_authentication_enabled = true
}
2. “Access denied for user”
Error: Access denied for user 'dbadmin'@'%'
Fix: Create the database user with IAM authentication:
-- For PostgreSQL
CREATE USER dbadmin WITH LOGIN;
GRANT rds_iam TO dbadmin;
3. Proxy not becoming available
The proxy can take 5-10 minutes to become available after creation. Check the status:
aws rds describe-db-proxies \
--db-proxy-name lambda-app-proxy \
--query 'DBProxies[0].Status'
4. Lambda timeout connecting to proxy
Causes:
- Security group not allowing traffic
- Lambda not in VPC
- NAT Gateway missing (Lambda can’t reach Secrets Manager)
Fix: Ensure Lambda is in the same VPC, security groups allow traffic, and Lambda has outbound internet access (for IAM token generation).
Best Practices Summary
- Use IAM authentication - More secure than password-based, auto-rotating tokens
- Keep transactions short - Long transactions cause pinning
- Avoid session state - Temp tables, user variables cause pinning
- Monitor pool metrics - Watch for connection exhaustion
- Set appropriate timeouts -
connection_borrow_timeoutshould be less than Lambda timeout - Use connection pooling in code - Even with proxy, reuse connections within a Lambda execution
- Test failover - RDS Proxy handles failover, but test your application’s behaviour
Key Takeaways
- RDS Proxy solves Lambda connection exhaustion by pooling connections
- IAM authentication is recommended for Lambda → Proxy
- Connection pinning reduces effectiveness - minimise transactions and session state
- Monitor CloudWatch metrics to tune pool configuration
- Cost is per-vCPU of target database - factor into decisions
The Lambda-RDS connection problem catches many teams off guard. RDS Proxy isn’t free, but it’s far cheaper than the alternative: scaling your database instance just to handle connection overhead.