Skip to content
Back to blog RDS Proxy for Lambda - Solving the Connection Exhaustion Problem

RDS Proxy for Lambda - Solving the Connection Exhaustion Problem

AWSDatabases

RDS Proxy for Lambda - Solving the Connection Exhaustion Problem

Your Lambda function connects to RDS. Works fine in development. Then you hit production traffic - 500 concurrent executions - and your database falls over with “too many connections.”

This is the Lambda-RDS connection problem. Each Lambda execution creates a new database connection. At scale, you exhaust your database’s connection limit, causing failures across your entire application.

RDS Proxy solves this by sitting between Lambda and your database, pooling and reusing connections. Instead of 500 Lambda executions creating 500 database connections, they share a pool of maybe 50 connections managed by the proxy.

This post covers when to use RDS Proxy, how it works, and complete Terraform setup for production.

TL;DR

  • Lambda functions create new DB connections per invocation - doesn’t scale
  • RDS Proxy pools connections between Lambda and RDS/Aurora
  • Reduces connection count, improves failover handling
  • Use IAM authentication from Lambda to the proxy
  • Proxy connects to RDS using Secrets Manager credentials
  • Lambda must be in a VPC to use RDS Proxy

Code Repository: All code from this post is available at github.com/moabukar/blog-code/rds-proxy-lambda


The Problem: Lambda Connection Exhaustion

Traditional applications maintain a connection pool - open connections at startup, reuse them for requests. Lambda doesn’t work that way:

Traditional App:
┌─────────────┐     10 connections      ┌──────────┐
│    App      │ ══════════════════════► │   RDS    │
│  (pooled)   │     (reused)            │          │
└─────────────┘                         └──────────┘

Lambda at Scale:
┌─────────────┐ ─┐
│  Lambda 1   │  │
├─────────────┤  │
│  Lambda 2   │  │
├─────────────┤  │    500 connections    ┌──────────┐
│  Lambda 3   │  ├═══════════════════════►│   RDS    │
├─────────────┤  │    (new each time)    │  💥      │
│    ...      │  │                        └──────────┘
├─────────────┤  │
│  Lambda 500 │  │
└─────────────┘ ─┘

Problems:

  1. Connection limit exhaustion - RDS instances have max connection limits based on instance size (e.g., db.t3.micro = ~85 connections)
  2. Connection overhead - Each new connection requires TCP handshake, TLS negotiation, authentication
  3. Cold starts are worse - Establishing DB connections adds latency
  4. Failover handling - If RDS fails over, Lambda functions holding connections get errors

RDS Connection Limits

Connection limits vary by instance class:

Instance ClassMax Connections (approx)
db.t3.micro85
db.t3.small170
db.t3.medium340
db.r5.large1,000
db.r5.xlarge2,000

With 500 concurrent Lambda executions, even a db.r5.large might struggle.


How RDS Proxy Solves This

RDS Proxy sits between Lambda and your database:

┌─────────────┐ ─┐
│  Lambda 1   │  │
├─────────────┤  │
│  Lambda 2   │  │    500 Lambda        ┌───────────┐    50 DB       ┌──────────┐
├─────────────┤  │    connections       │           │  connections   │          │
│  Lambda 3   │  ├════════════════════►│ RDS Proxy │═══════════════►│   RDS    │
├─────────────┤  │    (to proxy)        │  (pool)   │   (reused)     │   ✓      │
│    ...      │  │                      └───────────┘                └──────────┘
├─────────────┤  │
│  Lambda 500 │  │
└─────────────┘ ─┘

The proxy:

  1. Maintains a connection pool to your database
  2. Multiplexes Lambda requests over fewer database connections
  3. Reuses connections - no per-request connection overhead
  4. Handles failovers - automatically reconnects to new primary
  5. Queues requests when the pool is busy (instead of failing)

When to Use RDS Proxy

Use RDS Proxy when:

  • Lambda functions make frequent, short-lived database queries
  • You have high concurrency (100+ concurrent executions)
  • You’re hitting connection limits
  • You need improved failover handling
  • You want IAM-based database authentication

Don’t use RDS Proxy when:

  • Low concurrency (a few requests per second)
  • Long-running transactions (proxy pins connections)
  • Cost is a major concern (proxy adds ~$0.015/hour per vCPU of target DB)
  • You need features that cause connection pinning (see below)

Connection Pinning - The Important Gotcha

RDS Proxy multiplexes connections - multiple Lambda executions share database connections. But some operations require a dedicated connection. This is called pinning.

When a connection is pinned, that Lambda execution holds a database connection until the session ends. This reduces the effectiveness of pooling.

Operations that cause pinning:

OperationCauses Pinning
Open transactionYes (until COMMIT/ROLLBACK)
Temporary tablesYes
User-defined variablesYes
Prepared statementsDepends on settings
SET statementsSome
LOCK TABLESYes
Large result setsYes (>16KB statement text)

Best practices to minimise pinning:

# BAD - Transaction stays open across Lambda invocation
def handler(event, context):
    conn = get_connection()
    cursor = conn.cursor()
    cursor.execute("BEGIN")
    cursor.execute("INSERT INTO orders ...")
    # Connection is pinned until COMMIT
    return {"statusCode": 200}
    # Never committed! Connection stays pinned.

# GOOD - Complete transactions quickly
def handler(event, context):
    conn = get_connection()
    cursor = conn.cursor()
    try:
        cursor.execute("BEGIN")
        cursor.execute("INSERT INTO orders ...")
        cursor.execute("UPDATE inventory ...")
        conn.commit()  # Transaction complete, unpinned
    except:
        conn.rollback()
        raise
    finally:
        conn.close()
    return {"statusCode": 200}

Architecture Overview

Here’s what we’ll build:

┌─────────────────────────────────────────────────────────────────────┐
│                              VPC                                     │
│                                                                      │
│  ┌──────────────────┐     ┌──────────────────┐     ┌─────────────┐  │
│  │  Lambda Function │────►│    RDS Proxy     │────►│    RDS      │  │
│  │  (private subnet)│     │  (private subnet)│     │  (private)  │  │
│  └──────────────────┘     └──────────────────┘     └─────────────┘  │
│           │                        │                      │          │
│           │ IAM Auth               │                      │          │
│           ▼                        ▼                      │          │
│  ┌──────────────────┐     ┌──────────────────┐           │          │
│  │    IAM Role      │     │ Secrets Manager  │───────────┘          │
│  │ (rds-db:connect) │     │ (DB credentials) │                      │
│  └──────────────────┘     └──────────────────┘                      │
│                                                                      │
└─────────────────────────────────────────────────────────────────────┘

Authentication flow:

  1. Lambda → Proxy: IAM authentication (recommended) or Secrets Manager
  2. Proxy → RDS: Credentials from Secrets Manager

Terraform Implementation

1. VPC and Networking

Lambda and RDS Proxy must be in the same VPC:

# vpc.tf

resource "aws_vpc" "main" {
  cidr_block           = "10.0.0.0/16"
  enable_dns_hostnames = true
  enable_dns_support   = true

  tags = {
    Name = "lambda-rds-vpc"
  }
}

# Private subnets for Lambda, RDS Proxy, and RDS
resource "aws_subnet" "private" {
  count             = 2
  vpc_id            = aws_vpc.main.id
  cidr_block        = cidrsubnet(aws_vpc.main.cidr_block, 8, count.index)
  availability_zone = data.aws_availability_zones.available.names[count.index]

  tags = {
    Name = "private-${count.index + 1}"
  }
}

data "aws_availability_zones" "available" {
  state = "available"
}

2. Security Groups

# security-groups.tf

# Lambda security group
resource "aws_security_group" "lambda" {
  name        = "lambda-sg"
  description = "Security group for Lambda functions"
  vpc_id      = aws_vpc.main.id

  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }

  tags = {
    Name = "lambda-sg"
  }
}

# RDS Proxy security group
resource "aws_security_group" "rds_proxy" {
  name        = "rds-proxy-sg"
  description = "Security group for RDS Proxy"
  vpc_id      = aws_vpc.main.id

  # Allow inbound from Lambda
  ingress {
    from_port       = 5432
    to_port         = 5432
    protocol        = "tcp"
    security_groups = [aws_security_group.lambda.id]
    description     = "PostgreSQL from Lambda"
  }

  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }

  tags = {
    Name = "rds-proxy-sg"
  }
}

# RDS security group
resource "aws_security_group" "rds" {
  name        = "rds-sg"
  description = "Security group for RDS"
  vpc_id      = aws_vpc.main.id

  # Allow inbound from RDS Proxy only
  ingress {
    from_port       = 5432
    to_port         = 5432
    protocol        = "tcp"
    security_groups = [aws_security_group.rds_proxy.id]
    description     = "PostgreSQL from RDS Proxy"
  }

  tags = {
    Name = "rds-sg"
  }
}

3. RDS Instance

# rds.tf

resource "aws_db_subnet_group" "main" {
  name       = "main"
  subnet_ids = aws_subnet.private[*].id

  tags = {
    Name = "main-db-subnet-group"
  }
}

resource "aws_db_instance" "main" {
  identifier     = "lambda-app-db"
  engine         = "postgres"
  engine_version = "15.4"
  instance_class = "db.t3.medium"

  allocated_storage     = 20
  max_allocated_storage = 100
  storage_type          = "gp3"
  storage_encrypted     = true

  db_name  = "appdb"
  username = "dbadmin"
  password = random_password.db_password.result

  db_subnet_group_name   = aws_db_subnet_group.main.name
  vpc_security_group_ids = [aws_security_group.rds.id]

  # Required for RDS Proxy
  iam_database_authentication_enabled = true

  backup_retention_period = 7
  skip_final_snapshot     = true
  deletion_protection     = false

  tags = {
    Name = "lambda-app-db"
  }
}

resource "random_password" "db_password" {
  length  = 32
  special = false
}

4. Secrets Manager

RDS Proxy needs database credentials stored in Secrets Manager:

# secrets.tf

resource "aws_secretsmanager_secret" "db_credentials" {
  name        = "rds-proxy/db-credentials"
  description = "Database credentials for RDS Proxy"
}

resource "aws_secretsmanager_secret_version" "db_credentials" {
  secret_id = aws_secretsmanager_secret.db_credentials.id
  secret_string = jsonencode({
    username = aws_db_instance.main.username
    password = random_password.db_password.result
    engine   = "postgres"
    host     = aws_db_instance.main.address
    port     = aws_db_instance.main.port
    dbname   = aws_db_instance.main.db_name
  })
}

5. RDS Proxy

# rds-proxy.tf

resource "aws_db_proxy" "main" {
  name                   = "lambda-app-proxy"
  debug_logging          = false
  engine_family          = "POSTGRESQL"
  idle_client_timeout    = 1800
  require_tls            = true
  role_arn               = aws_iam_role.rds_proxy.arn
  vpc_security_group_ids = [aws_security_group.rds_proxy.id]
  vpc_subnet_ids         = aws_subnet.private[*].id

  auth {
    auth_scheme               = "SECRETS"
    client_password_auth_type = "POSTGRES_SCRAM_SHA_256"
    iam_auth                  = "REQUIRED"
    secret_arn                = aws_secretsmanager_secret.db_credentials.arn
  }

  tags = {
    Name = "lambda-app-proxy"
  }
}

resource "aws_db_proxy_default_target_group" "main" {
  db_proxy_name = aws_db_proxy.main.name

  connection_pool_config {
    connection_borrow_timeout    = 120
    max_connections_percent      = 100
    max_idle_connections_percent = 50
  }
}

resource "aws_db_proxy_target" "main" {
  db_instance_identifier = aws_db_instance.main.identifier
  db_proxy_name          = aws_db_proxy.main.name
  target_group_name      = aws_db_proxy_default_target_group.main.name
}

# IAM role for RDS Proxy to access Secrets Manager
resource "aws_iam_role" "rds_proxy" {
  name = "rds-proxy-role"

  assume_role_policy = jsonencode({
    Version = "2012-10-17"
    Statement = [{
      Action = "sts:AssumeRole"
      Effect = "Allow"
      Principal = {
        Service = "rds.amazonaws.com"
      }
    }]
  })
}

resource "aws_iam_role_policy" "rds_proxy_secrets" {
  name = "rds-proxy-secrets"
  role = aws_iam_role.rds_proxy.id

  policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Effect = "Allow"
        Action = [
          "secretsmanager:GetSecretValue"
        ]
        Resource = [aws_secretsmanager_secret.db_credentials.arn]
      },
      {
        Effect = "Allow"
        Action = [
          "kms:Decrypt"
        ]
        Resource = "*"
        Condition = {
          StringEquals = {
            "kms:ViaService" = "secretsmanager.${var.region}.amazonaws.com"
          }
        }
      }
    ]
  })
}

6. Lambda Function

# lambda.tf

resource "aws_lambda_function" "api" {
  filename         = data.archive_file.lambda.output_path
  function_name    = "api-handler"
  role             = aws_iam_role.lambda.arn
  handler          = "index.handler"
  source_code_hash = data.archive_file.lambda.output_base64sha256
  runtime          = "python3.11"
  timeout          = 30
  memory_size      = 256

  vpc_config {
    subnet_ids         = aws_subnet.private[*].id
    security_group_ids = [aws_security_group.lambda.id]
  }

  environment {
    variables = {
      DB_PROXY_ENDPOINT = aws_db_proxy.main.endpoint
      DB_NAME           = aws_db_instance.main.db_name
      DB_PORT           = "5432"
      DB_USER           = aws_db_instance.main.username
      AWS_REGION_NAME   = var.region
    }
  }

  tags = {
    Name = "api-handler"
  }
}

data "archive_file" "lambda" {
  type        = "zip"
  source_dir  = "${path.module}/lambda"
  output_path = "${path.module}/lambda.zip"
}

# IAM role for Lambda
resource "aws_iam_role" "lambda" {
  name = "lambda-api-role"

  assume_role_policy = jsonencode({
    Version = "2012-10-17"
    Statement = [{
      Action = "sts:AssumeRole"
      Effect = "Allow"
      Principal = {
        Service = "lambda.amazonaws.com"
      }
    }]
  })
}

# Basic Lambda execution policy
resource "aws_iam_role_policy_attachment" "lambda_basic" {
  role       = aws_iam_role.lambda.name
  policy_arn = "arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole"
}

# VPC access for Lambda
resource "aws_iam_role_policy_attachment" "lambda_vpc" {
  role       = aws_iam_role.lambda.name
  policy_arn = "arn:aws:iam::aws:policy/service-role/AWSLambdaVPCAccessExecutionRole"
}

# RDS Proxy IAM authentication
resource "aws_iam_role_policy" "lambda_rds_proxy" {
  name = "lambda-rds-proxy-connect"
  role = aws_iam_role.lambda.id

  policy = jsonencode({
    Version = "2012-10-17"
    Statement = [{
      Effect   = "Allow"
      Action   = "rds-db:connect"
      Resource = "arn:aws:rds-db:${var.region}:${data.aws_caller_identity.current.account_id}:dbuser:${aws_db_proxy.main.id}/${aws_db_instance.main.username}"
    }]
  })
}

data "aws_caller_identity" "current" {}

7. Lambda Function Code (Python)

# lambda/index.py
import os
import boto3
import psycopg2

def get_connection():
    """
    Connect to RDS via RDS Proxy using IAM authentication.
    """
    # Get RDS auth token
    client = boto3.client('rds')
    
    token = client.generate_db_auth_token(
        DBHostname=os.environ['DB_PROXY_ENDPOINT'],
        Port=int(os.environ['DB_PORT']),
        DBUsername=os.environ['DB_USER'],
        Region=os.environ['AWS_REGION_NAME']
    )
    
    # Connect using the token as password
    conn = psycopg2.connect(
        host=os.environ['DB_PROXY_ENDPOINT'],
        port=os.environ['DB_PORT'],
        database=os.environ['DB_NAME'],
        user=os.environ['DB_USER'],
        password=token,
        sslmode='require'
    )
    
    return conn

def handler(event, context):
    """
    Example Lambda handler that queries the database.
    """
    conn = None
    try:
        conn = get_connection()
        cursor = conn.cursor()
        
        # Example query
        cursor.execute("SELECT version();")
        version = cursor.fetchone()[0]
        
        cursor.close()
        
        return {
            'statusCode': 200,
            'body': f'Connected via RDS Proxy! PostgreSQL version: {version}'
        }
    
    except Exception as e:
        return {
            'statusCode': 500,
            'body': f'Error: {str(e)}'
        }
    
    finally:
        if conn:
            conn.close()

Connection Pool Configuration

The proxy’s connection pool settings are critical:

resource "aws_db_proxy_default_target_group" "main" {
  db_proxy_name = aws_db_proxy.main.name

  connection_pool_config {
    # How long a client can wait for a connection from the pool
    connection_borrow_timeout = 120  # seconds

    # Maximum connections as percentage of max_connections
    max_connections_percent = 100

    # Idle connections to keep as percentage of max_connections
    max_idle_connections_percent = 50

    # Optional: SQL to run when connection is created
    # init_query = "SET timezone='UTC'"
  }
}

Tuning guidance:

  • max_connections_percent: Start at 100%, reduce if you want to reserve connections for direct access
  • max_idle_connections_percent: Higher = faster response to traffic spikes, but more idle connections
  • connection_borrow_timeout: How long Lambda waits if pool is exhausted. Set higher than Lambda timeout minus expected query time

Monitoring and Metrics

Key CloudWatch metrics for RDS Proxy:

# cloudwatch.tf

resource "aws_cloudwatch_metric_alarm" "proxy_connections" {
  alarm_name          = "rds-proxy-high-connections"
  comparison_operator = "GreaterThanThreshold"
  evaluation_periods  = 2
  metric_name         = "DatabaseConnections"
  namespace           = "AWS/RDS"
  period              = 60
  statistic           = "Average"
  threshold           = 80
  alarm_description   = "RDS Proxy connections high"

  dimensions = {
    DBProxyName = aws_db_proxy.main.name
  }

  alarm_actions = [aws_sns_topic.alerts.arn]
}

resource "aws_cloudwatch_metric_alarm" "client_connections" {
  alarm_name          = "rds-proxy-high-client-connections"
  comparison_operator = "GreaterThanThreshold"
  evaluation_periods  = 2
  metric_name         = "ClientConnections"
  namespace           = "AWS/RDS"
  period              = 60
  statistic           = "Average"
  threshold           = 400
  alarm_description   = "Too many Lambda connections to proxy"

  dimensions = {
    DBProxyName = aws_db_proxy.main.name
  }

  alarm_actions = [aws_sns_topic.alerts.arn]
}

Key metrics to watch:

MetricDescriptionWhat to look for
ClientConnectionsConnections from Lambda to proxyShould be <= your Lambda concurrency
DatabaseConnectionsConnections from proxy to RDSShould be much lower than ClientConnections
DatabaseConnectionsBorrowLatencyTime to get a connection from poolSpikes indicate pool exhaustion
QueryDatabaseResponseLatencyQuery response timeBaseline for your queries
ClientConnectionsReceivedNew connections per secondHigh values = Lambda cold starts

Pricing

RDS Proxy pricing is based on the vCPUs of your target database:

Target Instance vCPUsProxy Cost (per hour)
1 vCPU~$0.015
2 vCPUs~$0.030
4 vCPUs~$0.060
8 vCPUs~$0.120

Example: db.t3.medium (2 vCPUs) = ~$21.60/month for the proxy

When is it worth it?

  • If connection exhaustion is causing errors: worth it
  • If you need improved failover: worth it
  • If Lambda cold start latency is hurting you: worth it
  • If you’re running low concurrency with no issues: probably not

Common Issues and Solutions

1. “IAM authentication is not enabled”

Error: IAM authentication is not enabled for this database instance

Fix: Enable IAM authentication on the RDS instance:

resource "aws_db_instance" "main" {
  # ...
  iam_database_authentication_enabled = true
}

2. “Access denied for user”

Error: Access denied for user 'dbadmin'@'%'

Fix: Create the database user with IAM authentication:

-- For PostgreSQL
CREATE USER dbadmin WITH LOGIN;
GRANT rds_iam TO dbadmin;

3. Proxy not becoming available

The proxy can take 5-10 minutes to become available after creation. Check the status:

aws rds describe-db-proxies \
  --db-proxy-name lambda-app-proxy \
  --query 'DBProxies[0].Status'

4. Lambda timeout connecting to proxy

Causes:

  • Security group not allowing traffic
  • Lambda not in VPC
  • NAT Gateway missing (Lambda can’t reach Secrets Manager)

Fix: Ensure Lambda is in the same VPC, security groups allow traffic, and Lambda has outbound internet access (for IAM token generation).


Best Practices Summary

  1. Use IAM authentication - More secure than password-based, auto-rotating tokens
  2. Keep transactions short - Long transactions cause pinning
  3. Avoid session state - Temp tables, user variables cause pinning
  4. Monitor pool metrics - Watch for connection exhaustion
  5. Set appropriate timeouts - connection_borrow_timeout should be less than Lambda timeout
  6. Use connection pooling in code - Even with proxy, reuse connections within a Lambda execution
  7. Test failover - RDS Proxy handles failover, but test your application’s behaviour

Key Takeaways

  • RDS Proxy solves Lambda connection exhaustion by pooling connections
  • IAM authentication is recommended for Lambda → Proxy
  • Connection pinning reduces effectiveness - minimise transactions and session state
  • Monitor CloudWatch metrics to tune pool configuration
  • Cost is per-vCPU of target database - factor into decisions

The Lambda-RDS connection problem catches many teams off guard. RDS Proxy isn’t free, but it’s far cheaper than the alternative: scaling your database instance just to handle connection overhead.


Questions? Find me on LinkedIn or GitHub.

Found this helpful?

Comments