Cloud Unit Economics for Multi-Tenant SaaS - Cost Per Customer, Not Per Service

Your AWS bill tells you that EKS costs £50,000/month and Aurora costs £15,000/month. But what does Customer A cost? What about Customer B who does 10x the transactions? Traditional cloud billing shows you spend by service - it doesn’t show you spend by customer, transaction, or business unit.

This is the unit economics problem, and for multi-tenant SaaS platforms, it’s critical. Without it, you can’t answer:

Which customers are profitable?
What’s the true margin on each deal?
Where should we optimise?
How should we price?

I recently helped a client solve this for their multi-tenant platform running on EKS with shared Aurora, DynamoDB, MSK, and KeySpaces backends. This post covers the approach, the tooling, and the gotchas.

The Problem: Shared Infrastructure, Unknown Attribution

Consider this typical multi-tenant architecture:

┌─────────────────────────────────────────────────────────────────────┐
│                           Customers                                  │
│  ┌────────┐    ┌────────┐    ┌────────┐                            │
│  │ Cust A │    │ Cust B │    │ Cust C │                            │
│  └───┬────┘    └───┬────┘    └───┬────┘                            │
│      │             │             │                                   │
│      └─────────────┼─────────────┘                                   │
│                    │                                                  │
│                    ▼                                                  │
│  ┌─────────────────────────────────────────────────────────────┐    │
│  │                      CloudFront                              │    │
│  └─────────────────────────────────────────────────────────────┘    │
│                    │                                                  │
│                    ▼                                                  │
│  ┌─────────────────────────────────────────────────────────────┐    │
│  │                    EKS Cluster                               │    │
│  │  ┌─────────┐  ┌─────────┐  ┌─────────┐  ┌─────────┐        │    │
│  │  │ Login   │  │ Orders  │  │ Payment │  │ Common  │        │    │
│  │  │ Service │  │ Service │  │ Service │  │ Services│        │    │
│  │  └─────────┘  └─────────┘  └─────────┘  └─────────┘        │    │
│  └─────────────────────────────────────────────────────────────┘    │
│                    │                                                  │
│      ┌─────────────┼─────────────────────────────┐                   │
│      │             │             │               │                   │
│      ▼             ▼             ▼               ▼                   │
│  ┌────────┐   ┌────────┐   ┌──────────┐   ┌──────────┐             │
│  │ Aurora │   │DynamoDB│   │ KeySpaces│   │   MSK    │             │
│  │(shared)│   │(shared)│   │ (shared) │   │ (shared) │             │
│  └────────┘   └────────┘   └──────────┘   └──────────┘             │
└─────────────────────────────────────────────────────────────────────┘

The challenge:

All customers hit the same EKS pods
All customers share the same Aurora cluster
All customers write to the same DynamoDB tables
Tenants are isolated at the data level, not the infrastructure level

AWS Cost Explorer will tell you Aurora costs £15k/month. It won’t tell you that Customer A costs £8k and Customer B costs £2k.

Unit Economics Defined

Unit economics = Cost to serve one unit of business value

Common units:

Cost per customer - Total cost / number of customers
Cost per transaction - Total cost / number of transactions
Cost per API call - Total cost / number of API requests
Cost per user - Total cost / active users
Cost per order - Total cost / orders processed

The “right” unit depends on your business model:

Per-seat SaaS → Cost per user
Transaction platform → Cost per transaction
API business → Cost per 1M requests
E-commerce → Cost per order

The Solution: Multi-Dimensional Cost Attribution

To solve this, we need to:

Tag everything possible at the AWS level
Instrument applications to emit tenant context
Collect resource usage at the tenant level
Allocate shared costs proportionally
Build a cost model that combines direct and allocated costs

Step 1: AWS Tagging Strategy

Start with consistent tagging. Every resource needs:

tenant_id: customer-123        # Direct tenant if applicable
service: orders-api            # Which service
environment: production        # Environment
cost_center: platform          # Business allocation

For shared resources, tag with:

allocation_type: shared
allocation_basis: request_count  # How to split the cost

The problem: Most shared resources can’t be tagged per-tenant because multiple tenants use them simultaneously.

Step 2: Kubernetes Cost Attribution with OpenCost

OpenCost is the CNCF project for Kubernetes cost monitoring. It allocates cluster costs to namespaces, deployments, and labels.

Install OpenCost:

helm repo add opencost https://opencost.github.io/opencost-helm-chart
helm install opencost opencost/opencost \
  --namespace opencost \
  --set opencost.prometheus.internal.enabled=true \
  --set opencost.ui.enabled=true

Configure for tenant attribution:

The key is labeling your pods with tenant information when possible, or tracking tenant metrics separately.

For shared pods (most multi-tenant setups), OpenCost gives you cost-per-pod, but you need application-level metrics to split by tenant.

# Example: Pod with tenant label (for tenant-dedicated resources)
apiVersion: v1
kind: Pod
metadata:
  labels:
    app: orders-api
    tenant: customer-123  # Only works for tenant-dedicated pods

For shared pods serving multiple tenants, you need a different approach.

Step 3: Application-Level Tenant Metrics

This is where most cost attribution projects fail. You need your application to emit tenant-tagged metrics.

Instrument your services:

# Python example with Prometheus metrics
from prometheus_client import Counter, Histogram

# Request counter by tenant
requests_total = Counter(
    'http_requests_total',
    'Total HTTP requests',
    ['service', 'tenant_id', 'endpoint']
)

# Request duration by tenant
request_duration = Histogram(
    'http_request_duration_seconds',
    'Request duration',
    ['service', 'tenant_id', 'endpoint']
)

# In your request handler
@app.route('/api/orders')
def handle_orders():
    tenant_id = get_tenant_from_request()  # Extract from JWT, header, etc.
    
    with request_duration.labels(
        service='orders-api',
        tenant_id=tenant_id,
        endpoint='/api/orders'
    ).time():
        # Process request
        result = process_order()
    
    requests_total.labels(
        service='orders-api',
        tenant_id=tenant_id,
        endpoint='/api/orders'
    ).inc()
    
    return result

Key metrics to collect per tenant:

Request count
CPU time consumed
Memory high-water mark
Database queries executed
Storage bytes read/written
Kafka messages produced/consumed

Step 4: Database Cost Attribution

Shared databases are the hardest to attribute. Tenants are isolated at the row/table level, not the instance level.

Aurora/RDS Attribution

Aurora costs have multiple components:

Instance hours (compute)
Storage (GB-months)
I/O requests
Backup storage

Attribution approach:

-- Track storage per tenant
SELECT 
    tenant_id,
    SUM(pg_total_relation_size(schemaname || '.' || tablename)) as bytes
FROM pg_tables
JOIN your_data_table ON table_id = tablename
GROUP BY tenant_id;

-- Track query activity per tenant (requires pg_stat_statements)
SELECT 
    -- Extract tenant from query or use application tags
    tenant_id,
    SUM(total_time) as query_time_ms,
    SUM(calls) as query_count,
    SUM(shared_blks_read + shared_blks_hit) as blocks_accessed
FROM pg_stat_statements
JOIN tenant_query_log ON query_hash = queryid
GROUP BY tenant_id;

For Aurora I/O costs:

Track read/write IOPS per tenant via application metrics
Use CloudWatch VolumeReadIOPs and VolumeWriteIOPs for total
Allocate proportionally based on application-tracked I/O

DynamoDB Attribution

DynamoDB billing is simpler - it’s based on:

Read Capacity Units (RCU)
Write Capacity Units (WCU)
Storage (GB)

Enable DynamoDB Contributor Insights:

aws dynamodb update-contributor-insights \
    --table-name YourTable \
    --contributor-insights-action ENABLE

This shows top partition keys (often tenant IDs) and their access patterns.

Custom attribution via application:

# Track DynamoDB operations per tenant
dynamodb_reads = Counter(
    'dynamodb_read_units_total',
    'DynamoDB consumed read units',
    ['table', 'tenant_id']
)

dynamodb_writes = Counter(
    'dynamodb_write_units_total',
    'DynamoDB consumed write units',
    ['table', 'tenant_id']
)

# After each DynamoDB operation
response = dynamodb.query(
    TableName='Orders',
    KeyConditionExpression='tenant_id = :tid',
    ExpressionAttributeValues={':tid': {'S': tenant_id}}
)

consumed_rcu = response['ConsumedCapacity']['ReadCapacityUnits']
dynamodb_reads.labels(table='Orders', tenant_id=tenant_id).inc(consumed_rcu)

Step 5: The Cost Attribution Pipeline

Now we combine everything into an attribution pipeline:

┌──────────────┐    ┌──────────────┐    ┌──────────────┐
│  AWS Cost    │    │  OpenCost    │    │ Application  │
│  & Usage     │    │  (K8s costs) │    │   Metrics    │
│   Report     │    │              │    │  (Prometheus)│
└──────┬───────┘    └──────┬───────┘    └──────┬───────┘
       │                   │                   │
       └───────────────────┼───────────────────┘
                           │
                           ▼
                  ┌────────────────┐
                  │  Cost          │
                  │  Attribution   │
                  │  Engine        │
                  └────────┬───────┘
                           │
                           ▼
                  ┌────────────────┐
                  │  Tenant Cost   │
                  │  Dashboard     │
                  └────────────────┘

Example attribution logic:

def calculate_tenant_costs(period):
    # 1. Get total AWS costs from Cost & Usage Report
    aws_costs = get_cur_costs(period)  # {'eks': 50000, 'aurora': 15000, ...}
    
    # 2. Get tenant resource usage from Prometheus
    tenant_metrics = query_prometheus(f'''
        sum by (tenant_id) (
            rate(http_requests_total{{service=~".+"}}[{period}])
        )
    ''')
    
    total_requests = sum(tenant_metrics.values())
    
    # 3. Get tenant-specific metrics where available
    tenant_db_usage = get_database_usage_by_tenant(period)
    tenant_storage = get_storage_by_tenant(period)
    
    # 4. Calculate allocation ratios
    tenant_costs = {}
    for tenant_id, request_count in tenant_metrics.items():
        request_ratio = request_count / total_requests
        db_ratio = tenant_db_usage.get(tenant_id, 0) / sum(tenant_db_usage.values())
        storage_ratio = tenant_storage.get(tenant_id, 0) / sum(tenant_storage.values())
        
        tenant_costs[tenant_id] = {
            # Allocate EKS costs by request ratio
            'eks': aws_costs['eks'] * request_ratio,
            
            # Allocate Aurora by DB usage
            'aurora': aws_costs['aurora'] * db_ratio,
            
            # Allocate storage by storage ratio
            's3': aws_costs['s3'] * storage_ratio,
            
            # Direct costs (if any tenant-specific resources)
            'direct': get_direct_tenant_costs(tenant_id, period),
        }
        
        tenant_costs[tenant_id]['total'] = sum(tenant_costs[tenant_id].values())
    
    return tenant_costs

Tools Comparison

Several tools can help with this:

OpenCost

What: Open-source Kubernetes cost monitoring
Good for: Pod/namespace/label cost allocation
Limitation: Doesn’t handle non-K8s resources, needs app metrics for tenant split
Cost: Free

CloudZero

What: SaaS unit economics platform
Good for: End-to-end unit cost tracking, pre-built integrations
Limitation: SaaS pricing can be high, less customisable
Cost: $$$

Kubecost

What: Commercial K8s cost monitoring (OpenCost fork)
Good for: K8s-focused with better UI, alerting
Limitation: Still K8s-centric
Cost: Free tier, paid for advanced features

Attrb.io

What: Cost attribution sensors for K8s
Good for: Works with Karpenter, fine-grained attribution
Limitation: Newer tool, less mature
Cost: Check pricing

Custom Build

What: Build your own with CUR + Prometheus + custom logic
Good for: Full control, handles edge cases
Limitation: Engineering effort, maintenance burden
Cost: Engineering time

Our Recommendation

For most multi-tenant platforms:

Start with OpenCost for K8s visibility
Add application-level tenant metrics (non-negotiable)
Build a custom attribution layer for shared resources
Consider CloudZero if you need quick time-to-value and can afford it

Implementation Checklist

## Tagging
- [ ] Define tenant tagging strategy
- [ ] Tag all AWS resources
- [ ] Label all K8s resources

## Instrumentation
- [ ] Add tenant_id to all application metrics
- [ ] Instrument request counts per tenant
- [ ] Instrument database operations per tenant
- [ ] Instrument storage usage per tenant
- [ ] Instrument queue operations per tenant

## Collection
- [ ] Deploy OpenCost for K8s costs
- [ ] Configure Cost & Usage Report
- [ ] Set up Prometheus for application metrics
- [ ] Enable database monitoring (pg_stat_statements, DynamoDB Contributor Insights)

## Attribution
- [ ] Define cost allocation rules
- [ ] Build attribution pipeline
- [ ] Handle shared resource allocation
- [ ] Handle idle/unattributed costs

## Reporting
- [ ] Build tenant cost dashboard
- [ ] Set up cost anomaly alerting
- [ ] Create margin reports
- [ ] Enable drill-down by service/time/tenant

Common Pitfalls

1. Ignoring Idle Costs

Not all costs map to tenant activity. Idle EKS nodes, standby Aurora replicas, unused reserved capacity - these need a policy:

Spread evenly: Divide among all tenants
Spread by usage: Allocate proportionally to active tenants
Keep separate: Track as “platform overhead”

2. Point-in-Time vs. Averaged

Tenant usage varies. A tenant might spike to 50% of capacity for an hour, then drop to 5%.

Don’t: Take a single measurement Do: Average over the billing period, or use peak-based allocation for reserved capacity

3. Forgetting Support and People Costs

Cloud costs aren’t the full picture:

Support tickets per tenant
Engineering time per tenant
Onboarding costs
Account management

For true unit economics, you need these too.

4. Over-Engineering Early

Start simple:

Track total costs
Track tenant request counts
Allocate by request ratio

Add complexity (DB-level, storage-level, network-level) only when the simple model is insufficient.

Example Dashboard

A good unit economics dashboard shows:

┌─────────────────────────────────────────────────────────────────┐
│                    Unit Economics Dashboard                      │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│  SUMMARY                           TREND (Last 6 Months)        │
│  ─────────────────────────         ────────────────────────     │
│  Total Cost:     £85,000           [Line chart showing          │
│  Customers:      150                cost per customer trend]    │
│  Avg Cost/Cust:  £567                                           │
│  Cost/1K Trans:  £12.40                                         │
│                                                                  │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│  TOP 10 CUSTOMERS BY COST          COST BREAKDOWN BY SERVICE    │
│  ─────────────────────────         ────────────────────────     │
│  1. BigCorp Inc     £12,400        EKS:        58%             │
│  2. MegaTech Ltd    £8,200         Aurora:     18%             │
│  3. StartupXYZ      £6,100         DynamoDB:   12%             │
│  4. Enterprise Co   £5,800         MSK:         7%             │
│  5. Growth Inc      £4,200         Other:       5%             │
│  ...                                                            │
│                                                                  │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│  MARGIN ANALYSIS                                                 │
│  ─────────────────                                              │
│  Customer     Revenue    Cost    Margin    Margin %             │
│  BigCorp      £25,000    £12,400  £12,600    50.4%             │
│  MegaTech     £10,000    £8,200   £1,800     18.0%  ⚠️         │
│  StartupXYZ   £15,000    £6,100   £8,900     59.3%             │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘

Key Takeaways

AWS billing ≠ business visibility - You need tenant-level attribution
Tag everything - But know that tagging alone isn’t enough for shared resources
Instrument applications - Tenant-aware metrics are essential
Start simple - Request-based allocation is a good first step
Handle shared costs explicitly - Define allocation rules upfront
Include non-cloud costs - Support, engineering, sales for true unit economics
Iterate - Your first model will be wrong; refine based on learnings

Unit economics turns your cloud bill from a mystery into a business tool. You’ll finally know which customers are profitable, where to optimise, and how to price your product.

Building unit economics for your platform? Questions about the approach? Find me on LinkedIn or GitHub.