Cloud Tagging Strategies That Actually Work
Every cloud governance conversation starts the same way: “We need better tagging.”
Tags are simple key-value pairs. Yet most organisations struggle with them. Inconsistent naming. Missing tags. Tags that exist but mean different things to different teams. The result: you can’t answer basic questions like “how much does Team X spend?” or “who owns this resource?”
I’ve seen tagging done badly at scale. I’ve also worked at a place where we got it right - using a context module pattern that made consistent tagging almost automatic. This post covers that pattern and other approaches that actually work.
TL;DR
- Tags are the foundation of cost allocation, security, and automation
- Enforce tags through policies, not documentation
- Use a context module pattern to inject consistent tags automatically
- Combine multiple enforcement layers: SCPs, Terraform validation, CI checks
- Start with a minimal required set and expand gradually
Why Tagging Matters
Tags enable:
Cost Allocation
# Without tags: "AWS costs $500k/month"
# With tags: "Team Platform costs $120k, Team Data costs $180k..."
Security & Compliance
# Find all production resources
aws resourcegroupstaggingapi get-resources \
--tag-filters Key=Environment,Values=production
Automation
# Auto-shutdown dev resources at night
aws ec2 stop-instances --filters "Name=tag:Environment,Values=dev"
Ownership & Accountability
# Who owns this? Check the tags.
Owner: platform-team
ContactEmail: platform@company.com
Without consistent tagging, you’re flying blind.
The Context Module Pattern
At a previous company, we solved tagging consistency with a context module. Every Terraform stack used it, and it provided standardised context that flowed through to all resources.
How It Works
The context module reads from the CI/CD environment (we used Spacelift, but this works with any CI system) and outputs consistent values:
# modules/context/main.tf
variable "context" {
description = "Override context values for local development"
type = object({
stack_name = optional(string)
component_name = optional(string)
environment = optional(string)
owner = optional(string)
contact_email = optional(string)
repository = optional(string)
})
default = {}
}
locals {
# Read from CI environment or use overrides
environment = coalesce(var.context.environment, env("TF_VAR_environment"), "unknown")
component_name = coalesce(var.context.component_name, env("TF_VAR_component"), "unknown")
stack_name = coalesce(var.context.stack_name, env("TF_VAR_stack_name"), "unknown")
owner = coalesce(var.context.owner, env("TF_VAR_owner"), "unknown")
contact_email = coalesce(var.context.contact_email, env("TF_VAR_contact_email"), null)
repository = coalesce(var.context.repository, env("TF_VAR_repository"), "unknown")
}
output "environment" {
value = local.environment
}
output "component_name" {
value = local.component_name
}
output "stack_name" {
value = local.stack_name
}
output "owner" {
value = local.owner
}
output "contact_email" {
value = local.contact_email
}
output "repository" {
value = local.repository
}
# Computed tags ready to apply to any resource
output "tags" {
value = merge(
{
"Environment" = local.environment
"Owner" = local.owner
"Component" = local.component_name
"Terraform:Stack" = local.stack_name
"Terraform:Repository" = local.repository
},
local.contact_email != null ? { "ContactEmail" = local.contact_email } : {}
)
}
Using the Context Module
Every stack starts with:
module "context" {
source = "app.terraform.io/myorg/context/aws"
version = "~> 2.0"
}
Then every resource uses the tags:
resource "aws_s3_bucket" "data" {
bucket = "${module.context.component_name}-data-${module.context.environment}"
tags = module.context.tags
}
resource "aws_lambda_function" "processor" {
function_name = "${module.context.component_name}-processor"
# ... config ...
tags = module.context.tags
}
Local Development Override
When developing locally (not in CI), you provide context manually:
module "context" {
source = "app.terraform.io/myorg/context/aws"
version = "~> 2.0"
context = {
stack_name = "data-pipeline-dev"
component_name = "data-pipeline"
environment = "sandbox"
owner = "data-team"
contact_email = "data-team@company.com"
repository = "https://github.com/myorg/data-pipeline"
}
}
Passing Context to Child Modules
All your internal modules accept a context input:
# modules/ecs-service/variables.tf
variable "context" {
description = "Context from the context module"
type = object({
environment = string
component_name = string
stack_name = string
owner = string
contact_email = optional(string)
repository = string
tags = map(string)
})
}
# modules/ecs-service/main.tf
resource "aws_ecs_service" "this" {
name = var.service_name
cluster = var.cluster_arn
task_definition = aws_ecs_task_definition.this.arn
tags = var.context.tags
}
Usage:
module "api_service" {
source = "app.terraform.io/myorg/ecs-service/aws"
version = "~> 3.0"
context = module.context
service_name = "api"
# ... other config ...
}
Why This Pattern Works
- Single source of truth - Context defined once, used everywhere
- CI/CD integration - Automatically populated from pipeline
- Local dev friendly - Easy overrides for development
- Composable - Child modules inherit context automatically
- Extensible - Add new fields without changing every stack
Alternative Approaches
1. Default Tags Provider (AWS)
AWS provider supports default tags applied to all resources:
provider "aws" {
region = "eu-west-1"
default_tags {
tags = {
Environment = var.environment
Owner = var.owner
ManagedBy = "terraform"
Repository = var.repository
}
}
}
Pros:
- Simple, built-in
- Applies to all resources automatically
Cons:
- Provider-level only (can’t vary per module)
- Doesn’t work with
aws_autoscaling_grouppropagated tags - Can conflict with resource-level tags
2. Terraform Variables + Locals
The simplest approach - define tags in variables:
# variables.tf
variable "environment" {
type = string
}
variable "owner" {
type = string
}
variable "common_tags" {
type = map(string)
default = {}
}
# locals.tf
locals {
tags = merge(
{
Environment = var.environment
Owner = var.owner
ManagedBy = "terraform"
},
var.common_tags
)
}
# main.tf
resource "aws_instance" "web" {
ami = var.ami
instance_type = "t3.micro"
tags = local.tags
}
Pros:
- Simple, no external dependencies
- Easy to understand
Cons:
- Repeated in every stack
- No enforcement
- Easy to forget
3. Terragrunt Inputs
If you use Terragrunt, inject tags from the hierarchy:
# terragrunt.hcl (root)
locals {
common_tags = {
ManagedBy = "terraform"
Repository = "https://github.com/myorg/infra"
}
}
# environments/prod/terragrunt.hcl
locals {
environment_tags = {
Environment = "production"
}
}
include "root" {
path = find_in_parent_folders()
}
inputs = {
tags = merge(
local.common_tags,
local.environment_tags,
{
Owner = "platform-team"
}
)
}
Pros:
- DRY across environments
- Hierarchical inheritance
- Works well with Terragrunt’s folder structure
Cons:
- Requires Terragrunt
- Another layer of abstraction
4. Tag Policies (AWS Organizations)
Enforce tag requirements at the AWS level:
{
"tags": {
"Environment": {
"tag_key": {
"@@assign": "Environment"
},
"tag_value": {
"@@assign": ["production", "staging", "development", "sandbox"]
},
"enforced_for": {
"@@assign": ["ec2:instance", "rds:db", "s3:bucket"]
}
},
"Owner": {
"tag_key": {
"@@assign": "Owner"
},
"enforced_for": {
"@@assign": ["ec2:instance", "rds:db"]
}
}
}
}
Pros:
- Enforced at AWS level
- Works regardless of how resources are created
- Compliance reporting built-in
Cons:
- Limited to tag key/value validation
- Can’t enforce tag presence on all resource types
- Doesn’t prevent creation, just marks non-compliant
5. Service Control Policies (SCPs)
Block resource creation without required tags:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "RequireTagsOnEC2",
"Effect": "Deny",
"Action": [
"ec2:RunInstances",
"ec2:CreateVolume"
],
"Resource": [
"arn:aws:ec2:*:*:instance/*",
"arn:aws:ec2:*:*:volume/*"
],
"Condition": {
"Null": {
"aws:RequestTag/Environment": "true",
"aws:RequestTag/Owner": "true"
}
}
}
]
}
Pros:
- Hard enforcement - resources can’t be created without tags
- Works for all creation methods (console, CLI, SDK, Terraform)
Cons:
- Only works at creation time
- Doesn’t cover all resource types
- Can break automation if tags are missing
Enforcement Layers
The best tagging strategies use multiple enforcement layers:
┌─────────────────────────────────────────────────────┐
│ Layer 4: Alerts │
│ (AWS Config rules, CloudWatch alarms) │
├─────────────────────────────────────────────────────┤
│ Layer 3: SCPs │
│ (Block untagged resource creation) │
├─────────────────────────────────────────────────────┤
│ Layer 2: CI/CD │
│ (terraform validate, tflint, checkov) │
├─────────────────────────────────────────────────────┤
│ Layer 1: Code │
│ (Context module, default_tags) │
└─────────────────────────────────────────────────────┘
Layer 1: Code (Context Module)
Make tagging the default path:
module "context" {
source = "./modules/context"
}
# All resources get tags automatically
resource "aws_s3_bucket" "this" {
tags = module.context.tags
}
Layer 2: CI/CD Validation
Catch missing tags before apply:
# .github/workflows/terraform.yml
- name: Check for required tags
run: |
# Custom script to verify all resources have tags
./scripts/check-tags.sh
- name: Run tflint
run: |
tflint --config .tflint.hcl
# .tflint.hcl
rule "aws_resource_missing_tags" {
enabled = true
tags = ["Environment", "Owner"]
}
Layer 3: SCPs
Last line of defence:
{
"Effect": "Deny",
"Action": ["ec2:RunInstances"],
"Resource": "*",
"Condition": {
"Null": {
"aws:RequestTag/Environment": "true"
}
}
}
Layer 4: Alerts
Catch resources that slip through:
resource "aws_config_config_rule" "required_tags" {
name = "required-tags"
source {
owner = "AWS"
source_identifier = "REQUIRED_TAGS"
}
input_parameters = jsonencode({
tag1Key = "Environment"
tag2Key = "Owner"
})
}
Recommended Tag Schema
Start minimal and expand:
Required Tags
| Tag | Purpose | Example |
|---|---|---|
Environment | Deployment environment | production, staging, dev |
Owner | Team responsible | platform-team, data-team |
Component | Logical component name | api, worker, database |
Recommended Tags
| Tag | Purpose | Example |
|---|---|---|
CostCenter | Finance allocation | CC-1234 |
ContactEmail | Escalation contact | team@company.com |
Repository | Source code location | github.com/org/repo |
ManagedBy | How it’s managed | terraform, cloudformation |
Optional Tags
| Tag | Purpose | Example |
|---|---|---|
DataClassification | Security classification | public, internal, confidential |
Backup | Backup policy | daily, weekly, none |
AutoShutdown | Cost saving automation | true, false |
Common Mistakes
1. Too Many Required Tags
# Bad - too many required tags, people will game it
tags = {
Environment = "prod"
Owner = "team"
CostCenter = "unknown" # People just put garbage
Project = "unknown"
Application = "unknown"
DataClass = "unknown"
Compliance = "unknown"
}
Start with 3-4 required tags. Add more once the basics are consistent.
2. Inconsistent Naming
# Bad - different conventions
tags = { "environment" = "prod" } # lowercase
tags = { "Environment" = "prod" } # PascalCase
tags = { "ENVIRONMENT" = "prod" } # UPPERCASE
tags = { "env" = "prod" } # abbreviated
Pick one convention and enforce it.
3. No Validation of Values
# Bad - environment can be anything
variable "environment" {
type = string
}
# Good - constrained values
variable "environment" {
type = string
validation {
condition = contains(["production", "staging", "development", "sandbox"], var.environment)
error_message = "Environment must be: production, staging, development, or sandbox."
}
}
4. Manual Tagging
If humans have to remember to add tags, they won’t. Make it automatic:
# Bad - manual
resource "aws_instance" "web" {
tags = {
Environment = "prod" # Hope they remember
}
}
# Good - automatic
resource "aws_instance" "web" {
tags = module.context.tags # Always there
}
Quick Wins
Week 1: Audit Current State
# Find untagged resources
aws resourcegroupstaggingapi get-resources \
--tags-per-page 100 \
| jq '.ResourceTagMappingList[] | select(.Tags | length == 0)'
# Count resources by tag coverage
aws resourcegroupstaggingapi get-resources \
--tags-per-page 100 \
| jq '[.ResourceTagMappingList[] | .Tags | length] | group_by(.) | map({count: length, tags: .[0]})'
Week 2: Implement Context Module
Create and deploy the context module pattern.
Week 3: Add CI Validation
Block PRs that create untagged resources.
Week 4: Enable SCPs
Hard enforcement for critical tags.
Conclusion
Tagging isn’t glamorous, but it’s foundational. Without it, you can’t:
- Allocate costs accurately
- Automate based on resource attributes
- Identify ownership during incidents
- Enforce security policies
The context module pattern works because it makes tagging automatic. Engineers don’t have to think about it - they use the module and tags flow through.
Start simple: 3-4 required tags, enforced at multiple layers. Expand once you have consistency.