Vitess for MySQL: Horizontal Sharding Done Right
MySQL doesn’t scale horizontally. Vitess makes it scale. Born at YouTube to handle billions of rows, it’s now a CNCF project powering Slack, GitHub, and many others.
TL;DR
- Vitess = MySQL horizontal sharding layer
- Automatic shard routing
- Online schema migrations
- Connection pooling and query rewriting
- Kubernetes operator included
Architecture
┌─────────────────────────────────────────────────────────────────┐
│ Application │
│ (MySQL protocol) │
└─────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ VTGate │
│ (Query router, connection pooler) │
└─────────────────────────────────────────────────────────────────┘
│
┌────────────────────┼────────────────────┐
▼ ▼ ▼
┌──────────────────┐ ┌──────────────────┐ ┌──────────────────┐
│ VTTablet │ │ VTTablet │ │ VTTablet │
│ (Shard -80) │ │ (Shard 80-) │ │ (Replica) │
│ ┌──────────┐ │ │ ┌──────────┐ │ │ ┌──────────┐ │
│ │ MySQL │ │ │ │ MySQL │ │ │ │ MySQL │ │
│ └──────────┘ │ │ └──────────┘ │ │ └──────────┘ │
└──────────────────┘ └──────────────────┘ └──────────────────┘
Install Vitess Operator
# Install operator
kubectl apply -f https://github.com/vitessio/vitess/releases/download/v18.0.0/operator.yaml
# Wait for operator
kubectl wait --for=condition=Available deployment/vitess-operator -n vitess
Deploy Cluster
apiVersion: planetscale.com/v2
kind: VitessCluster
metadata:
name: example
spec:
images:
vtgate: vitess/lite:v18.0.0
vttablet: vitess/lite:v18.0.0
vtbackup: vitess/lite:v18.0.0
mysqld:
mysql80Compatible: vitess/lite:v18.0.0
mysqldExporter: prom/mysqld-exporter:v0.14.0
cells:
- name: zone1
gateway:
replicas: 2
resources:
requests:
cpu: 500m
memory: 512Mi
keyspaces:
- name: commerce
turndownPolicy: Immediate
partitionings:
- equal:
parts: 2
shardTemplate:
databaseInitScriptSecret:
name: commerce-init
key: init.sql
replication:
enforceSemiSync: true
tabletPools:
- cell: zone1
type: replica
replicas: 2
vttablet:
resources:
requests:
cpu: 500m
memory: 1Gi
mysqld:
resources:
requests:
cpu: 500m
memory: 1Gi
dataVolumeClaimTemplate:
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 100Gi
storageClassName: gp3
VSchema (Sharding Config)
{
"sharded": true,
"vindexes": {
"hash": {
"type": "hash"
},
"customer_keyspace_id": {
"type": "hash"
}
},
"tables": {
"customer": {
"column_vindexes": [
{
"column": "customer_id",
"name": "hash"
}
]
},
"orders": {
"column_vindexes": [
{
"column": "customer_id",
"name": "customer_keyspace_id"
}
]
},
"products": {
"type": "reference"
}
}
}
Apply VSchema
vtctldclient ApplyVSchema --vschema-file vschema.json commerce
Application Connection
import (
"database/sql"
_ "github.com/go-sql-driver/mysql"
)
func main() {
// Connect to VTGate (MySQL protocol)
db, err := sql.Open("mysql", "user:password@tcp(vtgate.vitess:3306)/commerce")
if err != nil {
log.Fatal(err)
}
// Queries are automatically routed to correct shard
rows, err := db.Query("SELECT * FROM customer WHERE customer_id = ?", 123)
// Cross-shard queries work automatically
rows, err = db.Query("SELECT c.name, o.total FROM customer c JOIN orders o ON c.customer_id = o.customer_id")
}
Online Schema Change
# Safe ALTER TABLE across shards
vtctldclient ApplySchema \
--sql "ALTER TABLE customer ADD COLUMN email VARCHAR(255)" \
commerce
Vitess uses gh-ost/pt-osc under the hood for non-blocking changes.
Resharding
Split shards when they get too big:
# Split shard -80 into -40 and 40-80
vtctldclient Reshard \
--source_shards "-80" \
--target_shards "-40,40-80" \
commerce.reshard1
# Monitor progress
vtctldclient Reshard Show commerce.reshard1
# Complete when ready
vtctldclient Reshard SwitchTraffic commerce.reshard1
vtctldclient Reshard Complete commerce.reshard1
Backup and Restore
apiVersion: planetscale.com/v2
kind: VitessBackupSchedule
metadata:
name: daily-backup
spec:
backup:
storage:
s3:
bucket: vitess-backups
region: eu-west-2
authSecret:
name: s3-credentials
schedule: "0 2 * * *"
keyspace: commerce
Monitoring
# ServiceMonitor for Prometheus
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: vitess
spec:
selector:
matchLabels:
app: vitess
endpoints:
- port: web
path: /debug/vars
References
- Vitess Docs: https://vitess.io/docs
- Operator: https://github.com/planetscale/vitess-operator
- VSchema: https://vitess.io/docs/reference/features/vschema