Everyone’s hiring platform engineers now. Job postings are everywhere. But talk to most of them and you’ll hear the same story: they’re building Kubernetes clusters, setting up Terraform modules, and wondering why developers still complain about shipping speed.
That’s because we’ve confused the tools with the discipline.
Platform engineering isn’t about Kubernetes. It isn’t about Backstage. It isn’t about whatever shiny internal developer portal someone’s pitching this week. Those are implementation details. The discipline is something else entirely.
What Platform Engineering Actually Is
Platform engineering is product management for infrastructure.
Read that again. It’s not “DevOps but with a platform team.” It’s not “SRE but we build things.” It’s treating your internal infrastructure as a product, with your developers as customers.
This means:
- You do user research (talking to developers about their pain points)
- You prioritise features (not everything gets built)
- You measure success (adoption, satisfaction, time-to-deploy)
- You iterate based on feedback (not based on what’s technically interesting)
Most platform teams skip all of this. They build what they think is cool, ship it, and then wonder why adoption is 20%.
The Golden Path Misconception
Everyone talks about “golden paths” now. The idea is simple: provide a paved road that makes the right thing the easy thing. Developers follow the path, they get security, observability, and compliance for free.
Sounds great in theory. In practice, most golden paths fail because they’re actually golden cages.
The difference is autonomy. A golden path says “here’s a great way to deploy a service, use it if you want.” A golden cage says “here’s the only way to deploy a service, deal with it.”
The moment your platform feels like a cage, developers will find workarounds. They’ll deploy to that one account you don’t control. They’ll spin up that VM that’s “just for testing.” They’ll do whatever it takes to ship, because shipping is their job.
The best platforms I’ve seen follow an 80/20 rule: 80% of use cases should be trivially easy with the golden path. The remaining 20% should still be possible, just with less hand-holding.
Why Most Platform Teams Fail
I’ve watched platform teams fail in three predictable ways.
Failure mode 1: Building for yourself
Platform engineers are usually senior. They’ve seen things. They have opinions about the “right” way to do infrastructure. So they build platforms that would’ve solved their problems from five years ago.
But your developers aren’t you. They don’t care about your elegant Terraform abstraction. They want to ship a feature by Friday.
The fix: Talk to your users. Not once at the start of the project. Continuously. Weekly user interviews should be non-negotiable.
Failure mode 2: Over-engineering
A team of four platform engineers does not need to build a multi-cluster, multi-region, active-active Kubernetes platform on day one. They need to solve the problems they actually have.
I’ve seen platform teams spend 18 months building infrastructure that would be appropriate for Netflix, for a company with 30 developers. By the time they shipped, half the engineering org had quit from frustration.
The fix: Start embarrassingly simple. Single cluster. Single region. Manual processes where automation doesn’t pay off yet. Iterate based on real pain points.
Failure mode 3: No product ownership
Platform teams without product ownership build features. Platform teams with product ownership build outcomes.
Features: “We shipped a service mesh.” Outcomes: “Developers can now do canary deployments in 2 clicks instead of 2 days.”
If you can’t articulate the outcome, you probably shouldn’t build the feature.
What Good Looks Like
The best platform teams I’ve worked with share some characteristics.
They measure developer experience.
Not just uptime. Not just deployment frequency. Actual developer satisfaction. How long does it take a new engineer to ship their first change? How many support tickets does the platform team get per week? Would developers recommend the platform to a colleague?
These are soft metrics, but they’re the ones that matter.
They have strong opinions, weakly held.
Good platforms are opinionated. They make choices for you. But good platform teams know when to bend. If three different teams need the same escape hatch, that’s not an edge case - that’s a missing feature.
They deprecate ruthlessly.
Every platform accumulates cruft. Old deployment methods. Legacy clusters. That one custom solution from 2019. The best teams deprecate aggressively, with clear timelines and migration support. The worst teams let options proliferate until nobody knows what to use.
They write documentation like it’s code.
Because for developers, docs are the interface. If your platform requires a 45-minute walkthrough to use, your platform has a bug. The fix isn’t better training - it’s a simpler platform.
The Technology Is the Easy Part
Let me tell you a secret: the technology choices barely matter.
Kubernetes vs ECS vs Lambda? Doesn’t matter. ArgoCD vs Flux vs whatever? Doesn’t matter. Backstage vs Port vs custom? Doesn’t matter.
What matters is whether developers can ship with confidence. Whether they trust the platform. Whether using it feels like an acceleration, not a tax.
I’ve seen teams build great platforms on “boring” tech stacks. I’ve seen teams build unusable platforms on cutting-edge infrastructure. The technology is not the differentiator.
The differentiator is whether you’re solving real problems, getting feedback, and iterating. That’s it. That’s the whole discipline.
Where Platform Engineering Goes From Here
Platform engineering is maturing as a discipline. Here’s what I think the next few years look like.
AI-assisted development will change everything. When AI can scaffold entire services, the platform becomes the guardrails. Your golden path becomes less about templates and more about policies, security boundaries, and compliance automation.
Developer experience will become a competitive advantage. Companies will compete for talent based on how fast developers can ship. Platform quality directly impacts recruiting and retention.
Platform teams will get smaller, not bigger. Better tooling means fewer people can do more. The teams that survive will be the ones that can do more with less.
The “platform” will become invisible. The end state isn’t developers loving your platform. It’s developers not thinking about infrastructure at all. They push code, it runs, it scales, it’s secure. Magic.
Getting Started
If you’re building a platform team from scratch, here’s what I’d do:
-
Interview 10 developers this week. Ask them what’s painful. Write it down. Don’t argue.
-
Identify the one thing that would make the biggest difference. Not three things. One thing.
-
Build the simplest possible solution. Ship it. Get feedback.
-
Iterate. Repeat forever.
That’s it. No complex framework. No multi-year roadmap. Just solve problems, get feedback, iterate.
Platform engineering isn’t about building the perfect infrastructure. It’s about continuously making developers’ lives better. The teams that understand this build great platforms. The teams that don’t build elaborate systems that nobody uses.
Which one are you building?