Agentic Engineering: A Definition
The value of engineering has moved from possessing implementation knowledge to mastering the architecture of intent.
For as long as I’ve known software development, the work was translation. Someone (a customer, a product manager, sometimes yourself) held an intent in their head: build a billing system, index this corpus, ship this feature. The job of the engineer was to take that intent and produce something a computer could execute. Lines of code. Schemas. Configuration. Deployment scripts. The translation was the craft. We hired for it, interviewed for it, and built whole industries around teaching it.
That work has stopped being scarce.
I want to be careful what I’m claiming. Translation didn’t vanish. Models still get it wrong. Production systems still need people who understand memory layouts, distributed consistency, and why a 200ms tail latency might be load-bearing for a business. But the act of writing the code (the daily, hourly motion that defined what software engineering felt like for two generations) has been removed from the critical path. Around late 2024, the output of frontier models crossed a quiet threshold. The code they produced started being more often right than wrong, more often readable than not, and good enough that engineers started shipping it without rewriting it. That’s a different relationship to the machine than we had a year before.
What’s left, when translation isn’t the bottleneck, is a discipline without a name.
This essay is an attempt to give it one. Call it agentic engineering: the practice of producing reliable software by directing AI agents through specification, implementation, review, testing, deployment, and maintenance, while retaining human ownership of intent, architecture, and correctness.
Why the old names don’t fit
We’ve been working without vocabulary for almost two years, and it shows. Every team I talk to calls what they’re doing something different, and most of them are wrong in the same way: they pick a name that describes the tool and pretend it describes the practice.
“AI-assisted coding” was the first attempt. It’s the name a coding tool gives itself when it wants to sound non-threatening. The metaphor is the autocomplete bar. The engineer drives; the AI suggests. This was true in 2023. It stopped being true the moment you could hand an agent a repo and a task and walk away for an hour. A practice where the AI is doing most of the typing isn’t “assisted.”
“Vibe coding” captured something real. It went viral because it gave a name to the moment anyone could ship a working app by describing what they wanted. That story about lowering the floor matters. It’s not the story of how serious software gets built. Karpathy drew the line cleanly: vibe coding raises the floor; agentic engineering raises the ceiling. They use the same infrastructure and they’re not the same activity. Conflating them is how production systems end up with bugs no one can find because no one ever read the code.
“Prompt engineering” had a six-month run as a job title in 2023 and has since collapsed back into being a skill, not a discipline. Writing a good prompt is part of agentic engineering the way writing a good function name is part of software engineering. Table stakes. Not the work.
What’s missing from all of these names is the part that’s actually hard: getting a stochastic, fallible, occasionally brilliant collaborator to produce software that holds together when ten thousand users hit it on a Tuesday morning. That’s a discipline. It deserves a name. The name is agentic engineering.
What an agent is
A quick clarification before going further, because the word “agent” has been smeared across about fifteen different things in the trade press.
An agent, here, is an AI system that takes a goal expressed in natural language, plans its own intermediate steps, calls tools, reads the results, and revises its plan until the goal is met or a budget runs out. It is not a chatbot. It is not a single LLM call inside an IDE. It is a loop. The loop is the thing that makes the work feel different from anything we’ve called “AI” before.
The agent reads the codebase you point it at. It runs the tests it wrote. It opens a pull request. If the build fails, it reads the failure and tries again. That loop is doing the typing now.
The discipline of agentic engineering is the discipline of being the human at the other end of that loop.
The four practices
Four practices distinguish agentic engineering from what came before. I’ll claim each one is necessary and that the four together are sufficient — at least at the resolution at which I can see the field today, in 2026.
1. Specifying intent precisely enough that an agent can act on it
The first thing that changes when an agent can write code is that the spec stops being a comms artifact and starts being the actual program.
In the old world, a spec was a thing you wrote so that humans on your team would understand what you wanted. Engineers reading the spec did the hard interpretive work, filling in the gaps, asking clarifying questions, making judgment calls about what the writer probably meant. The code was the source of truth. The spec was scaffolding.
In agentic engineering, the spec gets executed. Not literally. But what the agent produces is downstream of how clearly you described what you wanted, and the agent doesn’t ask the clarifying questions a junior engineer would. It guesses. And it guesses confidently. A spec that says “the user should be able to recover their account” produces wildly different code depending on whether the agent assumed email recovery, SMS recovery, security questions, or all three.
This is the central skill of the new discipline: writing intent that survives a long agent run. Specs that survive have a few properties. They distinguish what’s load-bearing from what’s incidental. They name the failure modes the engineer cares about. They say what the system is not supposed to do. They tell the agent which tradeoffs the engineer has already made, so the agent doesn’t relitigate them halfway through.
You don’t learn this by writing prompts. You learn it by reading the code an agent produced from a vague spec, and watching where it went wrong.
2. Architecting systems the agent can verify itself against
The second practice is about what you build around the agent so it can tell whether it’s done.
A coder who can’t tell whether their code is working is a liability. The same is true of an agent. Most of the work of getting agents to ship production-grade software is in giving them feedback loops short enough and sharp enough that they can self-correct before they go off the rails.
What does that look like in practice? Type systems the agent has to satisfy before tests even run. Test suites that run in seconds, not minutes. Linters that fail loudly on the patterns the agent is prone to. Schemas that make malformed data impossible to commit. Observability that lets the agent read what its own code did in production. The shape of the codebase changes when its primary author is a machine that can’t yet tell good from bad without help.
This is the part of agentic engineering that looks most like traditional software architecture, and it’s the part most teams underinvest in. They focus on the model and the prompt. They neglect the question of how the agent knows when it’s wrong. Then they wonder why their agent-written code passes review but breaks in production.
A useful working rule: if your test suite takes ten minutes to run, your agent is flying blind. Get it under sixty seconds or accept that the agent will produce code that compiles, passes review, and silently breaks.
3. Steering the agent through the work without micromanaging the code
The third practice is the hardest to teach because it looks, from the outside, like doing nothing.
An engineer steering an agent well spends most of their time reading. Reading the agent’s plan before it executes. Reading its commit messages. Reading the diff. Catching the moment the agent went off the road two steps ago, before it builds the next ten steps on a wrong foundation. The steering happens in interruptions. Stop, you’re solving the wrong problem. Or: stop, the idea is right, but use the existing service instead of writing a new one.
This is the skill that separates engineers who get production-grade output from agents and engineers who don’t. It’s a taste-level skill, not a syntax-level one. The engineer doesn’t need to remember the API. The agent has the API. The engineer needs to recognize, in two lines of code, whether the agent has understood the problem or is improvising plausibly around it.
There’s a corollary worth pulling out, because it changes how teams should hire: the engineers best at this are not the ones who can type the fastest. They’re the ones who’ve spent ten years reading other people’s code and developing an instinct for when something is subtly wrong. Code review used to be a tax. It’s now the job.
4. Owning everything that ships, including what the agent got subtly wrong
The last practice is the one most teams haven’t internalized, and it’s the one that turns agentic engineering into a discipline rather than a workflow.
If an agent writes the code, who’s responsible when it breaks?
The engineer is. Always. Every time. There is no version of agentic engineering where the answer is “the model did it.” The engineer set the spec. The engineer accepted the diff. The engineer’s name is on the commit. When the billing system charges customers twice on the first of the month, no court and no customer cares whether the bug was in the engineer’s spec, the agent’s interpretation, or the model’s training data.
This is not a complaint about the technology. It’s a structural property of how accountability works. The model is not a legal person. It cannot be sued, fired, or asked to explain itself under oath. Responsibility flows to the human who decided what to ship. It always has and it always will.
The implication is that agentic engineering, done well, is a more demanding discipline than what came before, not a less demanding one. The engineer is now responsible for a much larger volume of code produced at a much faster rate, with the same standard of correctness applied at the end. The model speeds up the production. It does not change the quality bar. If anything, it raises it, because the team next door is shipping at the same speed.
The engineer’s job is no longer “wrote the code.” It’s “owns the system.” The latter is harder.
What it isn’t
A definition gets sharper when you say what it excludes.
Agentic engineering is not AI-pair programming. Pair programming assumes a peer relationship; the agent isn’t a peer, it’s a tool with high variance. Treating it like a peer (”let me hear your opinion on the design”) misallocates the relationship. The agent doesn’t have opinions. It has distributions over completions. The engineer’s job is to know which distribution to draw from and when to override it.
It is not vibe coding. Vibe coding is describing what you want and shipping what comes back without reading it. That’s fine for a weekend toy. It’s malpractice for a system that handles money or health data or anything else where being wrong has consequences. The discipline of agentic engineering is exactly the discipline of reading what comes back, and the people who skip that step aren’t doing agentic engineering. They’re gambling.
It is not prompt engineering. Prompt engineering is a tactic inside the practice. Calling agentic engineering “prompt engineering” is like calling software engineering “typing.”
And it is not, despite a lot of recent press to the contrary, something only large research teams or model labs can do. The agents are commodities now. The discipline of using them well is open to anyone willing to learn it. That openness is the most interesting thing about this moment.
What used to matter, and what matters now
For most of software’s history, the engineer’s edge came from being able to produce code other people couldn’t. Knowing the language deeply. Knowing the library. Knowing the trick. The engineer’s career was built on a stockpile of implementation knowledge that other people didn’t have.
That stockpile is worth less now. The model has the library in its head. It has the trick. It can produce in three seconds a working implementation of something it would have taken a senior engineer three hours to write a year ago.
What’s worth more is the part the engineer carries that the model can’t fake.
Knowing what to build. The model can write the code for a feature; it can’t tell you the feature is the wrong feature for the product to ship this quarter. Product judgment was always part of engineering. It’s now the larger part.
Knowing what good looks like in this codebase. The model has seen ten million codebases. It hasn’t seen yours. It doesn’t know which patterns this team uses, which it’s tried and abandoned, which conventions are load-bearing because of an outage six months ago. The engineer who carries the institutional memory of a system steers an agent in that system better than any engineer who doesn’t.
Knowing when to stop. The model will keep building. It will write tests, refactor structures, add features, until the budget runs out. The engineer is the one who says: this is enough; ship it; the next thing is more important. Saying “enough” was always part of engineering. It’s now decisive.
The shorthand: we used to optimize the code. We now optimize the intent, and let the code follow.
Why this needs to be a discipline, not a workflow
I’ll close on the part of the argument I most want to fix in place because it determines what we build next.
A workflow is something you adopt. A discipline is something you train for. A workflow has tips and tricks. A discipline has a body of knowledge, a set of failure modes the field has learned to anticipate, and standards that distinguish a practitioner from someone who’s winging it.
Presently, agentic engineering looks like a workflow. Every team is figuring it out independently. The patterns that work are being rediscovered in different shapes inside every company. Failure modes recur (agents fabricate API calls; agents delete files they shouldn’t have touched) and the lessons get written up in private Slacks and then forgotten because nobody owns the body of knowledge.
That doesn’t last. It can’t. The economic gravity is too large. Every team in production software in 2026 is doing some version of this, and the gap between teams that do it well and teams that don’t is going to widen quickly. When the gap is wide enough, the discipline coalesces. People will start calling themselves agentic engineers the way they started calling themselves machine learning engineers in the 2010s, or site reliability engineers a decade before that. There will be conferences and benchmarks, and a canon of mistakes practitioners are expected to have read about so they don’t repeat them in production.
We can let that happen passively over the next five years, or we can name the discipline now and start writing down what we’ve learned. That’s the bet I’m taking with The Intent Layer and with the Center for Agentic Engineering. The bet is that the field is ready for the vocabulary, and the people writing the vocabulary will shape what the field becomes.
So here’s the definition again, and it’s the definition I’d like to see cited:
Agentic engineering is the practice of producing reliable software by directing AI agents through specification, implementation, review, testing, deployment, and maintenance, while retaining human ownership of intent, architecture, and correctness.
The four practices: specify intent the agent can act on; architect systems the agent can verify itself against; steer the agent through the work without micromanaging the code; own everything that ships.
If you’re already doing this, you have a name for it now. If you’re not, you have a target to train for.
The next decade of software gets built by the people who take it seriously.








