You Were the Scheduler
The next productivity leap comes from designing loops, not juggling workers.
A year ago, the most productive engineer you knew had ten terminal windows open and believed that was the frontier. Each window ran an agent. Each agent stalled the moment it finished a task and waited for a human to hand it the next one. The poor soul running all ten felt like an orchestrator, moving between windows, keeping every agent fed. He wasn’t orchestrating. He was polling. He was the scheduler, the router, and the memory of a small system, and he was holding all three roles together with nothing but his own attention.
We built agents to take the typing away, and the typing did leave. What arrived to replace it was a job nobody applied for: dispatcher. Ten agents is a team of direct reports, and managing ten direct reports one keystroke at a time is a worse job than doing the work yourself. The productivity gain was real and the workflow was still broken. The human had been promoted into the one position in the system that doesn’t scale.
To see why that position exists at all, it helps to separate the two loops that agentic work actually runs on. There’s an inner loop: read the issue, write the code, run the tests, read the failure, and try again. And there’s an outer loop: decide what’s worth building, judge whether the result is correct, and choose what happens next. The agent runs the inner loop. You run the outer one. For most of the last two years those loops were welded together because the models couldn’t be trusted to run the inner loop alone. You watched the work stream past, caught the moment it went the wrong direction, hit escape, and steered it back.
That babysitting was rational when the models needed it. It isn’t rational now. The current generation reads intent well enough that watching it type is like standing behind a competent colleague and reading each line over their shoulder as they write it. You wouldn’t do that to a person you trusted, and the reason you still do it to the agent is habit, not necessity. The reflex to watch was a workaround for a weakness the models mostly grew out of. Keeping the reflex after the weakness is gone just burns the one resource you can’t refill.
Managing one manager
Once you stop watching the worker, the shape of the job changes. You stop managing ten workers and start managing one manager. Picture a single long-lived agent that never closes. An issue lands on an open-source repo. The manager wakes, reads the issue against the project’s stated goals and notes, and decides whether it’s even a fit. If it is, it spins up a worker. The worker investigates, writes the change, and runs the tests. A second agent reviews the diff. You don’t see any of that. What reaches you is a pull request with the original issue, the proposed change, and maybe a running build you can open over VNC and click around in yourself. You leave one note. You approve, or you don’t. The loop keeps going and the change lands after the checks pass.
Three things make that arrangement hold together, and none of them are the model. The first is persistent context: server-side compaction, which compresses a long task’s history so the work doesn’t collapse when the context window fills. Before that existed, every long job had to be nursed back from a fresh session, which is why people optimized their whole workflow around short tasks. The second is delegation: one thread that can create and steer sub-projects instead of doing everything itself. The third is triggers, some event that can wake the manager when an issue arrives or a build breaks, so the loop starts without you starting it. Persistent memory, delegation, a way to be woken. That’s a loop. One engineer already runs a “chief of staff” agent that wakes every ten minutes to coordinate his repository and opens a thread in the sidebar whenever it needs a human decision. He isn’t watching it. He’s answering it.
The bottleneck keeps moving
The interesting part is what happens to the bottleneck once the loop is running, because it doesn’t disappear. It moves. The first wall people hit was tokens: not enough model budget to run the work they wanted. Solve that and the next wall is compute, ten threads running at once until the laptop sounds like it’s about to take off, which is why agents started shipping their test runs to separate machines. Solve compute and the wall you hit is your own attention, and that one has no upgrade path. You can add tokens. You can rent compute. You cannot buy a second person to sit in your chair and make your judgment calls. Every constraint you remove promotes the next one, and the human is always the last constraint standing.
Which is the actual point, and it’s easy to miss while admiring the models. The models are improving faster than anything built around them. The harness the agent runs inside, the org chart of agents that decides who does what, the triggers that wake them, and the review gates that let a change land safely without you reading every line: all of that is underbuilt, and it’s underbuilt because we spent the last two years staring at the model instead of the structure. A model that can write a correct pull request in ten seconds is worth very little to you if it still requires you to be personally present for every one of them. The gap between what the model can do and what your workflow lets it do is now the whole game.
Engineering the loop
So the thing worth engineering has changed underneath us. It used to be the feature. Then, for a brief and clarifying while, it was the pull request, the unit of reviewable change. Now the object worth designing is the loop itself: the standing arrangement of manager and workers and triggers and review that turns an incoming issue into a landed change while you’re doing something else. That’s a design problem with real decisions in it. What does the manager escalate and what does it decide alone? Which work runs in the cloud and which needs your machine? What has to be true before a change lands without a human reading it? Those questions have the texture of architecture, because that’s what they are.
The engineer juggling ten terminals wasn’t behind because he lacked good agents. He had good agents. He was behind because he was still the loop, doing by hand the coordination that should have been built once and left to run. The future isn’t twenty terminals open at once, each one waiting for you to notice it. It’s the loop you build so the twentieth terminal never has to open.


