Twenty stories. Easy, right? Write a spec, review with colleagues and agents, then kick off a long agent run. The first story sounds charming. The tenth sounds like the first dressed up in new nouns. The twentieth... well, you've already stopped listening by then. Twenty stories, one shape.
For the twentieth story to land, something has to change between story one and story two. A draft used "warm" twenty-one times, and before the next batch ran, a vocabulary gate went in. An intro made a four-year-old fidget by minute two, and we started checking opening DNA. We rewrote the prompt, and the next batch paced toward sleep instead of describing it.
That was not in the first spec. It came from sitting beside a child at bedtime and translating the observation into the next morning's prompt.
The Prompt Amplifier
If you think you're getting value from a super long agent run, think again. Every technology shift moves advantage to whatever stays scarce after the cost of an input collapses. Agents will keep getting better, so the output is already a commodity anyone else can buy with tokens.
Execution is getting cheap.
The moat is moving to signal. Signal can't be predicted in advance because it requires the product to be used to take shape: a child fidgets, a user hesitates, a reader skips, a buyer bounces. Signal capture is the only capability the next model upgrade can't kill.
Long runs scale stale assumptions. Tight loops scale with the frontier.
What scales as agents improve
Tight loop
software gathers, judgment filters
The loop stays close to the frontier as agent capability rises.
- ship surfaces
- capture use
- filter
- next prompt
Long loop
sparse feedback, late filter
Capability improves, but the work is still shaped by stale input.
- static spec
- long run
- late review
- rerun
Tight loops do not win by interrupting more. They win because each agent improvement has more clean usage and a better filter to multiply.
A signal system has two parts. Instrumentation: software surfaces that capture behavior. Organization: structure that lets what users do rewrite a prompt and ship the output by Friday. The latter is the most transformational need right now. (Dashboards live with PMs. Prompts live with engineers. Resource allocation lives with management. They'll have a meeting next Tuesday.)
There's lots of talk about the value of context, but it misses precision. Static context (docs, role definitions, style guides) has a half-life. The only context that compounds is novel: not customer contact or feedback, but usage behavior. A physical book company can ship the book and never see a page turn. An instrumented digital product sees every page.
Most engineers will spend the next year trying to extend the boundary outward: more autonomy, longer runs, fewer interruptions. That is still an execution mindset. The new posture is the prompt amplifier: a signal system that converts live attention on fresh usage into the next prompt before the observation cools.
The execution is the agent's. The judgment is yours. The amplification is in the gap.
Distance Kills the Loop
When I was an engineer at a large tech company, I was several layers from the customer. PMs, managers, eng leads filtered everything that flowed in either direction. By the time anything I built reached a user, it had passed through a dozen human filters.
This is a different kind of judgment than the prompt-amplifier kind. The dozen filters didn't add fresh customer usage. They diluted what little reached me.
It was the right org for an expensive build. When the marginal cost of a wrong build was six weeks of a team's time (and finding the right team with the right knowledge added more), investing ten people upfront to filter and route the work made sense. Layers were a defense mechanism: spend coordination overhead, reduce expensive rework.
Now the marginal cost of a wrong build is an afternoon of tokens. The same structure that protected against rework now blocks the usage observation that compounds: the live read from the person on the other end. Upfront investment in spec accuracy is the wrong place to spend.
Two cost regimes
When build was expensive, layered orgs paid for themselves. When it's cheap, the layers block the only signal that compounds.
Expensive build, layered org
Signal filters through four roles before any code runs.
Customer
User researcher
PM
Eng lead
Engineer
Loop closes from engineer back to customer after weeks or months, passing through user researcher, PM, and eng lead.
Cheap build, signal-driven loop
One observer; agents handle execution between observations.
Customer
Builder
+ Agents
Loop closes from Builder back to customer within hours or days.
Fewer layers between observation and rework.
Two product-development loops compared side by side. On the left, an expensive-build loop where a signal from the customer passes clockwise through a user researcher, PM, eng lead, and engineer before the cycle closes back to the customer in weeks or months. On the right, a much tighter loop where the Builder (with Agents handling execution) observes the customer directly and closes the cycle in hours or days.
Three Regimes
The throughput regime is what we came from. Execution was scarce. The advantage went to whoever shipped more for less. A spec, many builders, a review at the end. Volume was the moat.
The slop regime is where the previous regime's momentum carried us. Execution got cheap. The bottleneck moved to review, and most teams responded by reporting the savings instead of spending them. A hundred agents under one human produce output that competes with itself for average. The market floods. The ceiling is flat.
The signal regime is the third act. The new moat is the ability to gather many clean usage reads quickly and turn the best ones into the next prompt before the room has cooled. Fewer agents, tighter loops, more autonomy for the human bound to the work by minutes instead of memos.
| Regime | Bottleneck | Advantage | Failure mode |
|---|---|---|---|
| Throughput | Execution | Volume per unit cost | Slow ship |
| Slop | Review bandwidth | Cheap output | Average floods the market |
| Signal | Usage capture | Signal-to-ship velocity | Hardens around the wrong read |
A Monday test if you ship with agents: by Monday afternoon, how many clean usage reads do you have from the thing the agents helped make? How fast does the best one rewrite the next prompt? If the gap isn't shrinking, you're stuck in throughput regardless of how many tokens you used.
The signal system is not optional. The next decade goes to whoever designs the loop that writes the next prompt before the room cools.
Anthropic ships Claude Code at sixty to a hundred internal releases a day, sometimes prompted by a user's tweet, with ninety percent of the code written by the tool itself. Cursor builds the tool they themselves use to build it. The testable execution disappeared into the model, as the pattern predicts.
What is left is the signal system: software surfaces that gather usage at scale, agents that sift the flood, taste gates that preserve quality, and a person at the keyboard writing the next prompt fast enough to matter.
Most teams will still be counting tokens while it happens.
You can try Endless Storytime or find me on LinkedIn.
*APRIL 25, 2026