Provide agents with automated feedback

Recorded: Jan. 19, 2026, 10:03 a.m.

Original

Summarized

Don't waste your back pressure ·

Moss' blog

Don't waste your back pressure

Published Sat, Jan 17, 2026
by

[Moss]

Estimated reading time: 4 min

Back pressure for agents
You might notice a pattern in the most successful applications of agents over the last year. Projects that are able to
setup structure around the agent itself, to provide it with automated feedback on quality and correctness, have been able
to push them to work on longer horizon tasks.
This back pressure helps the agent identify mistakes as it progresses and models are now good enough that this feedback
can keep them aligned to a task for much longer. As an engineer, this means you can increase your leverage by delegating
progressively more complex tasks to agents, while increasing trust that when completed they are at a satisfactory standard.
Imagine for a second if you only gave an agent tools that allow it to edit files. Without a way to interact with a build
system the model relies on you for feedback about whether or not the change it made is sensible. This means you spend your
back pressure (the time you spend giving feedback to agents) on typing a message telling the agent it missed an import. This
scales poorly and limits you to working on simple problems.

If you’re directly responsible for checking each line of code produced is syntactically valid, then that’s time taken away
from thinking about the larger goals or problems in your software. You’re going to struggle to derive more leverage out of
agents because you are caught up in trivial changes. If instead you give the agent tools that allow it to run bash commands,
it can run a build, read the feedback, and correct itself. You remove yourself from needing to be involved in those tasks
and can instead focus on higher complexity tasks.

Languages with expressive type systems have been
growing in popularity in part
because of back pressure. Type systems allow you to describe better contracts in your program. They can let you avoid it
from even being possible to represent invalid states in your program. They can help you to identify edge cases that you
might not handle. Being able to lean on these features is another form of creating back pressure which you can direct as
feedback on changes made by an agent.
Bonus points go to languages that work to produce excellent error messages (think
Rust,
Elm and even
Python). These messages are fed directly back into the LLM so the more guidance or even
suggested resolutions the better.

Another example of back pressure is the rapid uptake in people giving agents a way to see rendered pages using MCP servers
for Playwright or Chrome DevTools. In either case these tools give the agent a way to be able to make a change and compare
an expectation of what it might see in the UI against a result. Attaching these tools mean you remove yourself from needing
to keep telling the agent that you’re not seeing a UI element load correctly or something isn’t centered. Not working on a
UI application? Use MCP servers that bridge to LSPs for lints or other feedback.

Even outside of engineering tasks, proof assistants like Lean combined with AI (see recent work on the
Erdős Problems which was solved by Kevin Barreto and Liam Price by using
Aristotle to formalise a proof written by GPT-5.2 Pro into Lean), randomized fuzzing to evaluate correctness when
generating CUDA kernels or logic programming with agents are all
powerful combinations because they let you keep pulling the LLM slot machine lever until the result you have can be trusted.
I think that the payoff of investing into higher quality testing is growing massively, and an increasing part of engineering
will involve designing and building back pressure in order to scale the rate at which contributions from agents can be
accepted.
If you’re doing spec-driven development and you want the agent to generate a specific API schema, setup automatic generation
of documentation based on the OpenAPI schema from your application so the agent can compare the result it produced and what
it intended on making. There are many more techniques you can apply similar to this once you recognize the pattern.

In your projects you should think about how you can build back pressure into your workflow and once you have it, you can
loop agents until they have stamped out all of the inconsistencies and issues for you.
Without it, you’re going to be stuck spending your time telling the agent about each mistake it makes yourself.
So next time, think - are you wasting your back pressure?

Tagged:
ai

Moss' blog

Ghostwriter theme By JollyGoodThemes
/ Ported to Hugo By jbub

Moss’ blog post explores the concept of “back pressure” as a critical mechanism for improving the effectiveness and reliability of agents—likely referring to AI-driven or automated systems—in complex tasks. The author argues that successful applications of agents over the past year have relied on structured feedback loops to maintain alignment with goals, enabling them to handle longer-term objectives. Back pressure, in this context, refers to systems or tools that provide automated, real-time feedback to agents, allowing them to identify and correct errors independently. This approach reduces the need for manual intervention by humans, who would otherwise have to constantly validate each step of an agent’s work. Moss emphasizes that by embedding back pressure into workflows, engineers can delegate increasingly complex tasks to agents while maintaining confidence in their outputs. The core argument is that without such feedback mechanisms, agents remain limited to simple tasks, and human effort is wasted on trivial corrections rather than strategic problem-solving.

A central example in the post involves agents that lack access to systems capable of evaluating their work, such as a build system. If an agent is only given the ability to edit files without tools to compile or test code, a human must manually confirm whether changes are valid. This process is inefficient and scales poorly, as it diverts attention from higher-level tasks. Moss contrasts this with scenarios where agents are equipped to run commands, execute tests, or interact with build systems. For instance, an agent that can trigger a CI/CD pipeline and analyze results autonomously eliminates the need for human intervention in basic validation. This shift not only saves time but also allows engineers to focus on broader architectural decisions or complex problem-solving. The author suggests that the more back pressure an agent can handle, the more it can operate independently, effectively increasing human leverage.

The discussion extends beyond software development to programming languages with expressive type systems, which Moss frames as another form of back pressure. These systems enforce constraints that prevent invalid states and encourage engineers to define clearer contracts within their code. For example, languages like Rust or Elm provide robust error messaging that helps identify edge cases and guide corrections. When integrated with AI agents, such type systems offer immediate feedback, enabling the agent to refine its output without human oversight. Moss highlights that languages with excellent error messages—such as Python, Rust, and Elm—are particularly beneficial because they provide detailed guidance that can be directly fed back into the agent’s learning or decision-making process. This synergy between type systems and AI agents reduces ambiguity, ensuring that the agent’s work adheres to expected standards without constant human correction.

Another key point is the role of specialized tools in providing back pressure for agents, especially in non-code domains. Moss cites the use of MCP servers with Playwright or Chrome DevTools, which allow agents to interact with rendered web pages and compare expected outcomes against actual results. This capability eliminates the need for humans to manually verify UI elements, such as whether a button is centered or if an element loads correctly. Even for projects not focused on user interfaces, tools that connect to LSPs (Language Server Protocols) for linting or static analysis serve as a form of back pressure. These systems provide automated feedback on code quality, enabling agents to self-correct without human intervention. Moss suggests that adopting such tools is essential for scaling agent contributions, as they allow the system to iterate and refine its work until it meets predefined criteria.

Outside of traditional engineering tasks, the post highlights how back pressure applies to proof assistants like Lean when combined with AI. Moss references recent work on solving the Erdős Problems, where an agent (Aristotle) formalized a proof written by GPT-5.2 Pro into Lean, demonstrating how automated feedback can validate and refine complex logical structures. Similarly, randomized fuzzing techniques for generating CUDA kernels or logic programming with agents benefit from back pressure by enabling iterative testing and refinement. These examples illustrate that back pressure is not limited to code but extends to any domain where automated validation can replace manual oversight. Moss argues that the growing importance of such feedback mechanisms is reshaping engineering practices, as teams increasingly prioritize designing systems that allow agents to operate autonomously.

The author also emphasizes the importance of aligning back pressure with specific development methodologies, such as spec-driven development. For instance, if an agent is tasked with generating a specific API schema, integrating automatic documentation generation based on OpenAPI schemas allows the agent to compare its output against intended specifications. This creates a feedback loop where the agent can adjust its work iteratively until it meets the required criteria. Moss suggests that similar techniques can be applied across various domains, from testing to documentation, once the principle of back pressure is recognized. The underlying message is that investing in structured feedback systems enables agents to produce reliable results more efficiently, reducing the burden on human operators.

Moss concludes by urging engineers to critically evaluate their workflows and identify opportunities to introduce back pressure. By doing so, they can transform agents from tools that require constant supervision into autonomous collaborators capable of resolving inconsistencies and issues independently. Without back pressure, the author warns, engineers risk becoming trapped in repetitive, low-value tasks that hinder progress. The post serves as a call to action for developers and teams to prioritize the design of systems that leverage feedback mechanisms, ensuring that agents can scale effectively while maintaining high standards of quality. Ultimately, the concept of back pressure is positioned as a foundational element for maximizing the potential of agents in both technical and non-technical contexts, fostering a shift toward more efficient and reliable automation.