The Code-Only Agent
Recorded: Jan. 19, 2026, 10:03 a.m.
| Original | Summarized |
The Code-Only Agent • Rijnard van Tonder Rijnardvan Tonder 𝕏 @rvtond Home ← Back to all posts The Code-Only Agent When Code Execution Really is All You Need If you're building an agent, you're probably overwhelmed. Tools. What if the agent only had Truly one tool means: no `bash`, no `ls`, no `grep`. Only When you watch an agent run, you might think: "I wonder what tools The simpler Code-Only paradigm makes that question irrelevant. The > Agent, do thing > Agent Contrast with: > Agent, do thing > Agent It does this every time. No, really, It needs to create a script that crawls a website? It doesn't write We make it so that there is literally no way for the agent to So what? Why do this? You're probably thinking, how is this useful? Let's think a bit more deeply what's happening. Traditional agents At some point the agent might even write a Python script to do this The Code-Only agent produces something more precise than an answer Try ❯❯ Code-Only plugin for Claude Code Code witnesses are semantic guarantees Let's follow the consequences. The code witness must abide by Is a Code-Only agent really enough, or too extreme? I'll be frank: I Code-Only agents are not too extreme. I think they're the only way This lens says the Code-Only agent is a producer of proofs, So you want to go Code-Only. What happens? The paradigm is simple, First, the harness. The LLM's output is code, and you execute that I've personally, e.g., had the tool return results directly if under Next, enforcement. Let's say you're using Claude Code. It's not Next, the language runtime. Python, TypeScript, Rust, Bash. Any Once you get into the Code-Only mindset, you'll see the potential What about heterogeneous languages and runtimes for our `execute_tool`? I don't think we've thought that far yet. The agent landscape is quickly evolving. My thoughts on how the prose.md Welcome to Gas Town Anthropic Code Execution with MCP article Anthropic Agent Skills article Cloudflare Code Mode article Ralph Wiggum as a "software engineer" Tools: Code is All You Need How to Build an Agent What's Next Two directions feel inevitable. First, agent orchestration. Tools Second, hybrid tooling. Skills work well for processes that live in Try ❯❯ Code-Only plugin for Claude Code 1There is something beautifully Timestamped 9 Jan 2026 |
The Code-Only Agent, as conceptualized by Rijnard van Tonder, represents a radical departure from traditional agent architectures that rely on multiple tools, subagents, or predefined skills. At its core, this paradigm asserts that the sole tool an agent needs is code execution itself—a Turing-complete mechanism capable of addressing virtually any task. Van Tonder argues that this approach simplifies the agent’s design by eliminating the complexity of managing disparate tools like `bash`, `ls`, or `grep`. Instead, every action—whether searching for files, parsing data, or interacting with external systems—is reduced to generating and executing code. This not only streamlines the agent’s workflow but also ensures that its outputs are precise, repeatable, and open to verification. The author emphasizes that the Code-Only Agent’s reliance on code execution transforms it into a “code witness,” producing executable artifacts that serve as both the solution and the proof of its correctness. By framing tasks in terms of code rather than natural language or tool calls, the agent avoids the pitfalls of ambiguity and inefficiency inherent in traditional methods. The article critiques conventional agents for their tendency to employ a patchwork of tools, which can lead to inconsistencies, missed details, or even hallucinations when processing complex tasks. For instance, an agent tasked with analyzing 1,000 files might use a combination of `ls` and `grep`, but this approach risks skipping directories or misinterpreting results. In contrast, the Code-Only Agent does not need to be explicitly instructed to write a script; it is forced by its design to generate code as the primary means of achieving any goal. This ensures that even mundane tasks, such as file searching or web scraping, are handled through programmatically generated solutions. Van Tonder illustrates this with examples like using Python’s `os.walk` or `rglob` to locate files, or writing code that outputs a script for web crawling without saving it to the filesystem. The result is a system where every action is codified, creating a traceable and reproducible process. This approach also aligns with the principle of “executable descriptions of behavior,” where the agent’s outputs are not just answers but fully functional programs that can be rerun, modified, or analyzed. A critical component of the Code-Only Agent’s design is its enforcement mechanism, which ensures that no alternative tools are used. Van Tonder acknowledges the challenges of implementing this in practice, particularly with models like Claude Code that may resist being confined to a single tool. He describes strategies such as using PreHook plugins to intercept and block prohibited actions, even if this leads to redundant iterations. The author also highlights the importance of managing output efficiently, as large datasets or complex computations could overwhelm the agent’s context. Solutions include returning results directly if they fall under a size threshold or writing them to disk and referencing the file path. Additionally, the handling of `stdout` and `stderr` requires careful consideration to balance transparency with usability. These technical challenges underscore the need for thoughtful design choices, such as selecting an appropriate runtime language—Python, TypeScript, Rust, or Bash—that aligns with the agent’s intended use cases. Dynamic languages like Python offer advantages by allowing code to run natively within the agent’s environment, while statically typed languages may provide stronger guarantees for certain applications. Beyond technical implementation, the Code-Only Agent raises philosophical and practical questions about the nature of computation and trustworthiness. Van Tonder argues that code execution provides a form of “semantic guarantee” rooted in the runtime semantics of the chosen language. Unlike natural language responses, which are prone to interpretation errors, code is a formal construct that can be executed and verified. This is particularly valuable in scenarios requiring rigorous guarantees, such as data analysis or system automation, where even minor errors can have significant consequences. The author references the concept of “proofs-as-programs,” suggesting that Code-Only Agents function as producers of computational proofs, with their code serving as both the solution and the evidence of its validity. This perspective is further reinforced by the potential integration of languages like Lean, which offer formal verification capabilities. However, Van Tonder also acknowledges the limitations of this approach: while code provides a high degree of reliability, it is still subject to the semantics and constraints of its implementation. The trustworthiness of a Code-Only Agent, therefore, depends on the robustness of its underlying programming language and the fidelity of its execution environment. The article also explores the broader implications of the Code-Only paradigm for agent orchestration and hybrid systems. Van Tonder speculates that future agents will likely blend Code-Only execution with other paradigms, such as natural language skills or API-based tools. For example, while Skills (as defined by Anthropic) enable reusable processes framed in natural language, the Code-Only approach offers a more precise and executable alternative. He envisions a future where these two models coexist, with Skills handling high-level coordination and Code-Only agents managing low-level computation. This hybrid model could leverage the strengths of both approaches, combining the flexibility of natural language with the rigor of code. Additionally, the article points to trends like “prose.md,” which uses natural language constructs to orchestrate agents, as a potential complement to Code-Only systems. By using natural language for coordination and code for execution, such hybrid agents could achieve a balance between human readability and machine precision. Van Tonder also addresses the practical challenges of adopting the Code-Only paradigm, including the need for custom infrastructure and the limitations of existing tools. He notes that while platforms like Anthropic’s MCP (Model-Controller Protocol) or Cloudflare’s Code Mode provide some support for code-centric agents, they are not inherently designed for the extreme simplicity of a single-tool system. This has led to workarounds, such as using PreHook plugins to enforce code-only behavior or adapting existing agents through custom modifications. The author acknowledges that building a Code-Only Agent from scratch may be the most straightforward approach, though it requires careful consideration of factors like runtime selection, output management, and error handling. He also highlights the importance of further research into heterogeneous runtimes, where different languages or execution environments could be integrated into a unified Code-Only framework. Finally, the article reflects on the philosophical underpinnings of the Code-Only Agent, framing it as a reaction to the growing complexity of agent ecosystems. Van Tonder critiques the “right way” mentality that often drives the development of tools and skills, arguing that it can stifle innovation by prioritizing established patterns over novel solutions. By stripping away extraneous components and focusing on code execution, the Code-Only Agent embodies a more minimalist and principled approach to agent design. This perspective is echoed in the author’s appreciation for quines—programs that generate their own source code—as a metaphor for the self-referential nature of the Code-Only paradigm. Ultimately, Van Tonder sees the Code-Only Agent not as a replacement for traditional methods but as a complementary approach that challenges developers to rethink how agents interact with the world. As he concludes, the future of agent systems may lie in a synthesis of Code-Only execution with other paradigms, creating a flexible and powerful framework for tackling computational tasks. |