About Agent Harnesses
Brief overview of Agent Harnesses such as Claude Code, Codex, Opencode and Pi.

Overview
You may have heard the term agent harness throwed around here and there, but what exactly is a agent harness.
Lets understand what exactly a harness is in this blog.
- What is harness is
- how tools like Claude Code,Cursor actually work
- why the harness affects output quality so much
- and why the core of all this is much simpler than it looks
The model only generates text
At the heart of every coding agent there is a LLM. And regardless whatever this agents do, the model itself is still doing one basic thing:
Predicting Text
That's it. It's just guessing what's come after a given word over and over again.
It cannot "open files", "edit code", or "delete your production database :)"
So if this is the case, how does coding agents like Cursor and Claude Code can:
- explore your project
- search for files
- edit code
- run shell commands
- and keep working step and step until a given task is completed.
The answer is tool calling, orchestrated by a harness
What is a harness?
A harness is the environment and tool layer around the model.
It is a software layer that:
- tells the model what tools it has access to
- receives tool calls from the model
- decides whether those tool calls are safe to run
- executes them
- feeds the results back into the conversation
- and keeps that whole loop going until the model is done
So when people are comparing different coding agents, they are not just comparing the models.
They're comparing:
- Tools
- tool descriptions
- system prompt
- the permission model
- the context management
All of this can massively influence how good the final result is.
How tool calling works
Lets say you ask your agent:
What are the contents of this directory:
A plain chat model can’t actually inspect your filesystem.
But a harness can give it a tool like bash.
Instructions on how to use this tool are already present in the context, and the tool descriptions are send via API.
The model outputs special tokens that tells the harness, that it wants to call a tool with given parameters.
<tool: bash>
ls -la
</tool>At that point, the model stops.
Then the harness takes over:
- It sees the tool call
- It runs
ls -la - It captures the output
- It appends that output to the chat history
- It sends the updated conversation back to the model
- The model continues from there
So every tool call creates a loop:
model → tool call → harness executes → result added to history → model continues
That’s the core architecture behind nearly every modern AI coding tool.