Why This Matters
Your agent now has real power. It can read files, write patches, and execute tools. But not all tool calls carry the same risk. Writing a comment to a config file is harmless. Running rm -rf / is catastrophic. Between those extremes lies a spectrum of operations that need different levels of oversight.
Every production agent implements some form of policy gate. Claude Code asks for confirmation before running shell commands. Codex runs untrusted code in sandboxed containers. The principle is universal: evaluate the risk of an action before executing it, and require appropriate approval.
Without policies, you are one hallucinated shell command away from disaster.
What You Will Build
An ApprovalPolicy class that intercepts tool calls, evaluates them against a chain of rules, and returns one of three decisions: allow, deny, or ask (require human confirmation). You will also implement a basic sandboxing concept that restricts what tools can access.
Tool Call from LLM
|
v
+------------------+
| policy.evaluate() |
+--------+---------+
|
+-----+-----+-----+
| | |
v v v
ALLOW ASK DENY
| | |
v v v
Execute Prompt Return
tool user error
for to LLM
approval
The policy chain evaluates rules in order. The first rule that matches determines the outcome. If no rule matches, the default is "ask" — fail safe, not fail open.
Story Mode for this chapter is coming soon
We are crafting a fun, code-free explanation with metaphors and interactive mini-games. In the meantime, switch to Builder Mode to start learning.
What's Next
You now have two layers of safety: structured patches (Chapter 5) for safe file editing, and policy gates (this chapter) for controlling what the agent can do. But there is still a fundamental problem: the agent works in your main working directory.
If the agent is working on a feature and you are also editing files, changes collide. If two agents run in parallel, they step on each other. You need isolation — each task should get its own copy of the codebase.
In Chapter 7: Worktree Isolation, you will use git worktrees to give each agent task its own isolated workspace, preventing cross-contamination and enabling safe parallel execution.