Blog

Agents Need Deterministic Sandboxes

Alex Kesling — CTO·

I still want my coding agents to run in a robust sandbox by default, one that restricts file access and network connections in a deterministic way. I trust those a whole lot more than prompt-based protections like this new auto mode. @simonw

Claude presents two options out of the box:

  • babysit its every action or
  • give it free rein over the system with --dangerously-skip-permissions.

I’m tired of babysitting my coding agents, but I don’t want them running rm -rf /. The approach Auto mode for Claude uses to take the burden off of us is an application of an LLM as Judge system. Claude nests untrusted prompts within trusted ones to interrogate their safety. E.g. “Is it safe to run the following prompt: <UNVALIDATED_PROMPT_HERE>”.

This is an improvement! It’s, however, akin to protecting against script-injection by injecting into a “safer” script first. SQL and shell injection have shown us time and time again that there's no replacement for a real sandbox.

I believe there's a third way beyond babysitting and chaos that lets me run Claude where I have all my tools and context (instead of having to spin up and juggle all the tools, credentials, etc. installed on a Sprite or exe.dev instance). One that's both safer and more productive: run each action Claude requests in its own action-specific sandbox..

That's why we made Clash, a sandbox built for users first.

The insight: I trust the Claude Code harness itself differently than the actions the LLM takes through it. Throwing the whole shebang in a container isn't actually performing the job I want. What I want is for Claude to extend my abilities with all the context and power I have locally (including helping review PRs, pushing to dev branches, performing QA on secure staging environments, etc.). To do that I need different actions to be sandboxed differently. I don't want to babysit, but I want control over what Claude runs and visibility into what's happening as a result. Safety and control shouldn't come at the cost of productivity. I should be able to do this without containerizing or replicating my environment across multiple machines.

Clash policies match actions being performed to an OS-enforced sandbox (on macOS w/ Seatbelt and Linux w/ Landlock) you want them to run in. Claude provides a plugin interface to intercept every action performed. Clash hooks into this layer and rewrites / wraps every action performed in a session. Want Claude to be able to curl, but only to your local server running on port 8080? Done. git fetch can pull from GitHub but nothing else can read your .ssh/ keys? Done. Want any bash command to be able to safely read from your repository directory but not see anything else? Done. With Clash we can control exactly which tools Claude can use and route any shell command to precise OS-enforced sandboxes.

The project is a few months old now. We’ve spent a majority of our time iterating on the ergonomics of policy creation and management. We’ll soon post about our journey (it started with a YAML DSL, then S-expressions which grew into their own Scheme… and now Starlark+JSON). It’s a great experience once you settle into a policy that fits your workflow. We’ve found meaningful productivity (and peace of mind) gains once we’ve gotten set up, and we're working on prebuilt "policy packs" to bootstrap new users. If Clash resonates with you, please give it a try!

Head over to the quickstart guide to get started. Please file any and all issues you might find (or feedback you have) on the Clash GitHub project.