Skip to main content
All agents have access to a cloud-based sandbox. Think of it as a clean, disposable computer that’s attached to your chat. The agent can use it to compile and run code, manipulate files, start web servers, run tests and benchmarks, and much more. The sandbox is a full Fedora Linux environment. Port 8080 is exposed publicly via a per-sandbox (per-chat) preview URL so you can view web services your agent starts for debugging and testing. Please note that this is not a long-term deployment solution. At a high level, the sandbox is the primary way for the LLM-based agent to ground its execution in the real world. The observations from test suites, compiler errors, log lines, benchmark numbers and so forth feed back into the agent’s plan and next actions.

Lifecycle and Setup

A new sandbox will be started for each chat on the first use of a terminal command. The agent accesses the x86-based Fedora Linux machine as the sandbox user which has password-less root access. This means that the agent can install system packages using dnf, or make any other system changes needed as part of the environment setup.
Instead of using virtual environments (such as venv, uv, conda, micromamba, etc) it is simply recommended to use the system Python installation inside the sandbox.
A few minutes after the chat ends (i.e. the agent is inactive, and the browser tab is disconnected), the sandbox associated with that chat will be terminated. If you decide to start the chat again, then the sandbox will be resumed.
Containers: since the sandbox is a Firecracker container, we do not support nesting further containers or virtualization within the sandbox. As a result We currently don’t support workflows based on OCI container images (such as Docker, Podman, etc).

Files

Files that the agent is working with are synchronised to the sandbox user’s home directory: /home/sandbox. That is, if the agent has been editing file foo.txt, then it will end up at /home/sandbox/foo.txt. Generally, files under /home/sandbox will be visible to the agent (i.e. its filetree is rooted in this directory), and synchronised to persist across sandbox restarts. There is however a notable exception which is that any files caught under .gitignore rules, as well as the filtering heuristics from scc will not be synchronised.
We are working to remove these intricate synchronisation ignore rules and in the near future we will synchronise almost all files.
If you have any large files or datasets, it is highly recommended to place these outside of the /home/sandbox directory (for instance under /tmp/my_dataset) to avoid slowing down the agent during file syncs.

Sandbox Tools

There are two main ways the agent will interact with the sandbox:
  1. execute_command: this is what the agent uses to run a command in a terminal window. If there were any commands already running in that terminal window, they will be cancelled immediately to allow the new command to start. You will see the terminal ‘streamer’ in the UI while the agent is watching the execution of the command. Note however that when the terminal streamer disappears, the command will keep running if the command did not exit, continuing in the background.
  2. terminal_operator: this is for more agentic terminal interaction: such as navigating TUI menus, operating htop, playing terminal-based games, etc. Note that in contrast to the execute_command tool, the terminal operator will pick up from the terminal window state without canceling the previously running command. This is useful for instance if the agent ran a command with execute_command which is now hanging at some [Y/n] prompt - the terminal operator will be able to pick up from where the previous tool left off.

Long-Running Processes

Any servers you start such as vite, uvicorn, next dev and so forth will keep running until the agent stops them, closes their terminal window, or the sandbox pauses from chat inactivity. Note that you may turn on the ‘persist sandbox’ toggle in the chat settings, which will keep the sandbox running for 24 hours after the chat ends (e.g. for ephemeral, shareable demos).

Environment Variables

If you wish to expose any environment variables (such as AWS credentials, HuggingFace tokens, or anything else), first make sure that they are defined in the credential manager UI menu (click on your avatar in the top-right, then Manage Credentials). The agent can see all your registered credentials, and ‘activate’ them (with your consent) on a per-chat basis. After your explicit consent, these credentials will be made available as environment variables in the agent’s terminal windows.

Limitations

There are a number of limitations to keep in mind:
  • containers: Nested OCI containers are not available inside the sandbox
  • GPU: accelerated compute is not yet available (coming soon)
  • Public Ports: only port 8080 is available on the preview URL. Consider using a reverse proxy and path-based routing.
  • File visibility: only text (i.e. source code files) + png, jpg, webp, gif and pdf files sync back to the agent. Other binaries and ignored files won’t show up for the agent.

Tricks

You may treat the sandbox as a general-purpose computer. This means that you can:
  • install VSCode, start the server, and access it (+the file viewer, editor, and terminal emulator) in your browser on the preview URL and port 8080
  • install Jupyter Lab, and similarly access the notebooks, python repl and terminal emulator from your browser
  • ask the agent to zip up entire directories, and then start a python server with python -m http.server -p 8080 followed by accessing the preview URL to download arbitary files in the sandbox
  • Maestro (or you) can use browsers to view any servers you host off 8080, including using the vision based browser tools to navigate and debug your front end systems.