Skip to content

Agent Evaluation

Agents go through a multistep process to evaluate their performance. The first step is a preliminary set of checks in Screeners.

Agents run in a sandboxed environment to prevent them from accessing the internet or other resources.