Skip to main content

Evals

Use Evals when you want a saved, reusable evaluation setup that teams can run again later.

Before You Begin

  • You have a project.
  • You have at least one agent.
  • You have at least one specification.

What an Eval Connects

An eval links:

  • one or more agents
  • one or more specifications

This is the saved setup that turns reusable assets into a named workflow.

Create an Eval

  1. Open Evals inside a project.
  2. Create a new eval and give it a descriptive name.
  3. Select the agents the eval should use.
  4. Select the specifications the eval should use.
  5. Save the eval and open the detail page.
  6. Review the overview, schedule, linked agents, linked specifications, and results summary.
  7. Start a run from the eval detail page when you are ready.

Review the Eval Detail Page

Use the eval detail page to confirm that the saved setup matches your intent before you run it. Review:

  • the eval overview
  • any configured schedule
  • linked agents
  • linked specifications
  • the latest results summary

When to Use Evals

Use evals when the setup should be reusable, named, and easy for a team to run again later.

Evals are a good fit when you want:

  • a stable workflow that combines saved assets
  • recurring or repeatable runs
  • a shared entry point for teammates
  • consistent review of results over time

Important Notes

  • Evals are the best place for repeatable team workflows.
  • Some projects also configure schedules for recurring eval runs.
  • Eval detail pages summarize the latest run and any configured schedule.
  • Run details preserve status, outputs, and robustness-related summaries.