Skip to main content

Troubleshooting

A run did not start

Check:

the eval has at least one linked agent
the eval has at least one linked specification
any dependent entries or preparation steps are complete
organization usage limits have not blocked the work

Agent preview is blocked

Check:

whether the agent requires missing Secrets
whether the secret key names match what the agent expects

I do not see an asset I expected to see

Check:

the active organization
whether the asset belongs to a different project
whether ownership or visibility rules are filtering it out

A specification is stuck in preparing or failed

Check:

whether the primary entries are valid and complete
whether the selected attack methods still make sense for the workflow
whether the failure state on the specification detail page gives a specific error
whether retrying preparation is available from the detail page

A judge-based score looks wrong

Check:

the judge configuration
the selected built-in judge version
any shared context you added
the sample input and output you used during judge testing
whether judge-based scoring is the right method for the task

My results are hard to interpret

Start with:

run status
latest step
per-row correctness
console output or error details

Then compare the run setup back to the eval and specification.

I hit a usage limit sooner than expected

Check:

the current organization plan
consumed versus reserved units
whether previews, specification preparation, or queued runs have already allocated usage in the current week

A run did not start
Agent preview is blocked
I do not see an asset I expected to see
A specification is stuck in preparing or failed
A judge-based score looks wrong
My results are hard to interpret
I hit a usage limit sooner than expected
Related pages