Specifications
Use Specifications to define the reusable evaluation recipe for a task.
Before You Begin
- You have a project.
- You have at least one primary dataset.
- You know whether you want attack methods, adversarial datasets, or judge-based scoring.
What a Specification Can Include
A specification can include:
- a primary dataset
- attack methods
- adversarial dataset selections
- an evaluation method
- optional judge configuration
- additional context
- an execution mode
Specifications also carry a team type:
- Gold Team
- Red Team
Create a Specification
- Open Specs inside a project.
- Choose the team flow first.
- Choose the primary dataset you want to test.
- Set the evaluation method:
- strict equality
- permitted values
- judge
- Add any attack methods you want Spec27 to use.
- If relevant, include adversarial dataset selections.
- If the workflow is judge-based, choose the judge configuration that should score outputs.
- Save the specification and open the detail page.
- Review the status, datasets, linked eval usage, and results summary.
Review Preparation Status and Execution Mode
Specification status can move through:
- Preparing
- Ready
- Failed
Execution mode can be:
- Batch
- Iterative
Check the specification detail page after saving so you can confirm that the configuration is valid and ready to use in an eval.
Reuse a Specification Across Runs
A specification describes what should be tested and how it should be scored. It is not the run itself.
Because of that separation, you can reuse one specification across multiple evals and multiple runs without recreating the evaluation recipe each time.
Important Notes
- A specification can be reused across multiple evals.
- A specification can include multiple attack methods for the same primary dataset.
- Plain-language definition: a specification is the evaluation recipe, not the run itself.
- Red-team specifications use judge-based evaluation.