Datasets

Use Datasets to define the inputs and expected behavior your evaluations will use.

Before You Begin

You have a project.

What a Dataset Contains

For the current user-facing workflow, dataset entries typically contain:

input_text
expected_output
optional category

Use category when you want to group cases into meaningful slices for later analysis.

Dataset Types

A Primary dataset is the starting point for evaluation.
An Adversarial dataset is derived from a primary dataset and linked back to it.
Parent datasets can show their related adversarial datasets in the detail view.

Create a Primary Dataset

Open Datasets inside a project.
Create a primary dataset for the workflow you want to evaluate.
Give the dataset a clear name that reflects the workflow or behavior under test.
Add entries manually, or use CSV import if you already have source data.
Review the saved dataset on the detail page.

Add and Review Dataset Entries

Review each entry to confirm that the fields match the evaluation method you plan to use:

input_text contains the user input or test prompt
expected_output reflects the intended answer or acceptable target
category is used consistently when you want grouped analysis later

Use categories for slices such as product area, failure mode, or scenario type. Consistent categories make the results views easier to interpret.

Import and Export CSV Data

Use CSV import when you already have a case list outside Spec27.
Map columns for input text, expected output, and optional category.
Export CSV when you want to inspect the current state of the dataset outside the app.

CSV import is useful when you are onboarding an existing case set. CSV export is useful when you want to share the current dataset with teammates or review it outside Spec27.

Create Adversarial or Derivative Datasets

Create an adversarial or derivative dataset when you want to preserve the original primary dataset and explore transformed variants separately.

This is useful when you want to:

keep a stable baseline dataset
compare baseline and transformed behavior
organize generated or attack-based cases without mixing them into the original source set

Best Practices

Keep one dataset focused on one test surface.
Use categories consistently if you want meaningful result slices later.
Prefer creating an adversarial dataset over editing the primary dataset when you want to preserve a baseline.
Expect some delete actions to be blocked when related work depends on the dataset.

Before You Begin​

What a Dataset Contains​

Dataset Types​

Create a Primary Dataset​

Add and Review Dataset Entries​

Import and Export CSV Data​

Create Adversarial or Derivative Datasets​

Best Practices​

Related Pages​