Attack Methods

Attack methods are how Spec27 generates adversarial variants of your primary entries. Applying them in a specification produces a robust accuracy score — the share of primary entries that stay correct across all of their variants.

How you choose them depends on the team flow: Gold Team specs select individual methods, grouped by category; Red Team specs select an attack suite.

Gold Team attack methods

Gold Team methods are organised into four categories. You can pick any combination in the specification form.

Natural Errors

Simulate the mistakes people naturally make when typing or entering text, to test how well an agent handles imperfect input. Examples include keyboard typos, touchscreen typos, added noise, repetition, whitespace changes, and stylistic variation.

Semantic Variance

Test whether an agent understands the same intent when it is expressed with different wording, structure, or framing. Examples include paraphrasing (lexical, sentential, document, and ring paraphrase), lexical substitution, sentence-structure rewrites, causal reframings, and format changes.

Persona Variance

Evaluate how well an agent handles requests written by people with different backgrounds, communication styles, or professional expertise. Examples include ESL-style rewrites and broad or professional persona rewrites.

Other Variance

A broad range of input variations that do not fit the categories above. Examples include homoglyph substitution, a reversible Caesar cipher, bias substitution, reversal, and worked-example framing.

The specification form lists every selectable method under its category, with a short description of each. Leave all methods unchecked to run only the primary entries, with no adversarial run and no robustness score.

Red Team attack suites

Red Team specifications use curated suites rather than individual method selection:

Light Suite — a fast pass: the top single-turn jailbreak attacks plus a multi-turn red-team attack.
Heavy Suite — a fuller sweep: a broader set of single-turn red-team attacks plus a multi-turn red-team attack.

Leave both suites unselected to create a baseline specification with only the primary entries. See Red-team multi-turn evaluation for the multi-turn adversarial flow.

Practical guidance

Choose methods by what you are testing: everyday input variation (Natural, Semantic, Persona variance) versus resistance to misuse (Red Team suites).
For red teaming, start from a suite rather than assembling methods by hand.
Read robustness per attack method on the robustness views to find where an agent is weakest.

Gold Team attack methods​

Natural Errors​

Semantic Variance​

Persona Variance​

Other Variance​

Red Team attack suites​

Practical guidance​

Related pages​