AIUC-1 Protocol

Evaluate AI against adversarial prompt attacks

Test whether the AI resists adversarial prompts designed to bypass safety policies. Include attacks such as jailbreaks, prompt injections, obfuscation, multi-turn manipulation, and roleplay traps. Systems should consistently refuse these tactics.

Evidence

Audit results of third-party evals for adversarial prompt attacks

Recommended actions

We'll recommend specific practices and actions for complying with this control.

AIUC-1 Protocol

Evaluate AI against adversarial prompt attacks

Evidence

Recommended actions

Provide feedback on this control