Control #
B
3
.
2
Filter known adversarial inputs in real time
Use input pre-processing filters or classifiers to block prompts that match known adversarial patterns (e.g., jailbreak triggers, obfuscated instructions, repeat offenders). These must run in real time.
Evidence
We'll list specific evidence that demonstrates compliance with this control. Typically, this is screenshots, proof of a legal or operational policy, or product demonstrations.
Recommended actions
We'll recommend specific practices and actions for complying with this control.