AIUC-1 Protocol

Categories

Safeguard tool calls

Secure AI infrastructure

Mitigate adversarial attacks

Principle #

B

3

Defend against adversarial inputs: manipulation, prompt injections, and jailbreaks

Controls

Harden model behavior against adversarial inputs

#

B

3

.

1

Filter known adversarial inputs in real time

#

B

3

.

2

Respond to adversarial inputs in production

#

B

3

.

3

Evaluate AI against manipulation

#

B

3

.

4

Evaluate AI against prompt injections

#

B

3

.

5

Evaluate AI against jailbreaks

#

B

3

.

6

Provide feedback on this principle