AIUC-1
Vendor Questionnaire
for AI Applications and AI Agents

Based on the AIUC-1 Framework and extensive research into best practices and emerging risks, this questionnaire helps organizations, businesses, and enterprises assess AI vendors—especially those whose systems generate content, automate decisions, or interact with users.

It covers key areas of risk including security, privacy, safety, governance, and regulatory compliance, and supports structured diligence across a range of AI systems and use cases.

A. Safety

A1

Mitigate generation of harmful outputs

For the purposes of this questionnaire, harmful outputs include responses from your AI system that may cause emotional distress, reinforce dangerous behavior, enable unsafe decision-making, or violate expectations in high-risk domains such as health, finance, or law. This includes both direct harms (e.g., offensive language, risky advice) and indirect harms (e.g., escalation of user distress, poor refusal behavior, or unsafe tone modulation). 1. Do you maintain a harm severity taxonomy to classify and triage AI-generated outputs? Describe the harm tiers or levels (e.g., low, moderate, severe), the criteria used to assign severity, and provide representative examples of each level. If available, include documentation or policy artifacts. 2. How do you evaluate your AI system’s behavior in response to emotionally charged prompts, such as distressed or angry user inputs? Describe the evaluation methodology and frequency. Provide examples of actual responses to both distressed and angry prompts. 3. What measures are in place to prevent escalation, de-escalate risk, or appropriately hand off to humans? 4. Does your AI system provide responses in high-risk domains such as health, financial, or legal topics? a. If yes, describe the types of responses typically provided and the guardrails or safeguards in place (e.g., disclaimers, refusal patterns, escalation triggers). b. If no, describe how the system avoids or refuses to engage in these topics. Provide sample responses. c. In either case, describe your audit process for reviewing the AI’s behavior in high-risk domains. Include recent findings if available. 5. Do you conduct regular audits or reviews focused on harmful or risky outputs in high-risk deployment domains? Describe audit frequency, scope, who conducts the reviews, and how findings are documented and remediated.

A2

Prevent adversarial inputs from causing harmful outputs

For the purposes of this questionnaire, adversarial inputs refer to prompts or interactions intentionally crafted to bypass an AI system’s safety policies—such as jailbreaks, prompt injections, obfuscated language, multi-step manipulation, or roleplay-based traps. This section evaluates your system’s ability to detect, block, resist, and respond to these inputs through testing, moderation layers, and ongoing review. 1. How do you evaluate your AI system’s behavior under adversarial prompting? a. Describe your evaluation or red-teaming process, including the types of attacks tested (e.g., roleplay, injections, chaining, obfuscation). b. How frequently are these evaluations conducted, and who is responsible? c. What severity classifications or failure thresholds do you use to track results? 2. Do you use real-time content moderation systems to detect and block adversarial prompts before they reach the model? a. Describe the rule-based, ML-based, or hybrid techniques in use. b. Have you implemented any model-side defenses such as refusal tuning or adversarial training? c. How do you measure coverage, latency, and false positive/negative rates? 3. How do you evaluate the performance and resilience of your input moderation systems? a. What metrics do you use to measure detection accuracy and evasion resistance? b. How often do you re-test these systems against new or emerging attack strategies? c. Do you benchmark them against internal or third-party adversarial inputs? 4. How does your system detect and respond to adversarial behavior in production? a. What types of behavior are flagged in real time (e.g., repeated probing, prompt tampering)? b. What automated or manual response mechanisms are in place (e.g., rate limiting, user blocks, alerts)? c. How are production responses logged, reviewed, and improved over time? 5. How are flagged or blocked prompts reviewed and analyzed on an ongoing basis? a. Who conducts prompt log reviews, and how frequently are they performed? b. What criteria guide triage, escalation, and resolution of incidents? c. How do findings from reviews feed back into your system improvements?

A3

Allow escalation of AI interactions to a human for review

1. Can users escalate AI interactions to a human for immediate review? If yes, describe how escalation is initiated (e.g., via UI, keywords, model-detected conditions). Include a walkthrough of the user steps and examples of escalation mechanisms, including any automation or AI-based detection involved. 2. Can users flag AI interactions for later review? If yes, describe how flagging works, including the user flow, interface elements, and an example of a typical flagging scenario. 3. What systems are in place to retain the history of user escalations and flagged interactions? Include details on retention periods, access controls, and any measures taken to protect this data. 4. Can you provide documentation or summaries of past user-initiated escalations that were reviewed by a human? What were the outcomes of these reviews, and how were they tracked or remediated? 4. Have you undergone any third-party audits or assessments of your escalation and flagging processes? If so, provide details on the scope, assessor, and any findings or remediations from the most recent evaluation.

A4

Run third-party safety testing regularly

1. How often do you conduct third-party safety testing or AI red-teaming for your products, and who performs them? Include the date and scope of your most recent engagement. 2. What were the key findings from your most recent third-party red-teaming? How were the identified vulnerabilities prioritized and addressed? 3. How do you verify that remediation efforts were effective? Include any re-testing procedures, signoff steps, or validation metrics used. 4. What criteria, standards, or protocols do your third-party testers follow during their evaluations? For example: AIUC-1, OWASP, MITRE, or internal red-teaming frameworks. 5. Do you track whether red-teaming results lead to safety improvements in your product? Describe how learnings are used in future development cycles or governance reviews. 6. Have you undergone any audits or assessments specifically focused on your third-party testing or red-teaming program? Include timing, scope, and outcomes of any external evaluations.

B. Security

B1

Limit AI access to external systems

For the purposes of this questionnaire, AI tool calls refer to actions initiated by the AI system that interact with external tools, services, APIs, or system components—such as retrieving files, triggering workflows, calling APIs, executing commands, or performing transactions. These capabilities may introduce security, operational, or compliance risk depending on how they are scoped and governed. 1. Do your AI systems have the ability to call external tools or systems (e.g., APIs, file systems, code execution, third-party services)? If yes, describe the criteria used to determine which tools are enabled and under what conditions the AI is permitted to initiate calls. 2. Do you detect and respond to anomalous or unexpected AI tool call usage? Describe your detection methods and response process. Include examples of anomalies identified and remediated in the past 12 months. 3. Are any AI-initiated tool calls subject to human approval prior to execution? If yes, describe the approval workflow, criteria for determining high-risk calls, and how approvals are documented. 4. Do you log AI tool calls? Describe what information is captured (e.g., inputs, timestamps, parameters, outcomes), how long logs are retained, and how they are protected. 5. How do you evaluate your AI systems for unauthorized or excessive tool usage? Include the frequency of evaluations, the types of misuse you look for, and how you address issues when found.

B2

Protect access to AI systems, data, and model assets

For the purposes of this questionnaire, AI systems include infrastructure used to train, manage, or deploy models; model artifacts themselves; training data and prompt logs; and associated APIs or interfaces used to operate or monitor AI components. 1. What access controls and logging practices are in place for AI training data, models, and outputs? Describe how access is restricted (e.g., RBAC, least privilege), how access is logged, and how these controls are maintained. 2. Is multi-factor authentication (MFA) required for access to AI model management or deployment systems? If so, specify which systems are covered and how MFA is enforced. 3. How do you monitor for anomalous API usage within your AI systems? Describe any detection tools, thresholds, or behavioral baselines used, and how anomalies are triaged and responded to. 4. How do you assess AI systems for unauthorized access risks? Outline your assessment process, including frequency, scope (e.g., penetration tests, cloud audits), and mitigation workflows for identified risks.

C. Privacy

C1

Promise customers they own AI outputs

For the purposes of this questionnaire, AI outputs refer to any model-generated content—text, code, scores, classifications, summaries, or recommendations—produced using input data provided by the buyer (us) or the buyer’s end users. This includes core product outputs as well as secondary outputs used in analytics, tooling, or internal processes. 1. How do you categorize AI outputs across your product and internal systems? Include both core product outputs and auxiliary outputs such as classifier results, analytics summaries, internal model logs, or retraining artifacts. 2. For each category of AI output, provide the following: a. Name of the output category b. A representative or illustrative example c. Who owns the AI output d. How the AI output is used (especially if repurposed beyond the user-facing product) e. How the output is stored and retained (include links to documentation or relevant user-facing policies) 3. Do you offer any contractual guarantees regarding AI output ownership, usage, or storage? If so, please provide the relevant contract language (e.g., MSA, DPA) or summary. 4. What operational or technical measures are in place to ensure compliance with your AI output ownership policy? 5. Have you conducted any third-party audits or internal assessments of compliance with your AI output ownership policy in the past 12 months? If yes, provide details and summaries of findings or remediation actions. 6. How will you communicate changes to your AI output ownership policy to us? Provide examples of recent updates and describe how they were communicated (e.g., dashboard notices, email, legal addenda).

C2

Don’t train on customer data without consent

1. Do you have a formal policy governing how customer or end-user data is used for AI training? Please summarize the policy and specify: a. Whether blanket consent is applied by default b. When explicit, case-by-case consent is required c. Whether consent can be retracted, and in what circumstances d. Whether (and how) we or our end users will be notified of policy changes 2. Are your AI training practices and consent requirements documented in our contract? If so, provide the relevant language from your MSA, DPA, or other contractual documents that describes limits on training, consent terms, or disclosure obligations. 3. How can we request the deletion of customer data previously used or eligible for AI training? Describe the process, timing, confirmation of deletion, and any exceptions. 4. How can our customers or end users request deletion of their data if it has been used for AI training (or could be)? Describe the available mechanisms, whether this is self-service or mediated through us, and any restrictions or limitations.

C3

Don’t allow AI access to PII

For the purposes of this questionnaire, personally identifiable information (PII) refers to any data that can directly or indirectly identify an individual. Because definitions vary by regulation and context, we ask that you describe your internal definition of PII and how it informs your practices. 1. How do you define PII within your organization, and what sources inform this definition (e.g., GDPR, CCPA, NIST)? Describe any categories or examples you explicitly include or exclude. 2. What mechanisms do you use to detect and redact unnecessary PII from AI inputs? Are these protections applied automatically? How do you ensure consistency and minimize false negatives? 3. Do you scan or review AI outputs for the presence of PII before delivery to the user? If yes, describe how this process is implemented and whether it runs in real time, batch mode, or asynchronously. Provide examples if available. 4. How do you evaluate the performance of your PII filtering systems (input and output)? Include any benchmarks, detection thresholds, test sets, or error rates. Share evaluation reports if available. 5. Have you undergone any third-party reviews or audits of your PII protection mechanisms in the past 12 months? If so, describe the scope, findings, and any remediations or improvements made as a result. 6. What policies or procedures are in place in the event that PII is inadvertently processed or exposed by an AI system? Include how incidents are detected, reported, triaged, and resolved.

C4

Log AI interactions with clear policies for retention and integrity

1. What are your logging practices for AI interactions? Describe: a. What is logged (e.g., inputs, outputs, system metadata, timestamps) b. How integrity is ensured (e.g., tamper-resistance, immutability) c. Whether logs are complete or selectively captured 2. What is your retention policy for AI interaction logs? Specify default retention periods. If different types of interactions (e.g., training logs, user prompts, tool calls) have different retention timelines, list them separately. 3. Are your logging and retention practices contractually defined? If so, provide the relevant sections or language from your MSA, DPA, or service terms that address log storage, access, or deletion commitments. 4. How do you enforce deletion of AI logs according to retention policies? Indicate whether deletion is automated, what systems enforce it, and whether audits are performed (and how frequently) to verify compliance.

D. Governance

D1

Clearly identify AI-generated content, conversations, and decisions

Answer the following questions where applicable based on your product’s capabilities. If a feature (e.g., generative content, AI-driven conversations, or automated decisions) does not apply to your system, you may indicate that clearly and skip the corresponding question. 1. How is AI-generated content labeled in your product? Describe the visual or textual indicators used (e.g., banners, icons, badges). Are these labels configurable or removable by us or by our end users? Please include examples, screenshots, or documentation if available. 2. Do you provide a disclosure statement at the beginning of AI-driven conversations (e.g., chat, voice, or phone-based interactions)? Describe when and how this disclosure is presented. Include representative language or transcripts if available. 3. Do you label or disclose when AI is involved in automated decision-making (e.g., filtering, ranking, approvals)? If so, describe the form and placement of the disclosure. Include examples of how this appears to users or affected individuals. 4. How do you manage updates to your labeling and disclosure practices? Describe how these updates are tracked and deployed, and how you ensure they remain compliant with emerging governance or regulatory requirements.

D2

Run AI models and store AI data in approved regions

For the purposes of this questionnaire, approved regions refer to geographic locations that you (the vendor) have designated as authorized for running AI workloads or storing AI-related data. These regions may be defined based on internal policy, customer contracts, or applicable legal and regulatory requirements. This section assesses how those region lists are maintained and enforced. 1. Do you maintain a list of approved regions for AI model execution and data storage? If yes, provide the current list and describe the criteria or frameworks (e.g., regulatory, contractual, internal risk) you use to approve or exclude regions. 2. How do you ensure that AI workloads are executed only within the approved regions? Describe the systems or enforcement mechanisms in place (e.g., cloud provider restrictions, deployment policy enforcement, monitoring). 3. What safeguards prevent AI data from transiting outside of approved regions? Include technical controls, encryption boundaries, routing constraints, or alerting mechanisms that support regional data residency. 4. Have you conducted any audits or assessments in the past 12 months to verify compliance with your approved-region restrictions? If so, summarize who conducted the review, what was assessed, and any findings or actions taken.

D3

Assess AI vendors for security, privacy, and compliance

For the purposes of this questionnaire, a third-party AI vendor is an external service provider that processes, transmits, or stores customer data on behalf of the primary AI system provider and applies generative artificial intelligence models to that data. These vendors typically qualify as subprocessors under data protection frameworks (e.g., SOC 2, GDPR), but this designation is limited here to those whose core function involves the use of AI systems—such as hosted foundation models, AI-powered feature layers (e.g., summarization, classification), or embedded LLM infrastructure. For each third-party AI provider, please describe: 1. The name of the vendor and a brief description of their role and the function they support within your system. 2. Whether this vendor’s obligations are obligated through contractual agreements in place with us. Please provide snippets from the MSA, DPA or other documentation that outline these agreements. 3. Whether this vendor has a contract or policy with you that explicitly prohibits them from using or training on our data without our prior written consent. Describe how this policy is enforced in practice. 4. Whether this vendor processes our personally identifiable information (PII), and whether there are contractual or technical controls that restrict PII processing, require redaction, or enforce data minimization. 5. Whether this vendor retains any user or system data, including for logging, auditing, or debugging purposes. If so, describe the types of data retained, the retention period, and whether it is linked to identifiable users. 6. What security certifications this vendor holds (e.g., SOC 2, ISO 27001). Please provide documentation or attestations for each certification. 7. Whether this vendor is contractually required to disclose changes in their security posture or risk profile, and how these changes are communicated to you and to us. 8. How this vendor is assessed on an ongoing basis against security, privacy, and responsible AI practices. Include frequency and scope of reassessment. 9. Whether there have been incidents or non-compliance issues involving this vendor in the past 24 months. If so, describe the issue and the remediation steps taken. 10. In which geographic regions this vendor operates its AI infrastructure (including training, inference, and fine-tuning workloads). 11. Whether this vendor has technical or procedural mechanisms in place to mitigate harmful outputs, adversarial prompts, or adversarial attacks (e.g., prompt injection, model exploitation). Please provide evidence of these safeguards, such as evaluation results, internal documentation, red-teaming summaries, or system design descriptions. 12. Whether this vendor has technical or procedural mechanisms in place to mitigate high-severity misuse risks, including (a) Deception or influence operations; (b) Cyber exploitation (e.g., vulnerability discovery, malware generation); and (c) Catastrophic misuse (e.g., CBRN, autonomous weaponization). Provide evidence of these safeguards, such as evaluation results, misuse red-teaming reports, policy thresholds, or internal documentation outlining how these scenarios are detected and handled. 13. Whether this vendor is subject to export controls related to AI models or infrastructure (e.g., U.S. EAR, ITAR). If so, describe how you confirm compliance.

D4

Align with AI regulation and bias/anti-discrimination law

This section focuses on how your organization tracks and complies with laws that govern AI systems, including both AI-specific regulations (e.g., EU AI Act, NYC AEDT Law) and general anti-discrimination or bias-related laws (e.g., GDPR, civil rights legislation, sector-specific rules in employment or finance). These laws may require obligations such as explainability, fairness, impact assessments, or public disclosures. Please answer the following questions based on your applicable systems and use cases. 1. Which AI-related or anti-discrimination laws does your organization consider your systems subject to? For each law or regulation you’ve identified (e.g., EU AI Act, NYC AEDT Law, GDPR Article 22), describe the relevant AI use cases and how you determined the law applies. 2. How do you monitor and manage your compliance obligations across different laws and jurisdictions? Describe the process for tracking compliance status, reviewing changes in regulatory requirements, and updating internal policies. 3. Do you have a team or function responsible for managing legal and regulatory risks related to AI and bias? If yes, describe its structure, responsibilities, and how it collaborates across legal, engineering, and product teams. 4. How can users, customers, candidates, or other affected individuals report concerns related to discrimination or fairness in your AI systems? Describe how these reports are submitted, reviewed, and resolved. 5. Do you perform legal or policy reviews of high-risk AI use cases before deployment? If yes, describe the criteria for what constitutes a high-risk use case, what the review process entails, and who is responsible for conducting it.

E. Efficacy

E1

Keep AI use within its intended scope

1. What is the intended scope of your AI product, and what use cases or behaviors are explicitly out of scope? Please provide documentation or summaries that define supported use cases, prohibited behaviors, and unsupported applications. Include examples of how these are communicated to users (e.g., UI disclosures, usage policies, API documentation). 2. What technical or procedural safeguards are in place to detect and reject out-of-scope requests or behaviors? Describe how these safeguards function at runtime or during user onboarding/configuration. 3. Do you evaluate your AI product to verify that its behavior remains within the defined scope? Provide examples of evaluations conducted in the past 12 months, including what scenarios were tested and what actions were taken based on the results. 4. Do you have a formal process for updating the intended scope and prohibited behaviors as the product evolves? Describe how changes are reviewed, approved, and reflected in documentation, product interfaces, and user-facing policies.

E2

Mitigate hallucinations

1. What techniques do you employ to detect or flag hallucinated or unreliable content in your AI product? Please provide documentation or examples of how these techniques are implemented in production, including any filtering, scoring, or user-facing indicators. 2. Does your system provide source attribution or citation for factual claims? If so, please describe how this feature works and include screenshots or UI examples. Indicate whether the citation is programmatically enforced, user-optional, or available on request. 3. What features or design choices have you implemented to help users understand when an AI-generated claim may be inaccurate, uncertain, or unsupported? This may include confidence signals, visual disclaimers, retrieval grounding, or prompt-based disclaimers. Please describe and provide examples. 4. How have you evaluated your system's performance in reducing or identifying hallucinations? Include any structured evaluations of (a) Factual accuracy (e.g., correctness against ground truth); (b) Logical consistency (e.g., internal contradictions or unsupported inferences); and (c) Structural integrity (e.g., broken references, incomplete citations, jumbled summaries). Please share findings, metrics, or reports from the past 12 months if available.

E3

Classify AI failures by severity and respond with internal review, customer disclosure, and support practices

1. How do you define and categorize AI incidents? Describe what constitutes an AI-related incident under your policy, including examples (e.g., model failures, tool misuse, safety violations, regulatory exposure). Explain how incidents are classified by severity and how these differ from routine product issues or bugs. 2. Do you maintain a severity-based incident response plan for AI failures? Describe how your AI incident response plan is structured. Include the severity tiers you use, how impact is assessed, and the corresponding escalation and resolution actions for each tier. Provide illustrative scenarios if available. 3. How do you conduct post-incident reviews for significant AI incidents? Detail your process for reviewing serious AI failures. Include when a review is triggered, who participates, how findings are documented, and how identified changes are tracked and implemented. Summarize a recent review (if shareable) to illustrate. 4. What is your process for disclosing high-impact AI incidents to customers? Describe the conditions under which customers are notified of AI incidents. Include how you determine materiality, what information is shared, the timeline for notification, and how ongoing transparency is maintained during resolution. 5. What commitments do you make to customers regarding AI failure response and support? Explain how you communicate and uphold your AI incident response commitments including: a. Operational support (e.g., service-level agreements, incident response timelines) b. Legal practices (e.g., notification obligations) c. Financial remedies (e.g., indemnities, credits, insurance coverage)

E4

Measure and reduce unfair bias in AI outcomes

1. How do you evaluate AI outcomes for potential bias across relevant demographic or stakeholder groups? Please describe your evaluation process, including which attributes or fairness metrics are considered and how often evaluations are performed. Provide examples or summaries of recent evaluations. 2. What actions do you take when material fairness issues are identified in your AI systems? Describe your remediation process and decision-making criteria. Share examples of past fairness issues and how they were addressed. 3. How do you determine which AI systems are considered high-risk in terms of fairness impact? Provide criteria or frameworks used in this classification, and list examples of systems that you’ve classified as high-risk. 4. Do you validate fairness in high-risk systems through independent, third-party audits? If yes, provide details of your most recent fairness audit, including the auditor, scope, findings, and remediation steps taken. 5. What mechanisms are in place for ongoing monitoring and improvement of fairness in your AI systems? Describe tools, frameworks, or governance processes used to track fairness over time and adapt to changing conditions or data distributions.

F. Society

F1

Prevent AI behaviors that mislead or manipulate users

For the purposes of this questionnaire, deceptive AI behavior refers to outputs that may cause users to misinterpret the model’s identity, authority, intent, or emotional state. This includes impersonation, claims of false credentials, simulated trust or empathy, or language that may manipulate or unduly influence users. The questions below assess your safeguards against these risks. 1. How do you prevent your AI system from generating outputs that simulate identity, credentials, or emotional intent? a. Describe any refusal behaviors, prompt filters, or tuning approaches used to block impersonation, false authority, or emotional simulation. b. Include examples of restricted roles (e.g., “as a doctor…”) or blocked capabilities that address deception risk. 2. Do you monitor for AI-generated outputs that could mislead or manipulate users? a. Describe any logging or flagging systems in place for suspected manipulative or misleading outputs. b. How often are these logs reviewed, and by whom? c. What actions are taken based on the findings (e.g., retraining, escalation, updates)? 3. How do you assess whether your AI system tends to mislead users about its capabilities, authority, or identity? a. Describe any evaluations, audits, or scenario testing conducted to identify user-facing deception risks. b. Do you evaluate for patterns like simulated authority, emotional influence, or trust-building language? 4. What internal policies or operational practices guide how you address AI-driven deception or manipulation risks? a. Do you maintain internal guidance or design principles that prohibit certain types of outputs? b. How are these policies communicated across product, engineering, and safety teams? 5. Have there been any known incidents of your AI system misleading users through its outputs? If yes, describe the incident(s), how the issue was detected, and what mitigations were implemented to prevent recurrence.

F2

Prevent AI-enabled cyber exploitation

For the purposes of this questionnaire, cyber exploitation refers to the misuse of AI systems to assist with malicious technical activity—including vulnerability discovery, exploit generation, malware development, and scalable abuse of APIs, infrastructure, or system misconfigurations. These questions assess your safeguards across technical filtering, monitoring, evaluation, and model-level restrictions. 1. How do you restrict your AI from generating outputs that could assist in cyber exploitation or scalable abuse? a. Describe refusal behaviors, technical filters, and other enforcement mechanisms in place. b. What types of capabilities or outputs are explicitly restricted (e.g., exploit generation, bypass techniques, automated tooling)? c. Share examples of restrictions implemented in practice. 2. Do you log or flag prompts related to technical misuse or abuse attempts? a. What types of prompts are logged or flagged (e.g., exploit crafting, enumeration attempts)? b. How frequently are these logs reviewed, and by whom? c. What actions are taken based on those reviews (e.g., filter updates, escalation, training data changes)? 3. How do you evaluate your AI systems for susceptibility to cyber misuse? a. What scenarios or misuse vectors do you test (e.g., privilege escalation, malware automation, recon tooling)? b. Who conducts these evaluations (internal teams, red teams, external auditors)? c. How often are these evaluations performed, and how are results used to improve safeguards? 4. Do you review and document your model’s policy on offensive security use? a. Are use cases like penetration testing, red teaming, or exploit development explicitly allowed or restricted? b. How do you communicate these restrictions internally and/or to users? c. If you use third-party models, do you assess their stance on offensive cybersecurity enablement?

F3

Prevent catastrophic misuse (CBRN)

For the purposes of this questionnaire, catastrophic harm refers to AI outputs that could materially assist in the creation, deployment, or proliferation of Chemical, Biological, Radiological, or Nuclear (CBRN) threats—or other forms of extreme, high-impact misuse. This includes step-by-step guidance, novel synthesis pathways, or outputs that could directly lower the barrier to catastrophic capabilities. 1. How do you prevent your AI system from producing outputs that contain or enable access to CBRN-related or other catastrophic misuse knowledge? Describe the filtering, refusal mechanisms, or redaction systems used. Include examples of prohibited queries or system-level restrictions designed to address these risks. 2. Do you maintain logs of prompts or outputs related to CBRN or catastrophic misuse? If so, describe what is logged, how frequently these logs are reviewed, and how novel misuse scenarios are identified and escalated. 3. How do you evaluate your AI system’s vulnerability to CBRN or other catastrophic misuse scenarios? Describe how you conduct red-teaming or adversarial evaluations for high-risk misuse, including prompt design, test cases, and evaluation cadence. Include any evaluations performed in the last 12 months. 4. How do you assess the effectiveness of your CBRN-related safeguards? What metrics, performance thresholds, or real-world testing methods do you use to validate that filtering and mitigation systems are functioning as intended? 5. Have any third-party audits or evaluations been conducted on your CBRN safeguards? If yes, provide details of the most recent external reviews, including scope, evaluator, findings, and remediation actions taken.

F4

Comply with export controls and national security regulations

1. What measures do you have in place to prevent use in sanctioned or restricted jurisdictions? Please describe geofencing, access controls, or usage restrictions. Provide evidence of how these controls are implemented and enforced in practice. 2. Is your AI product, or any of its components, subject to export controls or national security regulations (e.g., EAR, ITAR, dual-use classification)? If you answered yes to Question 2, please also answer the following: 3. Please specify which jurisdictions and laws apply, and provide documentation (e.g., export classification, licensing status, or internal determinations). 4. Do you review new AI capabilities for export control or dual-use risks before release? Describe the review process, frequency, responsible roles, and criteria used to flag potential national security concerns. 5. Have you undergone any third-party audits, legal reviews, or regulatory consultations regarding export control compliance in the last 12 months? If so, please provide summaries of findings and any resulting policy or product changes. 6. How do you stay current with evolving export control and national security regulations that may affect your AI product? Describe how changes are tracked and how you ensure timely updates to your compliance posture.