AIUC-1
Capability-Specific Evidence

Capability-Specific Evidence

Download evidence list

Capability: Text AI

Requirement
·
Mandatory Requirement
Control activity
Documenting foundation model provider IP protections which may serve as primary infringement safeguards. For example, indemnification clauses or copyright/trademark guardrails.
Evidence
A007.1 Documentation: Model provider IP infringement protections
Foundation model provider contract, terms of service, or data processing agreement showing IP protection commitments including copyright/trademark handling policies, indemnification clauses, liability coverage, and any documented limitations or exclusions. May include vendor questionnaire responses or certification documents addressing IP protections.
Tags
Mandatory Control
Legal Policies
Vendor Contracts
Requirement
·
Mandatory Requirement
Control activity
Establishing supplementary content filtering mechanisms where provider protections have gaps or limitations. For example, detecting copyrighted material in outputs, implementing trademark screening.
Evidence
A007.2 Config: IP infringement filtering
Code, API configuration, or filtering system showing detection of copyrighted material, trademark screening, or content validation checks applied to AI outputs - this could be pattern matching logic, third-party API integration (e.g. copyright detection services), or custom filtering rules.
Tags
Supplemental Control
Technical Implementation
Engineering CodeEng: LLM output filtering logic
Requirement
·
Mandatory Requirement
Control activity
Implementing user guidance and guardrails to reduce IP risk. For example, usage policies that explain prohibited content types, user warnings in product, restricting output generation in known infringement domains.
Implementing restrictions in AI acceptable use policy.
Evidence
A007.3 Logs: User-facing notices
User-facing IP risk guidance - may include warning messages when attempting high-risk operations, help center articles about IP infringement guidance, or UI elements explaining prohibited use cases.
Tags
Supplemental Control
Technical Implementation
ProductAcceptable Use Policy
Requirement
·
Optional Requirement
Control activity
Integrating automated moderation tools to filter inputs before they reach the foundation model. For example, integrating third-party moderation APIs, implementing custom filtering rules, configuring blocking or warning actions for flagged content, and establishing confidence thresholds based on risk category and severity
Evidence
B005.1 Config: Input filtering
Moderation tool integration showing API configuration, filtering rules, action settings (block/warn/modify), and confidence thresholds for different violation categories - this could be screenshots of configuration files, admin dashboard settings, or API integration code. Example moderation tools: OpenAI Moderation API, Claude content filtering, VirtueAI/Hive/Spectrum Labs
Tags
Mandatory Control
Technical Implementation
Eng: User LLM input filtering logicEngineering Tooling
Requirement
·
Optional Requirement
Control activity
Documenting the moderation logic and rationale. For example, explaining chosen moderation tools, threshold justifications, and decision criteria for different risk categories.
Evidence
B005.2 Documentation: Input moderation approach
Document explaining moderation approach including tool selection rationale, threshold settings with justifications, action logic for different violation types, and examples of how different input categories are handled.
Tags
Supplemental Control
Technical Implementation
Internal processesEngineering Practice
Requirement
·
Optional Requirement
Control activity
Providing feedback to users when inputs are blocked.
Evidence
B005.3 Demonstration: Warning for blocked inputs
User-facing messages or UI flows showing how blocked inputs are communicated to users - this could be error messages, warning dialogs, or alternative suggestions provided when content is filtered.
Tags
Supplemental Control
Technical Implementation
Product
Requirement
·
Optional Requirement
Control activity
Logging flagged prompts for analysis and refinement of filters, while ensuring compliance with privacy obligations.
Evidence
B005.4 Logs: Input filtering
Logging system showing how flagged inputs are captured, what metadata is included/excluded for privacy, retention policies, and audit trail - may include privacy documentation explaining logging disclosures to users.
Tags
Supplemental Control
Technical Implementation
Logs
Requirement
·
Optional Requirement
Control activity
Periodically evaluating filter performance and adjusting thresholds accordingly. For example, accuracy, latency, false positives/negatives.
Evidence
B005.5 Documentation: Input filter performance
Report or dashboard showing analysis of filter performance metrics (false positives, false negatives, accuracy, latency) and documented threshold adjustments made based on performance data - should include timestamps and rationale for changes.
Tags
Supplemental Control
Technical Implementation
Engineering Practice
Requirement
·
Mandatory Requirement
Control activity
Reducing or limiting the number of results shown in outputs to relevant only to balance security and utility. For example, character limits, limits on inference time.
Evidence
B009.1 Config: Output volume limits
Code or configuration implementing output restrictions - may include character or token limits, inference time limits, result count restrictions, or timeout configurations preventing excessive output. Can be demonstrated by product demo showing system timeout when requesting output exceeding limits.
Tags
Mandatory Control
Technical Implementation
Engineering CodeProduct
Requirement
·
Mandatory Requirement
Control activity
Providing user-facing notices or documentation about output limitations.
Evidence
B009.2 Demonstration: User output notices
Product interface showing user notices about output limitations - may include messages indicating truncated or suppressed outputs for security or privacy reasons, user documentation explaining limitation policies, or help articles describing output restrictions.
Tags
Supplemental Control
Operational Practices
Product
Requirement
·
Mandatory Requirement
Control activity
Limiting the fidelity of model outputs in certain use cases. For example, applying output rounding, threshold bands, or obfuscation techniques to reduce the risk of model inversion.
Evidence
B009.3 Config: Output precision controls
Code implementing output fidelity limitations - may include rounding logic for numerical outputs, threshold bands reducing precision, or obfuscation techniques preventing model inversion, precision-sensitive data disclosure, or adversarial model extraction attacks.
Tags
Supplemental Control
Technical Implementation
Engineering Code
Requirement
·
Mandatory Requirement
Control activity
Implementing content filtering for harmful output types. For example, detecting and blocking distressed responses, angry language, offensive content, biased statements, and deceptive information.
Evidence
C003.1 Config: Harmful output filtering
Content filtering rules, moderation API configuration, or classifier settings showing detection and blocking logic for harmful output types - may include filtering rules in code, third-party moderation tool configuration (e.g., OpenAI Moderation API, Perspective API), or custom classifier model settings with harm category definitions.
Tags
Mandatory Control
Technical Implementation
Eng: LLM output filtering logic
Requirement
·
Mandatory Requirement
Control activity
Implementing guardrails for advice generation. For example, restricting high-risk recommendations in sensitive domains, requiring disclaimers for guidance.
Evidence
C003.2 Config: Guardrails for high-risk advice
System prompts, guardrail rules, or domain restrictions showing safety controls on advice generation - may include defensive prompting, domain-specific output restrictions (e.g., medical/legal/financial advice blocklists), or conditional response templates that add warnings for sensitive topics.
Tags
Mandatory Control
Technical Implementation
Engineering Code
Requirement
·
Mandatory Requirement
Control activity
Implementing bias detection and mitigation controls. For example, monitoring for discriminatory patterns, implementing fairness checks in outputs.
Evidence
C003.3 Config: Guardrails for biased outputs
Documentation of bias eval results testing for stereotypical responses across demographic attributes, manual review logs documenting bias assessments, or output filtering rules blocking discriminatory patterns - may include automated fairness evaluation tools or bias monitoring dashboards if implemented.
Tags
Supplemental Control
Technical Implementation
Eng: LLM output filtering logic
Requirement
·
Mandatory Requirement
Control activity
Evaluating harm mitigation controls using performance metrics.
Evidence
C003.4 Documentation: Filtering performance benchmarks
Test results, metrics dashboard, or evaluation report showing performance of harm controls - may include false positive/negative rates, coverage analysis of test scenarios, benchmark results against harm datasets (e.g., ToxiGen, RealToxicityPrompts), or confusion matrices showing filtering accuracy across harm categories.
Tags
Supplemental Control
Operational Practices
Internal processes
Requirement
·
Mandatory Requirement
Control activity
Detecting and blocking out-of-scope requests. For example, detecting conversations outside intended use cases, blocking prohibited topics, providing redirection messages when users hit boundaries, and escalating or restricting access for repeated violations.
Evidence
C004.1 Config: out-of-scope guardrails
Blocking rules, defensive prompting, or filtering configuration showing how out-of-scope requests are detected and handled - may include topic blocklists, redirection message templates, escalation rules for repeated attempts, or system prompts defining allowed topics.
Tags
Mandatory Control
Technical Implementation
Engineering Code
Requirement
·
Mandatory Requirement
Control activity
Tracking out-of-scope violations and updating boundaries. For example, logging boundary violations, adjusting restrictions based on misuse patterns.
Evidence
C004.2 Logs: Out-of-scope attempts
Logs showing out-of-scope attempts with frequency data. May include documentation of boundary updates made in response to violations, monitoring dashboard of flagged requests, change log showing restriction updates with rationale, or incident reports triggering scope adjustments.
Tags
Mandatory Control
Technical Implementation
Logs
Requirement
·
Mandatory Requirement
Control activity
Providing user guidance on system capabilities and limitations. For example, communicating what the AI system can and cannot do, intended use cases, and topics or requests outside the system's scope.
Evidence
C004.3 Demonstration: User guidance on scope
User-facing guidance explaining system capabilities and limitations - may include onboarding tooltips or welcome screens, help documentation or FAQs describing intended use, UI warnings when approaching scope boundaries, or published usage guidelines.
Tags
Supplemental Control
Technical Implementation
Product
Requirement
·
Mandatory Requirement
Control activity
Establishing output sanitization and validation procedures before presenting content to users. For example, encoding or stripping potentially malicious content, validating structured outputs against safe schemas, blocking unsafe URLs, and enforcing secure rendering modes.
Evidence
C006.1 Config: Output sanitization
Code or configuration implementing output sanitization - may include HTML/JavaScript/shell syntax encoding functions, URL validation or rewriting rules blocking unsafe links, schema validation checking structured outputs (JSON/YAML/XML) against whitelists, CSP header configuration, or template rendering with auto-escaping enabled.
Tags
Mandatory Control
Technical Implementation
Engineering Code
Requirement
·
Mandatory Requirement
Control activity
Implementing security labeling and content handling based on trust level. For example, marking untrusted or third-party content, distinguishing external data from system-generated content, and applying differentiated security controls based on content source.
Evidence
C006.2 Demonstration: Warning labels for untrusted content
UI or code showing trust-based content handling - may include visual indicators marking third-party content (badges, styling, warning icons), metadata tags tracking content source and trust level, or code applying conditional security controls based on content origin (e.g., stricter sanitization for external sources).
Tags
Mandatory Control
Technical Implementation
Product
Requirement
·
Mandatory Requirement
Control activity
Detecting advanced output-based attack patterns. For example, identifying prompt injection attempts, model subversion techniques, payloads targeting downstream systems, or obfuscated exploits designed to bypass filters.
Evidence
C006.3 Config: Adversarial output detection
Detection rules or monitoring system identifying advanced attack patterns in outputs - may include pattern matching for prompt injection chains or jailbreak tokens, payload signature scanning detecting command injection or SQL queries, or anomaly detection flagging obfuscated exploits bypassing basic filters.
Tags
Supplemental Control
Technical Implementation
Eng: LLM output filtering logic
Requirement
·
Mandatory Requirement
Control activity
Appointing qualified third-party assessors. Including selecting assessors with relevant technical capabilities for identified risk areas, maintaining records of assessor qualifications and independence.
Conducting regular testing. Including performing assessments of harmful outputs at least every quarter, defining testing scope and methodologies based on risk classifications and industry benchmarks like ToxiGen, coordinating with internal security and testing teams.
Maintaining documentation. Including testing scope, results, and remediation actions taken, tracking follow-up activities and resolution timelines.
Evidence
C010.1 Report: Harmful output testing
Third-party evaluation report showing harmful output testing - must include documentation of assessor qualifications, testing methodology and findings, and improvement tracking with remediation timelines and documentation.
Tags
Mandatory Control
Third-party Evals
Third-party evaluation report
Requirement
Control activity
Appointing qualified third-party assessors. Including selecting assessors with relevant technical capabilities for identified risk areas, maintaining records of assessor qualifications and independence.
Conducting regular testing. Including defining testing scope and methodologies based on risk taxonomy and performing assessments of out-of-scope outputs at least every quarter.
Maintaining documentation. Including testing scope, results, and remediation actions taken, tracking follow-up activities and resolution timelines.
Evidence
C011.1 Report: Out-of-scope output testing
Third-party evaluation report showing out-of-scope output testing - must include documentation of assessor qualifications, testing methodology and findings, and improvement tracking with remediation timelines and documentation.
Tags
Mandatory Control
Third-party Evals
Third-party evaluation report
Requirement
·
Mandatory Requirement
Control activity
Implementing factual accuracy controls. For example, deploying available fact-checking mechanisms, flagging uncertain or low-confidence responses.
Evidence
D001.1 Config: Groundedness filter
Code or configuration showing groundedness validation - may include filters checking responses against source documents, fact-checking API integration, or logic comparing generated content to retrieved context for factual accuracy.
Tags
Mandatory Control
Technical Implementation
Engineering Code
Requirement
·
Mandatory Requirement
Control activity
Establishing information source validation. For example, requiring citations for factual claims, implementing source reliability checks.
Evidence
D001.2 Demonstration: User-facing citations & source attributions
UI or output format showing citations and source attributions provided to users - may include inline citations, source links, reference lists, or attribution labels identifying where information originated.
Tags
Mandatory Control
Technical Implementation
Product
Requirement
·
Mandatory Requirement
Control activity
Maintaining uncertainty communication. For example, displaying confidence levels, providing appropriate disclaimers for generated information.
Evidence
D001.3 Demonstration: User-facing uncertainty labels
UI or output format showing confidence levels, uncertainty disclaimers, or warnings for generated information - may include confidence score displays, low-certainty warnings, or standard disclaimers about potential inaccuracies.
Tags
Supplemental Control
Technical Implementation
Product
Requirement
·
Mandatory Requirement
Control activity
Appointing qualified third-party assessors. Including selecting assessors with relevant technical capabilities for identified risk areas, maintaining records of assessor qualifications and independence.
Conducting regular testing. Including defining testing scope and methodologies based on risk taxonomy and performing assessments at least every quarter.
Maintaining documentation. Including testing scope, results, and remediation actions taken, tracking follow-up activities and resolution timelines.
Evidence
D002.1 Report: Hallucination testing results
Third-party evaluation report showing hallucination testing - must include risk taxonomy tested, testing methodology and findings, and improvement tracking with remediation timelines and documentation.
Tags
Mandatory Control
Third-party Evals
Third-party evaluation report
Requirement
·
Mandatory Requirement
Control activity
Implementing customer communication protocols. For example, disclosure procedures, explanation of corrective actions, and follow-up commitments with executive approval for significant incidents.
Establishing immediate mitigation steps with designated staff responsibilities. For example, system freeze capabilities, output suppression, customer notification, and system adjustments.
Evidence
E002.1 Documentation: AI failure plan for harmful outputs
Can be standalone document or integrated in existing incident response procedures/policies
Tags
Mandatory Control
Operational Practices
AI failure plan
Requirement
·
Mandatory Requirement
Control activity
Defining harmful output categories with reference to risk taxonomy. For example, discriminatory content, offensive material, inappropriate recommendations, ideally with concrete examples.
Coordinating external support engagement. For example, legal counsel consultation, PR support, and insurance claim procedures.
Evidence
E002.2 Documentation: Additional harmful output failure procedures
May include harmful output category definitions referenced to risk taxonomy, external support contact list (legal counsel, PR firms, insurance providers), support engagement procedures or runbooks, or escalation criteria for involving external parties.
Tags
Supplemental Control
Operational Practices
AI failure plan
Requirement
·
Mandatory Requirement
Control activity
Establishing compensation assessment procedures. For example, loss evaluation methods, settlement approaches, and payment authorization levels with appropriate approval requirements.
Implementing remediation measures. For example, system freeze capabilities, model adjustments, output validation improvements, customer notification, and enhanced monitoring.
Evidence
E003.1 Documentation: AI failure plan for hallucinations
Can be standalone document or integrated in existing incident response procedures/policies
Tags
Mandatory Control
Operational Practices
AI failure plan
Requirement
·
Mandatory Requirement
Control activity
Defining hallucination incident types.
Coordinating potential external support. For example, legal consultation for significant claims, financial review when needed, and insurance coverage activation.
Evidence
E003.2 Documentation: Additional hallucination failure procedures
May include hallucination incident categories (e.g. factual errors, incorrect recommendations), external support contact list (legal counsel, financial reviewers, insurance providers), support engagement procedures, or escalation criteria for involving external parties.
Tags
Supplemental Control
Operational Practices
AI failure plan
Requirement
·
Mandatory Requirement
Control activity
Results of testing from foundation model developer on offensive cyber capabilities and mitigations.
Evidence
F001.1 Documentation: Foundation model cyber capabilities
Provider model cards, cybersecurity assessment reports from model developers, or foundation model documentation describing offensive cyber capabilities and mitigations
Tags
Mandatory Control
Legal Policies
Vendor Contracts
Requirement
·
Mandatory Requirement
Control activity
Implementing malicious use detection and blocking. For example, deploying available content filtering to detect requests for malicious code generation, attack planning, and vulnerability exploitation guidance, configuring automated blocking of cyber attack assistance requests, maintaining databases of prohibited use patterns.
Evidence
F001.2 Config: Cyber use detection
Content filtering rules blocking cyber attack requests, keyword or pattern matching detecting malicious code generation attempts, automated blocking configuration for exploit development queries, or prohibited use pattern database.
Tags
Supplemental Control
Technical Implementation
Engineering Code
Requirement
·
Mandatory Requirement
Control activity
Results of testing from foundation model developer on CBRN capabilities and mitigations.
Evidence
F002.1 Documentation: Foundation model CBRN capabilities
List of foundation models used with CBRN capability information - may include provider model cards with CBRN assessments, weapons of mass destruction risk evaluations from model developers, or other documentation describing CBRN-related capabilities and mitigations.
Tags
Mandatory Control
Legal Policies
Vendor Contracts
Requirement
·
Mandatory Requirement
Control activity
Establishing catastrophic misuse monitoring. For example, monitoring AI system interactions for patterns indicating weapons development or mass harm intent, implementing real-time alerting for detected catastrophic misuse attempts, documenting suspicious queries and system responses.
Evidence
F002.2 Config: Catastrophic misuse monitoring
Monitoring dashboard or alert configuration for catastrophic misuse patterns - may include usage monitoring flagging CBRN-related queries, alert rules for weapons development patterns, logs of detected and blocked catastrophic misuse attempts, or incident records documenting suspicious CBRN-related interactions.
Tags
Supplemental Control
Technical Implementation
Engineering Code

Capability: Voice AI

Requirement
·
Mandatory Requirement
Control activity
Documenting foundation model provider IP protections which may serve as primary infringement safeguards. For example, indemnification clauses or copyright/trademark guardrails.
Evidence
A007.1 Documentation: Model provider IP infringement protections
Foundation model provider contract, terms of service, or data processing agreement showing IP protection commitments including copyright/trademark handling policies, indemnification clauses, liability coverage, and any documented limitations or exclusions. May include vendor questionnaire responses or certification documents addressing IP protections.
Tags
Mandatory Control
Legal Policies
Vendor Contracts
Requirement
·
Mandatory Requirement
Control activity
Establishing supplementary content filtering mechanisms where provider protections have gaps or limitations. For example, detecting copyrighted material in outputs, implementing trademark screening.
Evidence
A007.2 Config: IP infringement filtering
Code, API configuration, or filtering system showing detection of copyrighted material, trademark screening, or content validation checks applied to AI outputs - this could be pattern matching logic, third-party API integration (e.g. copyright detection services), or custom filtering rules.
Tags
Supplemental Control
Technical Implementation
Engineering CodeEng: LLM output filtering logic
Requirement
·
Mandatory Requirement
Control activity
Implementing user guidance and guardrails to reduce IP risk. For example, usage policies that explain prohibited content types, user warnings in product, restricting output generation in known infringement domains.
Implementing restrictions in AI acceptable use policy.
Evidence
A007.3 Logs: User-facing notices
User-facing IP risk guidance - may include warning messages when attempting high-risk operations, help center articles about IP infringement guidance, or UI elements explaining prohibited use cases.
Tags
Supplemental Control
Technical Implementation
ProductAcceptable Use Policy
Requirement
·
Optional Requirement
Control activity
Integrating automated moderation tools to filter inputs before they reach the foundation model. For example, integrating third-party moderation APIs, implementing custom filtering rules, configuring blocking or warning actions for flagged content, and establishing confidence thresholds based on risk category and severity
Evidence
B005.1 Config: Input filtering
Moderation tool integration showing API configuration, filtering rules, action settings (block/warn/modify), and confidence thresholds for different violation categories - this could be screenshots of configuration files, admin dashboard settings, or API integration code. Example moderation tools: OpenAI Moderation API, Claude content filtering, VirtueAI/Hive/Spectrum Labs
Tags
Mandatory Control
Technical Implementation
Eng: User LLM input filtering logicEngineering Tooling
Requirement
·
Optional Requirement
Control activity
Documenting the moderation logic and rationale. For example, explaining chosen moderation tools, threshold justifications, and decision criteria for different risk categories.
Evidence
B005.2 Documentation: Input moderation approach
Document explaining moderation approach including tool selection rationale, threshold settings with justifications, action logic for different violation types, and examples of how different input categories are handled.
Tags
Supplemental Control
Technical Implementation
Internal processesEngineering Practice
Requirement
·
Optional Requirement
Control activity
Providing feedback to users when inputs are blocked.
Evidence
B005.3 Demonstration: Warning for blocked inputs
User-facing messages or UI flows showing how blocked inputs are communicated to users - this could be error messages, warning dialogs, or alternative suggestions provided when content is filtered.
Tags
Supplemental Control
Technical Implementation
Product
Requirement
·
Optional Requirement
Control activity
Logging flagged prompts for analysis and refinement of filters, while ensuring compliance with privacy obligations.
Evidence
B005.4 Logs: Input filtering
Logging system showing how flagged inputs are captured, what metadata is included/excluded for privacy, retention policies, and audit trail - may include privacy documentation explaining logging disclosures to users.
Tags
Supplemental Control
Technical Implementation
Logs
Requirement
·
Optional Requirement
Control activity
Periodically evaluating filter performance and adjusting thresholds accordingly. For example, accuracy, latency, false positives/negatives.
Evidence
B005.5 Documentation: Input filter performance
Report or dashboard showing analysis of filter performance metrics (false positives, false negatives, accuracy, latency) and documented threshold adjustments made based on performance data - should include timestamps and rationale for changes.
Tags
Supplemental Control
Technical Implementation
Engineering Practice
Requirement
·
Mandatory Requirement
Control activity
Reducing or limiting the number of results shown in outputs to relevant only to balance security and utility. For example, character limits, limits on inference time.
Evidence
B009.1 Config: Output volume limits
Code or configuration implementing output restrictions - may include character or token limits, inference time limits, result count restrictions, or timeout configurations preventing excessive output. Can be demonstrated by product demo showing system timeout when requesting output exceeding limits.
Tags
Mandatory Control
Technical Implementation
Engineering CodeProduct
Requirement
·
Mandatory Requirement
Control activity
Providing user-facing notices or documentation about output limitations.
Evidence
B009.2 Demonstration: User output notices
Product interface showing user notices about output limitations - may include messages indicating truncated or suppressed outputs for security or privacy reasons, user documentation explaining limitation policies, or help articles describing output restrictions.
Tags
Supplemental Control
Operational Practices
Product
Requirement
·
Mandatory Requirement
Control activity
Limiting the fidelity of model outputs in certain use cases. For example, applying output rounding, threshold bands, or obfuscation techniques to reduce the risk of model inversion.
Evidence
B009.3 Config: Output precision controls
Code implementing output fidelity limitations - may include rounding logic for numerical outputs, threshold bands reducing precision, or obfuscation techniques preventing model inversion, precision-sensitive data disclosure, or adversarial model extraction attacks.
Tags
Supplemental Control
Technical Implementation
Engineering Code
Requirement
·
Mandatory Requirement
Control activity
Implementing content filtering for harmful output types. For example, detecting and blocking distressed responses, angry language, offensive content, biased statements, and deceptive information.
Evidence
C003.1 Config: Harmful output filtering
Content filtering rules, moderation API configuration, or classifier settings showing detection and blocking logic for harmful output types - may include filtering rules in code, third-party moderation tool configuration (e.g., OpenAI Moderation API, Perspective API), or custom classifier model settings with harm category definitions.
Tags
Mandatory Control
Technical Implementation
Eng: LLM output filtering logic
Requirement
·
Mandatory Requirement
Control activity
Implementing guardrails for advice generation. For example, restricting high-risk recommendations in sensitive domains, requiring disclaimers for guidance.
Evidence
C003.2 Config: Guardrails for high-risk advice
System prompts, guardrail rules, or domain restrictions showing safety controls on advice generation - may include defensive prompting, domain-specific output restrictions (e.g., medical/legal/financial advice blocklists), or conditional response templates that add warnings for sensitive topics.
Tags
Mandatory Control
Technical Implementation
Engineering Code
Requirement
·
Mandatory Requirement
Control activity
Implementing bias detection and mitigation controls. For example, monitoring for discriminatory patterns, implementing fairness checks in outputs.
Evidence
C003.3 Config: Guardrails for biased outputs
Documentation of bias eval results testing for stereotypical responses across demographic attributes, manual review logs documenting bias assessments, or output filtering rules blocking discriminatory patterns - may include automated fairness evaluation tools or bias monitoring dashboards if implemented.
Tags
Supplemental Control
Technical Implementation
Eng: LLM output filtering logic
Requirement
·
Mandatory Requirement
Control activity
Evaluating harm mitigation controls using performance metrics.
Evidence
C003.4 Documentation: Filtering performance benchmarks
Test results, metrics dashboard, or evaluation report showing performance of harm controls - may include false positive/negative rates, coverage analysis of test scenarios, benchmark results against harm datasets (e.g., ToxiGen, RealToxicityPrompts), or confusion matrices showing filtering accuracy across harm categories.
Tags
Supplemental Control
Operational Practices
Internal processes
Requirement
·
Mandatory Requirement
Control activity
Detecting and blocking out-of-scope requests. For example, detecting conversations outside intended use cases, blocking prohibited topics, providing redirection messages when users hit boundaries, and escalating or restricting access for repeated violations.
Evidence
C004.1 Config: out-of-scope guardrails
Blocking rules, defensive prompting, or filtering configuration showing how out-of-scope requests are detected and handled - may include topic blocklists, redirection message templates, escalation rules for repeated attempts, or system prompts defining allowed topics.
Tags
Mandatory Control
Technical Implementation
Engineering Code
Requirement
·
Mandatory Requirement
Control activity
Tracking out-of-scope violations and updating boundaries. For example, logging boundary violations, adjusting restrictions based on misuse patterns.
Evidence
C004.2 Logs: Out-of-scope attempts
Logs showing out-of-scope attempts with frequency data. May include documentation of boundary updates made in response to violations, monitoring dashboard of flagged requests, change log showing restriction updates with rationale, or incident reports triggering scope adjustments.
Tags
Mandatory Control
Technical Implementation
Logs
Requirement
·
Mandatory Requirement
Control activity
Providing user guidance on system capabilities and limitations. For example, communicating what the AI system can and cannot do, intended use cases, and topics or requests outside the system's scope.
Evidence
C004.3 Demonstration: User guidance on scope
User-facing guidance explaining system capabilities and limitations - may include onboarding tooltips or welcome screens, help documentation or FAQs describing intended use, UI warnings when approaching scope boundaries, or published usage guidelines.
Tags
Supplemental Control
Technical Implementation
Product
Requirement
·
Mandatory Requirement
Control activity
Establishing output sanitization and validation procedures before presenting content to users. For example, encoding or stripping potentially malicious content, validating structured outputs against safe schemas, blocking unsafe URLs, and enforcing secure rendering modes.
Evidence
C006.1 Config: Output sanitization
Code or configuration implementing output sanitization - may include HTML/JavaScript/shell syntax encoding functions, URL validation or rewriting rules blocking unsafe links, schema validation checking structured outputs (JSON/YAML/XML) against whitelists, CSP header configuration, or template rendering with auto-escaping enabled.
Tags
Mandatory Control
Technical Implementation
Engineering Code
Requirement
·
Mandatory Requirement
Control activity
Implementing security labeling and content handling based on trust level. For example, marking untrusted or third-party content, distinguishing external data from system-generated content, and applying differentiated security controls based on content source.
Evidence
C006.2 Demonstration: Warning labels for untrusted content
UI or code showing trust-based content handling - may include visual indicators marking third-party content (badges, styling, warning icons), metadata tags tracking content source and trust level, or code applying conditional security controls based on content origin (e.g., stricter sanitization for external sources).
Tags
Mandatory Control
Technical Implementation
Product
Requirement
·
Mandatory Requirement
Control activity
Detecting advanced output-based attack patterns. For example, identifying prompt injection attempts, model subversion techniques, payloads targeting downstream systems, or obfuscated exploits designed to bypass filters.
Evidence
C006.3 Config: Adversarial output detection
Detection rules or monitoring system identifying advanced attack patterns in outputs - may include pattern matching for prompt injection chains or jailbreak tokens, payload signature scanning detecting command injection or SQL queries, or anomaly detection flagging obfuscated exploits bypassing basic filters.
Tags
Supplemental Control
Technical Implementation
Eng: LLM output filtering logic
Requirement
·
Mandatory Requirement
Control activity
Appointing qualified third-party assessors. Including selecting assessors with relevant technical capabilities for identified risk areas, maintaining records of assessor qualifications and independence.
Conducting regular testing. Including performing assessments of harmful outputs at least every quarter, defining testing scope and methodologies based on risk classifications and industry benchmarks like ToxiGen, coordinating with internal security and testing teams.
Maintaining documentation. Including testing scope, results, and remediation actions taken, tracking follow-up activities and resolution timelines.
Evidence
C010.1 Report: Harmful output testing
Third-party evaluation report showing harmful output testing - must include documentation of assessor qualifications, testing methodology and findings, and improvement tracking with remediation timelines and documentation.
Tags
Mandatory Control
Third-party Evals
Third-party evaluation report
Requirement
Control activity
Appointing qualified third-party assessors. Including selecting assessors with relevant technical capabilities for identified risk areas, maintaining records of assessor qualifications and independence.
Conducting regular testing. Including defining testing scope and methodologies based on risk taxonomy and performing assessments of out-of-scope outputs at least every quarter.
Maintaining documentation. Including testing scope, results, and remediation actions taken, tracking follow-up activities and resolution timelines.
Evidence
C011.1 Report: Out-of-scope output testing
Third-party evaluation report showing out-of-scope output testing - must include documentation of assessor qualifications, testing methodology and findings, and improvement tracking with remediation timelines and documentation.
Tags
Mandatory Control
Third-party Evals
Third-party evaluation report
Requirement
·
Mandatory Requirement
Control activity
Implementing factual accuracy controls. For example, deploying available fact-checking mechanisms, flagging uncertain or low-confidence responses.
Evidence
D001.1 Config: Groundedness filter
Code or configuration showing groundedness validation - may include filters checking responses against source documents, fact-checking API integration, or logic comparing generated content to retrieved context for factual accuracy.
Tags
Mandatory Control
Technical Implementation
Engineering Code
Requirement
·
Mandatory Requirement
Control activity
Establishing information source validation. For example, requiring citations for factual claims, implementing source reliability checks.
Evidence
D001.2 Demonstration: User-facing citations & source attributions
UI or output format showing citations and source attributions provided to users - may include inline citations, source links, reference lists, or attribution labels identifying where information originated.
Tags
Mandatory Control
Technical Implementation
Product
Requirement
·
Mandatory Requirement
Control activity
Maintaining uncertainty communication. For example, displaying confidence levels, providing appropriate disclaimers for generated information.
Evidence
D001.3 Demonstration: User-facing uncertainty labels
UI or output format showing confidence levels, uncertainty disclaimers, or warnings for generated information - may include confidence score displays, low-certainty warnings, or standard disclaimers about potential inaccuracies.
Tags
Supplemental Control
Technical Implementation
Product
Requirement
·
Mandatory Requirement
Control activity
Appointing qualified third-party assessors. Including selecting assessors with relevant technical capabilities for identified risk areas, maintaining records of assessor qualifications and independence.
Conducting regular testing. Including defining testing scope and methodologies based on risk taxonomy and performing assessments at least every quarter.
Maintaining documentation. Including testing scope, results, and remediation actions taken, tracking follow-up activities and resolution timelines.
Evidence
D002.1 Report: Hallucination testing results
Third-party evaluation report showing hallucination testing - must include risk taxonomy tested, testing methodology and findings, and improvement tracking with remediation timelines and documentation.
Tags
Mandatory Control
Third-party Evals
Third-party evaluation report
Requirement
·
Mandatory Requirement
Control activity
Implementing customer communication protocols. For example, disclosure procedures, explanation of corrective actions, and follow-up commitments with executive approval for significant incidents.
Establishing immediate mitigation steps with designated staff responsibilities. For example, system freeze capabilities, output suppression, customer notification, and system adjustments.
Evidence
E002.1 Documentation: AI failure plan for harmful outputs
Can be standalone document or integrated in existing incident response procedures/policies
Tags
Mandatory Control
Operational Practices
AI failure plan
Requirement
·
Mandatory Requirement
Control activity
Defining harmful output categories with reference to risk taxonomy. For example, discriminatory content, offensive material, inappropriate recommendations, ideally with concrete examples.
Coordinating external support engagement. For example, legal counsel consultation, PR support, and insurance claim procedures.
Evidence
E002.2 Documentation: Additional harmful output failure procedures
May include harmful output category definitions referenced to risk taxonomy, external support contact list (legal counsel, PR firms, insurance providers), support engagement procedures or runbooks, or escalation criteria for involving external parties.
Tags
Supplemental Control
Operational Practices
AI failure plan
Requirement
·
Mandatory Requirement
Control activity
Establishing compensation assessment procedures. For example, loss evaluation methods, settlement approaches, and payment authorization levels with appropriate approval requirements.
Implementing remediation measures. For example, system freeze capabilities, model adjustments, output validation improvements, customer notification, and enhanced monitoring.
Evidence
E003.1 Documentation: AI failure plan for hallucinations
Can be standalone document or integrated in existing incident response procedures/policies
Tags
Mandatory Control
Operational Practices
AI failure plan
Requirement
·
Mandatory Requirement
Control activity
Defining hallucination incident types.
Coordinating potential external support. For example, legal consultation for significant claims, financial review when needed, and insurance coverage activation.
Evidence
E003.2 Documentation: Additional hallucination failure procedures
May include hallucination incident categories (e.g. factual errors, incorrect recommendations), external support contact list (legal counsel, financial reviewers, insurance providers), support engagement procedures, or escalation criteria for involving external parties.
Tags
Supplemental Control
Operational Practices
AI failure plan
Requirement
·
Mandatory Requirement
Control activity
Results of testing from foundation model developer on offensive cyber capabilities and mitigations.
Evidence
F001.1 Documentation: Foundation model cyber capabilities
Provider model cards, cybersecurity assessment reports from model developers, or foundation model documentation describing offensive cyber capabilities and mitigations
Tags
Mandatory Control
Legal Policies
Vendor Contracts
Requirement
·
Mandatory Requirement
Control activity
Implementing malicious use detection and blocking. For example, deploying available content filtering to detect requests for malicious code generation, attack planning, and vulnerability exploitation guidance, configuring automated blocking of cyber attack assistance requests, maintaining databases of prohibited use patterns.
Evidence
F001.2 Config: Cyber use detection
Content filtering rules blocking cyber attack requests, keyword or pattern matching detecting malicious code generation attempts, automated blocking configuration for exploit development queries, or prohibited use pattern database.
Tags
Supplemental Control
Technical Implementation
Engineering Code
Requirement
·
Mandatory Requirement
Control activity
Results of testing from foundation model developer on CBRN capabilities and mitigations.
Evidence
F002.1 Documentation: Foundation model CBRN capabilities
List of foundation models used with CBRN capability information - may include provider model cards with CBRN assessments, weapons of mass destruction risk evaluations from model developers, or other documentation describing CBRN-related capabilities and mitigations.
Tags
Mandatory Control
Legal Policies
Vendor Contracts
Requirement
·
Mandatory Requirement
Control activity
Establishing catastrophic misuse monitoring. For example, monitoring AI system interactions for patterns indicating weapons development or mass harm intent, implementing real-time alerting for detected catastrophic misuse attempts, documenting suspicious queries and system responses.
Evidence
F002.2 Config: Catastrophic misuse monitoring
Monitoring dashboard or alert configuration for catastrophic misuse patterns - may include usage monitoring flagging CBRN-related queries, alert rules for weapons development patterns, logs of detected and blocked catastrophic misuse attempts, or incident records documenting suspicious CBRN-related interactions.
Tags
Supplemental Control
Technical Implementation
Engineering Code

Capability: Automation

Requirement
·
Mandatory Requirement
Control activity
Implementing technical restrictions that limit agent capabilities to authorized scope. For example, restricting agent access to approved backend services, APIs and MCP servers, enforcing network segmentation or API gateway rules, or implementing service-level authorization preventing access to sensitive systems.
Evidence
B006.1 Config: Agent service access restrictions
Configuration showing technical limitations on agent backend access - may include API gateway rules restricting accessible services, network policies defining allowed endpoints, MCP server allowlist or registration configuration restricting which MCP servers and tools the agent may connect to, service-level authorization configuration, or architecture diagram showing agent isolation boundaries including MCP server placement and network segmentation.
Tags
Mandatory Control
Technical Implementation
Engineering Code
Requirement
·
Mandatory Requirement
Control activity
Deploying monitoring and alerting for agent actions that exceed security boundaries. For example, logging all agent service interactions, alerting on access attempts to unauthorized systems or APIs, or anomaly detection flagging unusual connection patterns.
Evidence
B006.2 Config: Agent security monitoring and alerting
Implementation of monitoring configuration tracking agent security-relevant actions - may include logging setup capturing agent service calls and authentication attempts, alert rules for unauthorized system access, security monitoring dashboard showing agent infrastructure interactions, or example logs demonstrating boundary violations are detected.
Tags
Mandatory Control
Technical Implementation
Engineering CodeLogs
Requirement
·
Mandatory Requirement
Control activity
Implementing additional safeguards to contain runtime risk. For example, applying sandboxed execution environments with restricted filesystem, network, and credential access for first-party MCP servers; monitoring MCP tool definitions for unauthorized changes after initial approval; providing pre-execution authorization hooks that verify runtime tool calls against defined policy before execution proceeds; or equivalent containment approaches appropriate to the deployment architecture..
Evidence
B006.3 Config: Execution-level safeguards
Configuration, documentation or code demonstrating runtime containment controls applied to agents and MCP servers - may include tool definition integrity controls showing how unauthorized post-approval changes are detected, container or VM configuration showing restricted filesystem and network access for first-party MCP servers, pre-execution hook or policy engine configuration showing tool calls are verified at runtime.
Tags
Supplemental Control
Technical Implementation
Engineering Code
Requirement
·
Mandatory Requirement
Control activity
Implementing tool call validation and authorization. For example, restricting tool calls to approved functions and MCP servers, validating parameters before execution.
Evidence
D003.1 Config: Tool authorization & validation
Code or configuration showing function and tool allowlists, parameter validation logic, or authz checks before tool execution - may include tool permission schemas, input validation functions, or access control lists restricting available tools per agent/user.
Tags
Mandatory Control
Technical Implementation
Engineering Code
Requirement
·
Mandatory Requirement
Control activity
Enforcing rate limits and transaction caps for autonomous tool use.
Evidence
D003.2 Config: Rate limits for tools
Code or configuration showing rate limits and transaction caps on tool usage - may include per-tool usage quotas, time-windowed limits, or circuit breakers preventing excessive autonomous tool calls.
Tags
Mandatory Control
Technical Implementation
Engineering Code
Requirement
·
Mandatory Requirement
Control activity
Establishing execution monitoring and logging. For example, tracking all tool calls, monitoring for unauthorized access attempts or scope violations.
Evidence
D003.3 Config: Tool call log
Logging configuration, monitoring dashboard, or audit logs showing tracked tool calls - may include log entries capturing the originating MCP server, tool name, tool version, input parameters, and timestamps per invocation, alerts for unauthorized tool access attempts, or monitoring system flagging scope violations.
Tags
Mandatory Control
Technical Implementation
Logs
Requirement
·
Mandatory Requirement
Control activity
Requiring human approval for sensitive tool operations. For example, requiring human confirmation before executing high-risk actions, multi-step tool calls, implementing approval workflows for operations beyond autonomous boundaries.
Evidence
D003.4 Config: Human-approval workflows
Approval workflow, code requiring human confirmation, or ticketing system for sensitive, high-risk, or multi-step tool operations
Tags
Supplemental Control
Operational Practices
Internal processes
Requirement
·
Mandatory Requirement
Control activity
Reviewing patterns of AI tool usage. For example, identifying anomalies, updating tool permissions, and retiring unused or high-risk functions during scheduled evaluations.
Evidence
D003.5 Documentation: tool call log reviews
Reports or documentation showing periodic review of tool usage patterns, permission updates, and function retirement decisions - may include usage analytics identifying anomalies, change logs showing permission adjustments, or records of deprecated/retired tools with rationale.
Tags
Supplemental Control
Operational Practices
Internal processes
Requirement
·
Mandatory Requirement
Control activity
Appointing qualified third-party assessors. Including selecting assessors with relevant technical capabilities for identified risk areas, maintaining records of assessor qualifications and independence.
Conducting regular testing. Including defining testing scope and methodologies based on risk taxonomy and performing assessments of tool calls at least every quarter.
Maintaining documentation. Including testing scope, results, and remediation actions taken, tracking follow-up activities and resolution timelines.
Evidence
D004.1 Report: Tool call testing
Third-party evaluation report showing tool call testing - must include risk taxonomy tested, testing methodology and findings, and improvement tracking with remediation timelines and documentation.
Tags
Mandatory Control
Third-party Evals
Third-party evaluation report
Requirement
·
Mandatory Requirement
Control activity
Results of testing from foundation model developer on offensive cyber capabilities and mitigations.
Evidence
F001.1 Documentation: Foundation model cyber capabilities
Provider model cards, cybersecurity assessment reports from model developers, or foundation model documentation describing offensive cyber capabilities and mitigations
Tags
Mandatory Control
Legal Policies
Vendor Contracts
Requirement
·
Mandatory Requirement
Control activity
Implementing malicious use detection and blocking. For example, deploying available content filtering to detect requests for malicious code generation, attack planning, and vulnerability exploitation guidance, configuring automated blocking of cyber attack assistance requests, maintaining databases of prohibited use patterns.
Evidence
F001.2 Config: Cyber use detection
Content filtering rules blocking cyber attack requests, keyword or pattern matching detecting malicious code generation attempts, automated blocking configuration for exploit development queries, or prohibited use pattern database.
Tags
Supplemental Control
Technical Implementation
Engineering Code

Capability: Image generation

Requirement
·
Mandatory Requirement
Control activity
Documenting foundation model provider IP protections which may serve as primary infringement safeguards. For example, indemnification clauses or copyright/trademark guardrails.
Evidence
A007.1 Documentation: Model provider IP infringement protections
Foundation model provider contract, terms of service, or data processing agreement showing IP protection commitments including copyright/trademark handling policies, indemnification clauses, liability coverage, and any documented limitations or exclusions. May include vendor questionnaire responses or certification documents addressing IP protections.
Tags
Mandatory Control
Legal Policies
Vendor Contracts
Requirement
·
Mandatory Requirement
Control activity
Establishing supplementary content filtering mechanisms where provider protections have gaps or limitations. For example, detecting copyrighted material in outputs, implementing trademark screening.
Evidence
A007.2 Config: IP infringement filtering
Code, API configuration, or filtering system showing detection of copyrighted material, trademark screening, or content validation checks applied to AI outputs - this could be pattern matching logic, third-party API integration (e.g. copyright detection services), or custom filtering rules.
Tags
Supplemental Control
Technical Implementation
Engineering CodeEng: LLM output filtering logic
Requirement
·
Mandatory Requirement
Control activity
Implementing user guidance and guardrails to reduce IP risk. For example, usage policies that explain prohibited content types, user warnings in product, restricting output generation in known infringement domains.
Implementing restrictions in AI acceptable use policy.
Evidence
A007.3 Logs: User-facing notices
User-facing IP risk guidance - may include warning messages when attempting high-risk operations, help center articles about IP infringement guidance, or UI elements explaining prohibited use cases.
Tags
Supplemental Control
Technical Implementation
ProductAcceptable Use Policy
Requirement
·
Optional Requirement
Control activity
Integrating automated moderation tools to filter inputs before they reach the foundation model. For example, integrating third-party moderation APIs, implementing custom filtering rules, configuring blocking or warning actions for flagged content, and establishing confidence thresholds based on risk category and severity
Evidence
B005.1 Config: Input filtering
Moderation tool integration showing API configuration, filtering rules, action settings (block/warn/modify), and confidence thresholds for different violation categories - this could be screenshots of configuration files, admin dashboard settings, or API integration code. Example moderation tools: OpenAI Moderation API, Claude content filtering, VirtueAI/Hive/Spectrum Labs
Tags
Mandatory Control
Technical Implementation
Eng: User LLM input filtering logicEngineering Tooling
Requirement
·
Optional Requirement
Control activity
Documenting the moderation logic and rationale. For example, explaining chosen moderation tools, threshold justifications, and decision criteria for different risk categories.
Evidence
B005.2 Documentation: Input moderation approach
Document explaining moderation approach including tool selection rationale, threshold settings with justifications, action logic for different violation types, and examples of how different input categories are handled.
Tags
Supplemental Control
Technical Implementation
Internal processesEngineering Practice
Requirement
·
Optional Requirement
Control activity
Providing feedback to users when inputs are blocked.
Evidence
B005.3 Demonstration: Warning for blocked inputs
User-facing messages or UI flows showing how blocked inputs are communicated to users - this could be error messages, warning dialogs, or alternative suggestions provided when content is filtered.
Tags
Supplemental Control
Technical Implementation
Product
Requirement
·
Optional Requirement
Control activity
Logging flagged prompts for analysis and refinement of filters, while ensuring compliance with privacy obligations.
Evidence
B005.4 Logs: Input filtering
Logging system showing how flagged inputs are captured, what metadata is included/excluded for privacy, retention policies, and audit trail - may include privacy documentation explaining logging disclosures to users.
Tags
Supplemental Control
Technical Implementation
Logs
Requirement
·
Optional Requirement
Control activity
Periodically evaluating filter performance and adjusting thresholds accordingly. For example, accuracy, latency, false positives/negatives.
Evidence
B005.5 Documentation: Input filter performance
Report or dashboard showing analysis of filter performance metrics (false positives, false negatives, accuracy, latency) and documented threshold adjustments made based on performance data - should include timestamps and rationale for changes.
Tags
Supplemental Control
Technical Implementation
Engineering Practice
Requirement
·
Mandatory Requirement
Control activity
Implementing content filtering for harmful output types. For example, detecting and blocking distressed responses, angry language, offensive content, biased statements, and deceptive information.
Evidence
C003.1 Config: Harmful output filtering
Content filtering rules, moderation API configuration, or classifier settings showing detection and blocking logic for harmful output types - may include filtering rules in code, third-party moderation tool configuration (e.g., OpenAI Moderation API, Perspective API), or custom classifier model settings with harm category definitions.
Tags
Mandatory Control
Technical Implementation
Eng: LLM output filtering logic
Requirement
·
Mandatory Requirement
Control activity
Implementing guardrails for advice generation. For example, restricting high-risk recommendations in sensitive domains, requiring disclaimers for guidance.
Evidence
C003.2 Config: Guardrails for high-risk advice
System prompts, guardrail rules, or domain restrictions showing safety controls on advice generation - may include defensive prompting, domain-specific output restrictions (e.g., medical/legal/financial advice blocklists), or conditional response templates that add warnings for sensitive topics.
Tags
Mandatory Control
Technical Implementation
Engineering Code
Requirement
·
Mandatory Requirement
Control activity
Implementing bias detection and mitigation controls. For example, monitoring for discriminatory patterns, implementing fairness checks in outputs.
Evidence
C003.3 Config: Guardrails for biased outputs
Documentation of bias eval results testing for stereotypical responses across demographic attributes, manual review logs documenting bias assessments, or output filtering rules blocking discriminatory patterns - may include automated fairness evaluation tools or bias monitoring dashboards if implemented.
Tags
Supplemental Control
Technical Implementation
Eng: LLM output filtering logic
Requirement
·
Mandatory Requirement
Control activity
Evaluating harm mitigation controls using performance metrics.
Evidence
C003.4 Documentation: Filtering performance benchmarks
Test results, metrics dashboard, or evaluation report showing performance of harm controls - may include false positive/negative rates, coverage analysis of test scenarios, benchmark results against harm datasets (e.g., ToxiGen, RealToxicityPrompts), or confusion matrices showing filtering accuracy across harm categories.
Tags
Supplemental Control
Operational Practices
Internal processes
Requirement
·
Mandatory Requirement
Control activity
Appointing qualified third-party assessors. Including selecting assessors with relevant technical capabilities for identified risk areas, maintaining records of assessor qualifications and independence.
Conducting regular testing. Including performing assessments of harmful outputs at least every quarter, defining testing scope and methodologies based on risk classifications and industry benchmarks like ToxiGen, coordinating with internal security and testing teams.
Maintaining documentation. Including testing scope, results, and remediation actions taken, tracking follow-up activities and resolution timelines.
Evidence
C010.1 Report: Harmful output testing
Third-party evaluation report showing harmful output testing - must include documentation of assessor qualifications, testing methodology and findings, and improvement tracking with remediation timelines and documentation.
Tags
Mandatory Control
Third-party Evals
Third-party evaluation report
Requirement
·
Mandatory Requirement
Control activity
Implementing customer communication protocols. For example, disclosure procedures, explanation of corrective actions, and follow-up commitments with executive approval for significant incidents.
Establishing immediate mitigation steps with designated staff responsibilities. For example, system freeze capabilities, output suppression, customer notification, and system adjustments.
Evidence
E002.1 Documentation: AI failure plan for harmful outputs
Can be standalone document or integrated in existing incident response procedures/policies
Tags
Mandatory Control
Operational Practices
AI failure plan
Requirement
·
Mandatory Requirement
Control activity
Defining harmful output categories with reference to risk taxonomy. For example, discriminatory content, offensive material, inappropriate recommendations, ideally with concrete examples.
Coordinating external support engagement. For example, legal counsel consultation, PR support, and insurance claim procedures.
Evidence
E002.2 Documentation: Additional harmful output failure procedures
May include harmful output category definitions referenced to risk taxonomy, external support contact list (legal counsel, PR firms, insurance providers), support engagement procedures or runbooks, or escalation criteria for involving external parties.
Tags
Supplemental Control
Operational Practices
AI failure plan
Requirement
·
Mandatory Requirement
Control activity
Results of testing from foundation model developer on CBRN capabilities and mitigations.
Evidence
F002.1 Documentation: Foundation model CBRN capabilities
List of foundation models used with CBRN capability information - may include provider model cards with CBRN assessments, weapons of mass destruction risk evaluations from model developers, or other documentation describing CBRN-related capabilities and mitigations.
Tags
Mandatory Control
Legal Policies
Vendor Contracts
Requirement
·
Mandatory Requirement
Control activity
Establishing catastrophic misuse monitoring. For example, monitoring AI system interactions for patterns indicating weapons development or mass harm intent, implementing real-time alerting for detected catastrophic misuse attempts, documenting suspicious queries and system responses.
Evidence
F002.2 Config: Catastrophic misuse monitoring
Monitoring dashboard or alert configuration for catastrophic misuse patterns - may include usage monitoring flagging CBRN-related queries, alert rules for weapons development patterns, logs of detected and blocked catastrophic misuse attempts, or incident records documenting suspicious CBRN-related interactions.
Tags
Supplemental Control
Technical Implementation
Engineering Code