AIUC-1
Capability-Specific Evidence

Capability-Specific Evidence

Download evidence list

Capability: Text AI

Requirement
·
Mandatory Requirement
Control activity
Documenting foundation model provider IP protections which may serve as primary infringement safeguards. For example, indemnification clauses or copyright/trademark guardrails.
Evidence
A007.1 Documentation: Model provider IP infringement protections
Foundation model provider contract, terms of service, or data processing agreement showing IP protection commitments including copyright/trademark handling policies, indemnification clauses, liability coverage, and any documented limitations or exclusions. May include vendor questionnaire responses or certification documents addressing IP protections.
Tags
Mandatory Control
Legal Policies
Vendor Contracts
Requirement
·
Mandatory Requirement
Control activity
Establishing supplementary content filtering mechanisms where provider protections have gaps or limitations. For example, detecting copyrighted material in outputs, implementing trademark screening.
Evidence
A007.2 Config: IP infringement filtering
Screenshot of code, API configuration, or filtering system showing detection of copyrighted material, trademark screening, or content validation checks applied to AI outputs - this could be pattern matching logic, third-party API integration (e.g. copyright detection services), or custom filtering rules.
Tags
Supplemental Control
Technical Implementation
Engineering CodeEng: LLM output filtering logic
Requirement
·
Mandatory Requirement
Control activity
Implementing user guidance and guardrails to reduce IP risk. For example, usage policies that explain prohibited content types, user warnings in product, restricting output generation in known infringement domains.
Implementing restrictions in AI acceptable use policy.
Evidence
A007.3 Logs: User-facing notices
Screenshot of user-facing IP risk guidance - may include warning messages when attempting high-risk operations, help center articles about IP infringement guidance, or UI elements explaining prohibited use cases.
Tags
Supplemental Control
Technical Implementation
ProductAcceptable Use Policy
Requirement
·
Optional Requirement
Control activity
Integrating automated moderation tools to filter inputs before they reach the foundation model. For example, integrating third-party moderation APIs, implementing custom filtering rules, configuring blocking or warning actions for flagged content, and establishing confidence thresholds based on risk category and severity
Evidence
B005.1 Config: Input filtering
Screenshot of moderation tool integration showing API configuration, filtering rules, action settings (block/warn/modify), and confidence thresholds for different violation categories - this could be screenshots of configuration files, admin dashboard settings, or API integration code. Example moderation tools: OpenAI Moderation API, Claude content filtering, VirtueAI/Hive/Spectrum Labs
Tags
Mandatory Control
Technical Implementation
Eng: User LLM input filtering logicEngineering Tooling
Requirement
·
Optional Requirement
Control activity
Documenting the moderation logic and rationale. For example, explaining chosen moderation tools, threshold justifications, and decision criteria for different risk categories.
Evidence
B005.2 Documentation: Input moderation approach
Document explaining moderation approach including tool selection rationale, threshold settings with justifications, action logic for different violation types, and examples of how different input categories are handled.
Tags
Supplemental Control
Technical Implementation
Internal processesEngineering Practice
Requirement
·
Optional Requirement
Control activity
Providing feedback to users when inputs are blocked.
Evidence
B005.3 Demonstration: Warning for blocked inputs
Screenshot of user-facing messages or UI flows showing how blocked inputs are communicated to users - this could be error messages, warning dialogs, or alternative suggestions provided when content is filtered.
Tags
Supplemental Control
Technical Implementation
Product
Requirement
·
Optional Requirement
Control activity
Logging flagged prompts for analysis and refinement of filters, while ensuring compliance with privacy obligations.
Evidence
B005.4 Logs: Input filtering
Screenshot of logging system showing how flagged inputs are captured, what metadata is included/excluded for privacy, retention policies, and audit trail - may include privacy documentation explaining logging disclosures to users.
Tags
Supplemental Control
Technical Implementation
Logs
Requirement
·
Optional Requirement
Control activity
Periodically evaluating filter performance and adjusting thresholds accordingly. For example, accuracy, latency, false positives/negatives.
Evidence
B005.5 Documentation: Input filter performance
Report or dashboard showing analysis of filter performance metrics (false positives, false negatives, accuracy, latency) and documented threshold adjustments made based on performance data - should include timestamps and rationale for changes.
Tags
Supplemental Control
Technical Implementation
Engineering Practice
Requirement
·
Mandatory Requirement
Control activity
Reducing or limiting the number of results shown in outputs to relevant only to balance security and utility. For example, character limits, limits on inference time.
Evidence
B009.1 Config: Output volume limits
Screenshot of code or configuration implementing output restrictions - may include character or token limits, inference time limits, result count restrictions, or timeout configurations preventing excessive output. Can be demonstrated by product demo showing system timeout when requesting output exceeding limits.
Tags
Mandatory Control
Technical Implementation
Engineering CodeProduct
Requirement
·
Mandatory Requirement
Control activity
Providing user-facing notices or documentation about output limitations.
Evidence
B009.2 Demonstration: User output notices
Screenshot of product interface showing user notices about output limitations - may include messages indicating truncated or suppressed outputs for security or privacy reasons, user documentation explaining limitation policies, or help articles describing output restrictions.
Tags
Supplemental Control
Operational Practices
Product
Requirement
·
Mandatory Requirement
Control activity
Limiting the fidelity of model outputs in certain use cases. For example, applying output rounding, threshold bands, or obfuscation techniques to reduce the risk of model inversion.
Evidence
B009.3 Config: Output precision controls
Screenshot of code implementing output fidelity limitations - may include rounding logic for numerical outputs, threshold bands reducing precision, or obfuscation techniques preventing model inversion, precision-sensitive data disclosure, or adversarial model extraction attacks.
Tags
Supplemental Control
Technical Implementation
Engineering Code
Requirement
·
Mandatory Requirement
Control activity
Implementing content filtering for harmful output types. For example, detecting and blocking distressed responses, angry language, offensive content, biased statements, and deceptive information.
Evidence
C003.1 Config: Harmful output filtering
Screenshot of content filtering rules, moderation API configuration, or classifier settings showing detection and blocking logic for harmful output types - may include filtering rules in code, third-party moderation tool configuration (e.g., OpenAI Moderation API, Perspective API), or custom classifier model settings with harm category definitions.
Tags
Mandatory Control
Technical Implementation
Eng: LLM output filtering logic
Requirement
·
Mandatory Requirement
Control activity
Implementing guardrails for advice generation. For example, restricting high-risk recommendations in sensitive domains, requiring disclaimers for guidance.
Evidence
C003.2 Config: Guardrails for high-risk advice
Screenshot of system prompts, guardrail rules, or domain restrictions showing safety controls on advice generation - may include defensive prompting, domain-specific output restrictions (e.g., medical/legal/financial advice blocklists), or conditional response templates that add warnings for sensitive topics.
Tags
Mandatory Control
Technical Implementation
Engineering Code
Requirement
·
Mandatory Requirement
Control activity
Implementing bias detection and mitigation controls. For example, monitoring for discriminatory patterns, implementing fairness checks in outputs.
Evidence
C003.3 Config: Guardrails for biased outputs
Documentation of bias eval results testing for stereotypical responses across demographic attributes, manual review logs documenting bias assessments, or output filtering rules blocking discriminatory patterns - may include automated fairness evaluation tools or bias monitoring dashboards if implemented.
Tags
Supplemental Control
Technical Implementation
Eng: LLM output filtering logic
Requirement
·
Mandatory Requirement
Control activity
Evaluating harm mitigation controls using performance metrics.
Evidence
C003.4 Documentation: Filtering performance benchmarks
Test results, metrics dashboard, or evaluation report showing performance of harm controls - may include false positive/negative rates, coverage analysis of test scenarios, benchmark results against harm datasets (e.g., ToxiGen, RealToxicityPrompts), or confusion matrices showing filtering accuracy across harm categories.
Tags
Supplemental Control
Operational Practices
Internal processes
Requirement
·
Mandatory Requirement
Control activity
Detecting and blocking out-of-scope requests. For example, detecting conversations outside intended use cases, blocking prohibited topics, providing redirection messages when users hit boundaries, and escalating or restricting access for repeated violations.
Evidence
C004.1 Config: out-of-scope guardrails
Screenshot of blocking rules, defensive prompting, or filtering configuration showing how out-of-scope requests are detected and handled - may include topic blocklists, redirection message templates, escalation rules for repeated attempts, or system prompts defining allowed topics.
Tags
Mandatory Control
Technical Implementation
Engineering Code
Requirement
·
Mandatory Requirement
Control activity
Tracking out-of-scope violations and updating boundaries. For example, logging boundary violations, adjusting restrictions based on misuse patterns.
Evidence
C004.2 Logs: Out-of-scope attempts
Logs showing out-of-scope attempts with frequency data. May include documentation of boundary updates made in response to violations, monitoring dashboard of flagged requests, change log showing restriction updates with rationale, or incident reports triggering scope adjustments.
Tags
Mandatory Control
Technical Implementation
Logs
Requirement
·
Mandatory Requirement
Control activity
Providing user guidance on system capabilities and limitations. For example, communicating what the AI system can and cannot do, intended use cases, and topics or requests outside the system's scope.
Evidence
C004.3 Demonstration: User guidance on scope
Screenshot of user-facing guidance explaining system capabilities and limitations - may include onboarding tooltips or welcome screens, help documentation or FAQs describing intended use, UI warnings when approaching scope boundaries, or published usage guidelines.
Tags
Supplemental Control
Technical Implementation
Product
Requirement
·
Mandatory Requirement
Control activity
Appointing qualified third-party assessors. Including selecting assessors with relevant technical capabilities for identified risk areas, maintaining records of assessor qualifications and independence.
Conducting regular testing. Including performing assessments of harmful outputs at least every quarter, defining testing scope and methodologies based on risk classifications and industry benchmarks like ToxiGen, coordinating with internal security and testing teams.
Maintaining documentation. Including testing scope, results, and remediation actions taken, tracking follow-up activities and resolution timelines.
Evidence
C010.1 Report: Harmful output testing
Third-party evaluation report showing harmful output testing - must include documentation of assessor qualifications, testing methodology and findings, and improvement tracking with remediation timelines and documentation.
Tags
Mandatory Control
Third-party Evals
Third-party evaluation report
Requirement
Control activity
Appointing qualified third-party assessors. Including selecting assessors with relevant technical capabilities for identified risk areas, maintaining records of assessor qualifications and independence.
Conducting regular testing. Including defining testing scope and methodologies based on risk taxonomy and performing assessments of out-of-scope outputs at least every quarter.
Maintaining documentation. Including testing scope, results, and remediation actions taken, tracking follow-up activities and resolution timelines.
Evidence
C011.1 Report: Out-of-scope output testing
Third-party evaluation report showing out-of-scope output testing - must include documentation of assessor qualifications, testing methodology and findings, and improvement tracking with remediation timelines and documentation.
Tags
Mandatory Control
Third-party Evals
Third-party evaluation report
Requirement
·
Mandatory Requirement
Control activity
Implementing factual accuracy controls. For example, deploying available fact-checking mechanisms, flagging uncertain or low-confidence responses.
Evidence
D001.1 Config: Groundedness filter
Screenshot of code or configuration showing groundedness validation - may include filters checking responses against source documents, fact-checking API integration, or logic comparing generated content to retrieved context for factual accuracy.
Tags
Mandatory Control
Technical Implementation
Engineering Code
Requirement
·
Mandatory Requirement
Control activity
Establishing information source validation. For example, requiring citations for factual claims, implementing source reliability checks.
Evidence
D001.2 Demonstration: User-facing citations & source attributions
Screenshot of UI or output format showing citations and source attributions provided to users - may include inline citations, source links, reference lists, or attribution labels identifying where information originated.
Tags
Mandatory Control
Technical Implementation
Product
Requirement
·
Mandatory Requirement
Control activity
Maintaining uncertainty communication. For example, displaying confidence levels, providing appropriate disclaimers for generated information.
Evidence
D001.3 Demonstration: User-facing uncertainty labels
Screenshot of UI or output format showing confidence levels, uncertainty disclaimers, or warnings for generated information - may include confidence score displays, low-certainty warnings, or standard disclaimers about potential inaccuracies.
Tags
Supplemental Control
Technical Implementation
Product
Requirement
·
Mandatory Requirement
Control activity
Appointing qualified third-party assessors. Including selecting assessors with relevant technical capabilities for identified risk areas, maintaining records of assessor qualifications and independence.
Conducting regular testing. Including defining testing scope and methodologies based on risk taxonomy and performing assessments at least every quarter.
Maintaining documentation. Including testing scope, results, and remediation actions taken, tracking follow-up activities and resolution timelines.
Evidence
D002.1 Report: Hallucination testing results
Third-party evaluation report showing hallucination testing - must include risk taxonomy tested, testing methodology and findings, and improvement tracking with remediation timelines and documentation.
Tags
Mandatory Control
Third-party Evals
Third-party evaluation report
Requirement
·
Mandatory Requirement
Control activity
Implementing customer communication protocols. For example, disclosure procedures, explanation of corrective actions, and follow-up commitments with executive approval for significant incidents.
Establishing immediate mitigation steps with designated staff responsibilities. For example, system freeze capabilities, output suppression, customer notification, and system adjustments.
Evidence
E002.1 Documentation: AI failure plan for harmful outputs
Can be standalone document or integrated in existing incident response procedures/policies
Tags
Mandatory Control
Operational Practices
AI failure plan
Requirement
·
Mandatory Requirement
Control activity
Defining harmful output categories with reference to risk taxonomy. For example, discriminatory content, offensive material, inappropriate recommendations, ideally with concrete examples.
Coordinating external support engagement. For example, legal counsel consultation, PR support, and insurance claim procedures.
Evidence
E002.2 Documentation: Additional harmful output failure procedures
May include harmful output category definitions referenced to risk taxonomy, external support contact list (legal counsel, PR firms, insurance providers), support engagement procedures or runbooks, or escalation criteria for involving external parties.
Tags
Supplemental Control
Operational Practices
AI failure plan
Requirement
·
Mandatory Requirement
Control activity
Establishing compensation assessment procedures. For example, loss evaluation methods, settlement approaches, and payment authorization levels with appropriate approval requirements.
Implementing remediation measures. For example, system freeze capabilities, model adjustments, output validation improvements, customer notification, and enhanced monitoring.
Evidence
E003.1 Documentation: AI failure plan for hallucinations
Can be standalone document or integrated in existing incident response procedures/policies
Tags
Mandatory Control
Operational Practices
AI failure plan
Requirement
·
Mandatory Requirement
Control activity
Defining hallucination incident types.
Coordinating potential external support. For example, legal consultation for significant claims, financial review when needed, and insurance coverage activation.
Evidence
E003.2 Documentation: Additional hallucination failure procedures
May include hallucination incident categories (e.g. factual errors, incorrect recommendations), external support contact list (legal counsel, financial reviewers, insurance providers), support engagement procedures, or escalation criteria for involving external parties.
Tags
Supplemental Control
Operational Practices
AI failure plan
Requirement
·
Mandatory Requirement
Control activity
Results of testing from foundation model developer on offensive cyber capabilities and mitigations.
Evidence
F001.1 Documentation: Foundation model cyber capabilities
Provider model cards, cybersecurity assessment reports from model developers, or foundation model documentation describing offensive cyber capabilities and mitigations
Tags
Mandatory Control
Legal Policies
Vendor Contracts
Requirement
·
Mandatory Requirement
Control activity
Implementing malicious use detection and blocking. For example, deploying available content filtering to detect requests for malicious code generation, attack planning, and vulnerability exploitation guidance, configuring automated blocking of cyber attack assistance requests, maintaining databases of prohibited use patterns.
Evidence
F001.2 Config: Cyber use detection
Content filtering rules blocking cyber attack requests, keyword or pattern matching detecting malicious code generation attempts, automated blocking configuration for exploit development queries, or prohibited use pattern database.
Tags
Supplemental Control
Technical Implementation
Engineering Code
Requirement
·
Mandatory Requirement
Control activity
Results of testing from foundation model developer on CBRN capabilities and mitigations.
Evidence
F002.1 Documentation: Foundation model CBRN capabilities
List of foundation models used with CBRN capability information - may include provider model cards with CBRN assessments, weapons of mass destruction risk evaluations from model developers, or other documentation describing CBRN-related capabilities and mitigations.
Tags
Mandatory Control
Legal Policies
Vendor Contracts
Requirement
·
Mandatory Requirement
Control activity
Establishing catastrophic misuse monitoring. For example, monitoring AI system interactions for patterns indicating weapons development or mass harm intent, implementing real-time alerting for detected catastrophic misuse attempts, documenting suspicious queries and system responses.
Evidence
F002.2 Config: Catastrophic misuse monitoring
Monitoring dashboard or alert configuration for catastrophic misuse patterns - may include usage monitoring flagging CBRN-related queries, alert rules for weapons development patterns, logs of detected and blocked catastrophic misuse attempts, or incident records documenting suspicious CBRN-related interactions.
Tags
Supplemental Control
Technical Implementation
Engineering Code

Capability: Voice AI

Requirement
·
Mandatory Requirement
Control activity
Documenting foundation model provider IP protections which may serve as primary infringement safeguards. For example, indemnification clauses or copyright/trademark guardrails.
Evidence
A007.1 Documentation: Model provider IP infringement protections
Foundation model provider contract, terms of service, or data processing agreement showing IP protection commitments including copyright/trademark handling policies, indemnification clauses, liability coverage, and any documented limitations or exclusions. May include vendor questionnaire responses or certification documents addressing IP protections.
Tags
Mandatory Control
Legal Policies
Vendor Contracts
Requirement
·
Mandatory Requirement
Control activity
Establishing supplementary content filtering mechanisms where provider protections have gaps or limitations. For example, detecting copyrighted material in outputs, implementing trademark screening.
Evidence
A007.2 Config: IP infringement filtering
Screenshot of code, API configuration, or filtering system showing detection of copyrighted material, trademark screening, or content validation checks applied to AI outputs - this could be pattern matching logic, third-party API integration (e.g. copyright detection services), or custom filtering rules.
Tags
Supplemental Control
Technical Implementation
Engineering CodeEng: LLM output filtering logic
Requirement
·
Mandatory Requirement
Control activity
Implementing user guidance and guardrails to reduce IP risk. For example, usage policies that explain prohibited content types, user warnings in product, restricting output generation in known infringement domains.
Implementing restrictions in AI acceptable use policy.
Evidence
A007.3 Logs: User-facing notices
Screenshot of user-facing IP risk guidance - may include warning messages when attempting high-risk operations, help center articles about IP infringement guidance, or UI elements explaining prohibited use cases.
Tags
Supplemental Control
Technical Implementation
ProductAcceptable Use Policy
Requirement
·
Optional Requirement
Control activity
Integrating automated moderation tools to filter inputs before they reach the foundation model. For example, integrating third-party moderation APIs, implementing custom filtering rules, configuring blocking or warning actions for flagged content, and establishing confidence thresholds based on risk category and severity
Evidence
B005.1 Config: Input filtering
Screenshot of moderation tool integration showing API configuration, filtering rules, action settings (block/warn/modify), and confidence thresholds for different violation categories - this could be screenshots of configuration files, admin dashboard settings, or API integration code. Example moderation tools: OpenAI Moderation API, Claude content filtering, VirtueAI/Hive/Spectrum Labs
Tags
Mandatory Control
Technical Implementation
Eng: User LLM input filtering logicEngineering Tooling
Requirement
·
Optional Requirement
Control activity
Documenting the moderation logic and rationale. For example, explaining chosen moderation tools, threshold justifications, and decision criteria for different risk categories.
Evidence
B005.2 Documentation: Input moderation approach
Document explaining moderation approach including tool selection rationale, threshold settings with justifications, action logic for different violation types, and examples of how different input categories are handled.
Tags
Supplemental Control
Technical Implementation
Internal processesEngineering Practice
Requirement
·
Optional Requirement
Control activity
Providing feedback to users when inputs are blocked.
Evidence
B005.3 Demonstration: Warning for blocked inputs
Screenshot of user-facing messages or UI flows showing how blocked inputs are communicated to users - this could be error messages, warning dialogs, or alternative suggestions provided when content is filtered.
Tags
Supplemental Control
Technical Implementation
Product
Requirement
·
Optional Requirement
Control activity
Logging flagged prompts for analysis and refinement of filters, while ensuring compliance with privacy obligations.
Evidence
B005.4 Logs: Input filtering
Screenshot of logging system showing how flagged inputs are captured, what metadata is included/excluded for privacy, retention policies, and audit trail - may include privacy documentation explaining logging disclosures to users.
Tags
Supplemental Control
Technical Implementation
Logs
Requirement
·
Optional Requirement
Control activity
Periodically evaluating filter performance and adjusting thresholds accordingly. For example, accuracy, latency, false positives/negatives.
Evidence
B005.5 Documentation: Input filter performance
Report or dashboard showing analysis of filter performance metrics (false positives, false negatives, accuracy, latency) and documented threshold adjustments made based on performance data - should include timestamps and rationale for changes.
Tags
Supplemental Control
Technical Implementation
Engineering Practice
Requirement
·
Mandatory Requirement
Control activity
Reducing or limiting the number of results shown in outputs to relevant only to balance security and utility. For example, character limits, limits on inference time.
Evidence
B009.1 Config: Output volume limits
Screenshot of code or configuration implementing output restrictions - may include character or token limits, inference time limits, result count restrictions, or timeout configurations preventing excessive output. Can be demonstrated by product demo showing system timeout when requesting output exceeding limits.
Tags
Mandatory Control
Technical Implementation
Engineering CodeProduct
Requirement
·
Mandatory Requirement
Control activity
Providing user-facing notices or documentation about output limitations.
Evidence
B009.2 Demonstration: User output notices
Screenshot of product interface showing user notices about output limitations - may include messages indicating truncated or suppressed outputs for security or privacy reasons, user documentation explaining limitation policies, or help articles describing output restrictions.
Tags
Supplemental Control
Operational Practices
Product
Requirement
·
Mandatory Requirement
Control activity
Limiting the fidelity of model outputs in certain use cases. For example, applying output rounding, threshold bands, or obfuscation techniques to reduce the risk of model inversion.
Evidence
B009.3 Config: Output precision controls
Screenshot of code implementing output fidelity limitations - may include rounding logic for numerical outputs, threshold bands reducing precision, or obfuscation techniques preventing model inversion, precision-sensitive data disclosure, or adversarial model extraction attacks.
Tags
Supplemental Control
Technical Implementation
Engineering Code
Requirement
·
Mandatory Requirement
Control activity
Implementing content filtering for harmful output types. For example, detecting and blocking distressed responses, angry language, offensive content, biased statements, and deceptive information.
Evidence
C003.1 Config: Harmful output filtering
Screenshot of content filtering rules, moderation API configuration, or classifier settings showing detection and blocking logic for harmful output types - may include filtering rules in code, third-party moderation tool configuration (e.g., OpenAI Moderation API, Perspective API), or custom classifier model settings with harm category definitions.
Tags
Mandatory Control
Technical Implementation
Eng: LLM output filtering logic
Requirement
·
Mandatory Requirement
Control activity
Implementing guardrails for advice generation. For example, restricting high-risk recommendations in sensitive domains, requiring disclaimers for guidance.
Evidence
C003.2 Config: Guardrails for high-risk advice
Screenshot of system prompts, guardrail rules, or domain restrictions showing safety controls on advice generation - may include defensive prompting, domain-specific output restrictions (e.g., medical/legal/financial advice blocklists), or conditional response templates that add warnings for sensitive topics.
Tags
Mandatory Control
Technical Implementation
Engineering Code
Requirement
·
Mandatory Requirement
Control activity
Implementing bias detection and mitigation controls. For example, monitoring for discriminatory patterns, implementing fairness checks in outputs.
Evidence
C003.3 Config: Guardrails for biased outputs
Documentation of bias eval results testing for stereotypical responses across demographic attributes, manual review logs documenting bias assessments, or output filtering rules blocking discriminatory patterns - may include automated fairness evaluation tools or bias monitoring dashboards if implemented.
Tags
Supplemental Control
Technical Implementation
Eng: LLM output filtering logic
Requirement
·
Mandatory Requirement
Control activity
Evaluating harm mitigation controls using performance metrics.
Evidence
C003.4 Documentation: Filtering performance benchmarks
Test results, metrics dashboard, or evaluation report showing performance of harm controls - may include false positive/negative rates, coverage analysis of test scenarios, benchmark results against harm datasets (e.g., ToxiGen, RealToxicityPrompts), or confusion matrices showing filtering accuracy across harm categories.
Tags
Supplemental Control
Operational Practices
Internal processes
Requirement
·
Mandatory Requirement
Control activity
Detecting and blocking out-of-scope requests. For example, detecting conversations outside intended use cases, blocking prohibited topics, providing redirection messages when users hit boundaries, and escalating or restricting access for repeated violations.
Evidence
C004.1 Config: out-of-scope guardrails
Screenshot of blocking rules, defensive prompting, or filtering configuration showing how out-of-scope requests are detected and handled - may include topic blocklists, redirection message templates, escalation rules for repeated attempts, or system prompts defining allowed topics.
Tags
Mandatory Control
Technical Implementation
Engineering Code
Requirement
·
Mandatory Requirement
Control activity
Tracking out-of-scope violations and updating boundaries. For example, logging boundary violations, adjusting restrictions based on misuse patterns.
Evidence
C004.2 Logs: Out-of-scope attempts
Logs showing out-of-scope attempts with frequency data. May include documentation of boundary updates made in response to violations, monitoring dashboard of flagged requests, change log showing restriction updates with rationale, or incident reports triggering scope adjustments.
Tags
Mandatory Control
Technical Implementation
Logs
Requirement
·
Mandatory Requirement
Control activity
Providing user guidance on system capabilities and limitations. For example, communicating what the AI system can and cannot do, intended use cases, and topics or requests outside the system's scope.
Evidence
C004.3 Demonstration: User guidance on scope
Screenshot of user-facing guidance explaining system capabilities and limitations - may include onboarding tooltips or welcome screens, help documentation or FAQs describing intended use, UI warnings when approaching scope boundaries, or published usage guidelines.
Tags
Supplemental Control
Technical Implementation
Product
Requirement
·
Mandatory Requirement
Control activity
Appointing qualified third-party assessors. Including selecting assessors with relevant technical capabilities for identified risk areas, maintaining records of assessor qualifications and independence.
Conducting regular testing. Including performing assessments of harmful outputs at least every quarter, defining testing scope and methodologies based on risk classifications and industry benchmarks like ToxiGen, coordinating with internal security and testing teams.
Maintaining documentation. Including testing scope, results, and remediation actions taken, tracking follow-up activities and resolution timelines.
Evidence
C010.1 Report: Harmful output testing
Third-party evaluation report showing harmful output testing - must include documentation of assessor qualifications, testing methodology and findings, and improvement tracking with remediation timelines and documentation.
Tags
Mandatory Control
Third-party Evals
Third-party evaluation report
Requirement
Control activity
Appointing qualified third-party assessors. Including selecting assessors with relevant technical capabilities for identified risk areas, maintaining records of assessor qualifications and independence.
Conducting regular testing. Including defining testing scope and methodologies based on risk taxonomy and performing assessments of out-of-scope outputs at least every quarter.
Maintaining documentation. Including testing scope, results, and remediation actions taken, tracking follow-up activities and resolution timelines.
Evidence
C011.1 Report: Out-of-scope output testing
Third-party evaluation report showing out-of-scope output testing - must include documentation of assessor qualifications, testing methodology and findings, and improvement tracking with remediation timelines and documentation.
Tags
Mandatory Control
Third-party Evals
Third-party evaluation report
Requirement
·
Mandatory Requirement
Control activity
Implementing factual accuracy controls. For example, deploying available fact-checking mechanisms, flagging uncertain or low-confidence responses.
Evidence
D001.1 Config: Groundedness filter
Screenshot of code or configuration showing groundedness validation - may include filters checking responses against source documents, fact-checking API integration, or logic comparing generated content to retrieved context for factual accuracy.
Tags
Mandatory Control
Technical Implementation
Engineering Code
Requirement
·
Mandatory Requirement
Control activity
Establishing information source validation. For example, requiring citations for factual claims, implementing source reliability checks.
Evidence
D001.2 Demonstration: User-facing citations & source attributions
Screenshot of UI or output format showing citations and source attributions provided to users - may include inline citations, source links, reference lists, or attribution labels identifying where information originated.
Tags
Mandatory Control
Technical Implementation
Product
Requirement
·
Mandatory Requirement
Control activity
Maintaining uncertainty communication. For example, displaying confidence levels, providing appropriate disclaimers for generated information.
Evidence
D001.3 Demonstration: User-facing uncertainty labels
Screenshot of UI or output format showing confidence levels, uncertainty disclaimers, or warnings for generated information - may include confidence score displays, low-certainty warnings, or standard disclaimers about potential inaccuracies.
Tags
Supplemental Control
Technical Implementation
Product
Requirement
·
Mandatory Requirement
Control activity
Appointing qualified third-party assessors. Including selecting assessors with relevant technical capabilities for identified risk areas, maintaining records of assessor qualifications and independence.
Conducting regular testing. Including defining testing scope and methodologies based on risk taxonomy and performing assessments at least every quarter.
Maintaining documentation. Including testing scope, results, and remediation actions taken, tracking follow-up activities and resolution timelines.
Evidence
D002.1 Report: Hallucination testing results
Third-party evaluation report showing hallucination testing - must include risk taxonomy tested, testing methodology and findings, and improvement tracking with remediation timelines and documentation.
Tags
Mandatory Control
Third-party Evals
Third-party evaluation report
Requirement
·
Mandatory Requirement
Control activity
Implementing customer communication protocols. For example, disclosure procedures, explanation of corrective actions, and follow-up commitments with executive approval for significant incidents.
Establishing immediate mitigation steps with designated staff responsibilities. For example, system freeze capabilities, output suppression, customer notification, and system adjustments.
Evidence
E002.1 Documentation: AI failure plan for harmful outputs
Can be standalone document or integrated in existing incident response procedures/policies
Tags
Mandatory Control
Operational Practices
AI failure plan
Requirement
·
Mandatory Requirement
Control activity
Defining harmful output categories with reference to risk taxonomy. For example, discriminatory content, offensive material, inappropriate recommendations, ideally with concrete examples.
Coordinating external support engagement. For example, legal counsel consultation, PR support, and insurance claim procedures.
Evidence
E002.2 Documentation: Additional harmful output failure procedures
May include harmful output category definitions referenced to risk taxonomy, external support contact list (legal counsel, PR firms, insurance providers), support engagement procedures or runbooks, or escalation criteria for involving external parties.
Tags
Supplemental Control
Operational Practices
AI failure plan
Requirement
·
Mandatory Requirement
Control activity
Establishing compensation assessment procedures. For example, loss evaluation methods, settlement approaches, and payment authorization levels with appropriate approval requirements.
Implementing remediation measures. For example, system freeze capabilities, model adjustments, output validation improvements, customer notification, and enhanced monitoring.
Evidence
E003.1 Documentation: AI failure plan for hallucinations
Can be standalone document or integrated in existing incident response procedures/policies
Tags
Mandatory Control
Operational Practices
AI failure plan
Requirement
·
Mandatory Requirement
Control activity
Defining hallucination incident types.
Coordinating potential external support. For example, legal consultation for significant claims, financial review when needed, and insurance coverage activation.
Evidence
E003.2 Documentation: Additional hallucination failure procedures
May include hallucination incident categories (e.g. factual errors, incorrect recommendations), external support contact list (legal counsel, financial reviewers, insurance providers), support engagement procedures, or escalation criteria for involving external parties.
Tags
Supplemental Control
Operational Practices
AI failure plan
Requirement
·
Mandatory Requirement
Control activity
Results of testing from foundation model developer on offensive cyber capabilities and mitigations.
Evidence
F001.1 Documentation: Foundation model cyber capabilities
Provider model cards, cybersecurity assessment reports from model developers, or foundation model documentation describing offensive cyber capabilities and mitigations
Tags
Mandatory Control
Legal Policies
Vendor Contracts
Requirement
·
Mandatory Requirement
Control activity
Implementing malicious use detection and blocking. For example, deploying available content filtering to detect requests for malicious code generation, attack planning, and vulnerability exploitation guidance, configuring automated blocking of cyber attack assistance requests, maintaining databases of prohibited use patterns.
Evidence
F001.2 Config: Cyber use detection
Content filtering rules blocking cyber attack requests, keyword or pattern matching detecting malicious code generation attempts, automated blocking configuration for exploit development queries, or prohibited use pattern database.
Tags
Supplemental Control
Technical Implementation
Engineering Code
Requirement
·
Mandatory Requirement
Control activity
Results of testing from foundation model developer on CBRN capabilities and mitigations.
Evidence
F002.1 Documentation: Foundation model CBRN capabilities
List of foundation models used with CBRN capability information - may include provider model cards with CBRN assessments, weapons of mass destruction risk evaluations from model developers, or other documentation describing CBRN-related capabilities and mitigations.
Tags
Mandatory Control
Legal Policies
Vendor Contracts
Requirement
·
Mandatory Requirement
Control activity
Establishing catastrophic misuse monitoring. For example, monitoring AI system interactions for patterns indicating weapons development or mass harm intent, implementing real-time alerting for detected catastrophic misuse attempts, documenting suspicious queries and system responses.
Evidence
F002.2 Config: Catastrophic misuse monitoring
Monitoring dashboard or alert configuration for catastrophic misuse patterns - may include usage monitoring flagging CBRN-related queries, alert rules for weapons development patterns, logs of detected and blocked catastrophic misuse attempts, or incident records documenting suspicious CBRN-related interactions.
Tags
Supplemental Control
Technical Implementation
Engineering Code

Capability: Automation

Requirement
·
Mandatory Requirement
Control activity
Implementing technical restrictions that limit agent capabilities to authorized scope. For example, restricting agent access to approved backend services and APIs, enforcing network segmentation or API gateway rules, or implementing service-level authorization preventing access to sensitive systems.
Evidence
B006.1 Config: Agent service access restrictions
Screenshot of configuration showing technical limitations on agent backend access - may include API gateway rules restricting accessible services, network policies defining allowed endpoints, service-level authorization configuration, or architecture diagram showing agent isolation boundaries.
Tags
Mandatory Control
Technical Implementation
Engineering Code
Requirement
·
Mandatory Requirement
Control activity
Deploying monitoring and alerting for agent actions that exceed security boundaries. For example, logging all agent service interactions, alerting on access attempts to unauthorized systems or APIs, or anomaly detection flagging unusual connection patterns.
Evidence
B006.2 Config: Agent security monitoring and alerting
Screenshot of monitoring configuration tracking agent security-relevant actions - may include logging setup capturing agent service calls and authentication attempts, alert rules for unauthorized system access, security monitoring dashboard showing agent infrastructure interactions, or example logs demonstrating boundary violations are detected.
Tags
Mandatory Control
Technical Implementation
Engineering CodeLogs
Requirement
·
Mandatory Requirement
Control activity
Implementing function call validation and authorization. For example, restricting tool access to approved functions, validating parameters before execution.
Evidence
D003.1 Config: Tool authorization & validation
Screenshot of code or configuration showing function allowlists, parameter validation logic, or authz checks before tool execution - may include tool permission schemas, input validation functions, or access control lists restricting available tools per agent/user.
Tags
Mandatory Control
Technical Implementation
Engineering Code
Requirement
·
Mandatory Requirement
Control activity
Enforcing rate limits and transaction caps for autonomous tool use.
Evidence
D003.2 Config: Rate limits for tools
Screenshot of code or configuration showing rate limits and transaction caps on tool usage - may include per-tool usage quotas, time-windowed limits, or circuit breakers preventing excessive autonomous tool calls.
Tags
Mandatory Control
Technical Implementation
Engineering Code
Requirement
·
Mandatory Requirement
Control activity
Establishing execution monitoring and logging. For example, tracking all tool calls, monitoring for unauthorized access attempts or scope violations.
Evidence
D003.3 Config: Tool call log
Screenshot of logging configuration, monitoring dashboard, or audit logs showing tracked tool calls - may include tool execution logs with timestamps and parameters, alerts for unauthorized access attempts, or monitoring system flagging scope violations.
Tags
Mandatory Control
Technical Implementation
Logs
Requirement
·
Mandatory Requirement
Control activity
Requiring human approval for sensitive tool operations. For example, requiring human confirmation before executing high-risk actions, implementing approval workflows for operations beyond autonomous boundaries.
Evidence
D003.4 Config: Human-approval workflows
Screenshot of approval workflow, code requiring human confirmation, or ticketing system for sensitive tool operations
Tags
Supplemental Control
Operational Practices
Internal processes
Requirement
·
Mandatory Requirement
Control activity
Reviewing patterns of AI tool usage. For example, identifying anomalies, updating tool permissions, and retiring unused or high-risk functions during scheduled evaluations.
Evidence
D003.5 Documentation: tool call log reviews
Reports or documentation showing periodic review of tool usage patterns, permission updates, and function retirement decisions - may include usage analytics identifying anomalies, change logs showing permission adjustments, or records of deprecated/retired tools with rationale.
Tags
Supplemental Control
Operational Practices
Internal processes
Requirement
·
Mandatory Requirement
Control activity
Appointing qualified third-party assessors. Including selecting assessors with relevant technical capabilities for identified risk areas, maintaining records of assessor qualifications and independence.
Conducting regular testing. Including defining testing scope and methodologies based on risk taxonomy and performing assessments of tool calls at least every quarter.
Maintaining documentation. Including testing scope, results, and remediation actions taken, tracking follow-up activities and resolution timelines.
Evidence
D004.1 Report: Tool call testing
Third-party evaluation report showing tool call testing - must include risk taxonomy tested, testing methodology and findings, and improvement tracking with remediation timelines and documentation.
Tags
Mandatory Control
Third-party Evals
Third-party evaluation report
Requirement
·
Mandatory Requirement
Control activity
Results of testing from foundation model developer on offensive cyber capabilities and mitigations.
Evidence
F001.1 Documentation: Foundation model cyber capabilities
Provider model cards, cybersecurity assessment reports from model developers, or foundation model documentation describing offensive cyber capabilities and mitigations
Tags
Mandatory Control
Legal Policies
Vendor Contracts
Requirement
·
Mandatory Requirement
Control activity
Implementing malicious use detection and blocking. For example, deploying available content filtering to detect requests for malicious code generation, attack planning, and vulnerability exploitation guidance, configuring automated blocking of cyber attack assistance requests, maintaining databases of prohibited use patterns.
Evidence
F001.2 Config: Cyber use detection
Content filtering rules blocking cyber attack requests, keyword or pattern matching detecting malicious code generation attempts, automated blocking configuration for exploit development queries, or prohibited use pattern database.
Tags
Supplemental Control
Technical Implementation
Engineering Code

Capability: Image generation

Requirement
·
Mandatory Requirement
Control activity
Documenting foundation model provider IP protections which may serve as primary infringement safeguards. For example, indemnification clauses or copyright/trademark guardrails.
Evidence
A007.1 Documentation: Model provider IP infringement protections
Foundation model provider contract, terms of service, or data processing agreement showing IP protection commitments including copyright/trademark handling policies, indemnification clauses, liability coverage, and any documented limitations or exclusions. May include vendor questionnaire responses or certification documents addressing IP protections.
Tags
Mandatory Control
Legal Policies
Vendor Contracts
Requirement
·
Mandatory Requirement
Control activity
Establishing supplementary content filtering mechanisms where provider protections have gaps or limitations. For example, detecting copyrighted material in outputs, implementing trademark screening.
Evidence
A007.2 Config: IP infringement filtering
Screenshot of code, API configuration, or filtering system showing detection of copyrighted material, trademark screening, or content validation checks applied to AI outputs - this could be pattern matching logic, third-party API integration (e.g. copyright detection services), or custom filtering rules.
Tags
Supplemental Control
Technical Implementation
Engineering CodeEng: LLM output filtering logic
Requirement
·
Mandatory Requirement
Control activity
Implementing user guidance and guardrails to reduce IP risk. For example, usage policies that explain prohibited content types, user warnings in product, restricting output generation in known infringement domains.
Implementing restrictions in AI acceptable use policy.
Evidence
A007.3 Logs: User-facing notices
Screenshot of user-facing IP risk guidance - may include warning messages when attempting high-risk operations, help center articles about IP infringement guidance, or UI elements explaining prohibited use cases.
Tags
Supplemental Control
Technical Implementation
ProductAcceptable Use Policy
Requirement
·
Optional Requirement
Control activity
Integrating automated moderation tools to filter inputs before they reach the foundation model. For example, integrating third-party moderation APIs, implementing custom filtering rules, configuring blocking or warning actions for flagged content, and establishing confidence thresholds based on risk category and severity
Evidence
B005.1 Config: Input filtering
Screenshot of moderation tool integration showing API configuration, filtering rules, action settings (block/warn/modify), and confidence thresholds for different violation categories - this could be screenshots of configuration files, admin dashboard settings, or API integration code. Example moderation tools: OpenAI Moderation API, Claude content filtering, VirtueAI/Hive/Spectrum Labs
Tags
Mandatory Control
Technical Implementation
Eng: User LLM input filtering logicEngineering Tooling
Requirement
·
Optional Requirement
Control activity
Documenting the moderation logic and rationale. For example, explaining chosen moderation tools, threshold justifications, and decision criteria for different risk categories.
Evidence
B005.2 Documentation: Input moderation approach
Document explaining moderation approach including tool selection rationale, threshold settings with justifications, action logic for different violation types, and examples of how different input categories are handled.
Tags
Supplemental Control
Technical Implementation
Internal processesEngineering Practice
Requirement
·
Optional Requirement
Control activity
Providing feedback to users when inputs are blocked.
Evidence
B005.3 Demonstration: Warning for blocked inputs
Screenshot of user-facing messages or UI flows showing how blocked inputs are communicated to users - this could be error messages, warning dialogs, or alternative suggestions provided when content is filtered.
Tags
Supplemental Control
Technical Implementation
Product
Requirement
·
Optional Requirement
Control activity
Logging flagged prompts for analysis and refinement of filters, while ensuring compliance with privacy obligations.
Evidence
B005.4 Logs: Input filtering
Screenshot of logging system showing how flagged inputs are captured, what metadata is included/excluded for privacy, retention policies, and audit trail - may include privacy documentation explaining logging disclosures to users.
Tags
Supplemental Control
Technical Implementation
Logs
Requirement
·
Optional Requirement
Control activity
Periodically evaluating filter performance and adjusting thresholds accordingly. For example, accuracy, latency, false positives/negatives.
Evidence
B005.5 Documentation: Input filter performance
Report or dashboard showing analysis of filter performance metrics (false positives, false negatives, accuracy, latency) and documented threshold adjustments made based on performance data - should include timestamps and rationale for changes.
Tags
Supplemental Control
Technical Implementation
Engineering Practice
Requirement
·
Mandatory Requirement
Control activity
Implementing content filtering for harmful output types. For example, detecting and blocking distressed responses, angry language, offensive content, biased statements, and deceptive information.
Evidence
C003.1 Config: Harmful output filtering
Screenshot of content filtering rules, moderation API configuration, or classifier settings showing detection and blocking logic for harmful output types - may include filtering rules in code, third-party moderation tool configuration (e.g., OpenAI Moderation API, Perspective API), or custom classifier model settings with harm category definitions.
Tags
Mandatory Control
Technical Implementation
Eng: LLM output filtering logic
Requirement
·
Mandatory Requirement
Control activity
Implementing guardrails for advice generation. For example, restricting high-risk recommendations in sensitive domains, requiring disclaimers for guidance.
Evidence
C003.2 Config: Guardrails for high-risk advice
Screenshot of system prompts, guardrail rules, or domain restrictions showing safety controls on advice generation - may include defensive prompting, domain-specific output restrictions (e.g., medical/legal/financial advice blocklists), or conditional response templates that add warnings for sensitive topics.
Tags
Mandatory Control
Technical Implementation
Engineering Code
Requirement
·
Mandatory Requirement
Control activity
Implementing bias detection and mitigation controls. For example, monitoring for discriminatory patterns, implementing fairness checks in outputs.
Evidence
C003.3 Config: Guardrails for biased outputs
Documentation of bias eval results testing for stereotypical responses across demographic attributes, manual review logs documenting bias assessments, or output filtering rules blocking discriminatory patterns - may include automated fairness evaluation tools or bias monitoring dashboards if implemented.
Tags
Supplemental Control
Technical Implementation
Eng: LLM output filtering logic
Requirement
·
Mandatory Requirement
Control activity
Evaluating harm mitigation controls using performance metrics.
Evidence
C003.4 Documentation: Filtering performance benchmarks
Test results, metrics dashboard, or evaluation report showing performance of harm controls - may include false positive/negative rates, coverage analysis of test scenarios, benchmark results against harm datasets (e.g., ToxiGen, RealToxicityPrompts), or confusion matrices showing filtering accuracy across harm categories.
Tags
Supplemental Control
Operational Practices
Internal processes
Requirement
·
Mandatory Requirement
Control activity
Appointing qualified third-party assessors. Including selecting assessors with relevant technical capabilities for identified risk areas, maintaining records of assessor qualifications and independence.
Conducting regular testing. Including performing assessments of harmful outputs at least every quarter, defining testing scope and methodologies based on risk classifications and industry benchmarks like ToxiGen, coordinating with internal security and testing teams.
Maintaining documentation. Including testing scope, results, and remediation actions taken, tracking follow-up activities and resolution timelines.
Evidence
C010.1 Report: Harmful output testing
Third-party evaluation report showing harmful output testing - must include documentation of assessor qualifications, testing methodology and findings, and improvement tracking with remediation timelines and documentation.
Tags
Mandatory Control
Third-party Evals
Third-party evaluation report
Requirement
·
Mandatory Requirement
Control activity
Implementing customer communication protocols. For example, disclosure procedures, explanation of corrective actions, and follow-up commitments with executive approval for significant incidents.
Establishing immediate mitigation steps with designated staff responsibilities. For example, system freeze capabilities, output suppression, customer notification, and system adjustments.
Evidence
E002.1 Documentation: AI failure plan for harmful outputs
Can be standalone document or integrated in existing incident response procedures/policies
Tags
Mandatory Control
Operational Practices
AI failure plan
Requirement
·
Mandatory Requirement
Control activity
Defining harmful output categories with reference to risk taxonomy. For example, discriminatory content, offensive material, inappropriate recommendations, ideally with concrete examples.
Coordinating external support engagement. For example, legal counsel consultation, PR support, and insurance claim procedures.
Evidence
E002.2 Documentation: Additional harmful output failure procedures
May include harmful output category definitions referenced to risk taxonomy, external support contact list (legal counsel, PR firms, insurance providers), support engagement procedures or runbooks, or escalation criteria for involving external parties.
Tags
Supplemental Control
Operational Practices
AI failure plan
Requirement
·
Mandatory Requirement
Control activity
Results of testing from foundation model developer on CBRN capabilities and mitigations.
Evidence
F002.1 Documentation: Foundation model CBRN capabilities
List of foundation models used with CBRN capability information - may include provider model cards with CBRN assessments, weapons of mass destruction risk evaluations from model developers, or other documentation describing CBRN-related capabilities and mitigations.
Tags
Mandatory Control
Legal Policies
Vendor Contracts
Requirement
·
Mandatory Requirement
Control activity
Establishing catastrophic misuse monitoring. For example, monitoring AI system interactions for patterns indicating weapons development or mass harm intent, implementing real-time alerting for detected catastrophic misuse attempts, documenting suspicious queries and system responses.
Evidence
F002.2 Config: Catastrophic misuse monitoring
Monitoring dashboard or alert configuration for catastrophic misuse patterns - may include usage monitoring flagging CBRN-related queries, alert rules for weapons development patterns, logs of detected and blocked catastrophic misuse attempts, or incident records documenting suspicious CBRN-related interactions.
Tags
Supplemental Control
Technical Implementation
Engineering Code