AIUC-1
Third-Party Evals

Third-Party Evals

Third-Party Evals

Download evidence list
Requirement
Control activity
Establishing a taxonomy for adversarial risks. For example, drawing on NIST's AI 100-2e2023 attack classifications and aligning these to system architecture and use cases.
Conducting comprehensive adversarial testing at least quarterly. For example, performing structured red-teaming, prompt injection assessments, jailbreaking attempts, adversarial perturbation testing, semantic manipulation, and simulated malicious tool invocations.
Maintaining secure testing documentation. For example, recording test cases, methods, outcomes, and system behaviors with restricted access controls, implementing secure storage for sensitive testing materials.
Establishing improvement processes based on findings. For example, assigning owners and remediation timelines based on test severity, tracking fixes through risk registers or issue management systems, documenting updates to safeguards and procedures.
Evidence
B001.1 Report: adversarial testing results
Third-party evaluation report showing adversarial robustness testing - must include risk taxonomy tested, testing methodology and findings, secure documentation practices, and improvement tracking with remediation timelines and documentation.
Tags
Mandatory Control
Third-party Evals
Third-party evaluation report
Requirement
·
Mandatory Requirement
Control activity
Appointing qualified third-party assessors. Including selecting assessors with relevant technical capabilities for identified risk areas, maintaining records of assessor qualifications and independence.
Conducting regular testing. Including performing assessments of harmful outputs at least every quarter, defining testing scope and methodologies based on risk classifications and industry benchmarks like ToxiGen, coordinating with internal security and testing teams.
Maintaining documentation. Including testing scope, results, and remediation actions taken, tracking follow-up activities and resolution timelines.
Evidence
C010.1 Report: Harmful output testing
Third-party evaluation report showing harmful output testing - must include documentation of assessor qualifications, testing methodology and findings, and improvement tracking with remediation timelines and documentation.
Tags
Mandatory Control
Third-party Evals
Third-party evaluation report
Requirement
Control activity
Appointing qualified third-party assessors. Including selecting assessors with relevant technical capabilities for identified risk areas, maintaining records of assessor qualifications and independence.
Conducting regular testing. Including defining testing scope and methodologies based on risk taxonomy and performing assessments of out-of-scope outputs at least every quarter.
Maintaining documentation. Including testing scope, results, and remediation actions taken, tracking follow-up activities and resolution timelines.
Evidence
C011.1 Report: Out-of-scope output testing
Third-party evaluation report showing out-of-scope output testing - must include documentation of assessor qualifications, testing methodology and findings, and improvement tracking with remediation timelines and documentation.
Tags
Mandatory Control
Third-party Evals
Third-party evaluation report
Requirement
Control activity
Appointing qualified third-party assessors. Including selecting assessors with relevant technical capabilities for identified risk areas, maintaining records of assessor qualifications and independence.
Conducting regular testing. Including defining testing scope and methodologies based on risk taxonomy and performing assessments of high-risk areas at least every quarter.
Maintaining documentation. Including testing scope, results, and remediation actions taken, tracking follow-up activities and resolution timelines.
Evidence
C012.1 Third-party evaluation report assessing customer-defined risk
Third-party evaluation report showing testing of customer-defined risk - must include documentation of assessor qualifications, testing methodology and findings, and improvement tracking with remediation timelines and documentation.
Tags
Mandatory Control
Third-party Evals
Third-party evaluation report
Requirement
·
Mandatory Requirement
Control activity
Appointing qualified third-party assessors. Including selecting assessors with relevant technical capabilities for identified risk areas, maintaining records of assessor qualifications and independence.
Conducting regular testing. Including defining testing scope and methodologies based on risk taxonomy and performing assessments at least every quarter.
Maintaining documentation. Including testing scope, results, and remediation actions taken, tracking follow-up activities and resolution timelines.
Evidence
D002.1 Report: Hallucination testing results
Third-party evaluation report showing hallucination testing - must include risk taxonomy tested, testing methodology and findings, and improvement tracking with remediation timelines and documentation.
Tags
Mandatory Control
Third-party Evals
Third-party evaluation report
Requirement
·
Mandatory Requirement
Control activity
Appointing qualified third-party assessors. Including selecting assessors with relevant technical capabilities for identified risk areas, maintaining records of assessor qualifications and independence.
Conducting regular testing. Including defining testing scope and methodologies based on risk taxonomy and performing assessments of tool calls at least every quarter.
Maintaining documentation. Including testing scope, results, and remediation actions taken, tracking follow-up activities and resolution timelines.
Evidence
D004.1 Report: Tool call testing
Third-party evaluation report showing tool call testing - must include risk taxonomy tested, testing methodology and findings, and improvement tracking with remediation timelines and documentation.
Tags
Mandatory Control
Third-party Evals
Third-party evaluation report