AIUC-1
B002

Detect adversarial input

Implement monitoring capabilities to detect and respond to adversarial inputs and prompt injection attempts

Keywords
Monitor
Adversarial
Jailbreak
Prompt Injection
Application
Optional
Frequency
Every 3 months
Type
Detective
Crosswalks
AML-M0003: Model Hardening
AML-M0015: Adversarial Input Detection
AML-M0024: AI Telemetry Logging
AML-M0021: Generative AI Guidelines
Article 15: Accuracy Robustness and Cybersecurity
Article 72: Post-Market Monitoring by Providers and Post-Market Monitoring Plan for High-Risk AI Systems
GOVERN 1.5: Risk monitoring and review
MEASURE 2.4: Production monitoring
MEASURE 2.7: Security and resilience
MEASURE 3.1: Emergent risk tracking
LLM01:25 - Prompt Injection
LLM08:25 - Vector and Embedding Weaknesses
LLM10:25 - Unbounded Consumption
AIS-08: Input Validation
MDS-07: Robustness against Adversarial Attack / Model Hardening
TVM-01: Threat and Vulnerability Management Policy and Procedures
TVM-04: Detection Updates
UEM-09: Anti-Malware Detection and Prevention
TVM-02: Malware and Malicious Instructions Protection Policy and Procedure
AIS-10: API Security
LOG-14: Input Monitoring
Establishing detection and alerting. For example, implementing monitoring for prompt injection patterns, jailbreak techniques, adversarial input attempts, and exceeding rate limits, configuring alerts and threat notifications for suspicious activities.
B002.1 Config: Adversarial input detection and alerting

Screenshot of monitoring system, SIEM, or detection code showing rules and alerts for adversarial inputs - may include prompt injection detection patterns, jailbreak technique signatures, rate limit monitoring with threshold alerts, or notification configurations (Slack, PagerDuty, email)

Engineering Code
Universal
Implementing incident logging and response procedures. For example, logging suspected adversarial attacks with relevant context, escalating to designated personnel based on severity, and documenting response actions in a centralized system.
B002.2 Logs: Adversarial incident and response

Screenshot of incident management system or logs showing adversarial attack handling - may include log entries with timestamps and user/session context, escalation runbooks defining severity thresholds, or incident tickets in Jira/PagerDuty/ServiceNow documenting response actions and workflows.

LogsEngineering Tooling
Universal
Maintaining detection effectiveness through quarterly reviews. For example, updating detection rules based on emerging adversarial techniques, analyzing incident patterns and documenting system improvements.
B002.3 Documentation: Updates to detection config

Quarterly review documentation showing detection updates - for example, review meeting notes with incident pattern analysis, updated detection rules with version history, or tracking records showing rule improvements (e.g. GitHub/Jira tickets).

Engineering PracticeInternal processes
Universal
Implementing adversarial input detection prior to AI model processing where feasible. For example, using pre-processing filters to flag likely threats before model processing.
B002.4 Config: Pre-processing adversarial detection

Screenshot of pre-processing filtering logic or gateway - may include pattern-matching or heuristic code checking inputs before model processing, WAF or API gateway rules blocking adversarial patterns, or IP-based filtering.

Engineering Code
Universal
Integrating adversarial input detection into existing security operations tooling. For example, forwarding flagged inputs to SIEM platforms, correlating detection with authentication and network logs, enabling SOC teams to triage AI-related security events.
B002.5 Config: AI security alerts

Screenshot of SIEM platform, SOC tooling, or log forwarding configuration showing adversarial detection integration - may include Splunk/Datadog/Elastic SIEM ingesting AI adversarial alerts, correlation rules linking AI events with authentication or network logs, SOC dashboard displaying AI security event triage, or code forwarding flagged inputs to security platforms.

Engineering Tooling
Universal

Organizations can submit alternative evidence demonstrating how they meet the requirement.

AIUC-1 is built with industry leaders

Phil Venables

"We need a SOC 2 for AI agents— a familiar, actionable standard for security and trust."

Google Cloud
Phil Venables
Former CISO of Google Cloud
Dr. Christina Liaghati

"Integrating MITRE ATLAS ensures AI security risk management tools are informed by the latest AI threat patterns and leverage state of the art defensive strategies."

MITRE
Dr. Christina Liaghati
MITRE ATLAS lead
Hyrum Anderson

"Today, enterprises can't reliably assess the security of their AI vendors— we need a standard to address this gap."

Cisco
Hyrum Anderson
Senior Director, Security & AI
Prof. Sanmi Koyejo

"Built on the latest advances in AI research, AIUC-1 empowers organizations to identify, assess, and mitigate AI risks with confidence."

Stanford
Prof. Sanmi Koyejo
Lead for Stanford Trustworthy AI Research
John Bautista

"AIUC-1 standardizes how AI is adopted. That's powerful."

Orrick
John Bautista
Partner at Orrick
Lena Smart

"An AIUC-1 certificate enables me to sign contracts much faster— it's a clear signal I can trust."

SecurityPal
Lena Smart
Head of Trust for SecurityPal and former CISO of MongoDB