AIUC-1
Context
IntroductionCertificate overview
Framework comparisons
ChangelogAIUC-1 ConsortiumProvide input on AIUC-1Contact
Standard
A. Data & Privacy
B. Security
Third-party testing of adversarial robustnessDetect adversarial inputManage public release of technical detailsPrevent AI endpoint scrapingImplement real-time input filteringPrevent unauthorized AI agent actionsEnforce user access privileges to AI systemsProtect AI system deployment environmentLimit output over-exposure
C. Safety
D. Reliability
E. Accountability
F. Society
Certification
AIUC-1 certification Scoping Accredited auditors FAQ
Evidence overview
AIUC-1

Share your details and let us know how you hope to use AIUC-1

I am interested in...

The Security, Safety, and Reliability standard for AI agents

Stay up to date with AIUC-1

AIUC-1
AIUC-1.COM

© 2026.AIUC

OverviewChangelogConsortium

LEGAL

Privacy PolicyTerms of Service
AIUC-1 Standard
→
B. Security
→
B002. Detect adversarial input
B002

Detect adversarial input

Implement monitoring capabilities to detect and enable responding to adversarial inputs and prompt injection attempts

Keywords

MonitorAdversarialJailbreakPrompt Injection

Application

Optional

Frequency

Every 3 months

Type

Detective

Crosswalks

MITRE ATLAS
AML-M0003: Model Hardening
AML-M0015: Adversarial Input Detection
AML-M0024: AI Telemetry Logging
AML-M0021: Generative AI Guidelines
EU AI Act
Article 15: Accuracy Robustness and Cybersecurity
Article 72: Post-Market Monitoring by Providers and Post-Market Monitoring Plan for High-Risk AI Systems
NIST AI RMF
GOVERN 1.5: Risk monitoring and review
MEASURE 2.4: Production monitoring
MEASURE 2.7: Security and resilience
MEASURE 3.1: Emergent risk tracking
OWASP Top 10
LLM01:25 - Prompt Injection
LLM08:25 - Vector and Embedding Weaknesses
LLM10:25 - Unbounded Consumption
CSA AICM
AIS-08: Input Validation
MDS-07: Robustness against Adversarial Attack / Model Hardening
TVM-01: Threat and Vulnerability Management Policy and Procedures
TVM-04: Detection Updates
UEM-09: Anti-Malware Detection and Prevention
TVM-02: Malware and Malicious Instructions Protection Policy and Procedure
AIS-10: API Security
LOG-14: Input Monitoring
OWASP AIVSS
Agent Goal and Instruction Manipulation
IBM AI Risk Atlas
IBM 41: Inference - Evasion attack
IBM 43: Inference - Jailbreaking
IBM 46: Inference - Prompt injection attack
IBM 49: Inference - Context overload attack
IBM 52: Inference - Indirect instructions attack
IBM 53: Inference - Social hacking attack
Cisco AI Security Framework
AITech-1.1: Direct Prompt Injection
AITech-1.2: Indirect Prompt Injection
AITech-1.4: Multi-Modal Injection and Manipulation
AITech-2.1: Jailbreak
AITech-3.1: Masquerading / Obfuscation / Impersonation
AITech-4.3: Protocol Manipulation
AITech-5.1: Memory System Persistence
AITech-5.2: Configuration Persistence
AITech-7.2: Memory System Corruption
AITech-7.4: Token Manipulation
AITech-9.1: Model or Agentic System Manipulation
AITech-9.2: Detection Evasion
AITech-11.2: Model-Selective Evasion
AITech-17.1: Sensor Spoofing

Control activities

Typical evidence

Establishing detection and alerting. For example, implementing monitoring for prompt injection patterns, jailbreak techniques, adversarial input attempts, and exceeding rate limits, configuring alerts and threat notifications for suspicious activities.
B002.1 Config: Adversarial input detection and alerting

Monitoring system, SIEM, or detection code showing rules and alerts for adversarial inputs - may include prompt injection detection patterns, jailbreak technique signatures, rate limit monitoring with threshold alerts, or notification configurations (Slack, PagerDuty, email)

Category

Technical Implementation
Engineering Code
Universal
Implementing incident logging and response procedures. For example, logging suspected adversarial attacks with relevant context, escalating to designated personnel based on severity, and documenting response actions in a centralized system.
B002.2 Logs: Adversarial incident and response

Incident management system or logs showing adversarial attack handling - may include log entries with timestamps and user/session context, escalation runbooks defining severity thresholds, or incident tickets in Jira/PagerDuty/ServiceNow documenting response actions and workflows.

Category

Technical Implementation
LogsEngineering Tooling
Universal
Maintaining detection effectiveness through quarterly reviews. For example, updating detection rules based on emerging adversarial techniques, analyzing incident patterns and documenting system improvements.
B002.3 Documentation: Updates to detection config

Quarterly review documentation showing detection updates - for example, review meeting notes with incident pattern analysis, updated detection rules with version history, or tracking records showing rule improvements (e.g. GitHub/Jira tickets).

Category

Technical Implementation
Engineering PracticeInternal processes
Universal
Implementing adversarial input detection prior to AI model processing where feasible. For example, using pre-processing filters to flag likely threats before model processing.
B002.4 Config: Pre-processing adversarial detection

Pre-processing filtering logic or gateway - may include pattern-matching or heuristic code checking inputs before model processing, WAF or API gateway rules blocking adversarial patterns, or IP-based filtering.

Category

Technical Implementation
Engineering Code
Universal
Integrating adversarial input detection into existing security operations tooling. For example, forwarding flagged inputs to SIEM platforms, correlating detection with authentication and network logs, enabling SOC teams to triage AI-related security events.
B002.5 Config: AI security alerts

SIEM platform, SOC tooling, or log forwarding configuration showing adversarial detection integration - may include Splunk/Datadog/Elastic SIEM ingesting AI adversarial alerts, correlation rules linking AI events with authentication or network logs, SOC dashboard displaying AI security event triage, or code forwarding flagged inputs to security platforms.

Category

Technical Implementation
Engineering Tooling
Universal

Organizations can submit alternative evidence demonstrating how they meet the requirement.

AIUC-1 is built with industry leaders

Phil Venables

"We need a SOC 2 for AI agents— a familiar, actionable standard for security and trust."

Google Cloud
Phil Venables
Former CISO of Google Cloud
Dr. Christina Liaghati

"Integrating MITRE ATLAS ensures AI security risk management tools are informed by the latest AI threat patterns and leverage state of the art defensive strategies."

MITRE
Dr. Christina Liaghati
MITRE ATLAS lead
Hyrum Anderson

"Today, enterprises can't reliably assess the security of their AI vendors— we need a standard to address this gap."

Cisco
Hyrum Anderson
Senior Director, Security & AI
Prof. Sanmi Koyejo

"Built on the latest advances in AI research, AIUC-1 empowers organizations to identify, assess, and mitigate AI risks with confidence."

Stanford
Prof. Sanmi Koyejo
Lead for Stanford Trustworthy AI Research
John Bautista

"AIUC-1standardizes how AI is adopted. That's powerful."

Orrick
John Bautista
Partner at Orrick
Lena Smart

"An AIUC-1certificate enables me to sign contracts much faster— it's a clear signal I can trust."

SecurityPal
Lena Smart
Head of Trust for SecurityPal and former CISO of MongoDB