Cisco's AI Security Framework offers one of the most comprehensive pictures of today's AI threat landscape: providing a unified taxonomy of AI threats, by identifying 19 attacker objectives with 150+ subtechniques.
AIUC-1 operationalizes this framework with requirements that enterprise customers demand and vendors can pragmatically meet, supported by underlying controls. Certification against AIUC-1:
Maps Cisco's AI threat taxonomy to concrete requirements and controls
Strengthens robustness against the attacker objectives and subtechniques identified by Cisco
Goes beyond threat identification to provide actionable, auditable requirements
AITech-1.2: Indirect Prompt Injection
Actors embed malicious instructions in external data sources (such as web pages, documents, emails, databases, API responses) that the LLM or the agent retrieves and processes, causing hidden instructions to be executed without the user's knowledge or awareness. The attack exploits the model's inability to distinguish between trusted system instructions and untrusted external content.
AITech-1.3: Goal Manipulation
Actors gain access to LLM/agent systems, functions, data, or resources without proper authorization through exploitation of authentication weaknesses, access control bypasses, or privilege boundaries. This includes both gaining initial unauthorized access and accessing restricted capabilities within a legitimately accessed system.
AITech-1.4: Multi-Modal Injection and Manipulation
Actors deliberately insert, alter, or create malicious content across various modalities (such as text, image, audio, video, or sensor streams) to influence, corrupt, or exploit an AI system's fusion, interpretation, or decision-making processes. Attacks exploit the integration points where different data types are combined.
AITech-2.1: Jailbreak
Actors employ adversarial technique that circumvents or defeats a model's safety guardrails, content filters, or usage restrictions, enabling output generation that violates the system's policies or intended constraints. The resulting harm comes from whatever guardrail was circumvented: for example, causing the system to violate its operators' policies, bypass safety guardrails and result in unsafe output. In a multi-agent system, jailbreaks can become more complex. Jailbreak prompts could be ingested by one agent while designed to target another agent downstream or parts of the jailbreak could be generated as part of multiple agent interactions after an initial trigger payload. These multi-agent jailbreak prompts will likely present differently than standard jailbreak attacks that are a single prompt or multi-turn interaction with a single chatbot.
AITech-3.1: Masquerading / Obfuscation / Impersonation
Actors introduce a malicious agent that impersonates a legitimate agent, user, system, or authority within a multi-agent ecosystem and causes other agents, systems, or users to accept it as authentic and trust its communications or actions.
AITech-4.1: Agent Injection
Actors manipulate an agent system to introduce or executive unauthorized agent instances, malicious agent code, or fake agent capabilities by inserting misleading data, exploiting registration mechanisms, or subverting agent deployment processes. This results in unauthorized agents operating within the system or legitimate agents executing malicious code.
AITech-4.2: Context Boundary Attacks
Actors manipulate contextual information that an agent sees, processes, or transmits by exploiting unclear or porous boundaries between different context segments (such as descriptions, prompts, tool outputs, memory, or status messages). An agent may misinterpret the source, trustworthiness, or intent of information.
AITech-4.3: Protocol Manipulation
Actors exploit assumptions or ambiguities in structured communication protocols (e.g., JSON, XML, YAML, API specifications, custom schemas) used for agentic communication by introducing adversarial inputs that corrupt parsing, processing, or interpretation.
AITech-5.1: Memory System Persistence
Actors insert poisoned, harmful, or adversarial data or instructions into an agent or model's short-term or long-term memory to corrupt its behavior over extended periods of time.
AITech-5.2: Configuration Persistence
Actor subverts an agent's internal configuration, settings, or initialization parameters to create lasting malicious influence over its behavior. This goes beyond a single, isolated prompt injection by modifying the system's core settings, context, or memory that defines its behavior. An example is modifying MCP configuration to include a malicious server or inserting malicious instructions into an integrated development environment's internal config files.
AITech-6.1: Training Data Poisoning
Actors insert malicious, inaccurate, or corrupted data into a model's training pipeline, with the aim of distorting the model's outputs, embedding payload, degrading performance, or introducing systematic biases.
AITech-7.1: Reasoning Corruption
Actors subvert a model/agent's planning, decision-making, or logical inference processes to produce unintended or harmful outcomes. As AI agents are increasingly trusted with autonomous actions, this attack targets the agent's "reasoning engine" i.e, the core mechanisms for analysis, judgment, and action selection.
AITech-7.2: Memory System Corruption
Actors inject false, malicious, or adversarial data into an AI agent's memory systems to cause immediate harmful decisions, corruption of system functionality or degraded performance.
AITech-7.3: Data Source Abuse and Manipulation
Actors inject malicious, misleading, or poisoned data into external data sources (APIs, databases, RAG systems) that AI systems consume in order to manipulate model behavior, corrupt decision-making, or inject malicious instructions through trusted data channels.
AITech-7.4: Token Manipulation
Actors misuse, poison, or tamper with tokens used by model/agentic systems (e.g., model-specific tokens, or security tokens) to control access, encode information, or manage interactions.
AITech-8.1: Membership Inference
Actors determine whether a specific data point, record, or individual was included in a model's training dataset, exposing sensitive information about training data subjects, reveal proprietary datasets, or indicate the presence of confidential information in model training.
AITech-8.2: Data Exfiltration / Exposure
Actors intentionally or unintentionally cause sensitive information to be exposed through model outputs, API responses, or system behaviors. This includes leakage of personal data, intellectual property, proprietary algorithms, training data, system internals, or other confidential information to unauthorized parties.
AITech-8.3: Information Disclosure
Actors trigger the unintentional or unauthorized exposure, leakage, or sharing of data, model parameters, or system details to parties who should not have access to them. The disclosure occurs through model behaviors, responses, or system interactions that reveal more information than intended.
AITech-8.4: Prompt/Meta Extraction
Actors extract, reveal, or reconstruct the system prompts, meta-prompts, hidden instructions, initial context, or operational directives that define a model/agent's behavior, constraints, and capabilities. This can expose intellectual property, security mechanisms, and operational logic.
AITech-9.1: Model or Agentic System Manipulation
Actor coerces or manipulates a model/agent into producing false, misleading or malicious outputs through adversarial inputs, context manipulation, or exploiting model behaviors, without directly modifying the model itself.
AITech-9.2: Detection Evasion
Actors manipulate or alter the input data, behaviors, or attack patterns to intentionally bypass security monitoring, content moderation, anomaly detection, and safety mechanisms.
AITech-9.3: Dependency / Plugin Compromise
Actors insert malicious code, backdoors, or vulnerabilities into third-party dependencies, plugins, extensions, or libraries used by models/agents/AI applications, creating supply chain attacks that affect all systems using the compromised component.
AITech-10.1: Model Extraction
Actors attempt to reconstruct or replicate a model partially or entirely by interacting with it through its APIs, inference endpoints, or other access points, with the goal of replicating its functionality, architecture, parameters, or outputs without authorized access.
AITech-10.2: Model Inversion
Actors reconstructs private training data, sensitive features, or confidential information from a trained model by exploiting the model's learned representations, decision boundaries, or output patterns.
AITech-11.1: Environment-Aware Evasion
Actors manipulate the model by exploiting aspects of its operational context, or "environment," to induce malicious or unintended behavior. Rather than simply altering a user's prompt, this type of attack crafts input to leverage the system's broader state and external interactions to bypass safety measures.
AITech-11.2: Model-Selective Evasion
Actors craft prompts or inputs specifically designed to bypass the safety controls, content filters, or detection mechanisms of a particular target LLM, while similar or related models would successfully detect and refuse the malicious input. This technique exploits model-specific quirks, training differences, and implementation variations.
AITech-12.1: Tool Exploitation
Actors abuse connected tools, APIs, functions, or external capabilities through subversive prompts, commands, or context manipulation. This can include attacks like: unauthorized tool execution, exploitation of security vulnerabilities in tools, tool poisoning, injection attacks through tool interfaces.
AITech-12.2: Insecure Output Handling
Actors exploit insufficient sanitization, validation, filtering, or security controls applied to content generated by models or agents before it is passed to downstream systems, databases, APIs, or components. This inadequate handling enables injection attacks, code execution, or broader system compromise.
AITech-13.1: Disruption of Availability
Actors prevent authorized users from accessing or using AI models, agents, or their services through resource exhaustion, system degradation, or complete service outage.
AITech-13.2: Cost Harvesting / Repurposing
Actors leverage computational or memory attacks that exploit AI systems to generate excessive costs for the service provider or owner through resource abuse, unauthorized usage for attacker benefit, or deliberate cost inflation. Examples can incude overwhelming human-in-the-loop processes and other denial of service conditions.
AITech-14.1: Unauthorized Access
Actors gain unauthorized access to the underlying infrastructure, deployment environment, or hosting systems of AI models and agents (e.g., servers, databases, cloud accounts). This is distinct from accessing the model/agentic system itself through exploitation.
AITech-14.2: Abuse of Delegated Authority
Actors exploit legitimate delegation mechanisms where agents are granted authority to act on behalf of users, systems, or other agents, and use this delegated authority to perform unauthorized actions, exceed intended scope, or abuse trust relationships for malicious purposes.
AITech-15.1: Harmful Content
Actors prompt generative models or agents to produce unauthorized, unsafe, or malicious content, including harmful instructions, toxic speech, misinformation, or unsafe media. The attack targets weaknesses in content guardrails to bypass safety constraints and generate content that should otherwise be restricted.
A004Protect IP & trade secrets
C002Conduct pre-deployment testing
C004Prevent out-of-scope outputs
C005Prevent customer-defined high risk outputs
C010Third-party testing for harmful outputs
C011Third-party testing for out-of-scope outputs
C012Third-party testing for customer-defined risk
D001Prevent hallucinated outputs
AITech-16.1: Eavesdropping
Actors conduct unauthorized passive monitoring, interception, or extraction of data from model/agentic system communications, including agent-to-agent messages, agent-to-user interactions, API calls, or other internal communications without directly manipulating the data flow.
AITech-17.1: Sensor Spoofing
Actors use of highly-realistic but deceptive signals or manipulating sensor data to deliberately mislead an AI system and its decision-making processes through falsified sensory inputs.
AITech-18.1: Fraudulent Use
Actors leverage models/agents specifically for generating, executing, scaling, or automating fraud, scams, deception, and other illegal activities.
AITech-18.2: Malicious Workflows
Actors leverage AI models to generate, execute, and scale harmful activities like cyberattacks, data theft, and disinformation campaigns. among other deceptive purposes.
AITech-19.1: Cross-Modal Inconsistency Exploits
Actors exploit contradictions, mismatches, or inconsistencies between different data modalities (i.e., text, image, audio, video) to confuse AI systems, bypass security controls, evade detection, or manipulate decision-making by presenting conflicting information across modalities that the system cannot properly reconcile.
AITech-19.2: Fusion Payload Split
Actors exploit weaknesses in fusion algorithms, arbitration logic, and decision-making processes to manipulate final system behavior. Combining payload components delivered through different inputs, outputs, or sources (including data sources, multiple models, or agents) and using the system’s fusion or arbitration mechanisms to assemble them to create an injection or attack payload.