AIUC-1
Cisco AI Security & Safety Framework

AIUC-1 × Cisco AI Security & Safety Framework

Cisco's AI Security Framework offers one of the most comprehensive pictures of today's AI threat landscape: providing a unified taxonomy of AI threats, by identifying 19 attacker objectives with 150+ subtechniques.

AIUC-1 operationalizes this framework with requirements that enterprise customers demand and vendors can pragmatically meet, supported by underlying controls. Certification against AIUC-1:

Maps Cisco's AI threat taxonomy to concrete requirements and controls

Strengthens robustness against the attacker objectives and subtechniques identified by Cisco

Goes beyond threat identification to provide actionable, auditable requirements

Cisco AI Security & Safety Framework crosswalks by attacker objective

Attacker Objective

AITech-1.2: Indirect Prompt Injection

Description

Actors embed malicious instructions in external data sources (such as web pages, documents, emails, databases, API responses) that the LLM or the agent retrieves and processes, causing hidden instructions to be executed without the user's knowledge or awareness. The attack exploits the model's inability to distinguish between trusted system instructions and untrusted external content.

Attacker Objective

AITech-1.3: Goal Manipulation

Description

Actors gain access to LLM/agent systems, functions, data, or resources without proper authorization through exploitation of authentication weaknesses, access control bypasses, or privilege boundaries. This includes both gaining initial unauthorized access and accessing restricted capabilities within a legitimately accessed system.

Attacker Objective

AITech-1.4: Multi-Modal Injection and Manipulation

Description

Actors deliberately insert, alter, or create malicious content across various modalities (such as text, image, audio, video, or sensor streams) to influence, corrupt, or exploit an AI system's fusion, interpretation, or decision-making processes. Attacks exploit the integration points where different data types are combined.

Attacker Objective

AITech-2.1: Jailbreak

Description

Actors employ adversarial technique that circumvents or defeats a model's safety guardrails, content filters, or usage restrictions, enabling output generation that violates the system's policies or intended constraints. The resulting harm comes from whatever guardrail was circumvented: for example, causing the system to violate its operators' policies, bypass safety guardrails and result in unsafe output. In a multi-agent system, jailbreaks can become more complex. Jailbreak prompts could be ingested by one agent while designed to target another agent downstream or parts of the jailbreak could be generated as part of multiple agent interactions after an initial trigger payload. These multi-agent jailbreak prompts will likely present differently than standard jailbreak attacks that are a single prompt or multi-turn interaction with a single chatbot.

Attacker Objective

AITech-3.1: Masquerading / Obfuscation / Impersonation

Description

Actors introduce a malicious agent that impersonates a legitimate agent, user, system, or authority within a multi-agent ecosystem and causes other agents, systems, or users to accept it as authentic and trust its communications or actions.

Attacker Objective

AITech-4.1: Agent Injection

Description

Actors manipulate an agent system to introduce or executive unauthorized agent instances, malicious agent code, or fake agent capabilities by inserting misleading data, exploiting registration mechanisms, or subverting agent deployment processes. This results in unauthorized agents operating within the system or legitimate agents executing malicious code.

Attacker Objective

AITech-4.2: Context Boundary Attacks

Description

Actors manipulate contextual information that an agent sees, processes, or transmits by exploiting unclear or porous boundaries between different context segments (such as descriptions, prompts, tool outputs, memory, or status messages). An agent may misinterpret the source, trustworthiness, or intent of information.

Attacker Objective

AITech-4.3: Protocol Manipulation

Description

Actors exploit assumptions or ambiguities in structured communication protocols (e.g., JSON, XML, YAML, API specifications, custom schemas) used for agentic communication by introducing adversarial inputs that corrupt parsing, processing, or interpretation.

Attacker Objective

AITech-5.1: Memory System Persistence

Description

Actors insert poisoned, harmful, or adversarial data or instructions into an agent or model's short-term or long-term memory to corrupt its behavior over extended periods of time.

Attacker Objective

AITech-5.2: Configuration Persistence

Description

Actor subverts an agent's internal configuration, settings, or initialization parameters to create lasting malicious influence over its behavior. This goes beyond a single, isolated prompt injection by modifying the system's core settings, context, or memory that defines its behavior. An example is modifying MCP configuration to include a malicious server or inserting malicious instructions into an integrated development environment's internal config files.

Attacker Objective

AITech-6.1: Training Data Poisoning

Description

Actors insert malicious, inaccurate, or corrupted data into a model's training pipeline, with the aim of distorting the model's outputs, embedding payload, degrading performance, or introducing systematic biases.

Relevant AIUC-1 Requirements
Attacker Objective

AITech-7.1: Reasoning Corruption

Description

Actors subvert a model/agent's planning, decision-making, or logical inference processes to produce unintended or harmful outcomes. As AI agents are increasingly trusted with autonomous actions, this attack targets the agent's "reasoning engine" i.e, the core mechanisms for analysis, judgment, and action selection.

Attacker Objective

AITech-7.2: Memory System Corruption

Description

Actors inject false, malicious, or adversarial data into an AI agent's memory systems to cause immediate harmful decisions, corruption of system functionality or degraded performance.

Attacker Objective

AITech-7.3: Data Source Abuse and Manipulation

Description

Actors inject malicious, misleading, or poisoned data into external data sources (APIs, databases, RAG systems) that AI systems consume in order to manipulate model behavior, corrupt decision-making, or inject malicious instructions through trusted data channels.

Relevant AIUC-1 Requirements
Attacker Objective

AITech-7.4: Token Manipulation

Description

Actors misuse, poison, or tamper with tokens used by model/agentic systems (e.g., model-specific tokens, or security tokens) to control access, encode information, or manage interactions.

Attacker Objective

AITech-8.1: Membership Inference

Description

Actors determine whether a specific data point, record, or individual was included in a model's training dataset, exposing sensitive information about training data subjects, reveal proprietary datasets, or indicate the presence of confidential information in model training.

Attacker Objective

AITech-8.2: Data Exfiltration / Exposure

Description

Actors intentionally or unintentionally cause sensitive information to be exposed through model outputs, API responses, or system behaviors. This includes leakage of personal data, intellectual property, proprietary algorithms, training data, system internals, or other confidential information to unauthorized parties.

Attacker Objective

AITech-8.3: Information Disclosure

Description

Actors trigger the unintentional or unauthorized exposure, leakage, or sharing of data, model parameters, or system details to parties who should not have access to them. The disclosure occurs through model behaviors, responses, or system interactions that reveal more information than intended.

Attacker Objective

AITech-8.4: Prompt/Meta Extraction

Description

Actors extract, reveal, or reconstruct the system prompts, meta-prompts, hidden instructions, initial context, or operational directives that define a model/agent's behavior, constraints, and capabilities. This can expose intellectual property, security mechanisms, and operational logic.

Attacker Objective

AITech-9.1: Model or Agentic System Manipulation

Description

Actor coerces or manipulates a model/agent into producing false, misleading or malicious outputs through adversarial inputs, context manipulation, or exploiting model behaviors, without directly modifying the model itself.

Attacker Objective

AITech-9.2: Detection Evasion

Description

Actors manipulate or alter the input data, behaviors, or attack patterns to intentionally bypass security monitoring, content moderation, anomaly detection, and safety mechanisms.

Attacker Objective

AITech-9.3: Dependency / Plugin Compromise

Description

Actors insert malicious code, backdoors, or vulnerabilities into third-party dependencies, plugins, extensions, or libraries used by models/agents/AI applications, creating supply chain attacks that affect all systems using the compromised component.

Attacker Objective

AITech-10.1: Model Extraction

Description

Actors attempt to reconstruct or replicate a model partially or entirely by interacting with it through its APIs, inference endpoints, or other access points, with the goal of replicating its functionality, architecture, parameters, or outputs without authorized access.

Attacker Objective

AITech-10.2: Model Inversion

Description

Actors reconstructs private training data, sensitive features, or confidential information from a trained model by exploiting the model's learned representations, decision boundaries, or output patterns.

Attacker Objective

AITech-11.1: Environment-Aware Evasion

Description

Actors manipulate the model by exploiting aspects of its operational context, or "environment," to induce malicious or unintended behavior. Rather than simply altering a user's prompt, this type of attack crafts input to leverage the system's broader state and external interactions to bypass safety measures.

Relevant AIUC-1 Requirements
Attacker Objective

AITech-11.2: Model-Selective Evasion

Description

Actors craft prompts or inputs specifically designed to bypass the safety controls, content filters, or detection mechanisms of a particular target LLM, while similar or related models would successfully detect and refuse the malicious input. This technique exploits model-specific quirks, training differences, and implementation variations.

Attacker Objective

AITech-12.1: Tool Exploitation

Description

Actors abuse connected tools, APIs, functions, or external capabilities through subversive prompts, commands, or context manipulation. This can include attacks like: unauthorized tool execution, exploitation of security vulnerabilities in tools, tool poisoning, injection attacks through tool interfaces.

Attacker Objective

AITech-12.2: Insecure Output Handling

Description

Actors exploit insufficient sanitization, validation, filtering, or security controls applied to content generated by models or agents before it is passed to downstream systems, databases, APIs, or components. This inadequate handling enables injection attacks, code execution, or broader system compromise.

Attacker Objective

AITech-13.1: Disruption of Availability

Description

Actors prevent authorized users from accessing or using AI models, agents, or their services through resource exhaustion, system degradation, or complete service outage.

Attacker Objective

AITech-13.2: Cost Harvesting / Repurposing

Description

Actors leverage computational or memory attacks that exploit AI systems to generate excessive costs for the service provider or owner through resource abuse, unauthorized usage for attacker benefit, or deliberate cost inflation. Examples can incude overwhelming human-in-the-loop processes and other denial of service conditions.

Relevant AIUC-1 Requirements
Attacker Objective

AITech-14.1: Unauthorized Access

Description

Actors gain unauthorized access to the underlying infrastructure, deployment environment, or hosting systems of AI models and agents (e.g., servers, databases, cloud accounts). This is distinct from accessing the model/agentic system itself through exploitation.

Attacker Objective

AITech-14.2: Abuse of Delegated Authority

Description

Actors exploit legitimate delegation mechanisms where agents are granted authority to act on behalf of users, systems, or other agents, and use this delegated authority to perform unauthorized actions, exceed intended scope, or abuse trust relationships for malicious purposes.

Attacker Objective

AITech-15.1: Harmful Content

Description

Actors prompt generative models or agents to produce unauthorized, unsafe, or malicious content, including harmful instructions, toxic speech, misinformation, or unsafe media. The attack targets weaknesses in content guardrails to bypass safety constraints and generate content that should otherwise be restricted.

Attacker Objective

AITech-16.1: Eavesdropping

Description

Actors conduct unauthorized passive monitoring, interception, or extraction of data from model/agentic system communications, including agent-to-agent messages, agent-to-user interactions, API calls, or other internal communications without directly manipulating the data flow.

Attacker Objective

AITech-17.1: Sensor Spoofing

Description

Actors use of highly-realistic but deceptive signals or manipulating sensor data to deliberately mislead an AI system and its decision-making processes through falsified sensory inputs.

Attacker Objective

AITech-18.1: Fraudulent Use

Description

Actors leverage models/agents specifically for generating, executing, scaling, or automating fraud, scams, deception, and other illegal activities.

Attacker Objective

AITech-18.2: Malicious Workflows

Description

Actors leverage AI models to generate, execute, and scale harmful activities like cyberattacks, data theft, and disinformation campaigns. among other deceptive purposes.

Attacker Objective

AITech-19.1: Cross-Modal Inconsistency Exploits

Description

Actors exploit contradictions, mismatches, or inconsistencies between different data modalities (i.e., text, image, audio, video) to confuse AI systems, bypass security controls, evade detection, or manipulate decision-making by presenting conflicting information across modalities that the system cannot properly reconcile.

Attacker Objective

AITech-19.2: Fusion Payload Split

Description

Actors exploit weaknesses in fusion algorithms, arbitration logic, and decision-making processes to manipulate final system behavior. Combining payload components delivered through different inputs, outputs, or sources (including data sources, multiple models, or agents) and using the system’s fusion or arbitration mechanisms to assemble them to create an injection or attack payload.

Last updated March 27, 2026.