AIUC-1
ResearchRajiv Dattani & Emil Lassen
Feb 23, 20266 min read

Setting the standard for voice AI agents

Setting the standard for voice AI agents

Voice agents are growing rapidly in demand – but come with their own unique risks that must be managed appropriately. This is how we evolved AIUC-1 to cover voice agent security, safety and reliability, working with Technical Contributors from Stanford University, ElevenLabs and Gray Swan.

Voice AI can deliver immense value – with agents being able to resolve customer support issues autonomously, process refunds, route calls, qualify leads, and much more – all in a natural, human-sounding way. It is therefore not surprising that the demand for voice-agents has increased significantly over the past year.

Voice agents introduce trust and safety challenges that the research community has been studying closely in the past year. Integrating these insights to AIUC-1 was an opportunity to ensure that what we know about adversarial robustness, manipulation risks, and vulnerable user protection is reflected in the standards that enterprises will actually deploy against.

Professor Sanmi Koyejo Leader, Stanford Trustworthy AI Research Lab

As AI agents evolve from text-based chatbots to voice-first conversational systems capable of real-time reasoning, emotional nuance, and autonomous action, new safeguards are needed for enterprises to adopt these agents with confidence and unlock the full value.

New risks or voice agents

Voice AI introduces a fundamentally different threat landscape compared to text-based systems. Three categories of risk stand out.

Misuse of generated audio: Every generated audio clip can be shared, repurposed, and weaponized in ways that text outputs rarely can. Voice cloning technology — when left ungoverned — enables impersonation of executives, celebrities, or private individuals at scale. Fraudsters have already used cloned voices to authorize wire transfers and deceive family members. The ease of capturing a voice sample and replicating it means that the barrier to misuse is dangerously low.

Real-time manipulation: Unlike a text chatbot that a user reads and reflects on, a voice agent operates in real time, under social pressure, and often with a human-like persona. This creates conditions for manipulation — both of the user by a malicious agent, and of the agent itself through adversarial prompting delivered verbally. Users are more likely to comply with requests from a voice they perceive as authoritative or trustworthy, raising the stakes when guardrails fail.

Expanded attack surface: Voice agents typically integrate with backend systems — CRMs, payment processors, scheduling tools — to take autonomous action. A compromised or poorly governed voice agent is therefore not just a reputational risk; it is a direct pathway to data exfiltration, unauthorized transactions, and operational disruption. Audio inputs also introduce new vectors for prompt injection that are harder to detect and filter than their text equivalents.

Voice systems expose a class of adversarial vulnerabilities that most security teams haven't encountered before — acoustic injection, prosodic manipulation, cross-lingual attack transfer. Stress-testing these vectors systematically, rather than waiting for them to be exploited in the wild, is what separates a robust voice agent from a liability - and what we focused on when adapting AIUC-1 testing requirements.

Zico Kolter Co-founder & Chief Scientist, Gray Swan

Taken together, these risks mean that deploying voice AI without purpose-built safeguards is not a calculated risk — it is an unmanaged one.

Voice-specific requirements now integrated in AIUC-1

In collaboration with researchers from Stanford University working on trustworthy AI, ElevenLabs security leaders providing real-world insights, and experts in adversarial threats from Gray Swan, AIUC-1 has now been adapted to cover voice-specific requirements. 

Below we highlight a subset of the requirements that voice AI vendors must meet to earn AIUC-1 certification, grouped across the six AIUC-1 principles. The full list includes 40+ requirements including frequent technical testing of voice-specific risks.

Several requirements were updated already in the January 15, 2026 update of AIUC-1, with additional updates on e.g. voice-cloning, vulnerable user protection, and safeguards against misuse planned for release as part of the April 15, 2026 update.

A. Data & Privacy

  • Data ownership - A001 & A002: Clarify ownership of voice inputs and outputs, including for training purposes.
  • Voice data protection & consent - A007: Ensure proper safeguards for voice biometrics and audio data to prevent unauthorized cloning or identity theft — including clear, explicit consent processes governing when and how a voice may be cloned

B. Security

  • Adversarial testing - B001: Include voice-specific attack vectors such as background noise injection, acoustic adversarial inputs, and attempts to manipulate agent behavior through speech characteristics such as tone, urgency, or impersonation.
  • Adversarial input detection - B002: Deploy real-time classifiers to detect and block adversarial inputs before they reach the model, including jailbreak attempts delivered via spoken language and prompt injection embedded in audio streams.

C. Safety

  • Harmful output prevention - C003: Implement guardrails and filtering to detect and block harmful voice outputs — including aggressive or threatening tone, biased or discriminatory language, and emotionally manipulative speech patterns that could cause harm or damage brand trust.
  • Vulnerable user protection - C005 & C007: Implement detection mechanisms to identify interactions with vulnerable users — including minors and individuals in emotional distress — and adapt agent behavior accordingly, including escalation to human agents or safe exit protocols.

D. Reliability

  • Hallucinations - D001 & D002: Continuously evaluate voice agents for hallucination rates across languages, dialects, and accents — ensuring factual reliability does not degrade for non-English speakers or users with non-standard speech patterns.
  • Tool call verification - D003 & D004: Implement verbal confirmation protocols before executing consequential tool calls — such as processing payments, booking appointments, or modifying account data

E. Accountability

  • Acceptable use - E010: Define clear prohibited use policies specific to voice AI — including restrictions on impersonation, synthetic voice fraud, and unauthorized cloning — and enforce them through active monitoring, user vetting, and sanctions for violations.
  • Transparency documentation - E017: Maintain up-to-date, publicly accessible documentation of voice system capabilities, known limitations, and the safety and security controls in place — enabling enterprise customers and auditors to make informed deployment decisions.

F. Society

  • Safeguard against misuse - F001: Ensure that voice AI cannot be exploited for fraud, impersonation, election interference, or other societal harms.

Outlook for secure, safe and reliable voice agents

Voice AI is maturing rapidly, and so is the security and safety infrastructure required to deploy it responsibly at scale. The risks are real — but they are manageable when addressed systematically, and the industry is beginning to converge on what good looks like.

ElevenLabs, one of the leading voice AI platforms and a Technical Contributor to AIUC-1, offers a concrete example of what this looks like in practice, moving beyond reactive enforcement toward proactive, architecture-level safeguards.

Voice AI interactions have novel risks that existing frameworks weren't designed to address. Working with AIUC-1 as a technical contributor let us translate what we've learned building and hardening these systems into concrete, auditable requirements. The voice-specific additions to AIUC-1 reflect real-world threat scenarios, not theoretical ones.

Marco Mancini Head of Safety & Security, ElevenLabs

Their approach includes publicly accessible AI speech classifiers that enable third-party deepfake detection, model-level blocking of high-risk voice clones, and technological verification gates for advanced cloning features, functioning as the voice equivalent of CAPTCHA.

Continuous monitoring via AI classifiers and human reviewers, combined with public abuse reporting mechanisms and law enforcement referrals, is becoming the baseline expectation for platforms operating at scale. Participation in provenance standards like C2PA signals a broader shift toward ecosystem-level accountability, where the origin of AI-generated audio can be traced and verified across platforms.

These emerging practices directly informed the voice-specific requirements now codified in AIUC-1 - helping ensure the framework reflects operational realities rather than theoretical risk models.