What are EchoLeak and GeminiJack, and what did they demonstrate?

EchoLeak (CVE-2025-32711) and GeminiJack are vulnerabilities discovered in 2025 in Microsoft 365 Copilot and Google Gemini Enterprise, respectively. An attacker could send a single email or share a document containing hidden instructions that, when later processed by the AI assistant, caused it to silently search the victim's emails, files, and documents and transmit the contents to the attacker. No user interaction was required. Both have been patched, but they demonstrated that AI email agents require architectural security controls - not just vendor patches - to be deployed safely.

What is the 'lethal trifecta' and how do we address it?

The lethal trifecta, named by AI researcher Simon Willison in June 2025, is the combination of three properties in any AI agent: access to private data, exposure to untrusted content, and an exfiltration channel like the ability to render images or follow links. If your AI agent has all three - and virtually every email-connected agent does - it is structurally exploitable. The defense strategy is to break at least one leg of the trifecta at the infrastructure level. In practice, this typically means severing the exfiltration channel through information-flow controls, scoping permissions tightly, and requiring human confirmation for irreversible actions.

Can attackers really hide instructions that are invisible to humans but visible to AI?

Yes, and this has been demonstrated in production systems. Techniques include white text on a white background, zero-width Unicode characters between words, HTML comment nodes, instructions embedded in image alt attributes, and prompt injections buried in shared documents or calendar invitations. The AI agent processes the raw text, not the visual rendering, so it reads instructions that no human would ever see. This is how both EchoLeak and GeminiJack operated. Input sanitization - stripping these hidden elements - is one of the five defense layers that experienced teams deploy.

Is prompt injection a challenge that will be permanently solved?

Current evidence suggests prompt injection will be a persistent architectural consideration rather than a one-time fix. In December 2025, OpenAI stated that prompt injection 'is unlikely to ever be fully solved,' and the U.K.'s National Cyber Security Centre issued a similar assessment. However, this is consistent with many security challenges that organizations manage successfully every day. Business email compromise has not been 'solved' either, but companies with proper controls have reduced their exposure by orders of magnitude. The same layered defense approach - input sanitization, information-flow control, least-privilege permissions, human confirmation gates, and behavioral monitoring - makes prompt injection a manageable risk rather than an unacceptable one.

What is the single highest-leverage step we can take right now?

Require human confirmation for every irreversible action your AI agent can take. Sending email, modifying records, sharing documents externally, initiating any communication outside your organization - all of these should require a human approval step in a separate interface, presented to a person who can see what the agent is about to do and why. This does not prevent prompt injection, but it prevents injected instructions from causing real damage. It is the highest-leverage single control, and it can be implemented quickly without rearchitecting your entire agent pipeline.

How is this different from the email security we already have in place?

Traditional email security looks for threats aimed at humans: malicious attachments, deceptive links, impersonation attempts. Generation Six attacks target the AI agent, not the person. The email may look completely benign to every existing security filter and to every human who reads it. The payload is not a link or a file - it is hidden text that only becomes dangerous when an AI model processes it. Your existing email gateway, spam filters, and security awareness training remain important but do not address this attack class. The good news is that the additional controls needed - input sanitization, permission scoping, human confirmation gates, and behavioral monitoring - complement your existing security stack rather than replacing it.

The Six-Generation Pattern Behind Every Email Attack - and How to Stay Ahead of Generation Seven

In January 2025, researchers at Aim Security sent a single email to a Microsoft 365 Copilot environment. No malicious links. No infected attachments. No social engineering. Just a carefully crafted message containing hidden prompt injection instructions disguised as ordinary business correspondence. When an employee later asked Copilot to "summarize my recent emails," the AI retrieved the attacker's message, followed the hidden instructions, and silently transmitted confidential files to an external server.

Microsoft assigned it CVE-2025-32711 with a severity score of 9.3 out of 10. The researchers named it EchoLeak.

What made EchoLeak significant was not the exploit itself. It was the pattern it fit into. Every time email has gained a new capability over the past 27 years - attachments, HTML rendering, clickable links, trusted-sender identity, AI-generated content, and now AI agents - attackers found the new attack surface within months. The organizations that came out ahead in each cycle were not the ones with the biggest security budgets. They were the ones that recognized the pattern one generation earlier than their peers.

That pattern recognition is what separates proactive security from reactive damage control. And right now, it points clearly to AI email agents as the surface that needs attention.

Six Generations, One Playbook

Looking across 27 years of email attacks, a consistent cycle emerges: a new capability is deployed for productivity, attackers discover the attack surface it creates, exploitation scales, and the industry builds defenses - typically after significant damage. The value in understanding this history is not to catalog disasters. It is to spot where we are in the current cycle and act accordingly.

Generation 1 - Executable Attachments (1999-2002). The Melissa virus hit in March 1999, spreading through Microsoft Word macros so fast that Microsoft, Intel, and the Marine Corps shut down their email gateways within hours. The FBI estimated $80 million in damages. A year later, ILOVEYOU infected 45 million computers in 10 days, causing an estimated $10 billion in damage. The new capability was email attachments carrying executable code. The industry responded with attachment scanning, file-type blocking, and macro restrictions. Effective defenses - but built after the damage.

Generation 2 - HTML Rendering (2003-2007). When email clients started rendering HTML, they became tiny web browsers sitting inside your inbox. Attackers embedded invisible iframes, tracking pixels, and scripts that executed the moment someone opened a message. Phishing emails became pixel-perfect replicas of bank login pages. The organizations that fared best were those that restricted HTML rendering in their email clients before the threat fully materialized.

Generation 3 - Weaponized Links (2007-2015). As attachment filters improved, attackers shifted tactics. They sent clean emails with malicious links - no payload, no macro, no executable. The Anti-Phishing Working Group documented a 162% increase in phishing sites between 2010 and 2014. The industry built URL reputation scoring and sandboxed link previews. Again, the teams that implemented link-inspection controls earliest absorbed the least damage.

Generation 4 - Business Email Compromise (2015-present). BEC stripped away every technical indicator. No malware, no links, no attachments - just a plain-text email from what appeared to be the CEO, asking the CFO to wire $400,000 to a new vendor account. The FBI's Internet Crime Complaint Center reports $55 billion in cumulative BEC losses worldwide. In the U.S. alone, BEC caused $2.77 billion in reported losses in 2024 across over 21,000 complaints. Sixty-three percent of organizations experienced a BEC attack last year, according to the Association for Financial Professionals. The organizations that built multi-step approval processes for financial transactions early avoided the worst outcomes.

Generation 5 - AI-Enhanced Phishing (2023-2025). Generative AI gave attackers the ability to produce flawless, personalized phishing at industrial scale. A 2024 Oxford University study found that AI-generated phishing emails achieved a 60% higher click rate than human-crafted ones. By March 2025, AI phishing agents were outperforming human social engineers by 24%, according to Hoxhunt research. Some security firms reported phishing volumes up over 1,000% since the launch of generative AI tools. The teams that invested in behavioral analysis rather than relying solely on pattern-matching filters adapted fastest.

Figure 1. Six generations of email attacks, each following the same cycle. Generation Six is currently in the proactive defense window - the stage where preparation delivers the highest return.

Generation Six: What the EchoLeak and GeminiJack Disclosures Revealed

EchoLeak was not an isolated finding. Six months after its disclosure, security researchers at Noma Labs disclosed GeminiJack - a structurally identical vulnerability in Google Gemini Enterprise. An attacker could embed hidden instructions in a shared Google Doc, a calendar invitation, or an email. When any employee queried Gemini Enterprise for anything that surfaced the poisoned content - "show me our budgets," for example - the AI retrieved the document, executed the hidden instructions, searched across Gmail, Calendar, and Docs for sensitive data, and exfiltrated the results through a disguised image URL. No click. No notification. No security alert.

Both vulnerabilities have been patched. But what matters more than the individual patches is what these disclosures revealed about the architecture of AI email agents - and how experienced security teams are already addressing it.

Understanding the Architecture: The "Lethal Trifecta" and How to Break It

In June 2025, AI researcher Simon Willison named the underlying structural pattern. He called it the "lethal trifecta" - three properties that, when combined in any AI agent, create an exploitable surface for data exfiltration through prompt injection:

1. Access to private data. The agent can read emails, documents, calendars, and internal files. This is the entire point of deploying it.

2. Exposure to untrusted content. The agent processes input from sources outside your organization - incoming emails, shared documents, web pages, calendar invitations from external parties.

3. An exfiltration channel. The agent can make external requests - rendering images, following links, calling APIs, or sending messages. Any of these can carry data outbound.

If your AI email agent has all three - and nearly every commercial deployment does - the architecture is exploitable. As Willison put it: "If your agentic system has all three, it is vulnerable. Period."

The good news: this framework also provides the blueprint for defense. You do not need to eliminate all three properties. You need to break at least one leg of the trifecta at the infrastructure level. In practice, this is exactly what the organizations getting ahead of Generation Six are doing - and the approach is well understood.

What the Attack Looks Like in Practice

Before diving into the defense playbook, it helps to understand what a Generation Six attack looks like from the perspective of a company that has not yet secured its AI email agents - because the subtlety is what makes the pattern different from previous generations.

An email arrives in someone's inbox. It looks like a routine vendor message, a meeting request, or a newsletter. Nothing about it triggers a spam filter or a security alert. A human reading it would see nothing unusual.

But buried in the HTML - in white text on a white background, in zero-width Unicode characters, in an image alt attribute, or in a comment tag invisible to every email client - is a set of instructions. Not instructions for a person. Instructions for the AI agent that is about to process this email.

Hours or days later, an employee asks the AI assistant a routine question. The agent retrieves the poisoned email as context. The hidden instructions activate. The agent searches for sensitive files, compiles confidential information, and transmits the data outbound through a channel that looks like normal operation: an image render, a link preview, a formatted response.

This is precisely what EchoLeak and GeminiJack demonstrated against Microsoft 365 Copilot and Google Gemini Enterprise. The important detail: both attacks were stopped once the organizations understood the pattern and applied architectural controls. The vulnerability is real, and the defenses work.

The Defense Playbook: Five Layers That Break the Chain

The organizations that have gotten ahead of Generation Six did not wait for patches or vendor announcements. They recognized the pattern from Generations One through Five and applied a layered defense strategy. Across the published incident write-ups and security advisories from teams deploying AI email agents, the following five controls - implemented in this order - deliver the strongest protection with the most practical deployment path.

Map the trifecta before deployment. Before connecting any AI agent to email, ask three questions: What private data can this agent access? What untrusted content will it process? What external communication channels does it have? If the answer to all three is "yes" - which it almost always is with email agents - treat the deployment as a high-risk integration, not a productivity upgrade. This framing changes every subsequent decision about architecture, permissions, and monitoring. A pattern emerging from public deployment write-ups is that this assessment takes one to two days and saves weeks of remediation later.

Break the trifecta by design. Microsoft's own security team published their approach in July 2025: a framework called FIDES that uses information-flow control to deterministically prevent an agent from moving data between untrusted inputs and external outputs. The principle is straightforward - an agent that reads untrusted email content cannot, architecturally, render external images or follow external links in the same execution context. The exfiltration channel is severed at the infrastructure level, not by asking the model to behave. This is the most technically involved control, and it is also the most durable.

Scope permissions like you mean it. An agent that summarizes email does not need permission to send email. An agent that drafts replies does not need access to the file system. An agent that triages support tickets does not need to modify CRM records. Leaders who get this right define the narrowest possible permission set for each agent role and enforce it with tooling, not with prompt instructions. When ServiceNow disclosed a privilege escalation vulnerability in late 2025 - where a low-privilege AI agent was tricked into asking a higher-privilege agent to act on its behalf - the organizations that had enforced hard permission boundaries at the infrastructure level were unaffected.

Require human confirmation for anything irreversible. Sending an email, modifying a record, sharing a document externally, initiating a payment - any action that cannot be undone requires a human to approve it. Not a prompt that says "are you sure?" to the model. An actual confirmation step in a separate interface, presented to a human who can see what the agent is about to do and why. This is the single highest-leverage control. It does not prevent injection - it prevents injected instructions from causing damage. And it can be implemented quickly without rearchitecting your entire agent pipeline.

Build detection for what sanitization misses. Input sanitization - stripping invisible characters, normalizing Unicode, flagging instruction-like patterns in email content - is necessary but insufficient on its own. The teams that build the most resilient deployments also monitor agent behavior for anomalies: sudden changes in the volume or destination of external requests, data access patterns that do not match the employee's normal workflow, tool invocations that the agent has never used before in that context. Behavioral monitoring catches the attacks that bypass every other layer.

Figure 2. Defense architecture for AI email agents: five layers designed to break at least one leg of the lethal trifecta at every stage of the pipeline.

What About Prompt Injection As a Long-Term Challenge?

An honest assessment: prompt injection is not a bug that gets fixed in the next software update. In December 2025, OpenAI published a blog post acknowledging that prompt injection "is unlikely to ever be fully solved." The U.K.'s National Cyber Security Centre issued its own assessment agreeing that prompt injection "may never be totally mitigated."

This might sound discouraging, but the parallel to previous generations is instructive. BEC - plain-text social engineering with no technical payload - has not been "solved" either. It still causes billions in losses annually. But the organizations that implemented multi-step approval workflows, out-of-band verification for financial transactions, and behavioral analytics reduced their BEC exposure by orders of magnitude. They did not eliminate the threat. They made it manageable through architectural controls.

The same approach applies to prompt injection. The teams that treat it as a permanent architectural consideration - like SQL injection, like cross-site scripting, like social engineering - and build layered defenses accordingly are the ones that will deploy AI email agents confidently and safely. The key question is not whether prompt injection will be solved. It is whether your defenses are layered deeply enough that no single bypass causes real damage.

The Proactive Defense Window

Generation Six is currently between the "attack surface discovered" and "mass exploitation" stages of the pattern. EchoLeak and GeminiJack are CVE-tracked vulnerabilities in the two largest enterprise AI platforms - they are not theoretical. But mass exploitation has not started yet. Only 34.7% of organizations have deployed dedicated prompt injection defenses, according to a VentureBeat survey from early 2026. That means the proactive defense window is still open, and the organizations that act now have a significant structural advantage.

The pattern across all six generations is consistent: the teams that built defenses during the discovery phase - before mass exploitation - absorbed dramatically less damage and adapted faster when the threat landscape evolved. This is not about predicting the future. It is about recognizing a pattern that has repeated five times and positioning accordingly.

At Code Atelier, we focus on AI agent security - including email agents, document processing pipelines, and any system that handles untrusted content. If your team is deploying AI agents and wants to get the security architecture right from the start, we would welcome that conversation.