When Your AI Tools Become the Attacker

When Your AI Tools Become the Attacker

The Problem in One Sentence

The AI tools your organization adopted for productivity — Copilots, MCP servers, Skills, and coding agents — are now actively being weaponized for data theft, credential harvesting, and supply chain attacks.


How It's Happening: Four Attack Patterns With Real Incidents

1. Zero-Click Emails That Weaponize Your Copilot

EchoLeak (CVE-2025-32711) — An attacker sends a normal-looking business email with hidden prompt injection instructions. No attachment, no link. When the victim later asks Microsoft 365 Copilot to summarize emails, Copilot reads the hidden instructions, extracts sensitive data from OneDrive, SharePoint, and Teams, and silently exfiltrates it through a trusted Microsoft domain. CVSS score: 9.3. Zero user interaction required.

Reprompt (2026) — A single click on a legitimate Microsoft URL gives the attacker persistent control over a Copilot session — even after the chat is closed. The attacker's server dynamically adapts follow-up queries based on stolen data, probing deeper into the victim's environment. Client-side monitoring tools see nothing.

Claudy Day (2026) — Researchers chained three Claude.ai vulnerabilities to deliver invisible prompt injections via URL parameters, exfiltrating conversation history and sensitive data from default sessions — no integrations or MCP servers needed.

2. Malicious MCP Servers and Skills Being Distributed at Scale

Postmark MCP Backdoor (2025) — A functional but backdoored MCP server was published to a public registry. One hidden line of code blind-copied every outgoing email — internal memos, password resets, invoices — to the attacker. It looked and worked like a legitimate tool.

CVE-2025-6514 (mcp-remote) — A command injection vulnerability in the most widely-used MCP OAuth proxy affected 437,000+ environments. Malicious server URLs could execute arbitrary commands on client machines, stealing API keys, cloud credentials, and SSH keys. CVSS: 9.6.

ToxicSkills Study — Snyk found that 13% of Claude Code Skills tested contained critical security flaws, and some actively attempted to exfiltrate credentials.

Smithery.ai — A path traversal flaw exposed an auth token controlling 3,000+ hosted MCP servers, potentially allowing mass deployment of rogue servers.

3. AI Coding Agents Hijacked Through Trusted Workflows

Claude Code RCE (CVE-2025-59536) — Simply cloning a repository with a malicious .claude/settings.json file triggered remote code execution and API key theft. Developers trust config files as metadata — attackers exploit that trust.

MCPoison / CurXecute (Cursor IDE) — Once a user approved an MCP config in Cursor, it was never re-validated. Attackers swapped benign configs for malicious payloads silently. In a separate flaw, a single poisoned Slack message achieved full RCE through Cursor's MCP integration.

Supabase Cursor Agent — Attackers embedded SQL instructions in support tickets. The AI agent, running with privileged access, executed them as commands, leaking sensitive tokens into a public thread.

4. AI-Targeted Supply Chain Campaigns

TeamPCP (March 2026) — Starting from a single stolen credential, this group compromised Trivy, Checkmarx KICS, LiteLLM (95M+ monthly downloads), and the Telnyx SDK in eight days. Each package was injected with credential harvesters targeting AWS, GCP, Azure, Kubernetes secrets, and SSH keys. A new target fell every one to three days.


Why Traditional Security Can't See This

These attacks don't use malware. They use words. AI models cannot distinguish between content to process and instructions to execute — and that architectural reality cannot be patched. The attack payload is natural language hidden in emails, documents, tool metadata, and config files. Your antivirus, firewall, and DLP rules were not built for this.


What Leadership Must Do

Before deploying AI tools: Audit existing permissions. AI amplifies permission debt — data that was practically invisible to humans becomes instantly reachable to AI agents.

Before trusting AI integrations: Vet every MCP server, Skill, and connector. Read the code. Pin versions. Treat them as privileged components, not utilities.

After deployment: Log all AI agent traffic. Build behavioural baselines. Require human approval for sensitive actions. Integrate AI monitoring into your SIEM.

Always: Accept that prompt injection is the #1 risk in the OWASP LLM Top 10, with attack success rates exceeding 85% against current defences. There is no silver bullet — only layered defence and continuous vigilance.


The Bottom Line

Malicious AI tools are already circulating in public registries. Zero-click email attacks against enterprise Copilots have already been demonstrated. AI supply chain campaigns are already hitting production infrastructure.


Incident Impact Scale
EchoLeak Zero-click Copilot data exfiltration via email All M365 Copilot users
Reprompt Persistent session hijacking, single click Copilot Personal users
Postmark MCP All outgoing emails copied to attacker Registry-wide
mcp-remote (CVE-2025-6514) Full RCE on client machines 437,000+ environments
TeamPCP Campaign Cascading credential theft across 5 packages Millions of CI/CD pipelines
Claude Code RCE Token theft via malicious repo configs All Claude Code users
Cursor MCPoison / CurXecute Silent persistent RCE via config swap All Cursor users
ToxicSkills 13% of Skills had critical flaws Claude Skills ecosystem

Read more