The Worm That Doesn’t Need a Vulnerability

In February 2026, researchers documented something that should have made every security team stop and reassess their AI strategy: a self-propagating generative worm that spreads through AI agents without exploiting a single code vulnerability.

No buffer overflow. No unpatched CVE. No zero-day.

The worm exploits the agent itself — its ability to read context, make decisions, and take actions. An infected agent propagates the attack to other agents through shared context. The agent’s intelligence becomes the attack vector.

We’ve spent years building defenses against code-level exploits. We have scanners for known vulnerabilities, signature-based detection, behavioral analysis of binary execution. None of that applies here. The attack lives in the space between the code — in the natural language instructions, the tool descriptions, the context windows that shape how agents behave.

This isn’t theoretical. The research is published. The attack chains are documented. And most organizations running AI agents in production have no defenses against any of it.


What’s Actually Happening

The scale of the problem is worse than most people realize. Researchers built the first labeled dataset of AI agent skills by behaviorally verifying 98,380 skills from community registries. The findings were stark:

  • 157 malicious skills confirmed, carrying 632 vulnerabilities between them
  • Malicious skills average 4.03 vulnerabilities across a median of three kill chain phases
  • A single actor was responsible for 54.1% of confirmed cases — using templated brand impersonation, not sophisticated techniques

A separate analysis of 42,447 skills found that 26.1% contain at least one vulnerability. The most common: data exfiltration (13.3%) and privilege escalation (11.8%). Skills with executable scripts are 2.12x more likely to be vulnerable than instruction-only skills.

These aren’t edge cases. This is the baseline.


The ClawHavoc Campaign

The clearest example of what’s coming is the ClawHavoc campaign from early 2026. Over 1,200 malicious skills were injected into the OpenClaw marketplace. A companion catalog documented 6,487 malicious tools that evade conventional detection.

This wasn’t a nation-state operation. It was systematic marketplace poisoning — the AI equivalent of typosquatting on npm, but with a critical difference: these skills execute with user privileges and interact directly with sensitive systems.

The barrier to entry was a well-crafted description and a few lines of code. The marketplace had minimal vetting. The skills looked legitimate because the descriptions were optimized to game the agent’s own selection process.

When a developer asked their agent to “help with database management,” the agent would search for relevant skills, find the malicious one (because its description was engineered to rank highly), and execute it with full user privileges. The developer never saw the code. The agent never questioned the source.


Attack Vectors You Can’t Scan For

Weaponized Config Files

The most insidious attack vector isn’t code — it’s configuration. Researchers demonstrated that SKILL.md files, the natural-language metadata that tells agents what a skill does, can be weaponized to hijack agent behavior.

By manipulating these files, attackers achieved 86% pairwise win rate in embedding-based retrieval and 80% Top-10 placement in skill recommendation systems. Adversarial variants were selected in 77.6% of paired trials.

This is supply chain poisoning through language. Traditional scanners don’t catch it because there’s nothing to scan. The attack vector is a sentence, not a script.

Backdoors That Survive Model Swaps

The chat template attack is particularly concerning for anyone using open-weight models. Researchers showed that maliciously modified Jinja2 chat templates can implant backdoors without touching model weights. Under triggered conditions, factual accuracy drops from 90% to 15%, and attacker-controlled URLs are emitted with success rates exceeding 80%.

These backdoors evade every automated security scan applied by the largest open-weight distribution platform. When you download a model from Hugging Face, the chat template is a separate file. Modify that, and you’ve compromised every inference call.

The LoRA Problem

LoRA adapters are the primary way organizations customize foundation models. They’re shared freely, downloaded casually, and integrated without the scrutiny you’d apply to traditional dependencies.

MasqLoRA demonstrated that standalone LoRA modules can serve as attack vehicles with a 99.8% attack success rate. For robotics applications, LoRA-based backdoors in LLM-mediated ROS2 systems achieved 83% attack success while maintaining 93% clean performance accuracy.

The model works fine. It passes all your tests. It also exfiltrates data when it sees a specific trigger.

Pickle Deserialization: The Old Problem in a New Context

Every time you load a model with torch.load() or pickle.load(), you’re trusting that the file doesn’t contain malicious code. Researchers identified 22 distinct pickle-based model loading paths across five major frameworks. 19 of those paths are entirely missed by existing scanners. They found 133 exploitable gadgets achieving almost 100% bypass rate.

This is the same deserialization vulnerability class we’ve been dealing with in web applications for a decade, now living in the ML supply chain. The only difference is that nobody’s patching it because it’s not in a CVE database.


The MCP Problem

The Model Context Protocol is becoming the standard for connecting AI agents to external tools and data sources. It’s also becoming a significant attack surface.

The protocol itself isn’t broken. The problem is that we’re connecting agents to external systems with the same trust model we used for human users. Agents operate at machine speed and scale — a compromised MCP server can serve thousands of agent invocations before anyone notices.

Three adversary types exploit MCP:

  1. Content-injection attackers who manipulate tool descriptions to influence agent behavior
  2. Supply-chain attackers who distribute compromised MCP servers through community registries
  3. Agents that over-step their roles, escalating privileges beyond their intended scope

The most concerning finding: three moderate/high-severity advisories in OpenClaw composed into a complete unauthenticated remote code execution path. A malicious skill executed a two-stage dropper within the LLM context, bypassing the exec pipeline entirely.

Individual vulnerabilities that seem manageable in isolation compose into critical attack chains when agents are involved.


What This Means for Your Monday Morning

If you’re running AI agents in production — whether it’s coding assistants, customer service bots, or internal tooling — here’s what you need to do this week:

1. Audit your agent’s tool access. Map every MCP server, every skill, every external tool your agent can reach. If you can’t enumerate it, you can’t secure it.

2. Treat agent skills as dependencies, not config. Apply the same scrutiny you’d apply to npm packages. Cryptographic signing, dependency analysis, reproducible builds.

3. Sandbox agent execution. Every skill invocation should run in an isolated environment with least-privilege defaults. If a skill needs database access, it gets read-only access to a specific table, not a connection string.

4. Monitor agent behavior, not just code. Traditional security tools scan code. Agent attacks live in natural language. You need behavioral monitoring that catches anomalous agent actions — unexpected network calls, unusual data access patterns, privilege escalation attempts.

5. Implement kill switches. If an agent starts behaving anomalously, you need the ability to halt execution immediately. Not after the incident review. Now.

6. Audit your model supply chain. Every model, LoRA adapter, and chat template should have provenance. If you can’t trace where it came from and verify its integrity, don’t use it in production.


The Uncomfortable Truth

We’re making the same mistakes with AI agents that we made with traditional software supply chains — trusting too much, verifying too little, and assuming the best about our dependencies.

The difference is that AI agents operate at machine speed, interact with sensitive systems directly, and make decisions that affect real outcomes. A compromised agent isn’t just a vulnerability in a library. It’s a compromised decision-maker with access to your infrastructure.

The research is clear. The attack vectors are real. The defenses are lagging.

The question isn’t whether AI agents will be used in supply chain attacks. It’s whether you’ll have defenses ready when it happens.