Uncovering New Attack Vectors in Model Context Protocols: Risks and Defenses
A comprehensive study reveals that Model Context Protocol (MCP) platforms lack strict vetting, users struggle to detect malicious servers, and current large language models cannot effectively resist MCP‑level injection attacks, highlighting critical security challenges and proposing mitigation strategies.
Introduction
In November 2024 Anthropic released the Model Context Protocol (MCP), an open standard that enables large‑model applications to establish bidirectional connections with external tools and data sources. By June 2025 three major aggregation platforms—Smithery.ai, MCP.so and Glama.ai—hosted over 25,000 servers.
When LLMs can connect to the whole digital world, a key question arises: how to ensure that every MCP‑connected server is safe and trustworthy, and what impact malicious MCP servers could have on users?
To address this, Ant Group’s Tianxiang Lab together with Sichuan University, Sun Yat‑sen University, University of Electronic Science and Technology of China and Zhejiang University released the research "Beyond the Protocol: Unveiling Attack Vectors in the Model Context Protocol Ecosystem".
Paper link: https://arxiv.org/abs/2506.02040
Main Findings
Major MCP aggregation platforms lack strict vetting, allowing attackers to upload malicious servers.
Users have significant difficulty identifying and analysing malicious MCP servers.
Current LLMs struggle to resist injection attacks launched at the MCP layer.
Users underestimate the new security problems introduced by MCP.
MCP applications suffer from alert fatigue due to unreasonable security mechanisms.
Responsibility attribution for MCP providers is unclear, making post‑incident accountability difficult.
Base LLMs inherently trust tool calls.
Four Novel Attack Vectors
Based on the numbered interaction paths in the workflow diagram, we categorize four new attack vectors in the MCP ecosystem.
1. Tool Poisoning Attack
Attackers embed hidden malicious commands in the tool description of an MCP server. These commands can deceive or inject the LLM to produce untrustworthy outputs or steal predefined sensitive information. This attack mainly exploits path ②→④ and succeeds during path ⑥.
2. Puppet Attack
In environments with multiple MCP servers, a malicious server injects prompts via carefully crafted tool descriptions, influencing the LLM’s tool‑selection decisions. The malicious server A typically acts during registration phases ② and ④→⑤, affecting the LLM agent and succeeding in path ⑥.
3. Rug Pull (Supply‑Chain) Attack
Many MCP servers are installed via
npx/uvx -y package-name@latest. Attackers can publish seemingly legitimate services that later change behavior, mimicking a cryptocurrency “rug pull”. This attack follows path ①→②→④ and is executed during path ⑥.
4. Malicious External Resource Exploitation
Malicious MCP servers redirect the agent to harmful third‑party resources outside the MCP ecosystem, such as attacker‑controlled APIs or webpages containing malicious content. This mainly impacts paths ⑦→⑧ and ④→⑤ (chained calls) and succeeds in path ⑦.
Even non‑malicious MCP servers (e.g., Browser‑use, Github‑MCP) can become vulnerable if they lack proper security checks and content filtering.
Proof of Concept Experiments
We designed three experiments covering aggregation platforms, end‑users, and LLM tool usage.
Q1: Can attackers upload malicious MCP servers to aggregation platforms?
We built a test MCP server containing simulated malicious code and description, and successfully uploaded it to all three platforms.
This demonstrates the lack of strict vetting on current MCP aggregation platforms.
Q2: Can users identify malicious MCP servers?
We created a mock platform displaying 13 MCP servers (4 malicious) and recruited 20 participants. 75 % selected at least one malicious server, but only one participant identified all four, indicating substantial difficulty for users.
Further analytics showed participants often ignored detailed warning dialogs and enabled “Auto Approve” features, leading to security fatigue.
Q3: Can malicious MCP servers execute harmful behavior on user machines?
We implemented the three attack vectors on five mainstream LLMs. The average attack success rate was 65.77 % while LLM rejection rates were below 23 %.
These results reveal a fundamental trust paradox: LLMs inherently trust tool descriptions and outputs, and current fine‑tuning data lack adversarial examples to detect malicious intent.
Deep Challenges for MCP Security
Insufficient user awareness of MCP security issues.
Alert fatigue and desensitization to security warnings.
Responsibility vacuum of MCP aggregation platforms.
Intrinsic trust paradox of LLMs and limited defensive capabilities.
Even experienced developers are unaware of emerging attack vectors such as prompt injection in tool descriptions. 45 % of participants would consider using LLM + MCP for sensitive data, amplifying risk.
Overall, the study systematically uncovers security threats throughout the MCP ecosystem—from malicious server infiltration to user negligence and LLM execution—providing guidance for stricter vetting, intelligent security gateways, cryptographic signing, and enhanced base‑model defenses.
AntTech
Technology is the core driver of Ant's future creation.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.