How Wukong AI Agent Uncovered a Critical RCE Vulnerability in LLaMA‑Factory (CVE‑2025‑53002)
This article details how the Wukong AI Agent automatically audited the popular LLaMA‑Factory project, discovered a high‑severity remote code execution vulnerability (CVE‑2025‑53002) caused by unsafe torch.load usage, reported it to the maintainers, and demonstrated the official fix that adds a secure weights_only flag.
1. Introduction to LLaMA‑Factory
LLaMA‑Factory is an open‑source framework with over 53K GitHub stars that simplifies fine‑tuning of large language models (LLMs). Its ease of use and flexibility have made it a preferred tool for many developers and research teams.
During a deep security audit, the Wukong AI Agent identified a severe remote code execution vulnerability (CVE‑2025‑53002, CVSS v3.1 8.3) in the project.
2. Practical Effectiveness of Wukong AI Agent
The agent leverages a multi‑agent architecture to automate vulnerability discovery:
Audit Agent: Traces data flow from the Web UI to backend parameters, expands function calls, and spots unsafe deserialization APIs.
Review Agent: Analyzes the extracted code, applies multi‑vote verification, and assesses exploitability.
Fix Agent: Generates remediation suggestions based on CVE databases and internal knowledge, producing concrete patches.
The agent pinpointed the unsafe
torch.loadcall in
src/llamafactory/model/model_utils/valuehead.py, which allowed arbitrary code execution via a malicious checkpoint path.
3. Official Response and Fix
After confirming the vulnerability, the researchers submitted a detailed security advisory (including PoC) via GitHub Security Advisories, which was acknowledged by the LLaMA‑Factory team and assigned a CVE number.
The fix modifies the
torch.loadcall to include
weights_only=True, preventing unsafe pickle deserialization.
# src/llamafactory/model/model_utils/valuehead.py
- state_dict = torch.load(vhead_file, map_location="cpu")
+ state_dict = torch.load(vhead_file, map_location="cpu", weights_only=True)4. Technical Analysis
The vulnerability stems from loading
vhead_filewithout the
weights_only=Truesafeguard. An attacker can supply a malicious checkpoint path through the Web UI, causing
torch.loadto deserialize arbitrary code.
Impact includes all LLaMA‑Factory versions ≤ 0.9.3, affecting a wide range of LLM training and inference scenarios. The CVSS 3.1 score of 8.3 indicates a high‑severity risk.
Note: In PyTorch versions < 2.6, the default for torch.load is weights_only=False . LLaMA‑Factory only requires torch>=2.0.0 , leaving affected versions insecure by default.
Conclusion
The case study demonstrates that Wukong AI Agent can efficiently discover critical vulnerabilities in complex open‑source projects, accelerate vendor response, and integrate automated security checks into CI/CD pipelines, heralding a new era of intelligent security engineering.
Tencent Technical Engineering
Official account of Tencent Technology. A platform for publishing and analyzing Tencent's technological innovations and cutting-edge developments.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.