Information Security 8 min read

How Wukong AI Agent Uncovered a Critical RCE Vulnerability in LLaMA‑Factory (CVE‑2025‑53002)

This article details how the Wukong AI Agent automatically audited the popular LLaMA‑Factory project, discovered a high‑severity remote code execution vulnerability (CVE‑2025‑53002) caused by unsafe torch.load usage, reported it to the maintainers, and demonstrated the official fix that adds a secure weights_only flag.

Tencent Technical Engineering
Tencent Technical Engineering
Tencent Technical Engineering
How Wukong AI Agent Uncovered a Critical RCE Vulnerability in LLaMA‑Factory (CVE‑2025‑53002)

1. Introduction to LLaMA‑Factory

LLaMA‑Factory is an open‑source framework with over 53K GitHub stars that simplifies fine‑tuning of large language models (LLMs). Its ease of use and flexibility have made it a preferred tool for many developers and research teams.

During a deep security audit, the Wukong AI Agent identified a severe remote code execution vulnerability (CVE‑2025‑53002, CVSS v3.1 8.3) in the project.

2. Practical Effectiveness of Wukong AI Agent

The agent leverages a multi‑agent architecture to automate vulnerability discovery:

Audit Agent: Traces data flow from the Web UI to backend parameters, expands function calls, and spots unsafe deserialization APIs.

Review Agent: Analyzes the extracted code, applies multi‑vote verification, and assesses exploitability.

Fix Agent: Generates remediation suggestions based on CVE databases and internal knowledge, producing concrete patches.

The agent pinpointed the unsafe

torch.load

call in

src/llamafactory/model/model_utils/valuehead.py

, which allowed arbitrary code execution via a malicious checkpoint path.

LLaMA‑Factory GitHub stars and history
LLaMA‑Factory GitHub stars and history
Wukong AI Agent architecture diagram
Wukong AI Agent architecture diagram

3. Official Response and Fix

After confirming the vulnerability, the researchers submitted a detailed security advisory (including PoC) via GitHub Security Advisories, which was acknowledged by the LLaMA‑Factory team and assigned a CVE number.

The fix modifies the

torch.load

call to include

weights_only=True

, preventing unsafe pickle deserialization.

# src/llamafactory/model/model_utils/valuehead.py
- state_dict = torch.load(vhead_file, map_location="cpu")
+ state_dict = torch.load(vhead_file, map_location="cpu", weights_only=True)
GitHub security advisory screenshot
GitHub security advisory screenshot

4. Technical Analysis

The vulnerability stems from loading

vhead_file

without the

weights_only=True

safeguard. An attacker can supply a malicious checkpoint path through the Web UI, causing

torch.load

to deserialize arbitrary code.

Impact includes all LLaMA‑Factory versions ≤ 0.9.3, affecting a wide range of LLM training and inference scenarios. The CVSS 3.1 score of 8.3 indicates a high‑severity risk.

Note: In PyTorch versions < 2.6, the default for torch.load is weights_only=False . LLaMA‑Factory only requires torch>=2.0.0 , leaving affected versions insecure by default.

Conclusion

The case study demonstrates that Wukong AI Agent can efficiently discover critical vulnerabilities in complex open‑source projects, accelerate vendor response, and integrate automated security checks into CI/CD pipelines, heralding a new era of intelligent security engineering.

remote code executionsecurity patchAI securityLLaMA-Factoryvulnerability analysisCVE-2025-53002
Tencent Technical Engineering
Written by

Tencent Technical Engineering

Official account of Tencent Technology. A platform for publishing and analyzing Tencent's technological innovations and cutting-edge developments.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.