How a Chinese Open‑Source AI Code Auditor with 6K Stars Uncovered 49 CVEs
DeepAudit, a 6K‑star open‑source AI code‑audit system, uses a four‑agent architecture and sandboxed PoC verification to automatically discover and confirm 49 high‑severity CVEs across popular projects, while offering both deep audit and instant analysis modes, but it faces model dependency, cost, and sandbox limitations.
What DeepAudit Is
DeepAudit is an open‑source AI‑driven code‑vulnerability mining system created by lintsinghua, marketed as China’s first open‑source multi‑agent code‑audit platform. It has attracted over 6,000 stars on GitHub and is released under the AGPL‑3.0 license.
Architecture and Agents
The system runs four clearly defined agents that collaborate through a real orchestration logic:
Orchestrator : receives a project, analyses its structure and tech stack, distributes tasks, aggregates results, removes false positives and generates the final report.
Recon Agent : scans the repository, identifies framework dependencies and API entry points, and builds an attack surface map, a step where traditional SAST tools are weak.
Analysis Agent : performs the actual vulnerability mining. It queries a RAG knowledge base containing CWE/CVE datasets and conducts semantic matching, which is smarter but more token‑expensive than pure rule‑based engines.
Verification Agent : the key differentiator. For each suspected vulnerability it automatically writes a PoC script, runs it in an isolated Docker sandbox, and only keeps findings that succeed, dramatically reducing false‑positive rates.
The backend is built with Python FastAPI, LangGraph for agent orchestration, ChromaDB as the vector store, PostgreSQL and Redis. The frontend uses React + TypeScript with shadcn/ui. The architecture is micro‑service‑oriented, with the sandbox running in a separate container for safety.
Interface and Modes
DeepAudit offers two operating modes:
Agent Deep Audit : users import a project via GitHub/GitLab/Gitea URL or ZIP upload. The UI shows real‑time agent reasoning and execution logs, making debugging straightforward.
Instant Analysis : users paste a code snippet and receive results within seconds, covering security issues, bugs, performance, style and maintainability, each accompanied by a What‑Why‑How explanation.
The dashboard visualises overall security posture, supports multi‑repo management, and allows one‑click export of reports in PDF, Markdown or JSON formats.
Benchmark: 49 CVEs Discovered
The DeepAudit team ran the tool on several well‑known Chinese open‑source projects, and the resulting CVEs have all been recorded in the NVD. Highlights include:
ZenTao PMS – SSRF and privilege escalation (CVSS 9.1)
DataEase – three JNDI injections (CVSS 9.8)
H2O‑3 – two deserialization bugs (CVSS 9.8)
O2OA – numerous XSS issues
Jimureport – deserialization (CVSS 9.8)
Litemall – hard‑coded credentials (CVSS 9.8)
Additional projects such as Mall, xxl‑job, eladmin also yielded findings. OpenClaw contributed six GHSA entries (command injection, RCE, signature‑verification bypass, credential leaks) that were confirmed by the upstream maintainers. These real‑world detections serve as the most convincing benchmark for DeepAudit.
Model Support and Security Considerations
DeepAudit supports a wide range of LLMs, including GPT‑4o, Claude 3.5 Sonnet/Opus, Gemini Pro, DeepSeek V3, and Chinese models such as Tongyi Qianwen, GLM‑4, Kimi, Wenxin Yi and Doubao. Crucially, it can run locally via Ollama, enabling models like DeepSeek‑Coder, Llama 3, Qwen 2.5 and CodeLlama to operate without sending proprietary code to external APIs—a vital feature for compliance‑sensitive environments. The author recommends using Ollama for private‑code audits and notes that model configuration can be changed directly in the browser without restarting the service.
Getting Started
curl -fsSL https://raw.githubusercontent.com/lintsinghua/DeepAudit/v3.0.0/docker-compose.prod.yml | docker compose -f - up -dA China‑accelerated mirror is also provided:
curl -fsSL https://raw.githubusercontent.com/lintsinghua/DeepAudit/v3.0.0/docker-compose.prod.cn.yml | docker compose -f - up -dAfter the containers start, open http://localhost:3000 , enter the LLM API key in system settings, and begin auditing. Development requires Python 3.11+, Node 20+, PostgreSQL 15+, with uv for environment management and pnpm for the frontend.
Limitations and Open Issues
Heavy model dependency : All agents follow a ReAct (Thought/Action/Action‑Input) pattern. Smaller local models (e.g., Qwen 2.5‑7B) often drift from the required format, causing the orchestration to break.
Cost and speed for large codebases : Each agent step consumes tokens. Auditing a project with hundreds of thousands of lines can become expensive (e.g., Claude 3.5 API fees) and take hours, whereas traditional SAST tools like Semgrep finish in seconds.
Sandbox verification is not universal : The PoC sandbox works well for typical SQL or command‑injection bugs but struggles with vulnerabilities that need multi‑service environments or low‑level memory exploits. A successful verification does not guarantee 100 % exploitability.
Community‑reported bugs : Users have reported project‑load failures, missing absolute paths or line‑number offsets in reports, and occasional unavailability of the sandbox Docker image from the Nanjing University mirror.
License considerations : AGPL‑3.0 permits internal use, but integrating DeepAudit into commercial products requires careful review of the license terms.
Overall, DeepAudit is currently best suited as an auxiliary tool for security researchers or small teams performing initial security assessments of their own code. Its innovative multi‑agent, AI‑driven approach shows strong potential, but the project is still maturing and may miss issues or produce false positives.
https://github.com/lintsinghua/DeepAudit
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
