DeepSeek-V4 Launches with 1M Token Context and Leading Open-Source Agent – A Chinese AI Milestone
DeepSeek has unveiled the V4 preview, offering two open‑source large language models—Pro (1.6 T parameters) and Flash (284 B)—both supporting 1 million‑token context, sparse‑attention efficiency gains, top‑ranked Agent capabilities, and competitive reasoning performance, marking a major milestone for Chinese AI.
DeepSeek‑V4 preview released
Two variants are provided: V4‑Pro (1.6 T total parameters, 49 B active) and V4‑Flash (284 B total, 13 B active). Both variants natively support a 1 M‑token context window.
V4‑Pro (flagship)
Total parameters: 1.6 T (active 49 B)
Target: Open‑source performance ceiling against top closed‑source models
Key strengths: Agent capability, world knowledge, inference performance
V4‑Flash (lightweight)
Total parameters: 284 B (active 13 B)
Target: High efficiency, low cost, fast response
Key strengths: Inference quality close to Pro, cheaper API pricing, suitable for high‑frequency calls
Core technical breakthroughs
1. 1 M‑token context as default
Innovation: DSA sparse attention combined with token compression enables 1 M‑token context as a standard service.
Application examples: processing an entire technical manual, full project source code, or a collection of million‑word documents without segmentation.
Retrieval accuracy: 97 % on single‑pass processing of large texts.
Efficiency: V4‑Pro inference compute is 27 % of V3.2 and memory usage is 10 % of V3.2; V4‑Flash reduces compute to 10 % and memory to 7 % of V3.2.
2. Agent capability
Benchmark: Top position on the Agentic Coding leaderboard among open‑source models.
Internal testing: Outperforms Claude Sonnet 4.5 and approaches Opus 4.6 in non‑thinking mode.
Framework support: Compatible with Claude Code, OpenClaw, OpenCode, etc., delivering strong code generation and document‑processing.
Internal use: Adopted as DeepSeek’s primary agent programming model, improving development efficiency.
3. World knowledge and reasoning
World knowledge: Significantly ahead of peer open‑source models; gap to Gemini‑Pro‑3.1 described as minimal.
Math / STEM reasoning: Best open‑source performance on competition‑level coding and complex mathematical tasks, surpassing some closed‑source competitors.
MMLU baseline: Score > 84 %, placing the model in the first tier of industry performance.
Availability
Website: chat.deepseek.com – direct chat with 1 M‑token context.
Official app: Mobile client updated for V4.
API: Set model_name to deepseek-v4-pro or deepseek-v4-flash to invoke the respective model.
Open‑source weights: Published on Hugging Face with accompanying technical report for deployment and fine‑tuning.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
