Tagged articles
9 articles
Page 1 of 1
Didi Tech
Didi Tech
May 14, 2026 · Artificial Intelligence

Accelerating Training and Inference of EAGLE-3 for Multi‑Round Agent Workflows

This article analyzes the latency bottlenecks of large language models in multi‑round AI Agent scenarios, introduces SpecForge‑based speculative decoding and Unified Sequence Parallelism (USP) techniques applied to the EAGLE-3 model, and presents benchmark results showing over two‑fold Accept‑Len gains and 35‑44% reductions in P95 token‑level latency while enabling 128K context training on an 8‑GPU node.

Agent AIEAGLE-3Speculative Decoding
0 likes · 26 min read
Accelerating Training and Inference of EAGLE-3 for Multi‑Round Agent Workflows
PMTalk Product Manager Community
PMTalk Product Manager Community
Apr 14, 2026 · Product Management

Why Evaluation and Decomposition, Not Prototyping, Are the Core Skills for AI Product Managers

Traditional product tactics like building features first and relying on gradual rollout no longer work for AI agents; instead, AI product managers must adopt a rigorous, scenario‑driven evaluation framework that measures result quality, task completion, tool correctness, and security to ensure trustworthy, business‑critical performance.

AI product managementAI reliabilityAgent AI
0 likes · 10 min read
Why Evaluation and Decomposition, Not Prototyping, Are the Core Skills for AI Product Managers
Architect's Journey
Architect's Journey
Mar 26, 2026 · Artificial Intelligence

How Cursor’s $30B AI Coding Tool Secretly Leverages China’s Kimi K2.5 Model

An API interception revealed that Cursor’s high‑valued AI programming platform relies on Moonshot AI’s Kimi K2.5 model, a trillion‑parameter MoE system, and uses a novel self‑summarization technique to compress context, achieving superior benchmark scores and exposing why Western open‑source models fall short.

AI programmingAgent AICursor
0 likes · 10 min read
How Cursor’s $30B AI Coding Tool Secretly Leverages China’s Kimi K2.5 Model
Xiaomi Tech
Xiaomi Tech
Mar 18, 2026 · Artificial Intelligence

Xiaomi’s MiMo‑V2‑Omni: A Full‑Modal Agent Base that Sees, Listens, and Acts

Xiaomi unveiled MiMo‑V2‑Omni, a full‑modal agent base that unifies text, image, video and audio perception with tool‑calling and GUI actions, outperforming leading models such as Gemini 3 Pro and Claude Opus 4.6 on benchmarks, and offering a 256K‑context API for diverse real‑world tasks.

APIAgent AIMiMo-V2-Omni
0 likes · 8 min read
Xiaomi’s MiMo‑V2‑Omni: A Full‑Modal Agent Base that Sees, Listens, and Acts
PaperAgent
PaperAgent
Mar 9, 2026 · Artificial Intelligence

Which LLM Wins the Agent Benchmark? PinchBench Success, Speed, and Cost Rankings Revealed

PinchBench evaluates 32 mainstream large language models on success rate, execution speed, and cost for real‑world agent tasks, highlighting top performers like Gemini‑3‑flash‑preview, MiniMax‑M2.1, and Kimi‑K2.5, and explains why traditional AI benchmarks no longer predict agent effectiveness.

Agent AIExecution SpeedLLM Benchmark
0 likes · 4 min read
Which LLM Wins the Agent Benchmark? PinchBench Success, Speed, and Cost Rankings Revealed
DataFunSummit
DataFunSummit
Dec 20, 2025 · Artificial Intelligence

How AutoHome Built the Cangjie Large Model: From Training Architecture to Real-World AI Applications

This article details AutoHome's end‑to‑end development of the Cangjie large model, covering the training infrastructure with distributed data, pipeline and tensor parallelism, core business use cases such as video script generation and multi‑tool Agent capabilities, inference optimizations through quantization and fast serving frameworks, and future directions for personalized automotive AI services.

Agent AILarge Language ModelVideo Generation
0 likes · 19 min read
How AutoHome Built the Cangjie Large Model: From Training Architecture to Real-World AI Applications
Design Hub
Design Hub
Dec 12, 2025 · Artificial Intelligence

GPT-5.2 Unveiled: A Cutting-Edge AI Super-Assistant Built for Real-World Work

OpenAI's newly released GPT-5.2 claims to outperform human experts on about 70% of real tasks, achieve a perfect score on the AIME 2025 competition, and deliver dramatic efficiency gains—up to 390× cost reduction—while showcasing impressive examples such as one‑shot ocean shader generation, a full 3D engine built in a single file, and visual‑perception scores rivaling top models.

AI benchmarksAgent AIEfficiency
0 likes · 8 min read
GPT-5.2 Unveiled: A Cutting-Edge AI Super-Assistant Built for Real-World Work
Data Party THU
Data Party THU
Oct 24, 2025 · Artificial Intelligence

How 78 Samples Outperform 10,000: The LIMI Breakthrough in Agent AI

The paper introduces the LIMI framework, which achieves state‑of‑the‑art agent performance on AgencyBench using only 78 carefully crafted samples—outperforming baseline models trained on thousands of examples—by focusing on high‑quality, strategic data construction and demonstrating superior generalization across code, research, and tool‑use tasks.

AgencyBenchAgent AILIMI
0 likes · 11 min read
How 78 Samples Outperform 10,000: The LIMI Breakthrough in Agent AI