Tagged articles

9 articles

Page 1 of 1

May 14, 2026 · Artificial Intelligence

Accelerating Training and Inference of EAGLE-3 for Multi‑Round Agent Workflows

This article analyzes the latency bottlenecks of large language models in multi‑round AI Agent scenarios, introduces SpecForge‑based speculative decoding and Unified Sequence Parallelism (USP) techniques applied to the EAGLE-3 model, and presents benchmark results showing over two‑fold Accept‑Len gains and 35‑44% reductions in P95 token‑level latency while enabling 128K context training on an 8‑GPU node.

Agent AIEAGLE-3Speculative Decoding

0 likes · 26 min read

Accelerating Training and Inference of EAGLE-3 for Multi‑Round Agent Workflows

Architects' Tech Alliance

May 12, 2026 · Industry Insights

Why Tokens Have Turned Into ‘Gold Water’: How Model Providers Are Raking in AI Profits

The AI ecosystem is witnessing a massive shift where exploding hardware costs and soaring token demand have turned tokens into a high‑margin commodity, allowing model providers to capture most of the profit while Nvidia and TSMC keep prices flat for strategic reasons.

AI economicsAgent AIAnthropic

0 likes · 7 min read

Why Tokens Have Turned Into ‘Gold Water’: How Model Providers Are Raking in AI Profits

PMTalk Product Manager Community

Apr 14, 2026 · Product Management

Why Evaluation and Decomposition, Not Prototyping, Are the Core Skills for AI Product Managers

Traditional product tactics like building features first and relying on gradual rollout no longer work for AI agents; instead, AI product managers must adopt a rigorous, scenario‑driven evaluation framework that measures result quality, task completion, tool correctness, and security to ensure trustworthy, business‑critical performance.

AI product managementAI reliabilityAgent AI

0 likes · 10 min read

Why Evaluation and Decomposition, Not Prototyping, Are the Core Skills for AI Product Managers

Architect's Journey

Mar 26, 2026 · Artificial Intelligence

How Cursor’s $30B AI Coding Tool Secretly Leverages China’s Kimi K2.5 Model

An API interception revealed that Cursor’s high‑valued AI programming platform relies on Moonshot AI’s Kimi K2.5 model, a trillion‑parameter MoE system, and uses a novel self‑summarization technique to compress context, achieving superior benchmark scores and exposing why Western open‑source models fall short.

AI programmingAgent AICursor

0 likes · 10 min read

How Cursor’s $30B AI Coding Tool Secretly Leverages China’s Kimi K2.5 Model

Xiaomi Tech

Mar 18, 2026 · Artificial Intelligence

Xiaomi’s MiMo‑V2‑Omni: A Full‑Modal Agent Base that Sees, Listens, and Acts

Xiaomi unveiled MiMo‑V2‑Omni, a full‑modal agent base that unifies text, image, video and audio perception with tool‑calling and GUI actions, outperforming leading models such as Gemini 3 Pro and Claude Opus 4.6 on benchmarks, and offering a 256K‑context API for diverse real‑world tasks.

APIAgent AIMiMo-V2-Omni

0 likes · 8 min read

Xiaomi’s MiMo‑V2‑Omni: A Full‑Modal Agent Base that Sees, Listens, and Acts

PaperAgent

Mar 9, 2026 · Artificial Intelligence

Which LLM Wins the Agent Benchmark? PinchBench Success, Speed, and Cost Rankings Revealed

PinchBench evaluates 32 mainstream large language models on success rate, execution speed, and cost for real‑world agent tasks, highlighting top performers like Gemini‑3‑flash‑preview, MiniMax‑M2.1, and Kimi‑K2.5, and explains why traditional AI benchmarks no longer predict agent effectiveness.

Agent AIExecution SpeedLLM Benchmark

0 likes · 4 min read

Which LLM Wins the Agent Benchmark? PinchBench Success, Speed, and Cost Rankings Revealed

DataFunSummit

Dec 20, 2025 · Artificial Intelligence

How AutoHome Built the Cangjie Large Model: From Training Architecture to Real-World AI Applications

This article details AutoHome's end‑to‑end development of the Cangjie large model, covering the training infrastructure with distributed data, pipeline and tensor parallelism, core business use cases such as video script generation and multi‑tool Agent capabilities, inference optimizations through quantization and fast serving frameworks, and future directions for personalized automotive AI services.

Agent AILarge Language ModelVideo Generation

0 likes · 19 min read

How AutoHome Built the Cangjie Large Model: From Training Architecture to Real-World AI Applications

Design Hub

Dec 12, 2025 · Artificial Intelligence

GPT-5.2 Unveiled: A Cutting-Edge AI Super-Assistant Built for Real-World Work

OpenAI's newly released GPT-5.2 claims to outperform human experts on about 70% of real tasks, achieve a perfect score on the AIME 2025 competition, and deliver dramatic efficiency gains—up to 390× cost reduction—while showcasing impressive examples such as one‑shot ocean shader generation, a full 3D engine built in a single file, and visual‑perception scores rivaling top models.

AI benchmarksAgent AIEfficiency

0 likes · 8 min read

GPT-5.2 Unveiled: A Cutting-Edge AI Super-Assistant Built for Real-World Work

Data Party THU

Oct 24, 2025 · Artificial Intelligence

How 78 Samples Outperform 10,000: The LIMI Breakthrough in Agent AI

The paper introduces the LIMI framework, which achieves state‑of‑the‑art agent performance on AgencyBench using only 78 carefully crafted samples—outperforming baseline models trained on thousands of examples—by focusing on high‑quality, strategic data construction and demonstrating superior generalization across code, research, and tool‑use tasks.

AgencyBenchAgent AILIMI

0 likes · 11 min read

How 78 Samples Outperform 10,000: The LIMI Breakthrough in Agent AI