Tagged articles
2 articles
Page 1 of 1
DataFunTalk
DataFunTalk
May 29, 2026 · Artificial Intelligence

Claude Opus 4.8 Arrives with Two Historic Firsts: Zero Lie Rate and Zero Lazy Rate

Claude Opus 4.8, released just 43 days after 4.7 at the same price, tops the GDPval‑AA leaderboard with 1890 Elo, beats GPT‑5.5 by 121 points, cuts steps by 15% and tokens by 35%, achieves a perfect 0% lie and lazy rate, dominates SWE‑Bench, ProgramBench and FrontierSWE, and introduces massive parallel agent workflows that can rewrite 750 k lines of production code in 11 days, while Anthropic prepares the upcoming Claude Mythos and celebrates a $965 b valuation.

AI benchmarksClaudeDynamic Workflows
0 likes · 10 min read
Claude Opus 4.8 Arrives with Two Historic Firsts: Zero Lie Rate and Zero Lazy Rate
ZhiKe AI
ZhiKe AI
Apr 21, 2026 · Artificial Intelligence

Open-Source Kimi K2.6 Beats GPT‑5.4 and Claude Opus 4.6 in Code Generation

Kimi K2.6, an open‑source Chinese LLM, outperforms GPT‑5.4 and Claude Opus 4.6 on SWE‑Bench Pro code tests, delivers 13‑hour uninterrupted coding, runs 300 parallel agents, and costs only one‑twentieth of comparable closed‑source models, while offering a trillion‑parameter MoE architecture and Apache 2.0 licensing.

AI model benchmarksApache-2.0Kimi K2.6
0 likes · 9 min read
Open-Source Kimi K2.6 Beats GPT‑5.4 and Claude Opus 4.6 in Code Generation