Java Tech Enthusiast
Feb 22, 2025 · Artificial Intelligence
Grok‑3 Evaluation Controversy and Community Reactions
Three days after Grok‑3’s launch, OpenAI was accused of inflating its benchmark scores by using a “cons@64” method that aggregates 64 answers, a practice critics say unfairly skews comparisons with single‑shot models like o3‑mini, while developers have already begun experimenting with the model in simple games.
AIGrok 3OpenAI
0 likes · 5 min read