Tag

Grok 3

1 views collected around this technical thread.

Java Tech Enthusiast
Java Tech Enthusiast
Feb 22, 2025 · Artificial Intelligence

Grok‑3 Evaluation Controversy and Community Reactions

Three days after Grok‑3’s launch, OpenAI was accused of inflating its benchmark scores by using a “cons@64” method that aggregates 64 answers, a practice critics say unfairly skews comparisons with single‑shot models like o3‑mini, while developers have already begun experimenting with the model in simple games.

AIGrok 3OpenAI
0 likes · 5 min read
Grok‑3 Evaluation Controversy and Community Reactions