Google Unleashes Gemma: Open‑Source LLM That Beats Llama 2 and Challenges OpenAI
Google has released the open‑source Gemma large language model in 2 B and 7 B parameter versions, claiming superior performance to Llama 2 and Mistral across 18 benchmarks, especially in math and code, while running on laptops, desktops, IoT and cloud devices.
Google announced the open‑source Gemma model, offered in 2 B‑parameter and 7 B‑parameter variants. The models are marketed as lightweight and high‑performance, capable of running on notebooks, desktops, IoT devices, mobile devices and cloud platforms.
In a set of 18 benchmark evaluations, Gemma achieved higher average scores than the current mainstream open‑source models Llama 2 and Mistral, with particular strength in mathematics and code tasks. The 7 B version topped the Hugging Face open‑source LLM leaderboard.
The technical report reveals that Gemma uses a tokenizer with a 256 k token vocabulary, which the authors suggest makes multilingual expansion easier. Training was performed on Google’s TPUv5e hardware: the 7 B model employed 4 096 TPUv5e chips, while the 2 B model used 512 TPUv5e chips. The 2 B and 7 B models were trained on 2 trillion and 6 trillion tokens respectively, drawing data mainly from web documents, mathematics and code in English.
Gemma inherits the Gemini architecture, built on the Transformer backbone, and follows the same pre‑training and instruction‑fine‑tuning pipeline. It uses a subset of Gemini’s SentencePiece tokenizer, which splits numbers, retains extra spaces, and falls back to byte‑level encoding for unknown tokens.
Benchmark results show the 7 B model winning 11 of the 18 tests, with an average score of 56.4, surpassing both Llama 2 and Mistral. In specific tasks such as question answering, reasoning, mathematics/science and code, Gemma 7 B outperformed Llama 2 13 B despite the latter’s larger size.
Alongside the model release, Google provided a Responsible Generative AI Toolkit, integration with Keras 3.0 for inference and supervised fine‑tuning across JAX, PyTorch and TensorFlow, and extensive safety evaluations including human‑feedback RLHF, red‑team testing and automated adversarial testing.
The Gemma launch is the third of three rapid moves in early 2024—following the free‑use Gemini Ultra announcement on February 9 and the Gemini 1.5 release on February 16—intended to counter OpenAI’s recent advances such as the Sora video model and to challenge Meta’s Llama 2. Google’s strategy leverages its custom TPUv5e chips, which the company claims provide compute power several times that of Nvidia GPUs, reinforcing its position in the large‑model race.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Smart Era Software Development
Committed to openness and connectivity, we build frontline engineering capabilities in software, requirements, and platform engineering. By integrating digitalization, cloud computing, blockchain, new media and other hot tech topics, we create an efficient, cutting‑edge tech exchange platform and a diversified engineering ecosystem. Provides frontline news, summit updates, and practical sharing.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
