Google’s Tiny Gemma 2 Model Outperforms GPT-3.5 2 Billion Parameters Big Performance

Google DeepMind Unveils Gemma 2 2B: A Tiny AIPowerhouse Outperforming Larger Models

San Francisco, August 1,2024 – Google DeepMind has released Gemma 2 2B, a lightweight, 2.6 billion parameter language model that punchesabove its weight, exceeding the performance of larger models like GPT-3.5 and Mixtral 8x7B. This breakthrough in AI efficiency comes justweeks after the release of its larger sibling, Gemma 2 27B, which quickly gained popularity for its impressive performance.

The new Gemma 2 2B model, distilled from the 27B version, has achievedremarkable scores on the LMSYS benchmark, surpassing both GPT-3.5 and Mixtral 8x7B. It also boasts impressive scores on the MMLU and MBPP benchmarks, achieving 56.1 and36.6 respectively, exceeding its predecessor, Gemma 1 2B, by over 10%. This achievement further validates the growing trend in the AI industry towards smaller, more efficient models.

Gemma 2 2B represents a significant leap forward in the development of lightweight AI models, said aspokesperson for Google DeepMind. It demonstrates that high performance can be achieved with significantly fewer parameters, making AI accessible to a wider range of devices and applications.

The new model’s capabilities extend beyond its impressive performance. It is designed for seamless integration with various platforms, including mobile devices, laptops, and cloud environmentslike Vertex AI and Google Kubernetes Engine (GKE). Optimized with NVIDIA TensorRT-LLM, Gemma 2 2B can be deployed on a range of hardware, including NVIDIA NIM platforms, data centers, and edge devices like RTX GPUs and Jetson modules.

Furthermore, the model integrates seamlessly with popularframeworks such as Keras, JAX, Hugging Face, NVIDIA NeMo, Ollama, and Gemma.cpp, with integration with MediaPipe planned for the future. This ease of integration makes Gemma 2 2B a valuable tool for developers looking to build and deploy AI solutions quickly and efficiently.

Beyond itstechnical prowess, Gemma 2 2B is also designed for accessibility. Its smaller size allows it to run on Google Colab’s free T4 GPU layer, lowering the barrier to entry for developers. The model’s weights are available for download on Kaggle, Hugging Face, and Vertex AI Model Garden, enabling developers to explore its potential and build their own AI applications.

Alongside Gemma 2 2B, Google DeepMind also unveiled two additional tools:

ShieldGemma: This state-of-the-art safety classifier is designed to ensure that AI outputs are engaging, safe, and inclusive. Itworks by detecting and mitigating harmful content, including hate speech, harassment, explicit content, and dangerous content. ShieldGemma comes in various model sizes (2B, 9B, and 27B), each optimized for different applications and hardware environments.
Gemma Scope: This tool provides unprecedented insight into theinner workings of AI models by leveraging open-source sparse autoencoders. Gemma Scope allows developers to understand the decision-making process of AI models, leading to greater transparency and accountability.

The release of Gemma 2 2B and its accompanying tools marks a significant step forward in the development of accessible and efficient AI. By pushing the boundaries of what is possible with smaller models, Google DeepMind is paving the way for a future where AI is more readily available and impactful across a wider range of applications.