Google's new Gemma 2 is giving LLaMA 3 a "tough time"😨
Gemma 2 offers best-in-class performance, runs at incredible speed across different hardware and easily integrates with other AI tools.
Google Releases Gemma 2: A Leap Forward in AI Development
Google has announced the global release of Gemma 2 (Technical Report), the latest in its family of lightweight, state-of-the-art open models. Building on the success of the original Gemma models, Gemma 2 offers superior performance, enhanced efficiency, and seamless integration with various AI tools and frameworks.
What is Gemma 2?
Gemma 2 is an advanced AI model available in two sizes: 9 billion (9B) and 27 billion (27B) parameters. These models are designed to deliver high performance and efficiency, making them ideal for a wide range of AI applications. The 27B model, in particular, offers competitive performance compared to models more than twice its size, running efficiently on a single NVIDIA H100 Tensor Core GPU or TPU host. This efficiency significantly reduces deployment costs while maintaining top-tier performance.
Key Features of Gemma 2
Outstanding Performance
- Performance: Gemma 2’s 27B model provides class-leading performance, outperforming many larger models. The 9B model also excels, surpassing competitors in its size category.
- Efficiency: The 27B model runs inference efficiently on a single Google Cloud TPU host, NVIDIA A100 80GB Tensor Core GPU, or NVIDIA H100 Tensor Core GPU, reducing costs and making high-performance AI more accessible.
Versatile Hardware Compatibility
- Speed: Gemma 2 is optimized for fast inference across various hardware platforms, from gaming laptops to cloud-based setups. Users can experience its full precision capabilities in Google AI Studio or on their home computers using an NVIDIA RTX or GeForce RTX via Hugging Face Transformers.
Integration and Accessibility
- Open and Accessible: Gemma 2 is available under a commercially-friendly license, allowing developers and researchers to share and commercialize their innovations.
- Framework Compatibility: It is compatible with major AI frameworks like Hugging Face Transformers, JAX, PyTorch, and TensorFlow. Integration with NVIDIA TensorRT-LLM and optimization for NVIDIA NeMo is also supported.
Easy Deployment
- Google Cloud: Starting next month, Gemma 2 can be easily deployed and managed on Google Cloud’s Vertex AI. The new Gemma Cookbook provides practical examples and recipes for building applications and fine-tuning models.
Commitment to Responsible AI Development
Google emphasizes responsible AI development with Gemma 2. The model has been developed following rigorous safety processes, including filtering pre-training data and extensive testing to mitigate biases and risks. Google has also open-sourced tools like the LLM Comparator for evaluating language models and is working on open sourcing text watermarking technology, SynthID, for Gemma models.
Real-World Applications
The first Gemma model saw over 10 million downloads and numerous innovative projects, such as Navarasa’s linguistic diversity model. Gemma 2 aims to inspire even more ambitious projects by offering enhanced performance and versatility. Google plans to continue exploring new architectures and developing specialized Gemma variants for various AI tasks.
Getting Started with Gemma 2
Gemma 2 is now available in Google AI Studio for testing at full performance capabilities. Model weights can be downloaded from Kaggle and Hugging Face Models, with Vertex AI Model Garden integration coming soon. To support research and development, Gemma 2 is available free of charge through Kaggle or a free tier for Colab notebooks. Academic researchers can apply for Google Cloud credits to accelerate their research with Gemma 2 through the Gemma 2 Academic Research Program.
For more information and to start exploring Gemma 2, visit the technical report and check out the practical examples in the Gemma Cookbook.