Newsletter image

Subscribe to the Newsletter

Join 10k+ people to get notified about new posts, news and tips.

Do not worry we don't spam!

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Search

GDPR Compliance

We use cookies to ensure you get the best experience on our website. By continuing to use our site, you accept our use of cookies, Privacy Policy, and Terms of Service.

Google - Conversational AI

Google Gemma 4

Google DeepMind has released Gemma 4, a family of open-weight language models in four sizes (E2B, E4B, 26B MoE, and 31B Dense) under the Apache 2.0 license. The new release brings dramatic benchmark gains over Gemma 3, with AIME 2026 math jumping from 20.8% to 89.2%, LiveCodeBench coding from 29.1% to 80.0%, and GPQA science from 42.4% to 84.3%. The flagship 31B Instruct variant ranks #3 on Arena AI text leaderboard at 1452 Elo, outperforming closed models twenty times its size. Gemma 4 ships with day-one support for Hugging Face, Kaggle, Ollama, and Google Cloud Vertex AI.
2026-04-09
Updated 2026-04-09 17:58:26

Google DeepMind just dropped Gemma 4 on April 2, 2026, and it is a massive jump from last year's Gemma 3. The new family ships in four sizes under a fully permissive Apache 2.0 license, with day-one support across Hugging Face, Kaggle, Ollama, Google Cloud Vertex AI, and the Transformers library. For anyone who wanted open weights that can actually compete with the frontier API models, this is the release to pay attention to.

Four Sizes, Two Architectures

Gemma 4 arrives in four variants tailored for different deployment scenarios:

  • Gemma 4 E2B: 2.3 billion effective parameters, optimized for on-device and laptop inference.
  • Gemma 4 E4B: 4.5 billion effective parameters, a sweet spot for mid-range consumer GPUs.
  • Gemma 4 26B A4B: a Mixture-of-Experts model that activates just 3.8 billion parameters per token while drawing on a 26 billion parameter pool, delivering frontier quality at a fraction of the compute cost.
  • Gemma 4 31B Dense: the flagship, a dense architecture tuned for maximum raw quality and fine-tuning headroom.

Benchmark Jumps That Actually Matter

The improvements over Gemma 3 are not marginal. They are dramatic:

  • AIME 2026 (math): 20.8% on Gemma 3, 89.2% on Gemma 4. More than four times better.
  • LiveCodeBench (coding): 29.1% to 80.0%.
  • GPQA Diamond (science): 42.4% to 84.3%.

The flagship Gemma 4 31B Instruct now ranks #3 on Arena AI's text leaderboard with a 1452 Elo score, outperforming models twenty times its size. This is the first time an open-weight model at this scale has reached the top tier of the leaderboard without a custom license or usage restriction.

Apache 2.0 Changes Everything

Earlier Gemma releases shipped under a custom Google license with its own quirks. Gemma 4 drops that entirely and ships under Apache 2.0. You can use it commercially, modify it, fine-tune it, redistribute derivatives, and ship it in products without asking permission. For teams building production AI, this removes the last reason to prefer Llama or Qwen purely for licensing comfort.

Ecosystem Day-One Support

Google did the work to make Gemma 4 accessible from day one. You can:

  1. Download weights from Hugging Face, Kaggle, or Ollama.
  2. Run locally via Transformers, TRL, Transformers.js, Candle, llama.cpp, or MLX on Apple Silicon.
  3. Fine-tune on a single H100 for the 31B variant, or a consumer GPU for the smaller sizes.
  4. Deploy to Google Cloud Vertex AI, AWS, or any Kubernetes cluster.

Why This Release Matters

Gemma 4 is the clearest signal yet that the gap between open-weight and closed-API frontier models has closed. A 31 billion parameter dense model from Google now beats closed models twenty times its size on public benchmarks, runs on hardware most teams already own, and ships under Apache 2.0. If you skipped Gemma 3 because the license felt restrictive or the benchmarks felt behind, Gemma 4 is the one to actually evaluate.

Read the official announcement, grab weights from Google DeepMind, or jump straight to Hugging Face to start building.

 

Prev Article
Black Forest Labs Drops FLUX.2
Next Article
OpenThinker-32B

Related to this topic: