Google has unleashed Gemma 4; a brand-new family of open AI models; and the developer community is doing a collective happy dance.

Built on the same "secret sauce" powering Gemini 3 Gemma 4 comes in multiple sizes, from a tiny 2 billion parameters to a massive 31 billion. Whether you are running AI on a laptop or a beefy server farm, there is a Gemma 4 with your name on it.

Meet the Four Models: 

But first get this: Gemma 4 is organized into two distinct tiers. 

1. Edge Tier: These models are designed to live natively on your devices.

  • E2B (The Tiny Giant): Don't let the name fool you. It looks like a 2B model but packs 5.1 billion parameters using a clever "Per-Layer Embeddings" trick. It handles text, images, and audio natively for live offline translation.

  • E4B (The Pocket Mathlete): Still phone-friendly but with serious muscle. It scored 42.5% on the AIME 2026 math test; it’s jaw-dropping for something that fits in your pocket.

Both edge models support a 128K-token context window — that's roughly the length of an entire novel worth of memory per conversation.

2. The Workstation Tier is built for developers who need local "frontier-class" intelligence.

  • 26B A4B (The Specialist): A Mixture-of-Experts (MoE) model. It carries 25.2 billion parameters but only "wakes up" 3.8 billion at a time. You get 30B-class intelligence at the speed and cost of a 4B model.

  • 31B Dense (The Absolute Unit): All 31 billion parameters are active all the time. It scored an eye-watering 89.2% on AIME 2026; beating models 20 times its size on the Arena AI leaderboard.

Unlike older models that had vision and audio awkwardly "stitched" on, Gemma 4 is natively multimodal from scratch:

  • It reads images, documents, and video frames with high-detail OCR.

  • Live Audio: On-device speech recognition and translation (Edge models only).

  • Function Calling: All four models are trained from the ground up to use tools and interact with software for complex, multi-step tasks.

  • Global Fluency: Supports 140+ languages natively.

But Here’s where it gets legendary: Previous Gemma models came with Google's own custom license that had legal teams sweating, but Google slapped an Apache 2.0 license on this entire family, and in case you don't know, this is  the same no-strings-attached license used by most of the open-source AI world. 

In plain English? Startups and indie builders can grab these models, use them commercially, modify them, and build empires without paying Google a single cent. In a world of restrictive licenses, this is the AI equivalent of finding a sports car with free lifetime fuel.

The Reality Check: Gemma 4 is straight-up embarrassing its predecessor. The tiny E2B now outperforms the old Gemma 3 27B on major benchmarks. It’s like the intern outperforming last year’s senior manager. Awkward, but amazing for us.

you can grab the weights right now on Hugging Face, Kaggle, or Ollama. Or go learn more here.

Reply

Avatar

or to participate

More From The Automated