Google DeepMind

1 article tagged with Google DeepMind

May 10, 2026
model releaseGoogle DeepMind

Google DeepMind Releases Gemma 4 E4B with Multi-Token Prediction for 2x Faster Inference

Google DeepMind released the Gemma 4 E4B assistant model using Multi-Token Prediction (MTP) architecture that accelerates inference by up to 2x through speculative decoding. The 4.5B effective parameter model supports 128K context windows and handles text, image, and audio input with pricing not yet disclosed.