Google DeepMind

Google's AI research lab, maker of Gemini

https://deepmind.google

News

product updateGoogle DeepMind

Google DeepMind connects Genie world model to 280 billion Street View images, Waymo already using for self-driving train

Google DeepMind has integrated its Genie world model with Street View's 280 billion images spanning 110 countries, enabling users to explore AI-generated simulations of real locations. Waymo is already using Genie 3 to train self-driving cars on rare scenarios like tornadoes and unexpected obstacles.

2 min read
model releaseGoogle DeepMind

Google DeepMind Releases Gemma 4 26B A4B Assistant Model for 2x Faster Inference via Multi-Token Prediction

Google DeepMind has released a Multi-Token Prediction assistant model for Gemma 4 26B A4B that achieves up to 2x decoding speedup through speculative decoding. The model uses 3.8B active parameters from a 25.2B total parameter MoE architecture with 128 experts and a 256K token context window.

2 min read
model releaseGoogle DeepMind

Google DeepMind releases Gemma 4 with 31B dense model, 256K context window, and speculative decoding drafters

Google DeepMind has released Gemma 4, a family of open-weight multimodal models including a 31B dense model with 256K context window and four size variants ranging from 2.3B to 30.7B effective parameters. The release includes Multi-Token Prediction (MTP) draft models that achieve up to 2x decoding speedup through speculative decoding while maintaining identical output quality.

3 min read
model releaseGoogle DeepMind

Google DeepMind releases Gemini 3.1 Flash TTS with audio tags for precise speech control across 70+ languages

Google DeepMind launched Gemini 3.1 Flash TTS, a text-to-speech model that achieved an Elo score of 1,211 on the Artificial Analysis TTS leaderboard. The model introduces audio tags that allow developers to control vocal style, pace, and delivery through natural language commands embedded in text input, with support for 70+ languages.

2 min read
model releaseGoogle DeepMind

Google DeepMind releases Gemma 4 with four model sizes, up to 256K context, multimodal support

Google DeepMind released Gemma 4, an open-weights multimodal model family in four sizes (2.3B to 31B parameters) with context windows up to 256K tokens. All models support text and image input, with audio native to E2B and E4B variants. The Gemma 4 31B dense model scores 85.2% on MMLU Pro, 89.2% on AIME 2026, and 80.0% on LiveCodeBench—significant improvements over Gemma 3.

2 min read
model releaseGoogle DeepMind

Google DeepMind releases Gemma 4 family: multimodal models from 2.3B to 31B parameters with 256K context

Google DeepMind released the Gemma 4 family of open-weights multimodal models in four sizes: E2B (2.3B effective parameters), E4B (4.5B effective), 26B A4B (3.8B active parameters), and 31B dense. All models support text and image input with 128K-256K context windows; E2B and E4B add native audio capabilities. Models feature reasoning modes, function calling, and multilingual support across 140+ languages.

3 min read
model releaseGoogle DeepMind

NVIDIA releases Gemma 4 31B quantized model with 256K context, multimodal capabilities

NVIDIA has released a quantized version of Google DeepMind's Gemma 4 31B IT model, compressed to NVFP4 format for efficient inference on consumer GPUs. The 30.7B-parameter multimodal model supports 256K token context windows, handles text and image inputs with video frame processing, and maintains near-baseline performance across reasoning and coding benchmarks.

2 min read
model releaseGoogle DeepMind

Google DeepMind releases Gemma 4 with multimodal reasoning and up to 256K context window

Google DeepMind released Gemma 4, a multimodal model family supporting text, images, video, and audio with context windows up to 256K tokens. The release includes four sizes (E2B, E4B, 26B A4B, and 31B) designed for deployment from mobile devices to servers. The 31B dense model achieves 85.2% on MMLU Pro and 89.2% on AIME 2026.

3 min read
model releaseGoogle DeepMind

Google DeepMind releases Gemma 4 with four models up to 31B parameters, 256K context window

Google DeepMind released Gemma 4, an open-weights multimodal model family in four sizes (E2B, E4B, 26B A4B, 31B) with context windows up to 256K tokens and native reasoning capabilities. The 26B A4B variant uses Mixture-of-Experts architecture with 3.8B active parameters for efficient inference. All models support text, image input and handle 140+ languages with Apache 2.0 licensing.

2 min read
model releaseGoogle DeepMind

Google DeepMind releases Gemma 4 open models with up to 256K context and multimodal reasoning

Google DeepMind has released Gemma 4, an open-weights model family in four sizes (2.3B to 31B parameters) with multimodal capabilities handling text, images, video, and audio. The 26B A4B variant uses mixture-of-experts to achieve 4B active parameters while supporting 256K token context windows and native reasoning modes.

3 min read
model releaseGoogle DeepMind

Google DeepMind releases Gemma 4 family with 256K context window and multimodal capabilities

Google DeepMind released the Gemma 4 family of open-weights models in four sizes (2.3B to 31B parameters) with multimodal support for text, images, video, and audio. The flagship 31B model achieves 85.2% on MMLU Pro and 89.2% on AIME 2024, with context windows up to 256K tokens. All models feature configurable reasoning modes and are optimized for deployment from mobile devices to servers under Apache 2.0 license.

3 min read
model releaseGoogle DeepMind

Google DeepMind releases Gemma 4 with 4 model sizes, 256K context, and multimodal reasoning

Google DeepMind released Gemma 4, a family of open-weights multimodal models in four sizes: E2B (2.3B effective), E4B (4.5B effective), 26B A4B (3.8B active), and 31B (30.7B parameters). All models support text and image input with 128K-256K context windows, while E2B and E4B add native audio capabilities and reasoning modes across 140+ languages.

3 min read
model releaseGoogle DeepMind

Google DeepMind releases Gemma 4 open models with multimodal capabilities and 256K context window

Google DeepMind released the Gemma 4 family of open-source models with multimodal capabilities (text, image, audio, video) and context windows up to 256K tokens. Four distinct model sizes—E2B (2.3B effective parameters), E4B (4.5B effective), 26B A4B (3.8B active), and 31B—are available under the Apache 2.0 license, with instruction-tuned and pre-trained variants.

3 min read
model releaseGoogle DeepMind

Google DeepMind releases Gemma 4: multimodal models up to 31B parameters with 256K context

Google DeepMind released the Gemma 4 family of open-weights multimodal models in four sizes: E2B (2.3B effective), E4B (4.5B effective), 26B A4B (25.2B total, 3.8B active), and 31B dense. All models support text and image input with 128K-256K context windows, reasoning modes, and native function calling for agentic workflows.

2 min read
researchGoogle DeepMind

Google Deepmind identifies six attack categories that can hijack autonomous AI agents

A Google Deepmind paper introduces the first systematic framework for 'AI agent traps'—attacks that exploit autonomous agents' vulnerabilities to external tools and internet access. The researchers identify six attack categories targeting perception, reasoning, memory, actions, multi-agent networks, and human supervisors, with proof-of-concept demonstrations for each.

3 min read
product updateGoogle DeepMind

Google DeepMind launches Lyria 3 Pro with 3-minute track generation and structural awareness

Google DeepMind introduced Lyria 3 Pro, an advanced music generation model capable of creating tracks up to 3 minutes long with structural awareness of musical composition elements like intros, verses, choruses, and bridges. The model is rolling out across multiple Google products including Vertex AI, Google Vids, Gemini app, and the new ProducerAI collaborative tool.

2 min read
changelogGoogle DeepMind

Google DeepMind's Gemini 3.1 Flash-Lite generates websites in real time, 2.5x faster than predecessor

Google DeepMind released Gemini 3.1 Flash-Lite, a model that generates functional websites in real time through a new pseudo-browser demo. The model achieves first response token 2.5 times faster than Gemini 2.5 Flash and outputs over 360 tokens per second, though output pricing has tripled from $0.40 to $1.50 per million tokens.

1 min read

Models

Gemini 3.5 Flash

Google DeepMind

active

May 22, 2026

Gemini 3.5 Flash

Google DeepMind

active

May 21, 2026

Gemini Omni Flash

Google DeepMind

active

May 13, 2026

Gemma 4 E4B Assistant

Google DeepMind

active
Context128K

May 10, 2026

Gemini 3.1 Flash Lite

Google DeepMind

active
Context1049K
Input/1M$0.25

May 7, 2026

Gemma 4 31B IT Assistant (MTP Drafter)

Google DeepMind

active
Context256K

May 6, 2026

Gemma 4 26B A4B Assistant

Google DeepMind

active
Context256K

May 6, 2026

Google Gemini Flash Latest

Google DeepMind

active
Context1049K

Apr 27, 2026

Google Gemini Pro Latest

Google DeepMind

active
Context1049K

Apr 27, 2026

Gemma 4 E2B VLA

Google DeepMind

active
Context2K

Apr 22, 2026

Gemini 3.1 Flash TTS

Google DeepMind

active
Input/1M$1

Apr 15, 2026

Gemini 3.1 Flash TTS

Google DeepMind

active

Apr 15, 2026

Gemma 4 E4B Instruction-Tuned

Google DeepMind

active
Context128K

Apr 4, 2026

Gemma 4 26B A4B

Google DeepMind

active
Context262K

Apr 3, 2026

Gemma 4 31B Dense

Google DeepMind

active
Context256K

Apr 2, 2026

Gemma 4 31B Instruct

Google DeepMind

active
Context262K
Input/1M$0.14

Apr 2, 2026

Gemma 4 E2B Instruction-Tuned

Google DeepMind

active
Context128K

Apr 2, 2026

Gemma 4 31B

Google DeepMind

active
Context256K

Apr 2, 2026

Gemma 4 31B Instruct

Google DeepMind

active
Context262K
0

Apr 2, 2026

Veo 3.1 Lite

Google DeepMind

active

Mar 31, 2026

Google Lyria 3 Pro Preview

Google DeepMind

active
Context1049K
0

Mar 30, 2026

Google Lyria 3 Clip Preview

Google DeepMind

active
Context1049K
0

Mar 30, 2026

Gemini 3.1 Flash Live

Google DeepMind

active

Mar 26, 2026

Lyria 3 Pro

Google DeepMind

active

Mar 25, 2026

Gemini (Ask Maps)

Google DeepMind

deprecated

Mar 12, 2026

Gemini 3.1 Flash Lite Preview

Google DeepMind

active

Google's high-efficiency model optimized for high-volume use cases. Outperforms Gemini 2.5 Flash Lite on overall quality and approaches Gemini 2.5 Flash performance. Improvements span audio input/ASR and RAG.

Context1050K
Input/1M$0.25

Mar 2, 2026

Nano Banana 2 (Gemini 3.1 Flash Image Preview)

Google DeepMind

active

Google's latest state-of-the-art image generation and editing model, delivering Pro-level visual quality at Flash speed. Combines advanced contextual understanding with fast, cost-efficient inference.

Context66K
Input/1M$0.5

Feb 26, 2026

Lyria 3

Google DeepMind

discontinued

Feb 19, 2026

Gemini 3.1 Pro

Google DeepMind

active

Google's most capable Gemini model. Next-generation reasoning with thinking mode and 1M context.

Context1000K
Input/1M$2

Feb 19, 2026

Gemini 3.0 Pro

Google DeepMind

deprecated

Gemini 3 Pro at launch. Superseded by Gemini 3.1 Pro. Introduced next-generation multimodal reasoning.

Context1000K
Input/1M$1.25

Dec 10, 2025

Gemini 2.5 Flash

Google DeepMind

active

Google's fastest thinking model. Configurable reasoning budget.

Context1000K
Input/1M$0.15

May 20, 2025

Gemma 4 E2B

Google DeepMind

active
0

Apr 11, 2025

Gemma 4 26B A4B IT

Google DeepMind

active
Context262K
0

Apr 3, 2025

Gemini 2.5 Pro

Google DeepMind

active

Google's top reasoning model with thinking mode. Leads on coding benchmarks.

Context1000K
Input/1M$1.25

Mar 25, 2025

Gemini 2.0 Flash

Google DeepMind

active

Google's fastest production model with 1M context and native multimodal I/O.

Context1000K
Input/1M$0.1

Feb 5, 2025

Gemini 2.0 Flash-Lite

Google DeepMind

active

Most cost-efficient Gemini model. Great throughput for high-volume applications.

Context1000K
Input/1M$0.075

Feb 5, 2025

Gemini 2.0 Flash

Google DeepMind

active
Context1049K
Input/1M$0.1

Feb 5, 2025

Lyria 3 Pro

Google DeepMind

active

Jan 16, 2025

Gemma 4

Google DeepMind

active
Context256K

Jan 8, 2025

Gemini 1.5 Flash

Google DeepMind

deprecated

Fast Gemini 1.5 variant. Superseded by Gemini 2.0 Flash.

Context1000K
Input/1M$0.075

May 15, 2024

Gemini 1.5 Pro

Google DeepMind

deprecated

Google's first 2M context model. Superseded by Gemini 2.x series.

Context2000K
Input/1M$1.25

May 15, 2024

Top Benchmark Scores

Full leaderboard →
92%
93%
96.8%
95.8%

LiveCodeBench

Gemma 4 31B
80%
91.4%
88.4%
360 tokens_per_sec

SWE-bench Verified

Gemini 3.1 Pro
80.6%