Google DeepMind

Google's AI research lab, maker of Gemini

https://deepmind.google

News

model releaseGoogle DeepMind

Google DeepMind releases Gemma 4 with four model sizes, up to 256K context, multimodal support

Google DeepMind released Gemma 4, an open-weights multimodal model family in four sizes (2.3B to 31B parameters) with context windows up to 256K tokens. All models support text and image input, with audio native to E2B and E4B variants. The Gemma 4 31B dense model scores 85.2% on MMLU Pro, 89.2% on AIME 2026, and 80.0% on LiveCodeBench—significant improvements over Gemma 3.

2 min read
model releaseGoogle DeepMind

Google DeepMind releases Gemma 4 family: multimodal models from 2.3B to 31B parameters with 256K context

Google DeepMind released the Gemma 4 family of open-weights multimodal models in four sizes: E2B (2.3B effective parameters), E4B (4.5B effective), 26B A4B (3.8B active parameters), and 31B dense. All models support text and image input with 128K-256K context windows; E2B and E4B add native audio capabilities. Models feature reasoning modes, function calling, and multilingual support across 140+ languages.

3 min read
model releaseGoogle DeepMind

NVIDIA releases Gemma 4 31B quantized model with 256K context, multimodal capabilities

NVIDIA has released a quantized version of Google DeepMind's Gemma 4 31B IT model, compressed to NVFP4 format for efficient inference on consumer GPUs. The 30.7B-parameter multimodal model supports 256K token context windows, handles text and image inputs with video frame processing, and maintains near-baseline performance across reasoning and coding benchmarks.

2 min read
model releaseGoogle DeepMind

Google DeepMind releases Gemma 4 with multimodal reasoning and up to 256K context window

Google DeepMind released Gemma 4, a multimodal model family supporting text, images, video, and audio with context windows up to 256K tokens. The release includes four sizes (E2B, E4B, 26B A4B, and 31B) designed for deployment from mobile devices to servers. The 31B dense model achieves 85.2% on MMLU Pro and 89.2% on AIME 2026.

3 min read
model releaseGoogle DeepMind

Google DeepMind releases Gemma 4 with four models up to 31B parameters, 256K context window

Google DeepMind released Gemma 4, an open-weights multimodal model family in four sizes (E2B, E4B, 26B A4B, 31B) with context windows up to 256K tokens and native reasoning capabilities. The 26B A4B variant uses Mixture-of-Experts architecture with 3.8B active parameters for efficient inference. All models support text, image input and handle 140+ languages with Apache 2.0 licensing.

2 min read
model releaseGoogle DeepMind

Google DeepMind releases Gemma 4 open models with up to 256K context and multimodal reasoning

Google DeepMind has released Gemma 4, an open-weights model family in four sizes (2.3B to 31B parameters) with multimodal capabilities handling text, images, video, and audio. The 26B A4B variant uses mixture-of-experts to achieve 4B active parameters while supporting 256K token context windows and native reasoning modes.

3 min read
model releaseGoogle DeepMind

Google DeepMind releases Gemma 4 family with 256K context window and multimodal capabilities

Google DeepMind released the Gemma 4 family of open-weights models in four sizes (2.3B to 31B parameters) with multimodal support for text, images, video, and audio. The flagship 31B model achieves 85.2% on MMLU Pro and 89.2% on AIME 2024, with context windows up to 256K tokens. All models feature configurable reasoning modes and are optimized for deployment from mobile devices to servers under Apache 2.0 license.

3 min read
model releaseGoogle DeepMind

Google DeepMind releases Gemma 4 with 4 model sizes, 256K context, and multimodal reasoning

Google DeepMind released Gemma 4, a family of open-weights multimodal models in four sizes: E2B (2.3B effective), E4B (4.5B effective), 26B A4B (3.8B active), and 31B (30.7B parameters). All models support text and image input with 128K-256K context windows, while E2B and E4B add native audio capabilities and reasoning modes across 140+ languages.

3 min read
model releaseGoogle DeepMind

Google DeepMind releases Gemma 4 open models with multimodal capabilities and 256K context window

Google DeepMind released the Gemma 4 family of open-source models with multimodal capabilities (text, image, audio, video) and context windows up to 256K tokens. Four distinct model sizes—E2B (2.3B effective parameters), E4B (4.5B effective), 26B A4B (3.8B active), and 31B—are available under the Apache 2.0 license, with instruction-tuned and pre-trained variants.

3 min read
model releaseGoogle DeepMind

Google DeepMind releases Gemma 4: multimodal models up to 31B parameters with 256K context

Google DeepMind released the Gemma 4 family of open-weights multimodal models in four sizes: E2B (2.3B effective), E4B (4.5B effective), 26B A4B (25.2B total, 3.8B active), and 31B dense. All models support text and image input with 128K-256K context windows, reasoning modes, and native function calling for agentic workflows.

2 min read
researchGoogle DeepMind

Google Deepmind identifies six attack categories that can hijack autonomous AI agents

A Google Deepmind paper introduces the first systematic framework for 'AI agent traps'—attacks that exploit autonomous agents' vulnerabilities to external tools and internet access. The researchers identify six attack categories targeting perception, reasoning, memory, actions, multi-agent networks, and human supervisors, with proof-of-concept demonstrations for each.

3 min read
product updateGoogle DeepMind

Google DeepMind launches Lyria 3 Pro with 3-minute track generation and structural awareness

Google DeepMind introduced Lyria 3 Pro, an advanced music generation model capable of creating tracks up to 3 minutes long with structural awareness of musical composition elements like intros, verses, choruses, and bridges. The model is rolling out across multiple Google products including Vertex AI, Google Vids, Gemini app, and the new ProducerAI collaborative tool.

2 min read
changelogGoogle DeepMind

Google DeepMind's Gemini 3.1 Flash-Lite generates websites in real time, 2.5x faster than predecessor

Google DeepMind released Gemini 3.1 Flash-Lite, a model that generates functional websites in real time through a new pseudo-browser demo. The model achieves first response token 2.5 times faster than Gemini 2.5 Flash and outputs over 360 tokens per second, though output pricing has tripled from $0.40 to $1.50 per million tokens.

1 min read
product updateGoogle DeepMind

Google Deepmind adds multi-tool chaining and context circulation to Gemini API

Google Deepmind has expanded the Gemini API to enable multi-tool chaining, allowing developers to combine built-in tools like Google Search and Google Maps with custom functions in a single request. Results from one tool now automatically pass to the next through context circulation, eliminating the need for separate sequential handling.

2 min read
model releaseGoogle DeepMind

Google DeepMind releases Nano Banana 2 image model with Pro-level capabilities at faster speeds

Google DeepMind has released Nano Banana 2, an image generation model that combines advanced world knowledge and subject consistency with faster inference speeds comparable to its Flash offering. The model is positioned as production-ready with capabilities previously associated with Pro-tier performance.

2 min read

Models

Gemma 4 E2B

Google DeepMind

active
Context128K

Apr 6, 2026

Gemma 4 E4B Instruction-Tuned

Google DeepMind

active
Context128K

Apr 4, 2026

Gemma 4 26B A4B

Google DeepMind

active
Context262K

Apr 3, 2026

Gemma 4 31B Dense

Google DeepMind

active
Context256K

Apr 2, 2026

Gemma 4 31B Instruct

Google DeepMind

active
Context262K
Input/1M$0.14

Apr 2, 2026

Gemma 4 E2B Instruction-Tuned

Google DeepMind

active
Context128K

Apr 2, 2026

Gemma 4 31B

Google DeepMind

active
Context256K

Apr 2, 2026

Gemma 4 31B Instruct

Google DeepMind

active
Context262K
0

Apr 2, 2026

Veo 3.1 Lite

Google DeepMind

active

Mar 31, 2026

Google Lyria 3 Pro Preview

Google DeepMind

active
Context1049K
0

Mar 30, 2026

Google Lyria 3 Clip Preview

Google DeepMind

active
Context1049K
0

Mar 30, 2026

Gemini 3.1 Flash Live

Google DeepMind

active

Mar 26, 2026

Lyria 3 Pro

Google DeepMind

active

Mar 25, 2026

Gemini (Ask Maps)

Google DeepMind

deprecated

Mar 12, 2026

Gemini 3.1 Flash Lite Preview

Google DeepMind

active

Google's high-efficiency model optimized for high-volume use cases. Outperforms Gemini 2.5 Flash Lite on overall quality and approaches Gemini 2.5 Flash performance. Improvements span audio input/ASR and RAG.

Context1050K
Input/1M$0.25

Mar 2, 2026

Gemini 3.1 Flash-Lite

Google DeepMind

active

Mar 1, 2026

Nano Banana 2 (Gemini 3.1 Flash Image Preview)

Google DeepMind

active

Google's latest state-of-the-art image generation and editing model, delivering Pro-level visual quality at Flash speed. Combines advanced contextual understanding with fast, cost-efficient inference.

Context66K
Input/1M$0.5

Feb 26, 2026

Lyria 3

Google DeepMind

discontinued

Feb 19, 2026

Gemini 3.1 Pro

Google DeepMind

active

Google's most capable Gemini model. Next-generation reasoning with thinking mode and 1M context.

Context1000K
Input/1M$2

Feb 19, 2026

Gemini 3.0 Pro

Google DeepMind

deprecated

Gemini 3 Pro at launch. Superseded by Gemini 3.1 Pro. Introduced next-generation multimodal reasoning.

Context1000K
Input/1M$1.25

Dec 10, 2025

Gemini 2.5 Flash

Google DeepMind

active

Google's fastest thinking model. Configurable reasoning budget.

Context1000K
Input/1M$0.15

May 20, 2025

Gemma 4 26B A4B IT

Google DeepMind

active
Context262K
0

Apr 3, 2025

Gemini 2.5 Pro

Google DeepMind

active

Google's top reasoning model with thinking mode. Leads on coding benchmarks.

Context1000K
Input/1M$1.25

Mar 25, 2025

Gemini 2.0 Flash

Google DeepMind

active

Google's fastest production model with 1M context and native multimodal I/O.

Context1000K
Input/1M$0.1

Feb 5, 2025

Gemini 2.0 Flash-Lite

Google DeepMind

active

Most cost-efficient Gemini model. Great throughput for high-volume applications.

Context1000K
Input/1M$0.075

Feb 5, 2025

Gemini 2.0 Flash

Google DeepMind

active
Context1049K
Input/1M$0.1

Feb 5, 2025

Lyria 3 Pro

Google DeepMind

active

Jan 16, 2025

Gemma 4

Google DeepMind

active
Context256K

Jan 8, 2025

Gemini 1.5 Flash

Google DeepMind

deprecated

Fast Gemini 1.5 variant. Superseded by Gemini 2.0 Flash.

Context1000K
Input/1M$0.075

May 15, 2024

Gemini 1.5 Pro

Google DeepMind

deprecated

Google's first 2M context model. Superseded by Gemini 2.x series.

Context2000K
Input/1M$1.25

May 15, 2024

Top Benchmark Scores

Full leaderboard →
92%
93%
96.8%
95.8%

LiveCodeBench

Gemma 4 31B
80%
91.4%
88.4%
360 tokens_per_sec

SWE-bench Verified

Gemini 3.1 Pro
80.6%