Gemini 3.1 Flash Lite

Name: Gemini 3.1 Flash Lite
Price: 0.25 USD
Author: Google DeepMind

Google DeepMind🇺🇸 United States

active

Compare with other models →

Context window1049K tokens

Input / 1M tokens$0.25

Output / 1M tokens$1.5

Version History

3.1minorMay 7, 2026

Google launches Flash Lite variant of Gemini 3.1 at 50% cost reduction with maintained 1M context window and four-level reasoning system.

3.1 Flash-LiteminorMarch 1, 2026

Gemini 3.1 Flash-Lite achieves 2.5x faster first-token latency than Gemini 2.5 Flash with 360 tokens/second throughput. Output pricing increased to $1.50 per million tokens from $0.40.

Benchmark Scores

Full leaderboard →

360.0 tokens_per_sec

Speed (tok/s)

Coverage

model release

Google releases Gemini 3.1 Flash Lite with 1M context at $0.25 per million input tokens

Google has released Gemini 3.1 Flash Lite, a high-efficiency multimodal model with a 1,048,576 token context window priced at $0.25 per million input tokens and $1.50 per million output tokens. The model supports text, image, video, audio, and PDF inputs with four thinking levels for cost-performance optimization.

May 7, 2026 · 4:21 PM2 min read

google gemini multimodal

changelogGoogle DeepMind

Google DeepMind's Gemini 3.1 Flash-Lite generates websites in real time, 2.5x faster than predecessor

Google DeepMind released Gemini 3.1 Flash-Lite, a model that generates functional websites in real time through a new pseudo-browser demo. The model achieves first response token 2.5 times faster than Gemini 2.5 Flash and outputs over 360 tokens per second, though output pricing has tripled from $0.40 to $1.50 per million tokens.

March 24, 2026 · 7:20 PM1 min read

gemini google-deepmind model-release