google-deepmind
44 articles tagged with google-deepmind
Google DeepMind Releases Gemma 4 26B A4B Assistant Model for 2x Faster Inference via Multi-Token Prediction
Google DeepMind has released a Multi-Token Prediction assistant model for Gemma 4 26B A4B that achieves up to 2x decoding speedup through speculative decoding. The model uses 3.8B active parameters from a 25.2B total parameter MoE architecture with 128 experts and a 256K token context window.
Google tests Remy AI agent internally, designed to act autonomously across Gemini services
Google is testing Remy, an AI personal agent for Gemini that can take actions on users' behalf across Google services, according to Business Insider. The tool is currently in employee-only testing with no confirmed public release date.
Google DeepMind releases Gemma 4 with 31B dense model, 256K context window, and speculative decoding drafters
Google DeepMind has released Gemma 4, a family of open-weight multimodal models including a 31B dense model with 256K context window and four size variants ranging from 2.3B to 30.7B effective parameters. The release includes Multi-Token Prediction (MTP) draft models that achieve up to 2x decoding speedup through speculative decoding while maintaining identical output quality.
Google Opens Gemini Notebooks to Free Users with 50-Source Limit
Google has expanded its Notebooks feature in the Gemini app to free users, allowing them to organize chats and files with up to 50 sources per notebook. The feature, which integrates with NotebookLM, was previously available only to Google AI subscribers.
Google adds Nano Banana image generation to Gemini Personal Intelligence, using Gmail and Photos data
Google has integrated its Nano Banana image generation system with Gemini's Personal Intelligence feature, enabling the AI to create images informed by user data from Gmail, Photos, Calendar, Drive, and other Google apps. The feature rolls out to Plus, Pro, and Ultra subscribers in the US first, with Europe excluded from the initial launch.
Google's Gemini now generates personalized images using your Google Photos library
Google's Gemini can now generate personalized images by pulling data from users' Google Photos libraries through its Personal Intelligence feature. The integration uses Google Photos labels to identify people and objects, then generates images via the Nano Banana 2 model that reflect users' tastes and lifestyle.
Google DeepMind releases Gemini 3.1 Flash TTS with audio tags for precise speech control across 70+ languages
Google DeepMind launched Gemini 3.1 Flash TTS, a text-to-speech model that achieved an Elo score of 1,211 on the Artificial Analysis TTS leaderboard. The model introduces audio tags that allow developers to control vocal style, pace, and delivery through natural language commands embedded in text input, with support for 70+ languages.
Google Home April 2026 update reduces Gemini interruptions, improves speech recognition in noisy environments
Google Home's April 2026 update addresses Gemini voice assistant reliability issues. The update improves speech detection to reduce mid-sentence interruptions, speeds up responses to simple queries, and enhances music playlist recognition even when names are misspoken or in noisy environments.
Google DeepMind releases Gemma 4 with four model sizes, up to 256K context, multimodal support
Google DeepMind released Gemma 4, an open-weights multimodal model family in four sizes (2.3B to 31B parameters) with context windows up to 256K tokens. All models support text and image input, with audio native to E2B and E4B variants. The Gemma 4 31B dense model scores 85.2% on MMLU Pro, 89.2% on AIME 2026, and 80.0% on LiveCodeBench—significant improvements over Gemma 3.
Google releases Gemma 4 26B with 256K context and multimodal support, free to use
Google DeepMind has released Gemma 4 26B A4B, a free instruction-tuned Mixture-of-Experts model with 262,144 token context window and multimodal capabilities including text, images, and video input. Despite 25.2B total parameters, only 3.8B activate per token, delivering performance comparable to larger 31B models at reduced compute cost.
Google releases Gemma 4 31B free model with 256K context and multimodal support
Google DeepMind has released Gemma 4 31B Instruct, a free 30.7-billion parameter model with a 256K token context window, multimodal text and image input capabilities, and native function calling. The model supports configurable reasoning mode and 140+ languages, with strong performance on coding and document understanding tasks under Apache 2.0 license.
Google DeepMind releases Gemma 4 family: multimodal models from 2.3B to 31B parameters with 256K context
Google DeepMind released the Gemma 4 family of open-weights multimodal models in four sizes: E2B (2.3B effective parameters), E4B (4.5B effective), 26B A4B (3.8B active parameters), and 31B dense. All models support text and image input with 128K-256K context windows; E2B and E4B add native audio capabilities. Models feature reasoning modes, function calling, and multilingual support across 140+ languages.
NVIDIA releases Gemma 4 31B quantized model with 256K context, multimodal capabilities
NVIDIA has released a quantized version of Google DeepMind's Gemma 4 31B IT model, compressed to NVFP4 format for efficient inference on consumer GPUs. The 30.7B-parameter multimodal model supports 256K token context windows, handles text and image inputs with video frame processing, and maintains near-baseline performance across reasoning and coding benchmarks.
Google DeepMind releases Gemma 4 with multimodal reasoning and up to 256K context window
Google DeepMind released Gemma 4, a multimodal model family supporting text, images, video, and audio with context windows up to 256K tokens. The release includes four sizes (E2B, E4B, 26B A4B, and 31B) designed for deployment from mobile devices to servers. The 31B dense model achieves 85.2% on MMLU Pro and 89.2% on AIME 2026.
Gemma 4 success hinges on tooling and fine-tuning ease, not benchmark scores
Google's Gemma 4 release marks a shift in open model strategy with Apache 2.0 licensing and competitive benchmarks, but real success depends on factors rarely measured: tooling stability, fine-tuning ease, and ecosystem adoption. The open model landscape is now crowded with alternatives like Qwen 3.5, Nemotron 3, and others—a maturation that changes what separates winners from the field.
Google DeepMind releases Gemma 4 with four models up to 31B parameters, 256K context window
Google DeepMind released Gemma 4, an open-weights multimodal model family in four sizes (E2B, E4B, 26B A4B, 31B) with context windows up to 256K tokens and native reasoning capabilities. The 26B A4B variant uses Mixture-of-Experts architecture with 3.8B active parameters for efficient inference. All models support text, image input and handle 140+ languages with Apache 2.0 licensing.
Google DeepMind releases Gemma 4, open multimodal models with 256K context and reasoning
Google DeepMind has released Gemma 4, a family of open-weights multimodal models ranging from 2.3B to 31B parameters with support for text, images, video, and audio. The models feature context windows up to 256K tokens, built-in reasoning modes, and native function calling for agentic workflows.
Google DeepMind releases Gemma 4 open models with up to 256K context and multimodal reasoning
Google DeepMind has released Gemma 4, an open-weights model family in four sizes (2.3B to 31B parameters) with multimodal capabilities handling text, images, video, and audio. The 26B A4B variant uses mixture-of-experts to achieve 4B active parameters while supporting 256K token context windows and native reasoning modes.
Google DeepMind releases Gemma 4 family with 256K context window and multimodal capabilities
Google DeepMind released the Gemma 4 family of open-weights models in four sizes (2.3B to 31B parameters) with multimodal support for text, images, video, and audio. The flagship 31B model achieves 85.2% on MMLU Pro and 89.2% on AIME 2024, with context windows up to 256K tokens. All models feature configurable reasoning modes and are optimized for deployment from mobile devices to servers under Apache 2.0 license.
Google launches Gemma 4 open-weights models with Apache 2.0 license to compete with Chinese LLMs
Google released Gemma 4, a new line of open-weights models available in sizes from 2 billion to 31 billion parameters, under a permissive Apache 2.0 license. The release includes multimodal capabilities, support for 140+ languages, native function calling, and a 256,000-token context window for the larger variants.
Google DeepMind releases Gemma 4 with 4 model sizes, 256K context, and multimodal reasoning
Google DeepMind released Gemma 4, a family of open-weights multimodal models in four sizes: E2B (2.3B effective), E4B (4.5B effective), 26B A4B (3.8B active), and 31B (30.7B parameters). All models support text and image input with 128K-256K context windows, while E2B and E4B add native audio capabilities and reasoning modes across 140+ languages.
Google DeepMind releases Gemma 4 open models with multimodal capabilities and 256K context window
Google DeepMind released the Gemma 4 family of open-source models with multimodal capabilities (text, image, audio, video) and context windows up to 256K tokens. Four distinct model sizes—E2B (2.3B effective parameters), E4B (4.5B effective), 26B A4B (3.8B active), and 31B—are available under the Apache 2.0 license, with instruction-tuned and pre-trained variants.
Google DeepMind releases Gemma 4: multimodal models up to 31B parameters with 256K context
Google DeepMind released the Gemma 4 family of open-weights multimodal models in four sizes: E2B (2.3B effective), E4B (4.5B effective), 26B A4B (25.2B total, 3.8B active), and 31B dense. All models support text and image input with 128K-256K context windows, reasoning modes, and native function calling for agentic workflows.
Google releases Gemma 4 31B with 256K context and configurable reasoning mode
Google DeepMind has released Gemma 4 31B, a 30.7-billion-parameter multimodal model supporting text and image input. The model features a 262,144-token context window, configurable thinking/reasoning mode, native function calling, and multilingual support across 140+ languages under Apache 2.0 license.
Google releases Gemma 4 family with 31B model, 256K context, multimodal capabilities
Google DeepMind released the Gemma 4 family of open-weights models ranging from 2.3B to 31B parameters, featuring up to 256K token context windows and native support for text, image, video, and audio inputs. The flagship 31B model scores 85.2% on MMLU Pro and 89.2% on AIME 2026, with a smaller 26B MoE variant requiring only 3.8B active parameters for faster inference.
NVIDIA Optimizes Google Gemma 4 for Local Agentic AI on RTX and Spark
NVIDIA has optimized Google's Gemma 4 models for local deployment on RTX and Spark platforms, targeting the emerging wave of on-device agentic AI. The optimization enables small, efficient models to access real-time local context for autonomous decision-making without cloud dependency.
Google DeepMind releases Gemma 4: open models ranking #3 and #6 on Arena AI leaderboard
Google DeepMind released Gemma 4, a family of four open models ranging from 2B to 31B parameters, all licensed under Apache 2.0. The 31B dense model ranks #3 on Arena AI's text leaderboard and the 26B mixture-of-experts variant ranks #6, outperforming closed models significantly larger in size.
Google Deepmind identifies six attack categories that can hijack autonomous AI agents
A Google Deepmind paper introduces the first systematic framework for 'AI agent traps'—attacks that exploit autonomous agents' vulnerabilities to external tools and internet access. The researchers identify six attack categories targeting perception, reasoning, memory, actions, multi-agent networks, and human supervisors, with proof-of-concept demonstrations for each.
Gemini 3.1 Flash Live scores 95.9% on Big Bench Audio, Google's fastest voice model
Google has released Gemini 3.1 Flash Live, its new voice and audio AI model, scoring 95.9% on the Big Bench Audio Benchmark at high thinking levels—second only to Step-Audio R1.1 Realtime at 97.0%. Response times range from 0.96 seconds at minimal thinking to 2.98 seconds at high thinking, with pricing held at $0.35 per hour of audio input and $1.40 per hour of audio output.
Google releases Gemini 3.1 Flash Live, its highest-quality audio model for real-time voice AI
Google has released Gemini 3.1 Flash Live, its highest-quality audio model designed for natural and reliable real-time voice interactions. The model scores 90.8% on ComplexFuncBench Audio and 36.1% on Scale AI's Audio MultiChallenge with thinking enabled. It's now available to developers via the Gemini Live API, enterprises through Gemini Enterprise for Customer Experience, and consumers in Search Live and Gemini Live across 200+ countries.
Google releases Gemini 3.1 Flash Live, its highest-quality audio model for real-time voice AI
Google has released Gemini 3.1 Flash Live, its highest-quality audio and voice model designed for real-time dialogue. The model scores 90.8% on ComplexFuncBench Audio and 36.1% on Scale AI's Audio MultiChallenge with reasoning enabled, with improved tonal understanding and lower latency compared to previous versions.
Google's TurboQuant compression cuts LLM memory needs by 6x, sparks memory chip stock selloff
Google unveiled TurboQuant, a compression technique that reduces memory required to run large language models by six times by optimizing key-value cache storage. Memory chipmakers Samsung, SK Hynix, and Micron fell 5-6% on concern the efficiency breakthrough could reduce future chip demand. Analysts expect the decline reflects profit-taking rather than a fundamental shift, as more powerful models will eventually require more advanced hardware.
Google's Lyria 3 Pro extends AI music generation to 3-minute songs with structural control
Google released Lyria 3 Pro, an updated music generation model capable of creating full 3-minute songs—six times longer than the 30-second limit of its predecessor launched last month. The new version adds granular control over song structure, allowing users to specify intros, verses, choruses, and bridges. It's available now for paid Gemini users, enterprise customers, and developers via API.
Google's Gemini app now creates 3-minute songs with Lyria 3 Pro
Google announced Lyria 3 Pro, expanding the Gemini app's music generation capability from 30-second tracks to full 3-minute songs. The model improves structural understanding of musical composition, allowing users to prompt for specific elements like intros, verses, choruses, and bridges. Available now for Gemini subscribers with tier-based daily limits (10-50 tracks/day) and in Vertex AI, Google AI Studio, and the Gemini API for developers.
Google DeepMind launches Lyria 3 Pro with 3-minute track generation and structural awareness
Google DeepMind introduced Lyria 3 Pro, an advanced music generation model capable of creating tracks up to 3 minutes long with structural awareness of musical composition elements like intros, verses, choruses, and bridges. The model is rolling out across multiple Google products including Vertex AI, Google Vids, Gemini app, and the new ProducerAI collaborative tool.
Google DeepMind's Gemini 3.1 Flash-Lite generates websites in real time, 2.5x faster than predecessor
Google DeepMind released Gemini 3.1 Flash-Lite, a model that generates functional websites in real time through a new pseudo-browser demo. The model achieves first response token 2.5 times faster than Gemini 2.5 Flash and outputs over 360 tokens per second, though output pricing has tripled from $0.40 to $1.50 per million tokens.
Google Deepmind adds multi-tool chaining and context circulation to Gemini API
Google Deepmind has expanded the Gemini API to enable multi-tool chaining, allowing developers to combine built-in tools like Google Search and Google Maps with custom functions in a single request. Results from one tool now automatically pass to the next through context circulation, eliminating the need for separate sequential handling.
Google releases Gemini 3.1 Flash-Lite, fastest model in 3 series
Google has released Gemini 3.1 Flash-Lite, positioning it as the fastest and most cost-efficient model in its Gemini 3 series. The release targets deployment scenarios requiring high-speed inference at reduced computational cost.
Google releases Gemini 3.1 Flash-Lite, fastest model in 3 series
Google DeepMind has released Gemini 3.1 Flash-Lite, positioning it as the fastest and most cost-efficient model in the Gemini 3 series. The release targets applications requiring high-speed inference at scale, continuing Google's multi-tier model strategy across the Gemini family.
Google DeepMind releases Nano Banana 2 image model with Pro-level capabilities at faster speeds
Google DeepMind has released Nano Banana 2, an image generation model that combines advanced world knowledge and subject consistency with faster inference speeds comparable to its Flash offering. The model is positioned as production-ready with capabilities previously associated with Pro-tier performance.
Google's Gemini 3.1 Pro now available in GitHub Copilot public preview
Google's Gemini 3.1 Pro, an agentic coding model optimized for autonomous development workflows, is now available in public preview within GitHub Copilot. The model emphasizes efficient edit-then-test loops and tool use in early testing.
Google DeepMind argues chatbot ethics require same rigor as coding benchmarks
Google DeepMind is pushing for moral behavior in large language models to be evaluated with the same technical rigor applied to coding and math benchmarks. As LLMs take on roles like companions, therapists, and medical advisors, the research group argues current evaluation standards are insufficient.
Google integrates Lyria 3 music generation into Gemini with text-to-music and cover art
Google Deepmind has integrated its Lyria 3 model into Gemini, enabling users to generate 30-second music tracks with vocals, lyrics, and cover art from text prompts or uploaded media. The model represents an expansion of Google's multimodal AI capabilities into creative audio generation.
Google announces Gemini 3.1 Pro for complex problem-solving tasks
Google announced Gemini 3.1 Pro, positioning the model for complex problem-solving tasks requiring deeper reasoning than previous versions. The release follows Gemini 3 Pro (November 2025) and Gemini 3 Flash (December 2025).